SlideShare uma empresa Scribd logo
1 de 102
Baixar para ler offline
Pooyan Jamshidi Miguel Velez Christian Kaestner Norbert Siegmund
Learning to Sample
Exploiting Similarities across Environments to Learn Performance Models for Configurable Systems
Today’s most popular systems are configurable
built
Today’s most popular systems are configurable
built
Today’s most popular systems are configurable
built
Today’s most popular systems are configurable
built
Systems are becoming increasingly more configurable and,
therefore, more difficult to understand their performance behavior
010 7/2012 7/2014
e time
1/1999 1/2003 1/2007 1/2011
0
1/2014
Release time
006 1/2010 1/2014
2.2.14
2.3.4
35
se time
ache
1/2006 1/2008 1/2010 1/2012 1/2014
0
40
80
120
160
200
2.0.0
1.0.0
0.19.0
0.1.0
Hadoop
Numberofparameters
Release time
MapReduce
HDFS
[Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
Systems are becoming increasingly more configurable and,
therefore, more difficult to understand their performance behavior
010 7/2012 7/2014
e time
1/1999 1/2003 1/2007 1/2011
0
1/2014
Release time
006 1/2010 1/2014
2.2.14
2.3.4
35
se time
ache
1/2006 1/2008 1/2010 1/2012 1/2014
0
40
80
120
160
200
2.0.0
1.0.0
0.19.0
0.1.0
Hadoop
Numberofparameters
Release time
MapReduce
HDFS
[Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
2180
218
= 2162
Increase in size of
configuration space
Empirical observations confirm that systems are
becoming increasingly more configurable
nia San Diego, ‡Huazhong Univ. of Science & Technology, †NetApp, Inc
tixu, longjin, xuf001, yyzhou}@cs.ucsd.edu
kar.Pasupathy, Rukma.Talwadker}@netapp.com
prevalent, but also severely
software. One fundamental
y of configuration, reflected
parameters (“knobs”). With
m software to ensure high re-
aunting, error-prone task.
nderstanding a fundamental
users really need so many
answer, we study the con-
including thousands of cus-
m (Storage-A), and hundreds
ce system software projects.
ng findings to motivate soft-
ore cautious and disciplined
these findings, we provide
ich can significantly reduce
A as an example, the guide-
ters and simplify 19.7% of
on existing users. Also, we
tion methods in the context
7/2006 7/2008 7/2010 7/2012 7/2014
0
100
200
300
400
500
600
700
Storage-A
Numberofparameters
Release time
1/1999 1/2003 1/2007 1/2011
0
100
200
300
400
500
5.6.2
5.5.0
5.0.16
5.1.3
4.1.0
4.0.12
3.23.0
1/2014
MySQL
Numberofparameters
Release time
1/1998 1/2002 1/2006 1/2010 1/2014
0
100
200
300
400
500
600
1.3.14
2.2.14
2.3.4
2.0.35
1.3.24
Numberofparameters
Release time
Apache
1/2006 1/2008 1/2010 1/2012 1/2014
0
40
80
120
160
200
2.0.0
1.0.0
0.19.0
0.1.0
Hadoop
Numberofparameters
Release time
MapReduce
HDFS
[Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
Configurations determine the performance
behavior of systems
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configurations determine the performance
behavior of systems
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configurations determine the performance
behavior of systems
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configurations determine the performance
behavior of systems
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
endif
else
PARROT_HAS_SETENV
LINUX
Configuration options interact, therefore, creating
a non-linear and complex performance behavior
• Non-linear

• Non-convex

• Multi-modal
number of counters
number of splitters
latency(ms)
100
150
1
200
250
2
300
Cubic Interpolation Over Finer Grid
243 684 10125 14166 18
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
The default configuration is typically bad and the
optimal configuration is noticeably better than median
better
better
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
The default configuration is typically bad and the
optimal configuration is noticeably better than median
Default Configurationbetter
better
• Default is bad
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
The default configuration is typically bad and the
optimal configuration is noticeably better than median
Default Configuration
Optimal
Configuration
better
better
• Default is bad
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
The default configuration is typically bad and the
optimal configuration is noticeably better than median
Default Configuration
Optimal
Configuration
better
better
• Default is bad
• 2X-10X faster than worst
0 500 1000 1500
Throughput (ops/sec)
0
1000
2000
3000
4000
5000
Averagewritelatency(s)
The default configuration is typically bad and the
optimal configuration is noticeably better than median
Default Configuration
Optimal
Configuration
better
better
• Default is bad
• 2X-10X faster than worst
• Noticeably faster than median
Setting the scene
Setting the scene
Compiler
(e.f., SaC, LLVM)
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Compiler
(e.f., SaC, LLVM)
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Compiler
(e.f., SaC, LLVM)
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Constant folding
Loop unrolling
Function inlining
Compiler
(e.f., SaC, LLVM)
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Configuration
Space
Constant folding
Loop unrolling
Function inlining
Compiler
(e.f., SaC, LLVM)
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Configuration
Space
Constant folding
Loop unrolling
Function inlining
Compiler
(e.f., SaC, LLVM)
Program
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Configuration
Space
Constant folding
Loop unrolling
Function inlining
c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ
Compiler
(e.f., SaC, LLVM)
Program
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Configuration
Space
Constant folding
Loop unrolling
Function inlining
c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ
Compiler
(e.f., SaC, LLVM)
Program Compiled
Code
Compile
Configure
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Configuration
Space
Constant folding
Loop unrolling
Function inlining
c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ
Compiler
(e.f., SaC, LLVM)
Program Compiled
Code
Instrumented
Binary
Hardware
Compile Deploy
Configure
Setting the scene
ℂ = O1 × O2 × ⋯ × O19 × O20
Dead code removal
Configuration
Space
Constant folding
Loop unrolling
Function inlining
c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ
fc(c1) = 11.1msCompile
time
Execution
time
Energy
Compiler
(e.f., SaC, LLVM)
Program Compiled
Code
Instrumented
Binary
Hardware
Compile Deploy
Configure
fe(c1) = 110.3ms
fen(c1) = 100mwh
Non-functional
measurable/quantifiable
aspect
A typical approach for understanding the
performance behavior is sensitivity analysis
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
y1 = f(c1)
y2 = f(c2)
y3 = f(c3)
yn = f(cn)
⋯
A typical approach for understanding the
performance behavior is sensitivity analysis
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
y1 = f(c1)
y2 = f(c2)
y3 = f(c3)
yn = f(cn)
⋯
Training/Sample
Set
A typical approach for understanding the
performance behavior is sensitivity analysis
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
y1 = f(c1)
y2 = f(c2)
y3 = f(c3)
yn = f(cn)
̂f ∼ f( ⋅ )
⋯ LearnTraining/Sample
Set
Performance model could be in any appropriate
form of black-box models
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
y1 = f(c1)
y2 = f(c2)
y3 = f(c3)
yn = f(cn)
̂f ∼ f( ⋅ )
⋯
Training/Sample
Set
Performance model could be in any appropriate
form of black-box models
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
y1 = f(c1)
y2 = f(c2)
y3 = f(c3)
yn = f(cn)
̂f ∼ f( ⋅ )
⋯
Training/Sample
Set
Performance model could be in any appropriate
form of black-box models
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
y1 = f(c1)
y2 = f(c2)
y3 = f(c3)
yn = f(cn)
̂f ∼ f( ⋅ )
⋯
Training/Sample
Set
Performance model could be in any appropriate
form of black-box models
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
y1 = f(c1)
y2 = f(c2)
y3 = f(c3)
yn = f(cn)
̂f ∼ f( ⋅ )
⋯
Training/Sample
Set
How to sample the
configuration space to
learn a “better”
performance behavior?
How to select the most
informative configurations?
We hypothesized that we can exploit similarities across
environments to learn “cheaper” performance models
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
ys1 = fs(c1)
ys2 = fs(c2)
ys3 = fs(c3)
ysn = fs(cn)
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
yt1 = ft(c1)
yt2 = ft(c2)
yt3 = ft(c3)
ytn = ft(cn)
Source Environment
(Execution time of Program X)
Target Environment
(Execution time of Program Y)
Similarity
[P. Jamshidi, et al., “Transfer learning for performance modeling of configurable systems….”, ASE’17]
We looked at different highly-configurable systems to
gain insights about similarities across environments
[P. Jamshidi, et al., “Transfer learning for performance modeling of configurable systems….”, ASE’17]
SPEAR	(SAT	Solver)
Analysis	time
14	options	
16,384	configurations
SAT	problems
3	hardware
2	versions
X264	(video	encoder)
Encoding	time
16	options	
4,000 configurations
Video	quality/size
2	hardware
3	versions
SQLite	(DB	engine)
Query	time
14	options	
1,000 configurations
DB	Queries
2	hardware
2 versions
SaC (Compiler)
Execution	time
50	options	
71,267 configurations
10	Demo	programs
Observation 1: Not all options and interactions are influential
and interactions degree between options are not high
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
ℂ = O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
Observation 2: Influential options and interactions
are preserved across environments
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
Observation 2: Influential options and interactions
are preserved across environments
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
Observation 2: Influential options and interactions
are preserved across environments
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
Observation 2: Influential options and interactions
are preserved across environments
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
Observation 2: Influential options and interactions
are preserved across environments
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
The similarity across environment is a
rich source of knowledge for
exploration of the configuration space
Without considering this knowledge, many
samples may not provide new information
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
Without considering this knowledge, many
samples may not provide new information
1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0c1
c2 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 1
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
c3 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 1 × 0
1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1c128
⋯
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
Without considering this knowledge, many
samples may not provide new information
1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0c1
c2 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 1
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂fs(c1) = 14.9
̂fs(c2) = 14.9
c3 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 1 × 0 ̂fs(c3) = 14.9
1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 ̂fs(c128) = 14.9c128
⋯
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
Without knowing this
knowledge, many blind/
random samples may not
provide any additional
information about
performance of the system
In higher dimensional spaces, the blind samples
even become less informative/effective
Learning to Sample (L2S)
Extracting the knowledge about influential options
and interactions: Step-wise linear regression
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
ys1 = fs(c1)
ys2 = fs(c2)
ys3 = fs(c3)
ysn = fs(cn)
Source
(Execution time of Program X)
Extracting the knowledge about influential options
and interactions: Step-wise linear regression
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
ys1 = fs(c1)
ys2 = fs(c2)
ys3 = fs(c3)
ysn = fs(cn)
Source
(Execution time of Program X)
Learn
performance
model
̂fs ∼ fs( ⋅ )
Extracting the knowledge about influential options
and interactions: Step-wise linear regression
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
ys1 = fs(c1)
ys2 = fs(c2)
ys3 = fs(c3)
ysn = fs(cn)
Source
(Execution time of Program X)
Learn
performance
model
̂fs ∼ fs( ⋅ )
1. Fit an initial model
Extracting the knowledge about influential options
and interactions: Step-wise linear regression
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
ys1 = fs(c1)
ys2 = fs(c2)
ys3 = fs(c3)
ysn = fs(cn)
Source
(Execution time of Program X)
Learn
performance
model
̂fs ∼ fs( ⋅ )
1. Fit an initial model
2. Forward selection: Add
terms iteratively
Extracting the knowledge about influential options
and interactions: Step-wise linear regression
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
ys1 = fs(c1)
ys2 = fs(c2)
ys3 = fs(c3)
ysn = fs(cn)
Source
(Execution time of Program X)
Learn
performance
model
̂fs ∼ fs( ⋅ )
1. Fit an initial model
2. Forward selection: Add
terms iteratively
3. Backward elimination:
Removes terms iteratively
Extracting the knowledge about influential options
and interactions: Step-wise linear regression
O1 × O2 × ⋯ × O19 × O20
0 × 0 × ⋯ × 0 × 1
0 × 0 × ⋯ × 1 × 0
0 × 0 × ⋯ × 1 × 1
1 × 1 × ⋯ × 1 × 0
1 × 1 × ⋯ × 1 × 1
⋯
c1
c2
c3
cn
ys1 = fs(c1)
ys2 = fs(c2)
ys3 = fs(c3)
ysn = fs(cn)
Source
(Execution time of Program X)
Learn
performance
model
̂fs ∼ fs( ⋅ )
1. Fit an initial model
2. Forward selection: Add
terms iteratively
3. Backward elimination:
Removes terms iteratively
4. Terminate: When neither
(2) or (3) improve the model
L2S extracts the knowledge about influential
options and interactions via performance models
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S extracts the knowledge about influential
options and interactions via performance models
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft(c2) = 8.1
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft(c2) = 8.1
c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft(c2) = 8.1
c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft(c2) = 8.1
c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6
c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft(c2) = 8.1
c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6
c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7
c6 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c6) = 23.7
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft(c2) = 8.1
c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6
c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7
c6 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c6) = 23.7
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
L2S exploits the knowledge it gained from the
source to sample the target environment
0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1
c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0
O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
̂ft(c1) = 10.4
̂ft(c2) = 8.1
c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6
̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6
c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7
c6 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c6) = 23.7
cx 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(cx) = 9.6
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
Exploration vs
Exploitation
We also explore the
configuration space using
pseudo-random sampling to
detect missing interactions
For capturing options and interactions that only appears
in the target, L2S relies on exploration (random sampling)
Performance
Distribution
Influential
Option
Learning
Influential
Interaction
(ii) Sampling
Configurable
System
Performance
Model
ExploitationExploration
Transfer
Knowledge
PerfConf <Conf,Perf>
SourceTarget
(i) Extract
Knowledge
Performance
Influence
Model
L2S transfers knowledge about the structure of performance models
from the source to guide the sampling in the target environment
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
standing the performance behavior of configurable
systems can enable (i) performance debugging, (ii)
ance tuning, (iii) design-time evolution, or (iv) runtime
on [11]. We lack empirical understanding of how the
ance behavior of a system will vary when the environ-
the system changes. Such empirical understanding will
important insights to develop faster and more accurate
techniques that allow us to make predictions and
tions of performance for highly configurable systems
ging environments [10]. For instance, we can learn
ance behavior of a system on a cheap hardware in a
ed lab environment and use that to understand the per-
e behavior of the system on a production server before
to the end user. More specifically, we would like to
hat the relationship is between the performance of a
n a specific environment (characterized by software
ation, hardware, workload, and system version) to the
we vary its environmental conditions.
s research, we aim for an empirical understanding of
ance behavior to improve learning via an informed
A. Preliminary concepts
In this section, we provide formal definitions of
cepts that we use throughout this study. The forma
enable us to concisely convey concept throughout
1) Configuration and environment space: Let F
the i-th feature of a configurable system A whic
enabled or disabled and one of them holds by de
configuration space is mathematically a Cartesian
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(F
Dom(Fi) = {0, 1}. A configuration of a syste
a member of the configuration space (feature spa
all the parameters are assigned to a specific valu
range (i.e., complete instantiations of the system’s pa
We also describe an environment instance by 3
e = [w, h, v] drawn from a given environment s
W ⇥H ⇥V , where they respectively represent sets
values for workload, hardware and system version.
2) Performance model: Given a software syste
configuration space F and environmental instances
formance model is a black-box function f : F ⇥
given some observations of the system performanc
combination of system’s features x 2 F in an en
e 2 E. To construct a performance model for a
ON
behavior of configurable
rformance debugging, (ii)
evolution, or (iv) runtime
understanding of how the
ill vary when the environ-
mpirical understanding will
p faster and more accurate
to make predictions and
ghly configurable systems
r instance, we can learn
on a cheap hardware in a
hat to understand the per-
a production server before
cifically, we would like to
een the performance of a
characterized by software
and system version) to the
conditions.
mpirical understanding of
learning via an informed
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
vations of the system performance for each
ystem’s features x 2 F in an environment
ruct a performance model for a system A
space F, we run A in environment instance
combinations of configurations xi 2 F, and
g performance values yi = f(xi) + ✏i, xi 2
(0, i). The training data for our regression
mply Dtr = {(xi, yi)}n
i=1. In other words, a
is simply a mapping from the input space to
ormance metric that produces interval-scaled
ume it produces real numbers).
distribution: For the performance model,
associated the performance response to each
w let introduce another concept where we
ment and we measure the performance. An
mance distribution is a stochastic process,
that defines a probability distribution over
sures for each environmental conditions. To
rmance distribution for a system A with
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
Extract Reuse
Learn Learn
L2S transfers knowledge about the structure of performance models
from the source to guide the sampling in the target environment
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
standing the performance behavior of configurable
systems can enable (i) performance debugging, (ii)
ance tuning, (iii) design-time evolution, or (iv) runtime
on [11]. We lack empirical understanding of how the
ance behavior of a system will vary when the environ-
the system changes. Such empirical understanding will
important insights to develop faster and more accurate
techniques that allow us to make predictions and
tions of performance for highly configurable systems
ging environments [10]. For instance, we can learn
ance behavior of a system on a cheap hardware in a
ed lab environment and use that to understand the per-
e behavior of the system on a production server before
to the end user. More specifically, we would like to
hat the relationship is between the performance of a
n a specific environment (characterized by software
ation, hardware, workload, and system version) to the
we vary its environmental conditions.
s research, we aim for an empirical understanding of
ance behavior to improve learning via an informed
A. Preliminary concepts
In this section, we provide formal definitions of
cepts that we use throughout this study. The forma
enable us to concisely convey concept throughout
1) Configuration and environment space: Let F
the i-th feature of a configurable system A whic
enabled or disabled and one of them holds by de
configuration space is mathematically a Cartesian
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(F
Dom(Fi) = {0, 1}. A configuration of a syste
a member of the configuration space (feature spa
all the parameters are assigned to a specific valu
range (i.e., complete instantiations of the system’s pa
We also describe an environment instance by 3
e = [w, h, v] drawn from a given environment s
W ⇥H ⇥V , where they respectively represent sets
values for workload, hardware and system version.
2) Performance model: Given a software syste
configuration space F and environmental instances
formance model is a black-box function f : F ⇥
given some observations of the system performanc
combination of system’s features x 2 F in an en
e 2 E. To construct a performance model for a
ON
behavior of configurable
rformance debugging, (ii)
evolution, or (iv) runtime
understanding of how the
ill vary when the environ-
mpirical understanding will
p faster and more accurate
to make predictions and
ghly configurable systems
r instance, we can learn
on a cheap hardware in a
hat to understand the per-
a production server before
cifically, we would like to
een the performance of a
characterized by software
and system version) to the
conditions.
mpirical understanding of
learning via an informed
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
vations of the system performance for each
ystem’s features x 2 F in an environment
ruct a performance model for a system A
space F, we run A in environment instance
combinations of configurations xi 2 F, and
g performance values yi = f(xi) + ✏i, xi 2
(0, i). The training data for our regression
mply Dtr = {(xi, yi)}n
i=1. In other words, a
is simply a mapping from the input space to
ormance metric that produces interval-scaled
ume it produces real numbers).
distribution: For the performance model,
associated the performance response to each
w let introduce another concept where we
ment and we measure the performance. An
mance distribution is a stochastic process,
that defines a probability distribution over
sures for each environmental conditions. To
rmance distribution for a system A with
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
Extract Reuse
Learn Learn
̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
Evaluation:
Other transfer learning approaches
also exists
“Model shift” builds a model in the source and uses
the shifted model “directly” for predicting the target
[Pavel Valov, et al. “Transferring performance prediction models…”, ICPE’17 ]
log P (θ, Xobs )
Θ
P (θ|Xobs )
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
P(θ|Xobs ) P(θ|Xobs ) P(θ|Xobs )
Target
Source
Throughput
Machine
twice as fast
“Data reuse” combines the data collected in the source with the
ones in the target in a “multi-task” setting to predict the target
DataData
Data
Measure
Measure
Reuse Learn
TurtleBot
Simulator (Gazebo)
[P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
Configurations
f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
Evaluation:
Learning performance behavior of
Machine Learning Systems
ML system: https://pooyanjamshidi.github.io/mls
Configurations of deep neural networks affect
accuracy and energy consumption
0 500 1000 1500 2000 2500
Energy consumption [J]
0
20
40
60
80
100
Validation(test)error
CNN on CNAE-9 Data Set
72%
22X
Configurations of deep neural networks affect
accuracy and energy consumption
0 500 1000 1500 2000 2500
Energy consumption [J]
0
20
40
60
80
100
Validation(test)error
CNN on CNAE-9 Data Set
72%
22X
the selected cell is plugged into a large model which is trained on the combinatio
validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We not
is never used for model selection, and it is only used for final model evaluation. We
cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dat
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
image
conv3x3
large CIFAR-10 model
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
cell
Figure 2: Image classification models constructed using the cells optimized with arc
Top-left: small model used during architecture search on CIFAR-10. Top-right:
model used for learned cell evaluation. Bottom: ImageNet model used for learned
For CIFAR-10 experiments we use a model which consists of 3 ⇥ 3 convolution w
Configurations of deep neural networks affect
accuracy and energy consumption
0 500 1000 1500 2000 2500
Energy consumption [J]
0
20
40
60
80
100
Validation(test)error
CNN on CNAE-9 Data Set
72%
22X
the selected cell is plugged into a large model which is trained on the combinatio
validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We not
is never used for model selection, and it is only used for final model evaluation. We
cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dat
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
image
conv3x3
large CIFAR-10 model
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
cell
Figure 2: Image classification models constructed using the cells optimized with arc
Top-left: small model used during architecture search on CIFAR-10. Top-right:
model used for learned cell evaluation. Bottom: ImageNet model used for learned
For CIFAR-10 experiments we use a model which consists of 3 ⇥ 3 convolution w
the selected cell is plugged into a large model which is trained on the combination of training and
validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note that the test set
is never used for model selection, and it is only used for final model evaluation. We also evaluate the
cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3).
image
sep.conv3x3/2
sep.conv3x3/2
sep.conv3x3
conv3x3
globalpool
linear&softmax
small CIFAR-10 model
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3
globalpool
linear&softmax
large CIFAR-10 model
cell
cell
cell
cell
cell
cell
sep.conv3x3/2
sep.conv3x3
sep.conv3x3/2
sep.conv3x3
sep.conv3x3
sep.conv3x3
image
conv3x3/2
conv3x3/2
sep.conv3x3/2
globalpool
linear&softmax
ImageNet model
cell
cell
cell
cell
cell
cell
cell
Figure 2: Image classification models constructed using the cells optimized with architecture search.
Top-left: small model used during architecture search on CIFAR-10. Top-right: large CIFAR-10
model used for learned cell evaluation. Bottom: ImageNet model used for learned cell evaluation.
Deep neural
network as a
highly
configurable
system
of top/bottom conf.; M6/M7: Number of influential options; M8/M9: Number of options agree
Correlation btw importance of options; M11/M12: Number of interactions; M13: Number of inte
e↵ects; M14: Correlation btw the coe↵s;
Input #1
Input #2
Input #3
Input #4
Output
Hidden
layer
Input
layer
Output
layer
4 Technical Aims and Research Plan
We will pursue the following technical aims: (1) investigate potential criteria for e↵ectiv
exploration of the design space of DNN architectures (Section 4.2), (2) build analytical m
curately predict the performance of a given architecture configuration given other similar
which either have been measured in the target environments or other similar environm
measuring the network performance directly (Section 4.3), and (3), develop a tunni
that exploit the performance model from previous step to e↵ectively search for optima
(Section 4.4).
4.1 Project Timeline
We plan to complete the proposed project in two years. To mitigate project risks, we
project into three major phases:
8
Network
Design
Model
Compiler
Hybrid
Deployment
OS/
Hardware
Scopeof
thisProject
Neural Search
Hardware
Optimization
Hyper-parameter
DNNsystemdevelopmentstack
Deployment
Topology
DNN measurements
are costly
Each sample cost ~1h

4000 * 1h ~= 6 months
DNN measurements
are costly
Each sample cost ~1h

4000 * 1h ~= 6 months

Yes, that’s the cost we
paid for conducting our
measurements!
L2S enables learning a more accurate model with less
samples exploiting the knowledge from the source
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(a) DNN (hard)
3 10 20 30 40
Sample Si
0
20
40
60
80
100
MeanAbsolutePercentageError
L
L
D
M
R
(b) XGBoost (h
Convolutional Neural Network
L2S enables learning a more accurate model with less
samples exploiting the knowledge from the source
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(a) DNN (hard)
3 10 20 30 40
Sample Si
0
20
40
60
80
100
MeanAbsolutePercentageError
L
L
D
M
R
(b) XGBoost (h
Convolutional Neural Network
L2S enables learning a more accurate model with less
samples exploiting the knowledge from the source
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(a) DNN (hard)
3 10 20 30 40
Sample Si
0
20
40
60
80
100
MeanAbsolutePercentageError
L
L
D
M
R
(b) XGBoost (h
Convolutional Neural Network
L2S may also help data-reuse approach to learn
faster
30 40 50 60 70
Sample Size
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(a) DNN (hard)
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError 100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(b) XGBoost (hard)
3 10 20 30 40
Sample Siz
0
20
40
60
80
100
MeanAbsolutePercentageError
L
L
D
M
R
(c) Storm (ha
XGBoost
L2S may also help data-reuse approach to learn
faster
30 40 50 60 70
Sample Size
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(a) DNN (hard)
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError 100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(b) XGBoost (hard)
3 10 20 30 40
Sample Siz
0
20
40
60
80
100
MeanAbsolutePercentageError
L
L
D
M
R
(c) Storm (ha
XGBoost
L2S may also help data-reuse approach to learn
faster
30 40 50 60 70
Sample Size
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(a) DNN (hard)
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError 100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(b) XGBoost (hard)
3 10 20 30 40
Sample Siz
0
20
40
60
80
100
MeanAbsolutePercentageError
L
L
D
M
R
(c) Storm (ha
XGBoost
Evaluation:
Learning performance behavior of
Big Data Systems
Some environments the similarities across environments
may be too low and this results in “negative transfer”
0 30 40 50 60 70
Sample Size
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(b) XGBoost (hard)
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError
100200500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(c) Storm (hard)
Apache Storm
Some environments the similarities across environments
may be too low and this results in “negative transfer”
0 30 40 50 60 70
Sample Size
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(b) XGBoost (hard)
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError
100200500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(c) Storm (hard)
Apache Storm
Some environments the similarities across environments
may be too low and this results in “negative transfer”
0 30 40 50 60 70
Sample Size
100
200
500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(b) XGBoost (hard)
3 10 20 30 40 50 60 70
Sample Size
0
20
40
60
80
100
MeanAbsolutePercentageError
100200500
L2S+GP
L2S+DataReuseTL
DataReuseTL
ModelShift
Random+CART
(c) Storm (hard)
Apache Storm
Why performance
models using L2S
sample are more
accurate?
The samples generated by L2S contains more
information… “entropy <-> information gain”
10 20 30 40 50 60 70
sample size
0
1
2
3
4
5
6
entropy[bits]
Max entropy
L2S
Random
DNN
The samples generated by L2S contains more
information… “entropy <-> information gain”
10 20 30 40 50 60 70
sample size
0
1
2
3
4
5
6
entropy[bits]
Max entropy
L2S
Random
10 20 30 40 50 60 70
sample size
0
1
2
3
4
5
6
7
entropy[bits]
Max entropy
L2S
Random
10 20 30 40 50 60 70
sample size
0
1
2
3
4
5
6
7
entropy[bits]
Max entropy
L2S
Random
DNN XGboost Storm
Limitations
• Limited number of systems and environmental changes

• Synthetic models 

• https://github.com/pooyanjamshidi/GenPerf

• Binary options

• Non-binary options -> binary

• Negative transfer
@pooyanjamshidi
@pooyanjamshidi
@pooyanjamshidi
@pooyanjamshidi

Mais conteúdo relacionado

Mais procurados

数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」Ken'ichi Matsui
 
Computational Linguistics week 10
 Computational Linguistics week 10 Computational Linguistics week 10
Computational Linguistics week 10Mark Chang
 
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料Ken'ichi Matsui
 
Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics nazlitemu
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」Ken'ichi Matsui
 
Statistics for Economics Midterm 2 Cheat Sheet
Statistics for Economics Midterm 2 Cheat SheetStatistics for Economics Midterm 2 Cheat Sheet
Statistics for Economics Midterm 2 Cheat SheetLaurel Ayuyao
 
Multi dof modal analysis free
Multi dof modal analysis freeMulti dof modal analysis free
Multi dof modal analysis freeMahdiKarimi29
 
El text.life science6.matsubayashi191120
El text.life science6.matsubayashi191120El text.life science6.matsubayashi191120
El text.life science6.matsubayashi191120RCCSRENKEI
 
Statistics for Economics Final Exam Cheat Sheet
Statistics for Economics Final Exam Cheat SheetStatistics for Economics Final Exam Cheat Sheet
Statistics for Economics Final Exam Cheat SheetLaurel Ayuyao
 
Kristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericosKristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericosKristhyanAndreeKurtL
 
The Ring programming language version 1.5.4 book - Part 60 of 185
The Ring programming language version 1.5.4 book - Part 60 of 185The Ring programming language version 1.5.4 book - Part 60 of 185
The Ring programming language version 1.5.4 book - Part 60 of 185Mahmoud Samir Fayed
 
MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709Min-hyung Kim
 
Regression_Sample
Regression_SampleRegression_Sample
Regression_SampleJie Huang
 
Application of recursive perturbation approach for multimodal optimization
Application of recursive perturbation approach for multimodal optimizationApplication of recursive perturbation approach for multimodal optimization
Application of recursive perturbation approach for multimodal optimizationPranamesh Chakraborty
 
Calculus B Notes (Notre Dame)
Calculus B Notes (Notre Dame)Calculus B Notes (Notre Dame)
Calculus B Notes (Notre Dame)Laurel Ayuyao
 

Mais procurados (19)

数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
数学カフェ 確率・統計・機械学習回 「速習 確率・統計」
 
Computational Linguistics week 10
 Computational Linguistics week 10 Computational Linguistics week 10
Computational Linguistics week 10
 
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
「ベータ分布の謎に迫る」第6回 プログラマのための数学勉強会 LT資料
 
Bessel functionsoffractionalorder1
Bessel functionsoffractionalorder1Bessel functionsoffractionalorder1
Bessel functionsoffractionalorder1
 
Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics Using R Tool for Probability and Statistics
Using R Tool for Probability and Statistics
 
Corona sdk
Corona sdkCorona sdk
Corona sdk
 
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
第13回数学カフェ「素数!!」二次会 LT資料「乱数!!」
 
Statistics for Economics Midterm 2 Cheat Sheet
Statistics for Economics Midterm 2 Cheat SheetStatistics for Economics Midterm 2 Cheat Sheet
Statistics for Economics Midterm 2 Cheat Sheet
 
Multi dof modal analysis free
Multi dof modal analysis freeMulti dof modal analysis free
Multi dof modal analysis free
 
El text.life science6.matsubayashi191120
El text.life science6.matsubayashi191120El text.life science6.matsubayashi191120
El text.life science6.matsubayashi191120
 
Statistics for Economics Final Exam Cheat Sheet
Statistics for Economics Final Exam Cheat SheetStatistics for Economics Final Exam Cheat Sheet
Statistics for Economics Final Exam Cheat Sheet
 
Kristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericosKristhyan kurtlazartezubia evidencia1-metodosnumericos
Kristhyan kurtlazartezubia evidencia1-metodosnumericos
 
Slides ensae 11bis
Slides ensae 11bisSlides ensae 11bis
Slides ensae 11bis
 
Eulode
EulodeEulode
Eulode
 
The Ring programming language version 1.5.4 book - Part 60 of 185
The Ring programming language version 1.5.4 book - Part 60 of 185The Ring programming language version 1.5.4 book - Part 60 of 185
The Ring programming language version 1.5.4 book - Part 60 of 185
 
MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709MH prediction modeling and validation in r (2) classification 190709
MH prediction modeling and validation in r (2) classification 190709
 
Regression_Sample
Regression_SampleRegression_Sample
Regression_Sample
 
Application of recursive perturbation approach for multimodal optimization
Application of recursive perturbation approach for multimodal optimizationApplication of recursive perturbation approach for multimodal optimization
Application of recursive perturbation approach for multimodal optimization
 
Calculus B Notes (Notre Dame)
Calculus B Notes (Notre Dame)Calculus B Notes (Notre Dame)
Calculus B Notes (Notre Dame)
 

Semelhante a Learning Performance Models

Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchAhmed BESBES
 
Yoyak ScalaDays 2015
Yoyak ScalaDays 2015Yoyak ScalaDays 2015
Yoyak ScalaDays 2015ihji
 
Using Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataUsing Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataAnalyticsWeek
 
Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)
Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)
Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)Trung tâm Advance Cad
 
JavaScript Design Patterns
JavaScript Design PatternsJavaScript Design Patterns
JavaScript Design PatternsDerek Brown
 
Low-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problemsLow-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problemsAlexander Litvinenko
 
Unit-1 DAA_Notes.pdf
Unit-1 DAA_Notes.pdfUnit-1 DAA_Notes.pdf
Unit-1 DAA_Notes.pdfAmayJaiswal4
 
Numerical Algorithm for a few Special Functions
Numerical Algorithm for a few Special FunctionsNumerical Algorithm for a few Special Functions
Numerical Algorithm for a few Special FunctionsAmos Tsai
 
Common derivatives integrals
Common derivatives integralsCommon derivatives integrals
Common derivatives integralsKavin Ruk
 
Computer Architecture 3rd Edition by Moris Mano CH 01-CH 02.ppt
Computer Architecture 3rd Edition by Moris Mano CH  01-CH 02.pptComputer Architecture 3rd Edition by Moris Mano CH  01-CH 02.ppt
Computer Architecture 3rd Edition by Moris Mano CH 01-CH 02.pptHowida Youssry
 
Logic gates and boolean algebra.ppt
Logic gates and boolean algebra.pptLogic gates and boolean algebra.ppt
Logic gates and boolean algebra.pptbcanawakadalcollege
 
digital logic circuits, logic gates, boolean algebra
digital logic circuits, logic gates, boolean algebradigital logic circuits, logic gates, boolean algebra
digital logic circuits, logic gates, boolean algebradianaandino4
 

Semelhante a Learning Performance Models (20)

Introduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from ScratchIntroduction to Neural Networks and Deep Learning from Scratch
Introduction to Neural Networks and Deep Learning from Scratch
 
Yoyak ScalaDays 2015
Yoyak ScalaDays 2015Yoyak ScalaDays 2015
Yoyak ScalaDays 2015
 
Using Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigDataUsing Topological Data Analysis on your BigData
Using Topological Data Analysis on your BigData
 
Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)
Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)
Phân tích kỹ thuật với Ansys WorkBench 13 ( demo)
 
JavaScript Design Patterns
JavaScript Design PatternsJavaScript Design Patterns
JavaScript Design Patterns
 
Low-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problemsLow-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problems
 
Digital logic circuit
Digital logic circuitDigital logic circuit
Digital logic circuit
 
Ch1 2 (2)
Ch1 2 (2)Ch1 2 (2)
Ch1 2 (2)
 
Javascript: The Important Bits
Javascript: The Important BitsJavascript: The Important Bits
Javascript: The Important Bits
 
Logic gates
Logic gatesLogic gates
Logic gates
 
Issta13 workshop on debugging
Issta13 workshop on debuggingIssta13 workshop on debugging
Issta13 workshop on debugging
 
Unit-1 DAA_Notes.pdf
Unit-1 DAA_Notes.pdfUnit-1 DAA_Notes.pdf
Unit-1 DAA_Notes.pdf
 
Numerical Algorithm for a few Special Functions
Numerical Algorithm for a few Special FunctionsNumerical Algorithm for a few Special Functions
Numerical Algorithm for a few Special Functions
 
Common derivatives integrals
Common derivatives integralsCommon derivatives integrals
Common derivatives integrals
 
Computer Architecture 3rd Edition by Moris Mano CH 01-CH 02.ppt
Computer Architecture 3rd Edition by Moris Mano CH  01-CH 02.pptComputer Architecture 3rd Edition by Moris Mano CH  01-CH 02.ppt
Computer Architecture 3rd Edition by Moris Mano CH 01-CH 02.ppt
 
CH1_2.ppt
CH1_2.pptCH1_2.ppt
CH1_2.ppt
 
Ch1 2
Ch1 2Ch1 2
Ch1 2
 
CH1_2.ppt
CH1_2.pptCH1_2.ppt
CH1_2.ppt
 
Logic gates and boolean algebra.ppt
Logic gates and boolean algebra.pptLogic gates and boolean algebra.ppt
Logic gates and boolean algebra.ppt
 
digital logic circuits, logic gates, boolean algebra
digital logic circuits, logic gates, boolean algebradigital logic circuits, logic gates, boolean algebra
digital logic circuits, logic gates, boolean algebra
 

Mais de Pooyan Jamshidi

Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachLearning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachPooyan Jamshidi
 
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn... A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...Pooyan Jamshidi
 
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Pooyan Jamshidi
 
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Pooyan Jamshidi
 
Transfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsTransfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsPooyan Jamshidi
 
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...Pooyan Jamshidi
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOpsPooyan Jamshidi
 
Integrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsIntegrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsPooyan Jamshidi
 
Transfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwareTransfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwarePooyan Jamshidi
 
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based SoftwareArchitectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based SoftwarePooyan Jamshidi
 
Production-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectProduction-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectPooyan Jamshidi
 
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory AnalysisTransfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory AnalysisPooyan Jamshidi
 
Learning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsLearning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsPooyan Jamshidi
 
Sensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwareSensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwarePooyan Jamshidi
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Pooyan Jamshidi
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic SystemsPooyan Jamshidi
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOpsPooyan Jamshidi
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...Pooyan Jamshidi
 
Configuration Optimization Tool
Configuration Optimization ToolConfiguration Optimization Tool
Configuration Optimization ToolPooyan Jamshidi
 

Mais de Pooyan Jamshidi (20)

Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachLearning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
 
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn... A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
 
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
 
Transfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsTransfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning Systems
 
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...Transfer Learning for Performance Analysis of Configurable Systems:A Causal ...
Transfer Learning for Performance Analysis of Configurable Systems: A Causal ...
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
Integrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsIntegrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of Robots
 
Transfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwareTransfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable Software
 
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based SoftwareArchitectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based Software
 
Production-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectProduction-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software Architect
 
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory AnalysisTransfer Learning for Software Performance Analysis: An Exploratory Analysis
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
 
Architecting for Scale
Architecting for ScaleArchitecting for Scale
Architecting for Scale
 
Learning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain EnvironmentsLearning Software Performance Models for Dynamic and Uncertain Environments
Learning Software Performance Models for Dynamic and Uncertain Environments
 
Sensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwareSensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic Software
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
 
Transfer Learning for Improving Model Predictions in Robotic Systems
Transfer Learning for Improving Model Predictions  in Robotic SystemsTransfer Learning for Improving Model Predictions  in Robotic Systems
Transfer Learning for Improving Model Predictions in Robotic Systems
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing S...
 
Configuration Optimization Tool
Configuration Optimization ToolConfiguration Optimization Tool
Configuration Optimization Tool
 

Último

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 

Último (20)

Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 

Learning Performance Models

  • 1. Pooyan Jamshidi Miguel Velez Christian Kaestner Norbert Siegmund Learning to Sample Exploiting Similarities across Environments to Learn Performance Models for Configurable Systems
  • 2. Today’s most popular systems are configurable built
  • 3. Today’s most popular systems are configurable built
  • 4. Today’s most popular systems are configurable built
  • 5. Today’s most popular systems are configurable built
  • 6. Systems are becoming increasingly more configurable and, therefore, more difficult to understand their performance behavior 010 7/2012 7/2014 e time 1/1999 1/2003 1/2007 1/2011 0 1/2014 Release time 006 1/2010 1/2014 2.2.14 2.3.4 35 se time ache 1/2006 1/2008 1/2010 1/2012 1/2014 0 40 80 120 160 200 2.0.0 1.0.0 0.19.0 0.1.0 Hadoop Numberofparameters Release time MapReduce HDFS [Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
  • 7. Systems are becoming increasingly more configurable and, therefore, more difficult to understand their performance behavior 010 7/2012 7/2014 e time 1/1999 1/2003 1/2007 1/2011 0 1/2014 Release time 006 1/2010 1/2014 2.2.14 2.3.4 35 se time ache 1/2006 1/2008 1/2010 1/2012 1/2014 0 40 80 120 160 200 2.0.0 1.0.0 0.19.0 0.1.0 Hadoop Numberofparameters Release time MapReduce HDFS [Tianyin Xu, et al., “Too Many Knobs…”, FSE’15] 2180 218 = 2162 Increase in size of configuration space
  • 8. Empirical observations confirm that systems are becoming increasingly more configurable nia San Diego, ‡Huazhong Univ. of Science & Technology, †NetApp, Inc tixu, longjin, xuf001, yyzhou}@cs.ucsd.edu kar.Pasupathy, Rukma.Talwadker}@netapp.com prevalent, but also severely software. One fundamental y of configuration, reflected parameters (“knobs”). With m software to ensure high re- aunting, error-prone task. nderstanding a fundamental users really need so many answer, we study the con- including thousands of cus- m (Storage-A), and hundreds ce system software projects. ng findings to motivate soft- ore cautious and disciplined these findings, we provide ich can significantly reduce A as an example, the guide- ters and simplify 19.7% of on existing users. Also, we tion methods in the context 7/2006 7/2008 7/2010 7/2012 7/2014 0 100 200 300 400 500 600 700 Storage-A Numberofparameters Release time 1/1999 1/2003 1/2007 1/2011 0 100 200 300 400 500 5.6.2 5.5.0 5.0.16 5.1.3 4.1.0 4.0.12 3.23.0 1/2014 MySQL Numberofparameters Release time 1/1998 1/2002 1/2006 1/2010 1/2014 0 100 200 300 400 500 600 1.3.14 2.2.14 2.3.4 2.0.35 1.3.24 Numberofparameters Release time Apache 1/2006 1/2008 1/2010 1/2012 1/2014 0 40 80 120 160 200 2.0.0 1.0.0 0.19.0 0.1.0 Hadoop Numberofparameters Release time MapReduce HDFS [Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
  • 9.
  • 10. Configurations determine the performance behavior of systems void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 11. Configurations determine the performance behavior of systems void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 12. Configurations determine the performance behavior of systems void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 13. Configurations determine the performance behavior of systems void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ endif else PARROT_HAS_SETENV LINUX
  • 14. Configuration options interact, therefore, creating a non-linear and complex performance behavior • Non-linear • Non-convex • Multi-modal number of counters number of splitters latency(ms) 100 150 1 200 250 2 300 Cubic Interpolation Over Finer Grid 243 684 10125 14166 18
  • 15. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) The default configuration is typically bad and the optimal configuration is noticeably better than median better better
  • 16. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) The default configuration is typically bad and the optimal configuration is noticeably better than median Default Configurationbetter better • Default is bad
  • 17. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) The default configuration is typically bad and the optimal configuration is noticeably better than median Default Configuration Optimal Configuration better better • Default is bad
  • 18. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) The default configuration is typically bad and the optimal configuration is noticeably better than median Default Configuration Optimal Configuration better better • Default is bad • 2X-10X faster than worst
  • 19. 0 500 1000 1500 Throughput (ops/sec) 0 1000 2000 3000 4000 5000 Averagewritelatency(s) The default configuration is typically bad and the optimal configuration is noticeably better than median Default Configuration Optimal Configuration better better • Default is bad • 2X-10X faster than worst • Noticeably faster than median
  • 22. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Compiler (e.f., SaC, LLVM)
  • 23. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Compiler (e.f., SaC, LLVM)
  • 24. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Constant folding Loop unrolling Function inlining Compiler (e.f., SaC, LLVM)
  • 25. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Configuration Space Constant folding Loop unrolling Function inlining Compiler (e.f., SaC, LLVM)
  • 26. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Configuration Space Constant folding Loop unrolling Function inlining Compiler (e.f., SaC, LLVM) Program
  • 27. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Configuration Space Constant folding Loop unrolling Function inlining c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ Compiler (e.f., SaC, LLVM) Program
  • 28. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Configuration Space Constant folding Loop unrolling Function inlining c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ Compiler (e.f., SaC, LLVM) Program Compiled Code Compile Configure
  • 29. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Configuration Space Constant folding Loop unrolling Function inlining c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ Compiler (e.f., SaC, LLVM) Program Compiled Code Instrumented Binary Hardware Compile Deploy Configure
  • 30. Setting the scene ℂ = O1 × O2 × ⋯ × O19 × O20 Dead code removal Configuration Space Constant folding Loop unrolling Function inlining c1 = 0 × 0 × ⋯ × 0 × 1c1 ∈ ℂ fc(c1) = 11.1msCompile time Execution time Energy Compiler (e.f., SaC, LLVM) Program Compiled Code Instrumented Binary Hardware Compile Deploy Configure fe(c1) = 110.3ms fen(c1) = 100mwh Non-functional measurable/quantifiable aspect
  • 31. A typical approach for understanding the performance behavior is sensitivity analysis O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn y1 = f(c1) y2 = f(c2) y3 = f(c3) yn = f(cn) ⋯
  • 32. A typical approach for understanding the performance behavior is sensitivity analysis O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn y1 = f(c1) y2 = f(c2) y3 = f(c3) yn = f(cn) ⋯ Training/Sample Set
  • 33. A typical approach for understanding the performance behavior is sensitivity analysis O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn y1 = f(c1) y2 = f(c2) y3 = f(c3) yn = f(cn) ̂f ∼ f( ⋅ ) ⋯ LearnTraining/Sample Set
  • 34. Performance model could be in any appropriate form of black-box models O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn y1 = f(c1) y2 = f(c2) y3 = f(c3) yn = f(cn) ̂f ∼ f( ⋅ ) ⋯ Training/Sample Set
  • 35. Performance model could be in any appropriate form of black-box models O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn y1 = f(c1) y2 = f(c2) y3 = f(c3) yn = f(cn) ̂f ∼ f( ⋅ ) ⋯ Training/Sample Set
  • 36. Performance model could be in any appropriate form of black-box models O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn y1 = f(c1) y2 = f(c2) y3 = f(c3) yn = f(cn) ̂f ∼ f( ⋅ ) ⋯ Training/Sample Set
  • 37. Performance model could be in any appropriate form of black-box models O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn y1 = f(c1) y2 = f(c2) y3 = f(c3) yn = f(cn) ̂f ∼ f( ⋅ ) ⋯ Training/Sample Set
  • 38. How to sample the configuration space to learn a “better” performance behavior? How to select the most informative configurations?
  • 39. We hypothesized that we can exploit similarities across environments to learn “cheaper” performance models O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn ys1 = fs(c1) ys2 = fs(c2) ys3 = fs(c3) ysn = fs(cn) O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ yt1 = ft(c1) yt2 = ft(c2) yt3 = ft(c3) ytn = ft(cn) Source Environment (Execution time of Program X) Target Environment (Execution time of Program Y) Similarity [P. Jamshidi, et al., “Transfer learning for performance modeling of configurable systems….”, ASE’17]
  • 40. We looked at different highly-configurable systems to gain insights about similarities across environments [P. Jamshidi, et al., “Transfer learning for performance modeling of configurable systems….”, ASE’17] SPEAR (SAT Solver) Analysis time 14 options 16,384 configurations SAT problems 3 hardware 2 versions X264 (video encoder) Encoding time 16 options 4,000 configurations Video quality/size 2 hardware 3 versions SQLite (DB engine) Query time 14 options 1,000 configurations DB Queries 2 hardware 2 versions SaC (Compiler) Execution time 50 options 71,267 configurations 10 Demo programs
  • 41. Observation 1: Not all options and interactions are influential and interactions degree between options are not high ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7 ℂ = O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10
  • 42. Observation 2: Influential options and interactions are preserved across environments ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
  • 43. Observation 2: Influential options and interactions are preserved across environments ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
  • 44. Observation 2: Influential options and interactions are preserved across environments ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
  • 45. Observation 2: Influential options and interactions are preserved across environments ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
  • 46. Observation 2: Influential options and interactions are preserved across environments ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7
  • 47. The similarity across environment is a rich source of knowledge for exploration of the configuration space
  • 48. Without considering this knowledge, many samples may not provide new information ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 49. Without considering this knowledge, many samples may not provide new information 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0c1 c2 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 1 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 c3 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 1 × 0 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1c128 ⋯ ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 50. Without considering this knowledge, many samples may not provide new information 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0c1 c2 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 1 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂fs(c1) = 14.9 ̂fs(c2) = 14.9 c3 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 1 × 0 ̂fs(c3) = 14.9 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 × 1 ̂fs(c128) = 14.9c128 ⋯ ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 51. Without knowing this knowledge, many blind/ random samples may not provide any additional information about performance of the system
  • 52. In higher dimensional spaces, the blind samples even become less informative/effective
  • 54. Extracting the knowledge about influential options and interactions: Step-wise linear regression O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn ys1 = fs(c1) ys2 = fs(c2) ys3 = fs(c3) ysn = fs(cn) Source (Execution time of Program X)
  • 55. Extracting the knowledge about influential options and interactions: Step-wise linear regression O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn ys1 = fs(c1) ys2 = fs(c2) ys3 = fs(c3) ysn = fs(cn) Source (Execution time of Program X) Learn performance model ̂fs ∼ fs( ⋅ )
  • 56. Extracting the knowledge about influential options and interactions: Step-wise linear regression O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn ys1 = fs(c1) ys2 = fs(c2) ys3 = fs(c3) ysn = fs(cn) Source (Execution time of Program X) Learn performance model ̂fs ∼ fs( ⋅ ) 1. Fit an initial model
  • 57. Extracting the knowledge about influential options and interactions: Step-wise linear regression O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn ys1 = fs(c1) ys2 = fs(c2) ys3 = fs(c3) ysn = fs(cn) Source (Execution time of Program X) Learn performance model ̂fs ∼ fs( ⋅ ) 1. Fit an initial model 2. Forward selection: Add terms iteratively
  • 58. Extracting the knowledge about influential options and interactions: Step-wise linear regression O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn ys1 = fs(c1) ys2 = fs(c2) ys3 = fs(c3) ysn = fs(cn) Source (Execution time of Program X) Learn performance model ̂fs ∼ fs( ⋅ ) 1. Fit an initial model 2. Forward selection: Add terms iteratively 3. Backward elimination: Removes terms iteratively
  • 59. Extracting the knowledge about influential options and interactions: Step-wise linear regression O1 × O2 × ⋯ × O19 × O20 0 × 0 × ⋯ × 0 × 1 0 × 0 × ⋯ × 1 × 0 0 × 0 × ⋯ × 1 × 1 1 × 1 × ⋯ × 1 × 0 1 × 1 × ⋯ × 1 × 1 ⋯ c1 c2 c3 cn ys1 = fs(c1) ys2 = fs(c2) ys3 = fs(c3) ysn = fs(cn) Source (Execution time of Program X) Learn performance model ̂fs ∼ fs( ⋅ ) 1. Fit an initial model 2. Forward selection: Add terms iteratively 3. Backward elimination: Removes terms iteratively 4. Terminate: When neither (2) or (3) improve the model
  • 60. L2S extracts the knowledge about influential options and interactions via performance models ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 61. L2S extracts the knowledge about influential options and interactions via performance models ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 62. L2S exploits the knowledge it gained from the source to sample the target environment O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 63. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 64. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft(c2) = 8.1 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 65. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft(c2) = 8.1 c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 66. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft(c2) = 8.1 c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 67. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft(c2) = 8.1 c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6 c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 68. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft(c2) = 8.1 c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6 c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7 c6 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c6) = 23.7 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 69. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft(c2) = 8.1 c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6 c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7 c6 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c6) = 23.7 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 70. L2S exploits the knowledge it gained from the source to sample the target environment 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0c1 c2 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 × 0 O1 × O2 × O3 × O4 × O5 × O6 × O7 × O8 × O9 × O10 ̂ft(c1) = 10.4 ̂ft(c2) = 8.1 c3 0 × 0 × 1 × 0 × 0 × 0 × 0 × 0 × 0 × 0 ̂ft(c3) = 11.6 ̂ft( ⋅ ) = 10.4 − 2.1o1 + 1.2o3 + 2.2o7 + 0.1o1o3 − 2.1o3o7 + 14o1o3o7 c4 0 × 0 × 0 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c4) = 12.6 c5 0 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c5) = 11.7 c6 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(c6) = 23.7 cx 1 × 0 × 1 × 0 × 0 × 0 × 1 × 0 × 0 × 0 ̂ft(cx) = 9.6 ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 71. Exploration vs Exploitation We also explore the configuration space using pseudo-random sampling to detect missing interactions
  • 72. For capturing options and interactions that only appears in the target, L2S relies on exploration (random sampling) Performance Distribution Influential Option Learning Influential Interaction (ii) Sampling Configurable System Performance Model ExploitationExploration Transfer Knowledge PerfConf <Conf,Perf> SourceTarget (i) Extract Knowledge Performance Influence Model
  • 73. L2S transfers knowledge about the structure of performance models from the source to guide the sampling in the target environment Target (Learn)Source (Given) DataModel Transferable Knowledge II. INTUITION standing the performance behavior of configurable systems can enable (i) performance debugging, (ii) ance tuning, (iii) design-time evolution, or (iv) runtime on [11]. We lack empirical understanding of how the ance behavior of a system will vary when the environ- the system changes. Such empirical understanding will important insights to develop faster and more accurate techniques that allow us to make predictions and tions of performance for highly configurable systems ging environments [10]. For instance, we can learn ance behavior of a system on a cheap hardware in a ed lab environment and use that to understand the per- e behavior of the system on a production server before to the end user. More specifically, we would like to hat the relationship is between the performance of a n a specific environment (characterized by software ation, hardware, workload, and system version) to the we vary its environmental conditions. s research, we aim for an empirical understanding of ance behavior to improve learning via an informed A. Preliminary concepts In this section, we provide formal definitions of cepts that we use throughout this study. The forma enable us to concisely convey concept throughout 1) Configuration and environment space: Let F the i-th feature of a configurable system A whic enabled or disabled and one of them holds by de configuration space is mathematically a Cartesian all the features C = Dom(F1) ⇥ · · · ⇥ Dom(F Dom(Fi) = {0, 1}. A configuration of a syste a member of the configuration space (feature spa all the parameters are assigned to a specific valu range (i.e., complete instantiations of the system’s pa We also describe an environment instance by 3 e = [w, h, v] drawn from a given environment s W ⇥H ⇥V , where they respectively represent sets values for workload, hardware and system version. 2) Performance model: Given a software syste configuration space F and environmental instances formance model is a black-box function f : F ⇥ given some observations of the system performanc combination of system’s features x 2 F in an en e 2 E. To construct a performance model for a ON behavior of configurable rformance debugging, (ii) evolution, or (iv) runtime understanding of how the ill vary when the environ- mpirical understanding will p faster and more accurate to make predictions and ghly configurable systems r instance, we can learn on a cheap hardware in a hat to understand the per- a production server before cifically, we would like to een the performance of a characterized by software and system version) to the conditions. mpirical understanding of learning via an informed A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A vations of the system performance for each ystem’s features x 2 F in an environment ruct a performance model for a system A space F, we run A in environment instance combinations of configurations xi 2 F, and g performance values yi = f(xi) + ✏i, xi 2 (0, i). The training data for our regression mply Dtr = {(xi, yi)}n i=1. In other words, a is simply a mapping from the input space to ormance metric that produces interval-scaled ume it produces real numbers). distribution: For the performance model, associated the performance response to each w let introduce another concept where we ment and we measure the performance. An mance distribution is a stochastic process, that defines a probability distribution over sures for each environmental conditions. To rmance distribution for a system A with given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To construct a performance distribution for a system A with Extract Reuse Learn Learn
  • 74. L2S transfers knowledge about the structure of performance models from the source to guide the sampling in the target environment Target (Learn)Source (Given) DataModel Transferable Knowledge II. INTUITION standing the performance behavior of configurable systems can enable (i) performance debugging, (ii) ance tuning, (iii) design-time evolution, or (iv) runtime on [11]. We lack empirical understanding of how the ance behavior of a system will vary when the environ- the system changes. Such empirical understanding will important insights to develop faster and more accurate techniques that allow us to make predictions and tions of performance for highly configurable systems ging environments [10]. For instance, we can learn ance behavior of a system on a cheap hardware in a ed lab environment and use that to understand the per- e behavior of the system on a production server before to the end user. More specifically, we would like to hat the relationship is between the performance of a n a specific environment (characterized by software ation, hardware, workload, and system version) to the we vary its environmental conditions. s research, we aim for an empirical understanding of ance behavior to improve learning via an informed A. Preliminary concepts In this section, we provide formal definitions of cepts that we use throughout this study. The forma enable us to concisely convey concept throughout 1) Configuration and environment space: Let F the i-th feature of a configurable system A whic enabled or disabled and one of them holds by de configuration space is mathematically a Cartesian all the features C = Dom(F1) ⇥ · · · ⇥ Dom(F Dom(Fi) = {0, 1}. A configuration of a syste a member of the configuration space (feature spa all the parameters are assigned to a specific valu range (i.e., complete instantiations of the system’s pa We also describe an environment instance by 3 e = [w, h, v] drawn from a given environment s W ⇥H ⇥V , where they respectively represent sets values for workload, hardware and system version. 2) Performance model: Given a software syste configuration space F and environmental instances formance model is a black-box function f : F ⇥ given some observations of the system performanc combination of system’s features x 2 F in an en e 2 E. To construct a performance model for a ON behavior of configurable rformance debugging, (ii) evolution, or (iv) runtime understanding of how the ill vary when the environ- mpirical understanding will p faster and more accurate to make predictions and ghly configurable systems r instance, we can learn on a cheap hardware in a hat to understand the per- a production server before cifically, we would like to een the performance of a characterized by software and system version) to the conditions. mpirical understanding of learning via an informed A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A vations of the system performance for each ystem’s features x 2 F in an environment ruct a performance model for a system A space F, we run A in environment instance combinations of configurations xi 2 F, and g performance values yi = f(xi) + ✏i, xi 2 (0, i). The training data for our regression mply Dtr = {(xi, yi)}n i=1. In other words, a is simply a mapping from the input space to ormance metric that produces interval-scaled ume it produces real numbers). distribution: For the performance model, associated the performance response to each w let introduce another concept where we ment and we measure the performance. An mance distribution is a stochastic process, that defines a probability distribution over sures for each environmental conditions. To rmance distribution for a system A with given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To construct a performance distribution for a system A with Extract Reuse Learn Learn ̂fs( ⋅ ) = 1.2 + 3o1 + 5o3 + 0.9o7 + 0.8o3o7 + 4o1o3o7
  • 75. Evaluation: Other transfer learning approaches also exists
  • 76. “Model shift” builds a model in the source and uses the shifted model “directly” for predicting the target [Pavel Valov, et al. “Transferring performance prediction models…”, ICPE’17 ] log P (θ, Xobs ) Θ P (θ|Xobs ) log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ P(θ|Xobs ) P(θ|Xobs ) P(θ|Xobs ) Target Source Throughput Machine twice as fast
  • 77. “Data reuse” combines the data collected in the source with the ones in the target in a “multi-task” setting to predict the target DataData Data Measure Measure Reuse Learn TurtleBot Simulator (Gazebo) [P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17] Configurations f(o1, o2) = 5 + 3o1 + 15o2 7o1 ⇥ o2
  • 78. Evaluation: Learning performance behavior of Machine Learning Systems ML system: https://pooyanjamshidi.github.io/mls
  • 79. Configurations of deep neural networks affect accuracy and energy consumption 0 500 1000 1500 2000 2500 Energy consumption [J] 0 20 40 60 80 100 Validation(test)error CNN on CNAE-9 Data Set 72% 22X
  • 80. Configurations of deep neural networks affect accuracy and energy consumption 0 500 1000 1500 2000 2500 Energy consumption [J] 0 20 40 60 80 100 Validation(test)error CNN on CNAE-9 Data Set 72% 22X the selected cell is plugged into a large model which is trained on the combinatio validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We not is never used for model selection, and it is only used for final model evaluation. We cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dat image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 image conv3x3 large CIFAR-10 model cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell cell Figure 2: Image classification models constructed using the cells optimized with arc Top-left: small model used during architecture search on CIFAR-10. Top-right: model used for learned cell evaluation. Bottom: ImageNet model used for learned For CIFAR-10 experiments we use a model which consists of 3 ⇥ 3 convolution w
  • 81. Configurations of deep neural networks affect accuracy and energy consumption 0 500 1000 1500 2000 2500 Energy consumption [J] 0 20 40 60 80 100 Validation(test)error CNN on CNAE-9 Data Set 72% 22X the selected cell is plugged into a large model which is trained on the combinatio validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We not is never used for model selection, and it is only used for final model evaluation. We cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dat image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 image conv3x3 large CIFAR-10 model cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell cell Figure 2: Image classification models constructed using the cells optimized with arc Top-left: small model used during architecture search on CIFAR-10. Top-right: model used for learned cell evaluation. Bottom: ImageNet model used for learned For CIFAR-10 experiments we use a model which consists of 3 ⇥ 3 convolution w the selected cell is plugged into a large model which is trained on the combination of training and validation sub-sets, and the accuracy is reported on the CIFAR-10 test set. We note that the test set is never used for model selection, and it is only used for final model evaluation. We also evaluate the cells, learned on CIFAR-10, in a large-scale setting on the ImageNet challenge dataset (Sect. 4.3). image sep.conv3x3/2 sep.conv3x3/2 sep.conv3x3 conv3x3 globalpool linear&softmax small CIFAR-10 model cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3 globalpool linear&softmax large CIFAR-10 model cell cell cell cell cell cell sep.conv3x3/2 sep.conv3x3 sep.conv3x3/2 sep.conv3x3 sep.conv3x3 sep.conv3x3 image conv3x3/2 conv3x3/2 sep.conv3x3/2 globalpool linear&softmax ImageNet model cell cell cell cell cell cell cell Figure 2: Image classification models constructed using the cells optimized with architecture search. Top-left: small model used during architecture search on CIFAR-10. Top-right: large CIFAR-10 model used for learned cell evaluation. Bottom: ImageNet model used for learned cell evaluation.
  • 82. Deep neural network as a highly configurable system of top/bottom conf.; M6/M7: Number of influential options; M8/M9: Number of options agree Correlation btw importance of options; M11/M12: Number of interactions; M13: Number of inte e↵ects; M14: Correlation btw the coe↵s; Input #1 Input #2 Input #3 Input #4 Output Hidden layer Input layer Output layer 4 Technical Aims and Research Plan We will pursue the following technical aims: (1) investigate potential criteria for e↵ectiv exploration of the design space of DNN architectures (Section 4.2), (2) build analytical m curately predict the performance of a given architecture configuration given other similar which either have been measured in the target environments or other similar environm measuring the network performance directly (Section 4.3), and (3), develop a tunni that exploit the performance model from previous step to e↵ectively search for optima (Section 4.4). 4.1 Project Timeline We plan to complete the proposed project in two years. To mitigate project risks, we project into three major phases: 8 Network Design Model Compiler Hybrid Deployment OS/ Hardware Scopeof thisProject Neural Search Hardware Optimization Hyper-parameter DNNsystemdevelopmentstack Deployment Topology
  • 83. DNN measurements are costly Each sample cost ~1h 4000 * 1h ~= 6 months
  • 84. DNN measurements are costly Each sample cost ~1h 4000 * 1h ~= 6 months Yes, that’s the cost we paid for conducting our measurements!
  • 85. L2S enables learning a more accurate model with less samples exploiting the knowledge from the source 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (a) DNN (hard) 3 10 20 30 40 Sample Si 0 20 40 60 80 100 MeanAbsolutePercentageError L L D M R (b) XGBoost (h Convolutional Neural Network
  • 86. L2S enables learning a more accurate model with less samples exploiting the knowledge from the source 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (a) DNN (hard) 3 10 20 30 40 Sample Si 0 20 40 60 80 100 MeanAbsolutePercentageError L L D M R (b) XGBoost (h Convolutional Neural Network
  • 87. L2S enables learning a more accurate model with less samples exploiting the knowledge from the source 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (a) DNN (hard) 3 10 20 30 40 Sample Si 0 20 40 60 80 100 MeanAbsolutePercentageError L L D M R (b) XGBoost (h Convolutional Neural Network
  • 88. L2S may also help data-reuse approach to learn faster 30 40 50 60 70 Sample Size 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (a) DNN (hard) 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (b) XGBoost (hard) 3 10 20 30 40 Sample Siz 0 20 40 60 80 100 MeanAbsolutePercentageError L L D M R (c) Storm (ha XGBoost
  • 89. L2S may also help data-reuse approach to learn faster 30 40 50 60 70 Sample Size 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (a) DNN (hard) 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (b) XGBoost (hard) 3 10 20 30 40 Sample Siz 0 20 40 60 80 100 MeanAbsolutePercentageError L L D M R (c) Storm (ha XGBoost
  • 90. L2S may also help data-reuse approach to learn faster 30 40 50 60 70 Sample Size 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (a) DNN (hard) 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (b) XGBoost (hard) 3 10 20 30 40 Sample Siz 0 20 40 60 80 100 MeanAbsolutePercentageError L L D M R (c) Storm (ha XGBoost
  • 92. Some environments the similarities across environments may be too low and this results in “negative transfer” 0 30 40 50 60 70 Sample Size 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (b) XGBoost (hard) 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100200500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (c) Storm (hard) Apache Storm
  • 93. Some environments the similarities across environments may be too low and this results in “negative transfer” 0 30 40 50 60 70 Sample Size 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (b) XGBoost (hard) 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100200500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (c) Storm (hard) Apache Storm
  • 94. Some environments the similarities across environments may be too low and this results in “negative transfer” 0 30 40 50 60 70 Sample Size 100 200 500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (b) XGBoost (hard) 3 10 20 30 40 50 60 70 Sample Size 0 20 40 60 80 100 MeanAbsolutePercentageError 100200500 L2S+GP L2S+DataReuseTL DataReuseTL ModelShift Random+CART (c) Storm (hard) Apache Storm
  • 95. Why performance models using L2S sample are more accurate?
  • 96. The samples generated by L2S contains more information… “entropy <-> information gain” 10 20 30 40 50 60 70 sample size 0 1 2 3 4 5 6 entropy[bits] Max entropy L2S Random DNN
  • 97. The samples generated by L2S contains more information… “entropy <-> information gain” 10 20 30 40 50 60 70 sample size 0 1 2 3 4 5 6 entropy[bits] Max entropy L2S Random 10 20 30 40 50 60 70 sample size 0 1 2 3 4 5 6 7 entropy[bits] Max entropy L2S Random 10 20 30 40 50 60 70 sample size 0 1 2 3 4 5 6 7 entropy[bits] Max entropy L2S Random DNN XGboost Storm
  • 98. Limitations • Limited number of systems and environmental changes • Synthetic models • https://github.com/pooyanjamshidi/GenPerf • Binary options • Non-binary options -> binary • Negative transfer