Modern systems (e.g., deep neural networks, big data analytics, and compilers) are highly configurable, which means they expose different performance behavior under different configurations. The fundamental challenge is that one cannot simply measure all configurations due to the sheer size of the configuration space. Transfer learning has been used to reduce the measurement efforts by transferring knowledge about performance behavior of systems across environments. Previously, research has shown that statistical models are indeed transferable across environments. In this work, we investigate identifiability and transportability of causal effects and statistical relations in highly-configurable systems. Our causal analysis agrees with previous exploratory analysis~\cite{Jamshidi17} and confirms that the causal effects of configuration options can be carried over across environments with high confidence. We expect that the ability to carry over causal relations will enable effective performance analysis of highly-configurable systems.
3. Empirical observations confirm that systems
are becoming increasingly configurable
3
Modern systems:
q Increasingly configurable
with software evolution
q Deployed in dynamic and
uncertain environments
[Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
arameters (“knobs”). With
software to ensure high re-
unting, error-prone task.
derstanding a fundamental
users really need so many
answer, we study the con-
ncluding thousands of cus-
(Storage-A), and hundreds
e system software projects.
g findings to motivate soft-
re cautious and disciplined
hese findings, we provide
ch can significantly reduce
A as an example, the guide-
ers and simplify 19.7% of
n existing users. Also, we
on methods in the context
ir effectiveness in dealing
7/2006 7/2008 7/2010 7/2012 7/2014
0
100
200
300
Numbero
Release time
1
2
Numbero
1/1998 1/2002 1/2006 1/2010 1/2014
0
100
200
300
400
500
600
1.3.14
2.2.14
2.3.4
2.0.35
1.3.24
Numberofparameters
Release time
Apache
1
1
2
Numberofparameters
Figure 1: The increasing number of
software evolution. Storage-A is a comm
4. 5
Influence of options is typically significant
number of counters
number of splitters
latency(ms)
150
100
1
200
250
2
300
243 684 10125 14166 18
Only by tweaking
2 options out of 200
in Apache Storm
- observed ~100%
change in latency
5. 7
How does transfer learning come to the scene ?
DataModel
Transferable
Knowledge
Extract Reuse
Source (Given)
y
C
Learn
Target (Learn)
y
C
Learn
v An ML approach
uses the knowledge
learned on the
source…
v …to learn a cheaper
model for the target
[Pooyan Jamshidi, et al., “Transfer Learning for
Performance Analysis…”, ASE’17]
6. 8
(Javidian, Jamshidi, Valtorta. AAAI Spring Symposium 2019, Stanford, CA.)
TargetSource
Causal
Model
II. INTUITION
ng the performance behavior of configurable
ms can enable (i) performance debugging, (ii)
uning, (iii) design-time evolution, or (iv) runtime
]. We lack empirical understanding of how the
ehavior of a system will vary when the environ-
stem changes. Such empirical understanding will
tant insights to develop faster and more accurate
niques that allow us to make predictions and
of performance for highly configurable systems
nvironments [10]. For instance, we can learn
ehavior of a system on a cheap hardware in a
environment and use that to understand the per-
vior of the system on a production server before
e end user. More specifically, we would like to
e relationship is between the performance of a
pecific environment (characterized by software
hardware, workload, and system version) to the
ary its environmental conditions.
arch, we aim for an empirical understanding of
behavior to improve learning via an informed
ess. In other words, we at learning a perfor-
n a changed environment based on a well-suited
hat has been determined by the knowledge we
er environments. Therefore, the main research
ether there exists a common information (trans-
le knowledge) that applies to both source and
ments of systems and therefore can be carried
A. Preliminary concepts
In this section, we provide formal definitions
cepts that we use throughout this study. The form
enable us to concisely convey concept throughou
1) Configuration and environment space: Le
the i-th feature of a configurable system A wh
enabled or disabled and one of them holds by
configuration space is mathematically a Cartesia
all the features C = Dom(F1) ⇥ · · · ⇥ Dom
Dom(Fi) = {0, 1}. A configuration of a sy
a member of the configuration space (feature s
all the parameters are assigned to a specific v
range (i.e., complete instantiations of the system’s
We also describe an environment instance by
e = [w, h, v] drawn from a given environmen
W ⇥H ⇥V , where they respectively represent se
values for workload, hardware and system versio
2) Performance model: Given a software sy
configuration space F and environmental instan
formance model is a black-box function f : F
given some observations of the system performa
combination of system’s features x 2 F in an
e 2 E. To construct a performance model for
with configuration space F, we run A in environm
e 2 E on various combinations of configurations
record the resulting performance values yi = f(x
F where ✏i ⇠ N (0, i). The training data for o
models is then simply Dtr = {(xi, yi)}n
i=1. In o
response function is simply a mapping from the i
a measurable performance metric that produces in
N
behavior of configurable
formance debugging, (ii)
evolution, or (iv) runtime
understanding of how the
ll vary when the environ-
pirical understanding will
faster and more accurate
o make predictions and
hly configurable systems
instance, we can learn
n a cheap hardware in a
hat to understand the per-
production server before
fically, we would like to
een the performance of a
haracterized by software
nd system version) to the
onditions.
mpirical understanding of
earning via an informed
we at learning a perfor-
ent based on a well-suited
ed by the knowledge we
efore, the main research
mmon information (trans-
plies to both source and
therefore can be carried
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
rvations of the system performance for each
system’s features x 2 F in an environment
truct a performance model for a system A
n space F, we run A in environment instance
s combinations of configurations xi 2 F, and
ing performance values yi = f(xi) + ✏i, xi 2
N (0, i). The training data for our regression
imply Dtr = {(xi, yi)}n
i=1. In other words, a
n is simply a mapping from the input space to
formance metric that produces interval-scaled
sume it produces real numbers).
ce distribution: For the performance model,
d associated the performance response to each
ow let introduce another concept where we
ment and we measure the performance. An
mance distribution is a stochastic process,
, that defines a probability distribution over
asures for each environmental conditions. To
ormance distribution for a system A with
ace F, similarly to the process of deriving
models, we run A on various combinations
i 2 F, for a specific environment instance
d the resulting performance values yi. We then
istribution to the set of measured performance
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
ecord the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
esponse function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
configuration space F, similarly to the process of deriving
he performance models, we run A on various combinations
configurations xi 2 F, for a specific environment instance
e 2 E and record the resulting performance values yi. We then
fit a probability distribution to the set of measured performance
Extract Reuse
Learn
C
C
P
Interventional Data
C
Learn
pr(P|do(Ci)) =?
Causal
Structure
Transferable
K
now
ledge
O1 O2 O3 O4 O5
P S
Observational
Data
N
behavior of configurable
formance debugging, (ii)
evolution, or (iv) runtime
nderstanding of how the
ll vary when the environ-
pirical understanding will
faster and more accurate
o make predictions and
hly configurable systems
instance, we can learn
n a cheap hardware in a
hat to understand the per-
production server before
fically, we would like to
een the performance of a
haracterized by software
nd system version) to the
onditions.
mpirical understanding of
earning via an informed
we at learning a perfor-
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
oad, hardware and system version.
ce model: Given a software system A with
ace F and environmental instances E, a per-
is a black-box function f : F ⇥ E ! R
rvations of the system performance for each
system’s features x 2 F in an environment
truct a performance model for a system A
n space F, we run A in environment instance
s combinations of configurations xi 2 F, and
ing performance values yi = f(xi) + ✏i, xi 2
N (0, i). The training data for our regression
imply Dtr = {(xi, yi)}n
i=1. In other words, a
n is simply a mapping from the input space to
formance metric that produces interval-scaled
sume it produces real numbers).
ce distribution: For the performance model,
d associated the performance response to each
ow let introduce another concept where we
ment and we measure the performance. An
mance distribution is a stochastic process,
, that defines a probability distribution over
asures for each environmental conditions. To
ormance distribution for a system A with
ace F, similarly to the process of deriving
Interventional
Data
Observational
Data
Causal Effect of
Config. Options
on Performance
How do causal inference tools come to the scene ?
7. Research Questions
Is it possible to identify causal relations from observational
data and how generalizable are they in highly-configurable
systems?
• RQ1 (Identifiability): Is it possible to estimate causal
effects of configuration options on performance from
observational studies alone?
• RQ2 (Transportability): Is the causal effect of influential
configuration options on performance transportable
across environments?
• RQ3 (Recoverability): Is it possible to recover
conditional probabilities from selection-biased data to the
entire population?
9
8. RQ1 (Identifiability): Is it possible to
estimate causal effects of configuration
options on performance from observational
studies alone?
12
P(encoding-time|do(visualize)=1)=P(encoding-time|visualize=1)
with mean of 0.37 and variance of 0.14.
9. RQ1: Results and Implications
Results:
§ Small number of influential configuration options
§ P(perf|do(O_i=o')) is estimable in environments with a
single performance measurement
§ P(perf|do(O_i=o')) is estimable in environments with
multiple performance measurements
Implications:
§ Leading to effective exploration strategies
13
10. 15
RQ2 (Transportability): Is the causal effect
of influential configuration options on
performance transportable across
environments?
11. 16
RQ2: Results and Implications
Results:
§ Trivial transportability: !" → $%&' ← )
§ Small environmental changes lead to transportability of
causal relations
§ With severe environmental changes, transportability of
some causal relations is still possible
Implications:
§ Running new costly experiments in the target environment
can be avoided
12. 18
RQ3 (Recoverability): Is it possible to
recover conditional probabilities from
selection-biased data to the entire population?
13. 19
RQ3: Results and Implications
Results:
§ Recoverability without external data is possible
§ Small sample size may lead to unrecoverable selection bias
Implications:
§ Cost-efficient sampling for performance prediction of
configurable systems
§ Avoiding of biased estimates of causal/statistical effects