On the value of Sampling and Pruning for SBSE

On the Value of Sampling and Pruning
for Search-Based Software Engineering
Jianfeng Chen (jchen37@ncsu.edu)
April 20 2018
1

How to better support SE planning + re-planning?
Plan
(what to do)
Re-plan
(what to react to new
circumstance)
What feature to
include in project
What feature to
include in vi+1
Assign software to
cloud env. How?
Adjust to cloud env.
changes. How?
What to test
first?
What to test
next?
2

Problem: planning & re-planning can be very slow.
Running time SLOW
[Zhang’17] Yuanyuan Zhang, Mark Harman, and A Mansouri. The sbse repository: A repository and analysis of authors and research articles on search based software engineering. CREST Centre, UCL
3

Thesis Statement
For the optimization of SE planning and re-planning tasks,
● given appropriate separation operators1
,
● then OverSampling and Pruning1
(OSAP) is better
● than the mutation based EVOLutionary1
(EVOL) approach
● (where “better” is measured in terms of runtimes, number of
evaluations, and value of final result).
1
to be defined, later in this talk
4

Roadmap
Introduction
EVOL
GALE
OSAP
├─ TopDown Bi-clustering
├─ Encoding Knowledge
└─ Random Anchors
Roadmap
● What is Search-based SE
● EVOL: Evolutionary algorithms
○ GALE: A geometric learner
● OSAP: Oversampling-and-pruning via Separation Operators
5

Roadmap
Introduction
EVOL
GALE
OSAP
Publications & tools in this PhD program
FINAL THESISTHIS TALK
[CLOUD18 Chen et al.] (Accept rate: 15%)
RIOT: workflow scheduling tool
[TSE18 Chen et al.]
Sampling as a baseline for SBSE
[IST17 Chen et al.]
Beyond EA for SBSE
[SSBSE16 Nair et al.]
Accidental exploration for SBSE
Publications Tools
6

Roadmap
Introduction
EVOL
GALE
OSAP
Roadmap
7

SE = making choices in multi (rival) objectives
● Deployments (improving QoS vs. reducing deployment cost)
○ CLOUD: cloud configuration optimization
● Testing (test cost vs. defects detected)
○ Fuzzy testing: less test cases to cover more paths
● SE Planning (trade offs functionality vs. cost)
○ NRP: next release requirements planning
○ SPL: software product lines: product selection
Roadmap
Introduction
EVOL
GALE
OSAP
8

Res s
Tim
Cos
Search based Software Engineering (SBSE) converts a software engineering
problem into a computational search problem, and solves that.
Roadmap
Introduction
EVOL
GALE
OSAP
Mem A c
C U l o
Ban d
b f(b)
f(a)
a
9

Configuration
Space
Objective
Space
Dominance: p dominates q if and only if
Consider every objective, p performs no worse than q AND
There exist at least one objective, p preforms strictly better than q
Roadmap
Introduction
EVOL
GALE
OSAP
f(p)
f(q)f(x)
10
Pareto frontier
Res s
Tim
co

Configuration
Space
Objective
Space
Roadmap
Introduction
EVOL
GALE
OSAP
f(p)
f(q)f(x)
Characteristics of SBSE problems
● More than one objective
● Configuration space is huge
● Constrained configurations
● Complex (no easy to assess configurations)
11In SBSE community: the Evolutionary algorithm

Roadmap
Introduction
EVOL
GALE
OSAP
Roadmap
12

initial configurations
(population) best configurations
Treat the problem as black-box
Easy to deploy to new problem
~~SLOW~~
● Airspace operation model verification --
7 days [Krall’15]
● Test suite generation -- weeks [Yoo’12]
● Software clone evaluation @ pc
-- 15 years [Wang’13]
Krall, Joseph, Tim Menzies, and Misty Davies. "Learning the task management space of an aircraft approach model." (2014).
Yoo, Shin, and Mark Harman. "Regression testing minimization, selection and prioritization: a survey." Software Testing, Verification and Reliability 22.2 (2012): 67-120.
Wang, Tiantian, et al. "Searching for better configurations: a rigorous approach to clone evaluation." Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, 2013.
Roadmap
Introduction
EVOL
GALE
OSAP
Evolutionary algorithm
(EVOL)
13

Chang, C. K., Jiang, H. Y., Di, Y., Zhu, D., & Ge, Y. (2008). Time-line based model for software project scheduling with genetic algorithms. Information and Software Technology, 50(11)
Tsai, Chun-Wei, et al. "A hyper-heuristic scheduling algorithm for cloud." IEEE Transactions on Cloud Computing 2.2 (2014): 236-250.
Arcuri, Andrea. "Many Independent Objective (MIO) Algorithm for Test Suite Generation." International Symposium on Search Based Software Engineering. Springer, Cham, 2017.
Research directions in SBSE
2 Combining EAs
E.g. [Tsai’14] A Hyper-heuristic Scheduling Algorithm for cloud
GA+SA+ACO+PSO
Slow^2
3 Re-design objective functions
E.g. [Andrea’17] Many Independent objective algorithm for test suite generation
Much complex model. Longer time to evaluate
Roadmap
Introduction
EVOL
GALE
OSAP
1 Better configuration encoding
E.g. [Chang’11] Time-line based model for software project scheduling with genetic algorithm
Expert knowledge; carefully design recombination/mutation
14

Roadmap
Introduction
EVOL
GALE
OSAP
Current SBSE solutions are too slow !
Why need faster optimizers?
(Save $$$, Faster response to model changes)
15

Roadmap
Introduction
EVOL
GALE
OSAP
Roadmap
16

Roadmap
Introduction
EVOL
GALE
OSAP
[Krall’15] Krall, Joseph, Tim Menzies, and Misty Davies. "Gale: Geometric active learning for search-based software engineering." TSE
Configuration
Space
GALE = Geometric active learner [krall’15]
17
(population) best configurations
Objective
Space

Roadmap
Introduction
EVOL
GALE
OSAP
Configuration
Space
18
best configurations
Objective
Space
(population)

Roadmap
Introduction
EVOL
GALE
OSAP
EVOL GALE
Population N = 100 N = 100
Recombination ✓ ✓
Mutation ✓ ✓
Evaluation # gen# * N gen# * 2*log(N)
O(G·N) ->
O(G·logN)
19

Roadmap
Introduction
EVOL
GALE
OSAP
20

Roadmap
Introduction
EVOL
GALE
OSAP
Configuration
Space
Objective
Space
The selected configuration region did not swift a lot.
Not necessary to explore more generations.
Increase population size. [100->10,000]
Over-sampling
21

Roadmap
Introduction
EVOL
GALE
OSAP
OSAP Oversampling and pruning
EVOL GALE Over-sampling
Population N = 100 N = 100
Recombination ✓ ✓
Mutation ✓ ✓
Evaluation # gen# * N gen# * 2log(N)
O(G·N) ->
O(G·logN)->
O(logN)
N=10,000
✘
✘
2log(N)
...
Over-sampling: population is much larger
22

Roadmap
Introduction
EVOL
GALE
OSAP
Roadmap
Separation
Operators
1 Top-down
bi-clustering
Algorithm
Configuration
Space
Study Cases
23

Roadmap
Introduction
EVOL
GALE
OSAP
SWAY = Top-down bi-clustering
(R) rand init configuration
(W) Furthest to (R)
(E)Furthest to (W)
Configuration
Space
24
“Diameter” of
configuration space

Roadmap
Introduction
EVOL
GALE
OSAP
SWAY = Top-down bi-clustering
(W) Furthest to (R)
(E)Furthest to (W)
Configuration
Space
25
“Diameter” of
configuration space

Roadmap
Introduction
EVOL
GALE
OSAP
└─ Random Anchors Separation
Operators
1 Top-down
bi-clustering
Algorithm SWAY
Configuration
Space
Continuous
Study Cases XOMO, POM3
26
Chen, Jianfeng, et al. "" Sampling" as a Baseline Optimizer for Search-based Software Engineering." IEEE Transactions on Software Engineering (2018).

Roadmap
Introduction
EVOL
GALE
OSAP
Assuming: small region of configuration space can lead to the frontier
What
if
Configuration
Space
Objective Space
27

Roadmap
Introduction
EVOL
GALE
OSAP
Configuration
Space
Objective Space
Perform the top-down
bi-clustering separately
28

Roadmap
Introduction
EVOL
GALE
OSAP
Configuration
Space
Objective Space
Encoding: represent the model
configuration in vectors, combinations,
etc.
How the
model
encoded?
How can we
gather similar
configurations
?
SWAY2
, Separate via Encoding knowledge 29

Software Product Line optimization
Objectives
Select features to develop such that...
● More features
● Less defects
● Less total cost
● More familiar features
Roadmap
Introduction
EVOL
GALE
OSAP
30

Configuration (feature model)
Roadmap
Introduction
EVOL
GALE
OSAP
optionalmandatory
Cross tree
constraints 31

Roadmap
Introduction
EVOL
GALE
OSAP
CNF (conjunctive normal
forms)
Solvable by SAT solvers.
Initialization via SAT solver.
32

Roadmap
Introduction
EVOL
GALE
OSAP
CNF (conjunctive normal
forms)
Solvable by SAT solvers.
Initialization via SAT solver.
HIGH DIMENSIONAL
HIGHLY CONSTRAINED
33

Roadmap
Introduction
EVOL
GALE
OSAP
Related work (EVOL)
White, Jules, Brian Doughtery, and Douglas C. Schmidt. "Filtered Cartesian Flattening: An Approximation Technique for Optimally Selecting Features while Adhering to Resource Constraints." SPLC (2). 2008.
Wu, Zhiqiao, et al. "An optimization model for reuse scenario selection considering reliability and cost in software product line development." International Journal of Information Technology & Decision Making 10.05
(2011): 811-841.
Sayyad, Abdel Salam, Tim Menzies, and Hany Ammar. "On the value of user preferences in search-based software engineering: a case study in software product lines." ICSE’13
Sayyad, Abdel Salam, et al. "Scalable product line configuration: A straw to break the camel's back." Automated Software Engineering (ASE), 2013
Henard, Christopher, et al. "Combining multi-objective search and constraint solving for configuring large software product lines." Software Engineering (ICSE), 2015
White’08
Wu’11 Sayyad’13
Henard’15
Single
obj
Aggregated
obj IBEA
34

Roadmap
Introduction
EVOL
GALE
OSAP
How the
model
encoded?
How can we
gather similar
configurations
?
As scale increases
Scale = 4
Configuration
Space
Objective
Space
co s
de t
35

Roadmap
Introduction
EVOL
GALE
OSAP
As scale increasesscale
Radius ∝ scale
Inner circle :: smaller area ::
less diverse for simple configurations
Outer circle :: larger area ::
larger diverse for complex configurations
36

Roadmap
Introduction
EVOL
GALE
OSAP
Radius ∝ scale
Smaller area. Less configurations
Larger area. More configurations
37
Configuration
Space
Objective
Space
co s
de t

Roadmap
Introduction
EVOL
GALE
OSAP
constraints#.
i.e. complexity of the model
State-of-the-art EVOL
SWAY2
is (orders of magnitude)
faster than EVOL.
This is important when models
become complex
38

Roadmap
Introduction
EVOL
GALE
OSAP
Wang, Shuai, et al. "A practical guide to select quality indicators for assessing pareto-based search algorithms in search-based software engineering." Software Engineering (ICSE), 2016 IEEE/ACM 38th
International Conference on. IEEE, 2016.
GS PFS HV
Webportal 81
Eshop 506
Fiasco 5228
Freebsd 62138
Linux 343944
Obtained
frontiers
Pareto front size (PFS)
# of obtained frontiers
Hyper-volume (HV)
Spread (GS)
39

Roadmap
Introduction
EVOL
GALE
OSAP
SWAY(*) VS. State-Of-The-Art
⬤ Statistically no difference than SATIBEA
⬤ Significantly better than SATIBEA
⬤ Significantly worse than SATIBEA
40
A12 >= 0.6, not the same
Arcuri and Briand at ICSE’11
Arcuri, Andrea, and Lionel Briand. "A practical guide for using statistical tests to assess randomized algorithms in software engineering." Software Engineering (ICSE), 2011 33rd International Conference on.
IEEE, 2011.

Roadmap
Introduction
EVOL
GALE
OSAP
W/o
encoding
knowledge
GS PFS HV
Webportal 81
⬤ ⬤ ⬤
Eshop 506
⬤ ⬤ ⬤
Fiasco 5228
⬤ ⬤ ⬤
Freebsd 62138
⬤ ⬤ ⬤
Linux 343944
⬤ ⬤ ⬤
With
encoding
knowledge
GS PFS HV
Webportal
⬤ ⬤ ⬤
eshop
⬤ ⬤ ⬤
Fiasco
⬤ ⬤ ⬤
freebsd
⬤ ⬤ ⬤
linux
⬤ ⬤ ⬤
41Arcuri, Andrea, and Lionel Briand. "A practical guide for using statistical tests to assess randomized algorithms in software engineering." Software Engineering (ICSE), 2011 33rd International Conference on.
IEEE, 2011.

Roadmap
Introduction
EVOL
GALE
OSAP
W/o
encoding
knowledge
GS PFS HV
Webportal 81
⬤ ⬤ ⬤
Eshop 506
⬤ ⬤ ⬤
Fiasco 5228
⬤ ⬤ ⬤
Freebsd 62138
⬤ ⬤ ⬤
Linux 343944
⬤ ⬤ ⬤
With
encoding
knowledge
GS PFS HV
Webportal
⬤ ⬤ ⬤
eshop
⬤ ⬤ ⬤
Fiasco
⬤ ⬤ ⬤
freebsd
⬤ ⬤ ⬤
linux
⬤ ⬤ ⬤
Across all measures,
in the majority cases,
SWAY2
is better than SATIBEA (EVOL)
42Arcuri, Andrea, and Lionel Briand. "A practical guide for using statistical tests to assess randomized algorithms in software engineering." Software Engineering (ICSE), 2011 33rd International Conference on.
IEEE, 2011.

Roadmap
Introduction
EVOL
GALE
OSAP
Separation
Operators
1 Top-down
bi-clustering
2 Encoding
Knowledge
Algorithm SWAY SWAY2
Configuration
Space
Continuous Binary vector
Highly constrained
Study Cases XOMO, POM3 SPL
43

Roadmap
Introduction
EVOL
GALE
OSAP
Configuration
Space
Objective
Space
Q: How to find the complete frontier?
A: Increase the “resolution” of the separation
However, we can’t evaluate too many
configurations! 44

Roadmap
Introduction
EVOL
GALE
OSAP
Select and evaluate a few “representative” configurations -- anchors.
# anchors <<
# init configurations
Choices of anchors:
★ 1 = the diagonal
★ 2 = random
★ 3 = 1 + 2
45

Roadmap
Introduction
EVOL
GALE
OSAP
Select and evaluate a few “representative” configurations -- anchors.
Then use the evaluated anchors to guess objectives of the other configurations
Surrogate model: replace the origin complex model with a very simple model/formula.
Config to guess “c”
Nearest anchor N
Similar config-> similar objs
Furthest anchor F
p
Q
p:Q
46
xY
x:Y = p:Q
f(c)
f(N)
f(F)
O1

Roadmap
Introduction
EVOL
GALE
OSAP
Workflow deployments
MONTAGE NASA workflow for generating
custom images of the sky
task
workflow
Objectives
Select proper virtual machines to execute
each task so that ...
● end workflow earlier
● less cloud service rental cost
Configuration space
47RIOT: Randomized instance types

Roadmap
Introduction
EVOL
GALE
OSAP
Zhu, Zhaomeng, et al. "Evolutionary multi-objective workflow scheduling in cloud." IEEE Transactions on parallel and distributed Systems 27.5 (2016): 1344-1357.
Finish time if we deploy model to aws
using median $$$
State-of-the-art method [Zhang’16].
EVOL based
48

Roadmap
Introduction
EVOL
GALE
OSAP
49
Montage
as tasks #
increases
Epigenomics Inspiral Cybershake Sipht
y=speedup
EVOL/RIOT

Roadmap
Introduction
EVOL
GALE
OSAP
50
Montage
as tasks #
increases
Epigenomics Inspiral Cybershake Sipht
y=speedup
EVOL/RIOT
RIOT is much faster than
state-of-the-art(EVOL)

Roadmap
Introduction
EVOL
GALE
OSAP
Obtained
frontiers Hyper-volume (HV)
Spread (GS)
Bold blue values
RIOT performed as well as or better
than state-of-the-art EVOL
Across all measures,
in the majority cases,
statistically,
RIOT is better than
EVOL. 51

Roadmap
Introduction
EVOL
GALE
OSAP
Recap
Separation
Operators
1 Top-down
bi-clustering
2 Encoding
Knowledge
3 Random Anchors
Algorithm SWAY SWAY2
RIOT
Configuration
Space
Continuous
Binary vector
Highly constrained
Enumerates
Study Cases XOMO, POM3 SPL Workflow config
52

Roadmap
Introduction
EVOL
GALE
OSAP
Conclusion
For the optimization of SE planning and re-planning tasks,
● given appropriate separation operators,
● then over-sampling+pruning (OSAP) is better
● than the standard mutation+evolutionary (EVOL) approach
53

On the value of Sampling and Pruning for SBSE

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (9)

Semelhante a On the value of Sampling and Pruning for SBSE

Semelhante a On the value of Sampling and Pruning for SBSE (20)

Último

Último (20)

On the value of Sampling and Pruning for SBSE