Talk at the 30th CREST Open Workshop on Search Based Software Testing (SBST) and Dynamic Symbolic Execution (DSE) -- http://crest.cs.ucl.ac.uk/cow/30/
University College London, January 2014
Video available at http://youtu.be/i4T2g-mdJ-U
A recent survey conducted among developers of the Apache, Eclipse, and Mozilla projects showed that the ability to recreate field failures is considered of fundamental importance when investigating bug reports. Unfortunately, the information typically contained in a bug report, such as memory dumps or call stacks, is usually insufficient for recreating the problem. Even more advanced approaches for gathering field data and help in-house debugging tend to collect either too little information, and be ineffective, or too much information, and be inefficient. This talk presents two techniques that address these issues: BugRedux and SBFR. Both techniques aim to provide support for in-house debugging of field failures by synthesizing, using execution data collected in the field, executions that mimic the observed field failures. BugRedux relies on symbolic execution, whereas SBFR leverages dynamic programming. The talk discusses the two techniques' complementary strengths and weaknesses.
Boost PC performance: How more available memory can improve productivity
Field Failure Reproduction Using Symbolic Execution and Genetic Programming
1. FIELD FAILURE REPRODUCTION
USING SYMBOLIC EXECUTION AND
GENETIC PROGRAMMING
Alessandro (Alex) Orso
School of Computer Science – College of Computing
Georgia Institute of Technology
Partially supported by: NSF, IBM, and MSR
2. DSE
SBST
FIELD FAILURE REPRODUCTION
USING SYMBOLIC EXECUTION AND
GENETIC PROGRAMMING
Alessandro (Alex) Orso
School of Computer Science – College of Computing
Georgia Institute of Technology
Partially supported by: NSF, IBM, and MSR
3. DSE
SBST
FIELD FAILURE REPRODUCTION
USING SYMBOLIC EXECUTION AND
GENETIC PROGRAMMING
Alessandro (Alex) Orso
School of Computer Science – College of Computing
Georgia Institute of Technology
Partially supported by: NSF, IBM, and MSR
4. DSE
SBST
FIELD FAILURE REPRODUCTION
USING SYMBOLIC EXECUTION AND
GENETIC PROGRAMMING
es are
failur
Field
Alessandro (Alex)ble!
oida Orso
unav
School of Computer Science – College of Computing
Georgia Institute of Technology
Partially supported by: NSF, IBM, and MSR
5. DSE
SBST
FIELD FAILURE REPRODUCTION
USING SYMBOLIC EXECUTION AND
GENETIC PROGRAMMING
es are
failur
Field
Alessandro (Alex)ble!
oida Orso
unav
School of Computer Science – College of Computing
Georgia Institute of Technology
Partially supported by: NSF, IBM, and MSR
7. TYPICAL DEBUGGING PROCESS
Recent survey of
Apache, Eclipse, and Mozilla developers:
Information on how to reproduce field failures is the most
valuable, and difficult to obtain, piece of information for
investigating such failures.
[Zimmermann10]
Bug Repository
Very hard to
(1) reproduce
(2) debug
8. TYPICAL DEBUGGING PROCESS
Recent survey of
Apache, Eclipse, and Mozilla developers:
Information on how to reproduce field failures is the most
valuable, and difficult to obtain, piece of information for
investigating such failures.
[Zimmermann10]
Bug Repository
OVERARCHING GOAL: help developers
(1) investigate field failures,
(2) understand their causes, and
(3) eliminate such causes.
Very hard to
(1) reproduce
(2) debug
9. OUR WORK SO FAR
Recording and replaying executions
[icsm 2007, icse 2007]
✘
Input minimization
[woda 2006, icse 2007]
Input anonymization
[icse 2011]
Mimicking field failures
[icse 2012, icst 2014]
Explaining field failures
[issta 2013, TR]
10. MIMICKING FIELD FAILURES
User run (R)
Mimicked run (R’)
•F’ is analogous to F
•R’ is an actual execution
in the field
F
F’
in house
14. ith Wei
Joint wor k w
Jin
BUGREDUX
Crash report
(execution data)
Synthesized
Executions
15. ith Wei
Joint wor k w
Test Input
Jin
BUGREDUX
Crash report
(execution data)
16. ith Wei
Joint wor k w
Jin
BUGREDUX
Candidate
input
Oracle
Test Input
•
Crash report
(execution data)
Execution data
•
•
•
•
•
Input
generator
Point of failure (POF)
Failure call stack
Call sequence
Complete trace
Input generation technique
•
Guided symbolic execution
17. ALGORITHM (SIMPLIFIED)
Input
icfg for P
goals (list of code locations)
Output
If (candidate input)
Main algorithm
init; currGoal = first(goals)
repeat
currState = SelNextState()
if (!currState) backtrack or fail
if (currState.cl == currGoal)
if (currGoal == last(goals))
return solve(currState.pc)
else
currGoal = next(goals)
currState.goal = currGoal
symbolicallyExec(currState)
statesSet= {<cl, pc, ss, goal>}
SelNextState
minDis = ∞
retState = null
foreach state in statesSet
if (state.goal = currGoal)
if (state.cl can reach currGoal)
d = |shortest path state.cl, currGoal|
if d < minDis
minDis = d
retState = state
return retState
18. ALGORITHM (SIMPLIFIED)
Input
icfg for P
goals (list of code locations)
Output
If (candidate input)
statesSet= {<cl, pc, ss, goal>}
Main algorithm
SelNextState s
/Heuristic ut space
ns
init; currGoal = first(goals) izatio minDiss=mbolic inp
tim
Op
the y ∞
e
e
repeat
null
ting to reducretStater=ne the search spac
Dyn=mic tain information to p u
ion
currState a SelNextState()
th computat
alysis
shor test pa
Program an
if (!currState) backtrack orss in the foreach state in statesSet
fail
an omne
e r==dcurrGoal)
if (currState.cl
if (state.goal = currGoal)
Som
if (currGoal == last(goals))
if (state.cl can reach currGoal)
return solve(currState.pc)
d = |shortest path state.cl, currGoal|
else
if d < minDis
currGoal = next(goals)
minDis = d
currState.goal = currGoal
retState = state
symbolicallyExec(currState)
return retState
20. BUGREDUX EVALUATION – FAILURES CONSIDERED
Name
sed
grep
gzip
ncompress
polymorph
aeon
glftpd
htget
socat
tipxd
aspell
exim
rsync
xmail
Repository
Size(KLOC)
# Faults
SIR
14
2
SIR
10
1
SIR
5
2
BugBench
2
1
BugBench
1
1
exploit-db
3
scovered by 1
di
faults can6be
exploit-db se
rs 1
ut of 72 hou
None of the
meo
exploit-db EE with a ti 3
1
vanilla KL
a
exploit-db
35
1
exploit-db
7
1
exploit-db
0.5
1
exploit-db
241
1
exploit-db
67
1
exploit-db
1
1
21. BUGREDUX EVALUATION – RESULTS
Name
sed #1
sed #2
grep
gzip #1
gzip #2
ncompress
polymorph
aeon
rsync
glftpd
htget
socat
tipxd
aspell
xmail
exim
POF
Call Stack
Call Seq.
One of three outcomes:
✘: fail
∼: synthesize
✔: (synthesize and) mimic
Compl. Trace
23. 16/16
2/16
Synth.: 9/16
BUGREDUX6/16 Synth.: 10/16 Synth.:– RESULTS
EVALUATION 16/16 Synth.: 2/16
Mimic: 6/16
Mimic:
Mimic:
Mimic:
Name
POF
Call Stack
sed #1
✘
✘
sed #2
✘
✘
grep
✘
∼
s:
n
gzip #1 Observatio
✔
✔
gzip #2
∼
∼
from
nt
ncompress ults can be dista
✔
✔
• Fa
polymorph e failure ✔ oints:
✔
p
th
s
aeon
✔nd call stack ✔
a
=> POFs ✘
rsync
✘
ely to help
nlik
t
glftpd u
✔mation is no✔
r
• More info
htget
∼
∼
t
be✘ter
s
socat alway
✘
xecution can
ce
tipxd• Symboli ✔
✔
factor ∼
ting
aspell be a limi∼
xmail
✘
✘
exim
✘
✘
Call Seq.
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
Compl. Trace
✘
✘
✘
✘
✘
✘
✘
✔
✘
✘
✘
✘
✘
✘
✘
✔
24. 16/16
2/16
Synth.: 9/16
BUGREDUX6/16 Synth.: 10/16 Synth.:– RESULTS
EVALUATION 16/16 Synth.: 2/16
Mimic: 6/16
Mimic:
Mimic:
Mimic:
Name
POF
Call Stack
sed #1
✘
✘
sed #2
✘
✘
grep
✘
∼
s: can
n
gzip #1 Observatiotion
✔ cu
✔
lic exe
Sym
gzip #2 bo
∼
∼
e o
fectivstafntrfrom
be s ef ✔
ncompress ultincan be di
✔
• Fa
polymorph e failure ✔ ointhighly
✔
p ith s:
th rograms w all stacks
aeon• p POFs ✔nd c
✔
a
>
=str uctured inputs
rsync
p
a ✘
elyato✘htelat internctt
nlik
glftpd •u progr ms mation is o✔
✔ h
s
nf t r
• Moire iexoer nal libr ar ie ∼
htget
∼
w ths better
r rams
alwraye co✘ plex tpoogcan
m ecu i n ✘
socat • la g
ce
Syn genlieralx tor ✔
tipxd• i mbo ✔
ting fac
aspell be a limi∼
∼
xmail
✘
✘
exim
✘
✘
Call Seq.
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
✔
Compl. Trace
✘
✘
✘
✘
✘
✘
✘
✔
✘
✘
✘
✘
✘
✘
✘
✔
25. SBFR
ith
Joint wor k w nella
iella, To
Kifetew, Jin, T
Crash report
(execution data)
•
Execution data
•
•
Call sequence
Input generation technique
•
Genetic Programming
Test Input
26. ith
Joint wor k w nella
iella, To
Kifetew, Jin, T
SBFR
<a> ::=
<b> |λ
Grammar
Crash report
(execution data)
Test Input
27. ith
Joint wor k w nella
iella, To
Kifetew, Jin, T
SBFR
<a> ::=
<b> |λ
Grammar
Derivation
Tree
Genetic
Programming
Crash report
(execution data)
Sentence derivation from the grammar:
Random application of grammar rules
• Uniform
• 80/20
• Stochastic (from a corpus)
Test Input
28. ith
Joint wor k w nella
iella, To
Kifetew, Jin, T
SBFR
<a> ::=
<b> |λ
Grammar
Derivation
Tree
Genetic
Programming
Crash report
(execution data)
Evolution:
Test Input
Fitness function:
Sentence derivation from the grammar:
Distance b/w execution traces
Random application of grammar rules
(candidate–actual failure)
• Uniform
• 80/20
• Stochastic (from a corpus)
29. ith
Joint wor k w nella
iella, To
Kifetew, Jin, T
SBFR
<a> ::=
<b> |λ
Grammar
Derivation
Tree
✔︎
Genetic
Programming
Crash report
(execution data)
Stopping criterion:
• Success
• Ic reaches the point of failure
• The program fails “in the same way”
• Search budget exhausted
Test Input
30. SBFR EVALUATION – FAILURES CONSIDERED
Name
Language Size(KLOC) # Productions # Faults
calc
Java
2
38
2
bc
C
12
80
1
MSDL
Java
13
140
5
PicoC
C
11
194
1
Lua
C
17
106
2
31. SBFR EVALUATION – FAILURES CONSIDERED
Name
Language Size(KLOC) # Productions # Faults
calc
Java
2
38
2
bc
C
12
80
duce any of
pro
unable to re
rs
as 13
72
MSDL BugRedux w
Java
ut of 140 hou
o
s with a time
these failure
1
PicoC
C
11
194
1
Lua
C
17
106
2
5
35. SBFR EVALUATION – RESULTS
Name
calc bug 1
FRP (SBFR)
0.6
FRP (Random)
0.0
calc bug 2
0.8
0.0
bc
1.0
0.0
1.0failure in bc
le:
0.0
Lua bug 1
0.0
0.0
Lua bug 2
0.5
0.0
MSDL bug 1
MSDL bug 2
Examp
1.0
0.0
str uction
in
gered by an
tri
MSDL bug 3 tion fault 1.0g
an
arrays 1.0 d
nta
t 32
me
eas
seg
allocates at l
n
MSDL bug ence that
4
1.0
ha0.0
sequ
bles higher t
o r ia
umber1.0f va rays
MSDLdbuglares a n
0.0
ec 5
ar
d
er of allocate
PicoC numb
0.8
0.1
the
36. SBFR EVALUATION – RESULTS
Name
calc bug 1
FRP (SBFR)
0.6
FRP (Random)
0.0
calc bug 2
0.8
0.0
Lua bug 1
0.0
0.0
Lua bug 2
0.5
0.0
:
ervations
1.0
0.0
Ob s
c
MSDL bug 1
1.0failure in an be effective in
0.0
:
eroaches c b
mplp
Exa
based ap1.0
annotchiandle
rchMSDL bug 2
0.0
• Sea
t on c i
olic execubiy an ivnstr u t on
b
a
e
s
a3es thot sfymlt 1.0ar s red effect e rays 1.0 d
c
MSDL bug
i tin gau mtrigg are ast 32 ar ted an
t
segmentaas c raocmtes atulteless direc
och hat all 1.0 le, b
a
n
MSDL • Stence t e scalab
bug 4
a0.0
u
les h em r th
seqSBST mor ber of vare bompilgheentar y,
ia
•
um d DSE ar c ays
MSDLdbuglares BST an 1.0
ec=5 S a n
artrechniques 0.0
>
of a ernatd
er an allltocate ive
PicoC rnuheb th
0.8
0.1
the at m r
bc
37. FUTURE WORK / FOOD
FOR THOUGHTS
Relevant execution data identification
• Which types?
• Which specific ones?
• Failure explanation
• Reproduction is not enough
• Can DSE and SBST help?
• Use of different input generation techniques
• Grammar-based symbolic execution
• Backward symbolic analysis?
• Other SBST approaches?
• SBST targeted at different kinds of programs?
• Combination of techniques
•