1. Finding Bugs Efficiently with a SAT Solver
Julian Dolby, Mandana Vaziri, Frank Tip
IBM Thomas J. Watson Research Center
FSE 2007—Dubrovnik, Croatia—September 6, 2007
3. Background—Finding Bugs
• Finding bugs must balance coverage and precision
– Tools must find bugs to be useful
– Tools must minimize false positives to be useful
4. Background—Finding Bugs
• Testing approaches find real bugs
– Each test is an actual execution, so all bugs real
– Coverage depends upon test suite, so often spotty
5. Background—Finding Bugs
• Conservative static analysis approaches find all bugs
– Conservative approximation of all executions
– Can find false positives due to approximation
6. Background—Systematic Underapproximation
• Systematic underapproximation blends testing and analysis
– Explores all possible concrete execution within a finite set
– Real bugs and total coverage within for set of executions
7. Background—Systematic Underapproximation
• Key issue is choosing a “good” set of execution
– Set must be tractable
– Set must cover interesting range of all executions
• Use “small scope hypothesis” to bound set of executions
– Hypothesis: most data structure bugs need only few objects
– E.g. Collection bugs can be seen by inserting few elements
– Explore executions with small heaps (also bound loops)
(small scope) (other approximation)
8. Background—Finding Bugs with SAT
• Given program, specification, find concrete counter example
• Use relational first-order logic (FOL) (Alloy [Jackson et al])
– Universe of atoms, and bounded relations over atoms
– Relational operators, first-order formulae over relations
– Tool (Kodkod[Torlak et al]) translates to CNF for SAT solver
• Java in Relational FOL [Vaziri et al] [Taghdiri et al] [Dennis et al]
– Heap objects as atoms, types as unary relations
– Fields as binary relations, code as formulae
– Encode integer operations using bit sets
– Solve FOL formula program ∧ ¬(spec)
9. Miniatur Contributions
• Supports much larger range of integers than previous work
– Integers represented by one atom per power of 2
– Enables e.g. use of hashCode in updating a collection
• Novel sparse representation of array objects
– Allows subset of array indices to have values
– Enables Miniatur to handle array-based collections
• Novel sliced translation based on control dependence
• General theoretical condition for slicing relational formulae
• Demonstration on real collections and a few real programs
– Arrays, integers enable analyzing all Java collections
– Integers allow checking Java equality contracts
10. Example—HashMap Data
class HashMap ... {
Entry[] table; ... HashM ap ← {H1}
class Entry { Array ← {A1}
Object key;
Entry ← {E1}
Object value;
Object ← {A1, E1, H1, K1, V 1}
int hash;
table ← { H1, A1 }
Entry next;
...}} key ← { E1, K1 }
value ← { E1, V 1 }
hash ← { E1, #65 }
I ← { A1, N 1, #5 }
V ← { A1, N 1, E1 }
11. Example—HashMap Data
class HashMap ... {
Entry[] table; ... HashM ap ← {H1}
class Entry { Array ← {A1}
Object key;
Entry ← {E1}
Object value;
Object ← {A1, E1, H1, K1, V 1}
int hash;
table ← { H1, A1 }
Entry next;
...}} key ← { E1, K1 }
value ← { E1, V 1 }
hash ← { E1, #1 , E1, #64 }
I ← { A1, N 1, #1 , A1, N 1, #4 }
V ← { A1, N 1, E1 }
12. Example—HashMap.put
public Object put(Object k, Object v) {
int h = k.hashCode();
int i = h % table.length;
for (Entry e = table[i]; e != null; e = e.next) {
...
}
modCount++;
table[i] = new Entry(h, k, v, table[i]);
return null;
13. Example—HashMap.put
public Object put(Object k, Object v) {
int h = k.hashCode();
int i = h % table.length;
for (Entry e = table[i]; e != null; e = e.next) {
...
}
modCount++;
table[i] = new Entry(h, k, v, table[i]);
return null;
14. Example—HashMap.put
public Object put(Object k, Object v) {
int h = k.hashCode();
int i = h % table.length;
Entry e = table[i];
if (e != null) { ...
e = e.next
if (e != null) { ...
e = e.next
if (e != null) goto end;
} }
modCount++;
table[i] = new Entry(h, k, v, table[i]);
return null;
15. Example—HashMap.put
public Object put(Object k, Object v) {
int h = k.hashCode();
int i = h % table.length;
Entry e = table[i]0;
if (e != null) { ...
e1 = e.next;
if (e1 != null) { ...
e2 = e1.next;
if (e2 != null) goto end;
} }
modCount1 = modCount0 + 1;
table[i]1 = new Entry(h, k, v, table[i]0);
return null;
16. Example—HashMap.put
public Object put(Object k, Object v) {
int h = k.hashCode();
int i = h % table.length;
Entry e = table[i]0;
if (e != null) { ...
e1 = e.next;
if (e1 != null) { ...
e2 = e1.next;
if (e2 != null) goto end;
} }
modCount1 = modCount0 + 1;
table[i]1 = new Entry(h, k, v, table[i]0);
A: assert table[i]1.next == null; // perfect hashing
return null;
17. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← E[table[i]].E[next]
E[table[i]] ← {E[table].x.E[V 1] |sum(E[table].x.E[I1]) = E[i] }
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1} , next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
18. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← E[table[i]].E[next]
E[table[i]] ← {E[table].x.E[V 1] |sum(E[table].x.E[I1]) = E[i] }
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1}, next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
19. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← E[table[i]].E[next]
E[table[i]] ← {{A1} .x.E[V 1] |sum({A1} .x.E[I1]) = 5 }
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1} , next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
20. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← E[table[i]].E[next]
E[table[i]] ← {{A1} . {N 1} .E[V 1] |sum({A1} . {N 1} .E[I1]) = 5 }
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1} , next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
21. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← E[table[i]].E[next]
E[table[i]] ← {{A1} . {N 1} .E[V 1] |sum({#1, #4}) = 5 }
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1} , next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
22. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← E[table[i]].E[next]
E[table[i]] ← {{A1} . {N 1} .E[V 1] |5 = 5 }
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1} , next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
23. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← E[table[i]].E[next]
E[table[i]] ← {E1}
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1} , next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
24. Expression for assert table[i]1.next == null
¬ (Guard(A) → E[table[i].next == null])
E[table[i].next == null] ← E[table[i].next] = {N ull}
E[table[i].next] ← {N ull}
E[table[i]] ← {E1}
...
sum(x) ← xi ∈x int(xi )
int(j) ← int(#1) ← 1, int(#2) ← 2, int(#4) ← 4, . . .
table ← {A1} , next ← { E1, N ull }
T 1 ← { A1, N 1 } , i ← 5
I1 ← { A1, N 1, #1 , A1, N 1, #4 }
V 1 ← { A1, N 1, E1 }
25. Guard for assert table[i]1.next == null
E[e2! = null] = F alse ∨ E[e1! = null] = F alse ∨ E[e! = null] = F alse
26. Evaluation
• Miniatur tool embodies our techniques
– Uses WALA for control dependence and other analyses
– Uses Kodkod to generate CNF
– Uses Minisat as backend SAT solver
• Evaluated structural assertions in java.util collections
– Check that size fields accurately reflect data structure
– Use driver that inserts arbitrary objects into collection
• Evaluated java equality contracts on open-source codes
– (1) reflexive, (2) symmetric, (3) transitive, (4) non-null,
– (5) hashCode, (6-9) compareTo properies
27. Evaluation Results: Testing java.util
class formula Time (s) Lines
LinkedList this.size = |this.header.next∗ − null| 8.2 111
TreeMap this.size = 67.4 12592
|count(this.root.(lef t + right)∗ − null|
TreeSet this.m.size = 80.0 12667
|this.m.root.(lef t + right)∗ − null|
HashMap this.size = |this.table[].next∗ − null| 30.0 1107
HashSet this.map.size = 34.8 1129
|(this.map.table[].next∗ − null|
• Encode correctness of size fields
• Check expressive properties against complex collections
• Heap size per type, Loop unrolling
28. Evaluation Results: Testing Equals Contracts
• Test each property of each equals method using harness
public static void equalsTester(Object a, Object b) {
if (a.equals(b)) assert b.equals(a); }
• Evaluated several common, open-source benchmarks
– Antlr, BCEL, Hsqldb, java cup
– Found 20 concrete violations (of 2, 3, 5, 6, 7)
– Most tests ran within 2 minutes
• Bugs illustrated by concrete counter-examples
argument a argument b
ArrayType3 ArrayType2
type=54304 type=9181
basic type=UninitializedObjectType3 basic type=anonType3
... ...
29. Related Work
• Most-closely related work is checking Java with Alloy
– Vaziri et al, Taghdiri et al, Dennis et al
– Miniatur handles more of Java, scales better
• SATURN checks C using SAT
– Uses manually-tailored summaries
– Checks less-rich properties
– Better scaling than Miniatur
• SMT solvers
– Mix SAT with theories for arithmetic, arrays, etc
– Can prove properties within particular theories
– Current SMT solvers less expressive
30. Conclusions and Future Work
• Miniatur makes SAT-based checking for Java practical
– Extends prior work to better handle integers, arrays
– New, sound slicing mechanism for scalability
→ Handles real properties on real programs
→ Deep structural properties on real data structures
• Future work
– Refinement-based approach to unrolling loops
– Model concurrency