Real Kala Jadu, Black magic specialist in Lahore and Kala ilam expert in kara...
Unified Theory of Garbage Collection
1. A Unified Theory of Garbage Collection
David F. Bacon Perry Cheng V.T. Rajan
IBM Watson Research Center
Seminar Talk by Yoshimi Takano, ETH Zurich
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 1
2. “He who loves practice without theory is like the sailor
who boards ship without a rudder and compass and
never knows where he may cast.”
– Leonardo da Vinci
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 2
3. A Unified Theory of Garbage Collection
David F. Bacon Perry Cheng V.T. Rajan
IBM Watson Research Center
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 3
4. Summary
Tracing and reference counting are duals
All high-performance garbage collectors are hybrids of
tracing and reference counting
This taxonomy can be used
To develop a uniform cost-model
As an algorithm design framework
To generate collectors dynamically. . .
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 4
5. Outline
Introduction
Garbage Collection
Motivation
Duality of Tracing and Reference Counting
Qualitative Comparison
Abstract Garbage Collection
Convergence
Collection as Tracing and Reference Counting
Single Heap
Split Heap
Uniform Cost Model
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 5
6. Introduction Garbage Collection
Garbage Collection and Liveness (Recap)
Automatic storage reclamation of unreachable objects
Roots:
Globals
Locals in stack frames
Live Dead
Roots
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 6
7. Introduction Motivation
Picking a Garbage Collector for your VM
Lots and lots of garbage collector algorithms
State of the Art
Implement n algorithms
Measure and compare for m benchmarks
Use algorithm with best mean performance
Problems
Limited exploration of design space (“no compass”)
Static selection can sacrifice performance
[Slide from OOPSLA presentation]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 7
8. Introduction Motivation
Picking a Garbage Collector for your VM
Lots and lots of garbage collector algorithms
State of the Art
Implement n algorithms
Measure and compare for m benchmarks
Use algorithm with best mean performance
Problems
Limited exploration of design space (“no compass”)
Static selection can sacrifice performance
[Slide from OOPSLA presentation]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 7
9. Duality of Tracing and Reference Counting Qualitative Comparison
Two Fundamental Garbage Collection Techniques
Tracing [McCarthy, 1960]
Stop the world
Trace forward from roots
Everything touched is live, all else is garbage
Reference Counting [Collins, 1960]
Each object has count of incoming pointers
Adjust count in case of mutations (write barrier)
When counter reaches zero, object is garbage and count
of all children is decremented
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 8
10. Duality of Tracing and Reference Counting Qualitative Comparison
Two Fundamental Garbage Collection Techniques
Tracing [McCarthy, 1960]
Stop the world
Trace forward from roots
Everything touched is live, all else is garbage
Reference Counting [Collins, 1960]
Each object has count of incoming pointers
Adjust count in case of mutations (write barrier)
When counter reaches zero, object is garbage and count
of all children is decremented
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 8
11. Duality of Tracing and Reference Counting Qualitative Comparison
Diametrical Opposites?
Tracing Reference Counting
Collection Style Batch Incremental
Pause Times Long Short
Real Time? No Yes
Delayed Reclamation? Yes No
Cost per Mutation None High
Collects Cycles? Yes No
1
1 1
[Table from paper]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 9
12. Duality of Tracing and Reference Counting Qualitative Comparison
How Different Really?
Both types have been implemented by the authors
Very different starting point
But with optimizations, similarities increase:
Both trace roots
Both are semi-incremental
Both have floating garbage
Both have write barriers
Why?
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 10
13. Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection Roots R
Definition
An object graph is a triple G = (V, E, R) with
V the set of vertices (objects)
V
E the multiset of directed edges (pointers)
R the multiset of roots
Multiset notation: [a, b] [b] = [a, b, b]
Definition
A function ρ : V → N0 is a reference count function for an object
graph G = (V, E, R) iff
∀ x ∈ V : ρ(x) = |[(u, x) ∈ E : ρ(u) > 0]| + 1x∈R
“# in-edges from vertices with a non-zero RC (+1)”
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 11
14. Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection Roots R
Definition
An object graph is a triple G = (V, E, R) with
V the set of vertices (objects)
V
E the multiset of directed edges (pointers)
R the multiset of roots
Multiset notation: [a, b] [b] = [a, b, b]
Definition
A function ρ : V → N0 is a reference count function for an object
graph G = (V, E, R) iff
∀ x ∈ V : ρ(x) = |[(u, x) ∈ E : ρ(u) > 0]| + 1x∈R
“# in-edges from vertices with a non-zero RC (+1)”
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 11
15. Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection, cont’d
Definition
A garbage collection algorithm takes an object graph G as input
and computes a reference count function ρ for G.
Objects x with ρ(x) = 0 are then reclaimed.
Common abstract model, where any algorithm computes
reference counts ρ
For a given object graph, there can be many such
functions ρ, as will be seen later
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 12
16. Duality of Tracing and Reference Counting Abstract Garbage Collection
Abstract Garbage Collection, cont’d
Definition
A garbage collection algorithm takes an object graph G as input
and computes a reference count function ρ for G.
Objects x with ρ(x) = 0 are then reclaimed.
Common abstract model, where any algorithm computes
reference counts ρ
For a given object graph, there can be many such
functions ρ, as will be seen later
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 12
17. Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes reference
counts instead of simply setting mark bits:
initialize-for-tracing():
W←R
scan-by-tracing(): 0 0 0 0
while W = ∅ 0
remove w from W 0 0 0 0
ρ(w) ← ρ(w) + 1
if ρ(w) = 1
0 0 0 0 0
for each x ∈ children(w)
W ← W [x] Roots
[Pseudo-code snippets from paper]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
18. Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes reference
counts instead of simply setting mark bits:
initialize-for-tracing():
W←R
scan-by-tracing(): 0 0 0 0
while W = ∅ 0
remove w from W 0 0 0 0
ρ(w) ← ρ(w) + 1
if ρ(w) = 1
0 0 0 0 0
for each x ∈ children(w)
W ← W [x] Roots
[Pseudo-code snippets from paper]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
19. Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes reference
counts instead of simply setting mark bits:
initialize-for-tracing():
Live
W←R
scan-by-tracing(): 2 1 0 0
while W = ∅ 4
remove w from W 2 0 0 0
ρ(w) ← ρ(w) + 1
if ρ(w) = 1
1 1 0 0 0
for each x ∈ children(w)
W ← W [x] Roots
[Pseudo-code snippets from paper]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
20. Duality of Tracing and Reference Counting Convergence
Tracing Revisited
Let’s consider a version of tracing that computes reference
counts instead of simply setting mark bits:
initialize-for-tracing():
Live Dead
W←R
scan-by-tracing(): 2 1 0 0
while W = ∅ 4
remove w from W 2 0 0 0
ρ(w) ← ρ(w) + 1
if ρ(w) = 1
1 1 0 0 0
for each x ∈ children(w)
W ← W [x] Roots
[Pseudo-code snippets from paper]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 13
21. Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrement
operations are batched instead of performed immediately:
mutate(old, new):
W ← W [old]
ρ(new) ← ρ(new) + 1
1 3 1 1
scan-by-counting():
4
while W = ∅
2 2 1 1
remove w from W
ρ(w) ← ρ(w) − 1
if ρ(w) = 0 1 2 1 2 1
for each x ∈ children(w)
W ← W [x]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
22. Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrement
operations are batched instead of performed immediately:
mutate(old, new):
W ← W [old]
ρ(new) ← ρ(new) + 1
1 3 1 1
scan-by-counting():
4
while W = ∅
2 2 1 1
remove w from W
ρ(w) ← ρ(w) − 1
if ρ(w) = 0 1 2 1 2 1
for each x ∈ children(w)
W ← W [x]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
23. Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrement
operations are batched instead of performed immediately:
mutate(old, new):
W ← W [old] Anti-roots
ρ(new) ← ρ(new) + 1
2 3 1 1
scan-by-counting():
4
while W = ∅
2 2 1 2
remove w from W
ρ(w) ← ρ(w) − 1
if ρ(w) = 0 1 2 1 2 1
for each x ∈ children(w)
W ← W [x]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
24. Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrement
operations are batched instead of performed immediately:
mutate(old, new):
W ← W [old] Anti-roots
ρ(new) ← ρ(new) + 1
2 2 0 0
scan-by-counting(): Dead
4
while W = ∅
2 0 0 0
remove w from W
ρ(w) ← ρ(w) − 1
if ρ(w) = 0 1 2 1 1 1
for each x ∈ children(w)
W ← W [x]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
25. Duality of Tracing and Reference Counting Convergence
Reference Counting Revisited
Let’s consider a version of RC in which the decrement
operations are batched instead of performed immediately:
mutate(old, new):
W ← W [old] Anti-roots
ρ(new) ← ρ(new) + 1
2 2 0 0
scan-by-counting(): Dead
4
while W = ∅
2 0 0 0
remove w from W
ρ(w) ← ρ(w) − 1
if ρ(w) = 0 1 2 1 1 1
for each x ∈ children(w) Cyclic
W ← W [x]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 14
26. Duality of Tracing and Reference Counting Convergence
Not So Different After All. . .
initialize-for-tracing(): mutate(old, new):
W←R W ← W [old]
ρ(new) ← ρ(new) + 1
scan-by-tracing(): scan-by-counting():
while W = ∅ while W = ∅
remove w from W remove w from W
ρ(w) ← ρ(w) + 1 ρ(w) ← ρ(w) − 1
if ρ(w) = 1 if ρ(w) = 0
for each x ∈ children(w) for each x ∈ children(w)
W ← W [x] W ← W [x]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 15
27. Duality of Tracing and Reference Counting Convergence
Duality
Tracing Reference Counting
Starting Point Roots Anti-roots
Graph Traversal Fwd. from roots Fwd. from anti-roots
Objects Traversed Live Dead
Initial RC Low (zero) High
RC Reconstruction Addition Subtraction
2 1 0 0 2 2 0 0
4 4
2 0 0 0 2 0 0 0
1 1 0 0 0 1 2 1 1 1
[Table from paper]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 16
28. Collection as Tracing and Reference Counting
Tracing/Counting Hybrids
Fundamentals
Division of storage:
Single heap (= 1)
Split heap (= 2)
Multi-heap (> 2)
Assignment of either tracing or reference counting to the
different divisions
Trade-offs
Remaining choices are implementation details and
space-time trade-offs
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 17
29. Collection as Tracing and Reference Counting
Tracing/Counting Hybrids
Fundamentals
Division of storage:
Single heap (= 1)
Split heap (= 2)
Multi-heap (> 2)
Assignment of either tracing or reference counting to the
different divisions
Trade-offs
Remaining choices are implementation details and
space-time trade-offs
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 17
30. Collection as Tracing and Reference Counting Single Heap
Single Heap Algorithms
Root references vs. intra-heap references
Roots Heap
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 18
31. Collection as Tracing and Reference Counting Single Heap
Single Heap Algorithms
Root references vs. intra-heap references
Roots Heap
[Schematics from paper]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 18
32. Collection as Tracing and Reference Counting Single Heap
Algorithm 1: Tracing
Both root and intra-heap references are traced
Roots Heap
T T
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 19
33. Collection as Tracing and Reference Counting Single Heap
Algorithm 2: Reference Counting
Both root and intra-heap references are counted
Roots Heap
C C
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 20
34. Collection as Tracing and Reference Counting Single Heap
Algo. 3: Deferred Reference Counting [Deutsch/Bobrow, ’76]
To avoid high mutation overhead root references are not
counted (i.e. write barrier ignores root pointers)
Objects with reference count 0 are maintained in a zero
count table (ZCT)
Root references are traced at collection time
Roots Heap
mutate(old, new): T C
if ¬is-root-pointer
...
ZCT
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 21
35. Collection as Tracing and Reference Counting Single Heap
Algo. 3: Deferred Reference Counting [Deutsch/Bobrow, ’76]
To avoid high mutation overhead root references are not
counted (i.e. write barrier ignores root pointers)
Objects with reference count 0 are maintained in a zero
count table (ZCT)
Root references are traced at collection time
Roots Heap
mutate(old, new): T C
if ¬is-root-pointer
...
ZCT
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 21
36. Collection as Tracing and Reference Counting Single Heap
Algo. 3: Deferred Reference Counting [Deutsch/Bobrow, ’76]
To avoid high mutation overhead root references are not
counted (i.e. write barrier ignores root pointers)
Objects with reference count 0 are maintained in a zero
count table (ZCT)
Root references are traced at collection time
Roots Heap
mutate(old, new): T C
if ¬is-root-pointer
...
ZCT
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 21
37. Collection as Tracing and Reference Counting Single Heap
Single Heap Collector Family
T T C C
(Pure) Tracing (Pure) Reference Counting
[McCarthy, 1960] [Collins, 1960]
C T T C
“Partial Tracing” Deferred Reference Counting
[Deutsch/Bobrow, 1976]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 22
38. Collection as Tracing and Reference Counting Split Heap
Generational Garbage Collection [Ungar, 1984]
Heap is split up in 2 regions: a nursery and a mature space
Collect nursery independently
Nursery objects pointed to by mature references are
maintained in a remembered set (RS) by a write barrier,
i.e. reference counted
Roots T T Nursery
RS
C
T T
Mature
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 23
39. Collection as Tracing and Reference Counting Split Heap
Generational Garbage Collection [Ungar, 1984]
Heap is split up in 2 regions: a nursery and a mature space
Collect nursery independently
Nursery objects pointed to by mature references are
maintained in a remembered set (RS) by a write barrier,
i.e. reference counted
Roots T T Nursery
RS
C
T T
Mature
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 23
40. Collection as Tracing and Reference Counting Split Heap
Generational Garbage Collection [Ungar, 1984]
Heap is split up in 2 regions: a nursery and a mature space
Collect nursery independently
Nursery objects pointed to by mature references are
maintained in a remembered set (RS) by a write barrier,
i.e. reference counted
Roots T T Nursery
RS
C
T T
Mature
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 23
41. Collection as Tracing and Reference Counting Split Heap
Generational Traced-Root Collector Family
T T T T
C C
T T T C
Generational [Ungar, 1984] Ulterior Reference Counting
[Blackburn/McKinley, 2003]
T C T C
C C
T C T T
“Redundant Reference Counting” “Inferior Reference Counting”
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 24
42. Uniform Cost Model
Enabling Quantitative Comparison
Characterize object graph and program
Number of objects
Allocation rate
Mutation rate
etc.
Develop space/time cost formulas for each collector
Don’t “cheat” by ignoring collector metadata
Coefficients ci for each parameter are left unspecified
See paper for details
Simple example:
time-per-collectionTracing = c1 |R|+c2 |Vlive |+c3 |Elive |+c4 |V|
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 25
43. Uniform Cost Model
Enabling Quantitative Comparison
Characterize object graph and program
Number of objects
Allocation rate
Mutation rate
etc.
Develop space/time cost formulas for each collector
Don’t “cheat” by ignoring collector metadata
Coefficients ci for each parameter are left unspecified
See paper for details
Simple example:
time-per-collectionTracing = c1 |R|+c2 |Vlive |+c3 |Elive |+c4 |V|
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 25
44. Conclusion Benefits
Conclusion
Benefits
Deeper theoretical insight into garbage collection
Design of collectors can be made more methodical
May help enable dynamic construction of collectors tuned
to particular applications
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 26
45. Conclusion Future Work/Outlook
Conclusion, cont’d
Future Work/Outlook
Refine (unrealistic) assumptions:
Fixed-size objects (no fragmentation)
No concurrent collectors
Application in steady state
Take allocation cost and locality issues into account
Measure coefficients for cost parameters
Theory looks promising, but practical relevance still needs
to emerge
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 27
46. “Theory without practice cannot survive and dies as
quickly as it lives.”
– Leonardo da Vinci
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 28
47. Sources
Sources
David F. Bacon, Perry Cheng, V.T. Rajan
A Unified Theory of Garbage Collection
(Paper and Presentation at OOPSLA 2004, Vancouver)
Paul R. Wilson
Uniprocessor Garbage Collection Techniques
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 29
48. Additional Material Fix-point Formulation
Fix-point Formulation Roots R
Definition V
A function ρ : V → N0 is a reference count function for an object
graph G = (V, E, R) iff
∀ x ∈ V : ρ(x) = |[(u, x) ∈ E : ρ(u) > 0]| + 1x∈R
“# in-edges from vertices with a non-zero RC + const.”
ρ = λ x. |[(u, x) ∈ E : ρ(u) > 0]| + 1x∈R =⇒ ρ is a fix-point
=: F(ρ)
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 30
49. Additional Material Partial Tracing
Algorithm 4: Partial Tracing
New (inefficient?) algorithm
Only root references are counted
Intra-heap references are traced, starting from the
dynamically maintained root set R
Roots Heap
mutate(old, new):
if is-root-pointer
C T
R ← R [new]
R ← R − [old]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 31
50. Additional Material Partial Tracing
Algorithm 4: Partial Tracing
New (inefficient?) algorithm
Only root references are counted
Intra-heap references are traced, starting from the
dynamically maintained root set R
Roots Heap
mutate(old, new):
if is-root-pointer
C T
R ← R [new]
R ← R − [old]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 31
51. Additional Material Partial Tracing
Algorithm 4: Partial Tracing
New (inefficient?) algorithm
Only root references are counted
Intra-heap references are traced, starting from the
dynamically maintained root set R
Roots Heap
mutate(old, new):
if is-root-pointer
C T
R ← R [new]
R ← R − [old]
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 31
52. Additional Material Train Algorithm
Multi-Heap Collectors: Train Algorithm [Hudson/Moss, 1992]
Roots Train 1
T T C T
C
T T C
Train 2
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 32
53. Additional Material Trade-Offs
Trade-Offs
Using semi-spaces with a copying collector (linear
space-time trade-off: half the heap space vs. sweep time)
Traversal (recursive or with pointer reversals?)
Memory compaction
Implementation of remembered sets
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 33
54. Additional Material Cycle Collection
Cycle Collection
Backup Tracing
Occasionally perform a tracing collection
Trial Deletion
Wanted: vertex set S having no live external in-edges
Candidate vertex x, S := x∗
Subtract internal references and remove vertices with
external count > 0
2 2 0 0
4
2 0 0 0
1 2 1 1 1
A Unified Theory of Garbage Collection (Bacon et al.) Seminar Talk by Yoshimi Takano 34