I will talk about some improvements of GC in Ruby 2.0.0. For instance, I will introduce about implementations of Bitmap Marking GC and so on, and show results of benchmarks after these are implemented.
Animation version is here: https://gumroad.com/l/xWCR (premium version)
10. What is a dead object?
➔A dead object is an object that is
no longer referenced by the
program
➔In GC terms, we say a that dead
object is unreachable from Roots
11. What is Roots?
➔Roots is a set of pointers that
directly reference objects in the
program.
– e.g. Ruby's local variables, etc..
15. CRuby's GC
➔Mark&Sweep
➔Mark phase
– mark all live(reachable) objects
➔Sweep phase
– free up all dead(unreachable) objects
– Unmark all marked objects
29. What's a Knuth's Algorithm?
➔To avoid a stack overflow
➔There is a fail-safe system which
consists of two stages.
– Using a marking buffer
– Rescanning all objects
30. Using a marking buffer
A
gc_mark()
gc_mark()
Frame gc_mark()
・
・
・
Max
Machine Stack
B
D E
C
F
G
・・・
Marking buffer
B C
A
CB
push push
Avoding overflow!!
31. Marking all objects of the
marking buffer at the end
of the mark phaseA
Frame gc_mark()
Machine Stack
B
D E
C
F
G
・・・
Marking buffer
B C
A
CB
gc_mark()
rescan rescan
D E F
G
32. How do you deal with
an overflow of the
marking buffer?
33. Rescanning all objects
A
B
D E
C
F
G
・・・
Marking buffer
S O
A
CB
overflow!!
R A HIgnoring
rescan rescan
rescan
D E F
G
It's very slow!!
35. 1. fail-safe system is slow
➔Rescanning is so slow.
– If you have some deep object graphs,
GC may be always slow with
rescanning.
36. 2. We can't precisely check
stack overflow
➔There is a trade-off between speed
and precision.
– Marking will be slow if we check stack
overflow in each gc_mark().
– So we checked it at the appropriate
time.
– But, it's not precise.
37. 2. We can't precisely check
stack overflow
➔This causes SEGV in the worst case
scenario
– For instance, Fiber sometimes fails
unexpectedly.
– Fiber uses small machine stack(128 KB)
– At times, checking for stack overflows
doesn't work well with Fiber.
42. Allocating new
a stack chunk
A
B
D E
C
F
G
Marking stack
X
A
CB
F
G
X
mark
X X X X DStack chunk
E Allocate!
43. Pros and Cons
➔Pros
– Good-bye complex fail-safe systems
– Good-bye SEGV!
➔Cons
– Fast enough?
– There is a risk of allocating a stack
chunk during GC
49. Bitmap Marking in CRuby
➔Mark-bits separate from object
headers
– for CoW friendly
➔REE has adopted this approach
– Since 2008
– But we can't import this patch
52. If we have many forked processes
Process 1
Shared
Process 2
P1 P2
Process 3 Process 4 ・・・
copy
P3 P4
write
・・・
Increase memory usage of all forked processes
53. Marking in the old way
… 16KB …Object
… 16KB …Object
・
・
・
mb mb mb mb mb mb mb
Ruby Heap
HeapBlock 1
(HB)
HeapBlock 2
59. e.g. Unicorn
w/ marking in the old way
Memory SpaceHB1 HB2
UP1(parent) UP2(child)
GC.start!!
read only
write
copy
・・・
Rails Rails app
read/write
app
write write
Rails Rails
62. Finding an appropriate
bit for an object
… 16KB …HB 1 mark
Bitmap
Header
16KB align
(low 13 bits must be 0)
Allocate a heap block using memory align
& ~0x3fff
HB1
mark
…
64. Allocating aligned memory
➔Using posix_memalign()
– For Unix-like OS
➔Using _aligned_malloc()
– For Windows OS
– mingw: __mingw_aligned_malloc()
65. Allocating aligned memory
➔Using malloc()
● Thanks to yugui-san's help!
– For other environments
– For instance, Max OS X Lion and so on
– It allocates 32KB and returns an address
which is a multiple of 16KB
● 16KB memory space is wasted
● We should use mmap(), but ....