2. Copyright 2015 Kirk Pepperdine
About me
- Offer performance tuning services and training
- created jPDM, a performance tuning methodology
- Write and speak about performance tuning
- Co-founder of jClarity
- engaged in building the first generation of
performance diagnostic engines
- Java Champion since 2006
3. Copyright 2015 Kirk Pepperdine
Questions To Be Answered
- How does the G1GC algorithm work
- What does Java heap look like
- What are the tools we can use the to help us
- want to engage in evidence based tuning
- configure the collector to work better with
our application
- tune our application to work better with the
collector
- How does it compare to the other collectors
4. Copyright 2015 Kirk Pepperdine
Generational heap is
- 1 large contigous reserved space
- specified with -mx (not a type-o)
- split into 5 (or 4) memory pools
- Eden
- Survivor (from, to)
- Old
- Perm (prior to Java 8)
5. Copyright 2015 Kirk Pepperdine
Generational Garbage Collection
- Mark-Sweep Copy (evacuation) for Young
- eden and survivor spaces
- both serial and parallel implementations
- Mark-Sweep (in-place) for Old space
- Serial and Parallel with compaction
- (mostly) Concurrent Mark-Sweep
- incremential mode
6. Copyright 2015 Kirk Pepperdine
Why another collector
- Scalability
- pause time tends to be a function of heap size
- Difficult to tune
- dozens of parameters some of which are very difficult
to understand how to use
- -XX:PLABSize=????
- Completely unpredictable
- well, maybe but that is a different talk
7. Copyright 2015 Kirk Pepperdine
G1GC
- Designed to scale
- break the pause time heap size dependency
- Easier to tune
- fewer configuration options
- Predictable
- offer pause time goals and have the collector tune it’s
self
- Does it work?
- lets see!!!
8. Copyright 2015 Kirk Pepperdine
G1GC heap is
- 1 large contigous reserved space
- specified with -mx
- split into ~2048 regions
- size is 1, 2, 4, 8, 16, 32, or 64m
For -mx10G,
Region size = 10240M/2048
= 5m
Number of regions = 10G/4m
= 2560
9. Copyright 2015 Kirk Pepperdine
Regions
- Placed in a free list
- When used, tagged as
- Eden, Survivor, Old, or Humongous
- Retuned to free list after being swept
10. Copyright 2015 Kirk Pepperdine
Allocation
- mutator threads get a region from region free list
- tag region as Eden
- allocate object into region
- when region is full, get a new regions from free list
Eden
Eden
Eden
Eden
11. Copyright 2015 Kirk Pepperdine
Humongous Allocation
- allocation is larger than 1/2 a regions size
- size of a regions defines what is humongous
- allocate into a humoungous region
- created from a set of contigous regions
Eden
Eden
Eden
Eden
Humongous
12. Copyright 2015 Kirk Pepperdine
Garbage Collection Triggers
- Alloted number of Eden regions have been consumed
- Unable to satisfy a Humongous allocation
- Heap is full triggering a Full GC
- Metaspace threshold is reached
- full discussion beyond the scope of this talk
13. Copyright 2015 Kirk Pepperdine
Garbage Collection
- Mark-Sweep/Mark GC combination
- Mark-Sweep for young
- (mostly) Concurrent-Mark for Old
- Sweep evacuates live objects in a region to another
region
- automatic compaction
- no need for fine grain free lists (expensive)
- Marked Old regions are swept by the Young collector
- Most phases require threads to be at a safe-point
14. Copyright 2015 Kirk Pepperdine
Safepoint
- A point in a threads execution where it can
safely stop for JVM maintenance
- Cooperative effort
- JVM signals it must perform some maintenance
- mutator threads stop when they reach a safe
point
- JVM carries out maintenance
- mutator threads are restarted
- Measure TTSP vs time to perform maintenance
15. Copyright 2015 Kirk Pepperdine
Heap after a Mark/Sweep
- all surviving objects are copied into (to) Survivor regions
- Eden and (from) Survivor regions are returned to free
regions list
Humoungous
Survivor
16. Copyright 2015 Kirk Pepperdine
Promotion to Old
- Data is promoted to old
- from survivor when it reaches tenuring threshold
- to prevent survivor from being overrun
- pre-emptive or reactive
Humongous
Survivor Old
23. Copyright 2015 Kirk Pepperdine
What is this 1 record telling us?
- Initial state of the Java heap
- what our application did to the heap
- Final state of Java heap
- what the GC algorithm did to the heap
- how long did the whole process take
- breakdown of phases
- Need to use 100s of these records to make decisions
24. Copyright 2015 Kirk Pepperdine
Scan for Roots
- Find all pointers external to the memory pool in
question
- Use RSet to track all incoming pointers from one
region
- prevent a heap scan for roots
- uses cards, card marking
- card is 1 word that tracks references into 512 bytes
- mark bit dirty when reference is updated
25. Copyright 2015 Kirk Pepperdine
RSet
- Collection card tables
- one card table for pointers
originating from each region
- Dirty cards are placed in a
concurrent refinement queue
- refinement threads will build the
RSet
- aim is to reduce the cost of scan
for roots
26. Copyright 2015 Kirk Pepperdine
RSet Refinement
- Refinement queue is divided into 4 regions
- White: no refinement threads are working
- Green: number of cards that can be processed
without exceeding 10% of pause time
- Yellow: all refinement threads are working to keep
up
- Red: Application threads are involved in refinement
27. Copyright 2015 Kirk Pepperdine
CSets
- Set of all regions to be swept
- Goal is to keep pauses under MaxGCPauseMillis
- controls the size of the CSet
- CSet contain
- all Young regions
- selected Old regions during mixed collections
- number / mixed GC ratio
28. Copyright 2015 Kirk Pepperdine
Reclaiming Memory (detailed)
- Mark Sweep Copy (Evacuating) Garbage Collection
- Capture all mutator threads at a safepoint
- Scan for GC Roots
- Trace all references from GC roots
- mark all data reached during tracing
- Copy all marked data into a “to space”
- Reset supporting structures
- Release all mutator threads
34. Copyright 2015 Kirk Pepperdine
Starting a (mostly) Concurrent Cycle
- Scheduled when heap occupancy reaches 45%
- initial-mark runs inside a Young collection
- mark calculates livelyness
- used for CSet inclusion decisions
Eden
Eden
Eden
Eden
Humoungous Survivor
Survivor
Old
OldOld
Old
Old
Old
Old
Old OldOld
Old
Old
Old
Old
Old
35. Copyright 2015 Kirk Pepperdine
Starting a (mostly) Concurrent Mark
4167.445: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested]
2015-10-17T21:13:17.981-0400: 4167.445: [GC pause (G1 Evacuation Pause) (young) (initial-mark)
Desired survivor size 4194304 bytes, new threshold 1 (max 1)
- age 1: 1585104 bytes, 1585104 total
4167.459: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 984, predicted base time: 3.98 ms, remaining time: 0.00 ms, target pause time: 1.00 ms]
4167.459: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1 regions, survivors: 1 regions, predicted young region time: 0.90 ms]
4167.459: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 1 regions, survivors: 1 regions, old: 0 regions, predicted pause time: 4.88 ms, target pause time: 1.00 ms]
, 0.0347452 secs]
No need to squint, a bigger version is coming
-XX:+PrintAdaptiveSizePolicy
42. Copyright 2015 Kirk Pepperdine
Common Failure Conditions
[GC pause (young) (to-space exhausted), 0.1709670 secs]
- Collection ran out of reserved space
- protect against temporary overflows
- -XX:G1ReservedPercent=10
- Heap is too small
[GC concurrent-mark-reset-for-overflow]
- Global marking stack filled
- Collection started too late
43. Copyright 2015 Kirk Pepperdine
Tools for evidence based tuning
- GC logs
- Log file parser/visualization tool
- HPJMeter
- GCViewer
- Censum
- Flags
- dozens of flags
- most of them you don’t want to touch!!!!
46. Copyright 2015 Kirk Pepperdine
Flags (you should think twice about using)
-XX:G1MixedGCCountTarget=8
-XX:+UnlockExperimentalVMOptions"
-XX:G1MixedGCLiveThresholdPercent=85/65
47. Copyright 2015 Kirk Pepperdine
Flags (you should never use)
-XX:+UnlockExperimentalVMOptions"
-XX:G1OldCSetRegionThresholdPercent=10
-XX:G1MaxNewSizePercent=60
-XX:G1HeapWastePercent=10
-XX:G1RSetUpdatingPauseTimePercent=10
48. Copyright 2015 Kirk Pepperdine
Tuning Cassandra (benchmark)
- Out of the box tuned for using CMS
- exceptionally complex set of configurations
- Reconfigured
- to run G1
- given fixed unit of work which should ideally be
cleared in 15 minutes
Goal: Configure G1 to maximize MMU