O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Performance tuning jvm

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
Basics of JVM Tuning
Basics of JVM Tuning
Carregando em…3
×

Confira estes a seguir

1 de 22 Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Quem viu também gostou (20)

Anúncio

Semelhante a Performance tuning jvm (20)

Mais recentes (20)

Anúncio

Performance tuning jvm

  1. 1. <ul><li>Performance Tuning JVM - a practical approach </li></ul><ul><li>Prem Kuppumani </li></ul><ul><li>August 2011 </li></ul>
  2. 2. JAVA JVM BASICS <ul><li>JDK vs. JRE </li></ul><ul><ul><li>JRE runs the executables. Small footprint. Recommended in production </li></ul></ul><ul><ul><li>JDK =JRE + javac + tools + debuggers + dev libraries. </li></ul></ul><ul><ul><li>JRE main components  JVM + JAVA API </li></ul></ul><ul><ul><li>JVM components  Class loader + byte code verifier + GC + Security mgr + execution engine + JIT code generator </li></ul></ul>
  3. 3. JAVA object <ul><li>What is an object? </li></ul><ul><ul><li>Object gives properties and behavior. </li></ul></ul><ul><ul><li>unique properties or state or data + behavior (method) + reusable benefit. </li></ul></ul>
  4. 4. Fundamentals and Terminology <ul><li>GC task is to search unreachable objects and reclaim memory. </li></ul><ul><ul><li>LIVE, GARBAGE and ROOT </li></ul></ul><ul><ul><li>Garbage is not reachable by application roots : (local variable on stack, thread stacks, registers, static objects, static fields and class variable refs.) </li></ul></ul><ul><ul><li>Anything not visited is unreachable is GARBAGE </li></ul></ul><ul><ul><ul><li>Advantage: More reliable, no intentional memory leak. </li></ul></ul></ul><ul><ul><ul><li>Disadvantage: Stops and Pauses. Consumes resources. </li></ul></ul></ul>
  5. 5. GC Algorithms. <ul><li>Different methods and algorithms and technical terms. </li></ul><ul><ul><li>Mark & Sweep, Mark & Compact, Copying. </li></ul></ul><ul><ul><ul><li>Mark & Sweep GC </li></ul></ul></ul><ul><ul><ul><ul><li>Mark does depth first search (DFS) from every root, marks all live objects. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Sweep phase each object not marked has its memory reclaimed. </li></ul></ul></ul></ul><ul><ul><ul><li>Mark & Compact </li></ul></ul></ul><ul><ul><ul><ul><li>Additionally this does compaction. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Avoids fragmentation. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Algorithms improved by 3 ways concurrency, parallelization and generational collection. </li></ul></ul></ul></ul>
  6. 6. Generational GC <ul><ul><ul><li>Copying GC </li></ul></ul></ul><ul><ul><ul><ul><li>Faster than M&S because only one phase. </li></ul></ul></ul></ul><ul><ul><ul><li>Generational GC. </li></ul></ul></ul><ul><ul><ul><ul><li>Young (short lived) and old (long-lived) objects in separate locations. </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Most (80% to 90%) instantiated objects are short-lived, and few connections between long-lived objects to short-lived objects. </li></ul></ul></ul></ul>
  7. 7. Minor and Major GC <ul><li>Minor Garbage Collection (scavenge) </li></ul><ul><ul><li>When eden space is filled gc is invoked. Frequent. </li></ul></ul><ul><li>Major Garbage Collection. </li></ul><ul><ul><li>When tenured space is filled Full GC is invoked. Mark & Sweep method. Infrequent. </li></ul></ul><ul><li>Different generations: </li></ul><ul><ul><li>Young  Eden and Survivor space  S0 & S1 Virtual </li></ul></ul><ul><ul><li>Tenured  Old and virtual </li></ul></ul><ul><ul><li>Permanent and virtual </li></ul></ul>
  8. 8. JVM GC Tuning <ul><li>Why performance tuning? </li></ul><ul><ul><ul><li>Wide and diverse range of apps from applets to web services on large servers. </li></ul></ul></ul><ul><ul><ul><li>There are multiple garbage collectors designed for different requirements. </li></ul></ul></ul><ul><ul><li>Ergonomics </li></ul></ul><ul><ul><ul><li>Introduced in java 5.0 </li></ul></ul></ul><ul><ul><ul><li>Automatic choosing of GC algorithm. </li></ul></ul></ul><ul><ul><ul><li>little or no tuning of command line options needed, by choosing GC, heap size and runtime compiler. </li></ul></ul></ul>
  9. 9. JVM GC Tuning <ul><li>Generations </li></ul><ul><ul><li>Primitive GCs examine every live object. </li></ul></ul><ul><ul><li>Generational collection exploits the several empirical observed behavior to minimize the work required to reclaim memory space. </li></ul></ul><ul><ul><li>Weak generational hypothesis which says most objects are short lived. </li></ul></ul><ul><li>Performance Considerations </li></ul><ul><ul><li>Throughput - % of time not spent in GC. </li></ul></ul><ul><ul><li>Pauses – times when application not responding. </li></ul></ul><ul><li>Sizing the Generations </li></ul><ul><ul><li>Total Heap: -Xms=-Xmx or not? </li></ul></ul><ul><ul><li>Young gen: -XXNewRatio=3 or NewSize and MaxSize </li></ul></ul><ul><ul><li>Survivor Space: -XX: SurvivorRatio=6 , Use –XX:+PrintTenuringDistribution </li></ul></ul>
  10. 10. JVM GC Tuning <ul><li>Different Collector options and choosing the right one. </li></ul><ul><ul><li>Give JVM a chance, adjust the heap size to improve. </li></ul></ul><ul><ul><ul><li>-XX:+UseSerialGC Single threaded, relatively efficient and small data sets. </li></ul></ul></ul><ul><ul><ul><li>-XX:+UseParallelGC (throughput collector) multithreaded, med. to large data sets. </li></ul></ul></ul><ul><ul><ul><li>-XX:+UseParallelOldGC parallel compaction in old space. Better scalable. </li></ul></ul></ul><ul><ul><ul><li>-XX:+UseConcMarkSweepGC (low pause), comparatively less throughput, chances of fragmentation. One or two cores use incremental mode. </li></ul></ul></ul><ul><ul><ul><li>-XX:+UseTrainGC train low pause. No more in development. </li></ul></ul></ul><ul><ul><ul><li>G1 - introduced lately in 1.6 - uses page densities, picks sparse pages and collects it and moves popular objects which is connected to so many other objects. Goal is to have 0 flags. </li></ul></ul></ul><ul><li>Parallel Collector. </li></ul><ul><ul><li>Characters: throughput, generational, multithreaded. Sync overhead. </li></ul></ul><ul><ul><li>-XX:ParallelGCThreads=<N> . Too many threads may cause fragments. </li></ul></ul><ul><ul><li>Ergonomics: Auto tune based on…the following order </li></ul></ul><ul><ul><ul><li>Max GC pause time -XX:MaxGCPauseMillis=<N> </li></ul></ul></ul><ul><ul><ul><li>Throughput -XX:GCTimeRatio=<N> 1/(1+ <N>) </li></ul></ul></ul><ul><ul><ul><li>Heap size -Xmx<N> </li></ul></ul></ul>
  11. 11. JVM GC Tuning <ul><ul><li>Young and Old gen adjustments. </li></ul></ul><ul><ul><ul><li>-XX:YoungGenerationSizeIncrement=<Y> XX:TenuredGenerationSizeIncrement=<T> XX:AdaptiveSizeDecrementScaleFactor=<D> </li></ul></ul></ul><ul><ul><ul><li>Growth increment is X% the shrinking decrement is X/D% </li></ul></ul></ul><ul><ul><ul><li>For max pause time goal, size of one generation is shrunk at a time. </li></ul></ul></ul><ul><ul><ul><li>For throughput goal, size of both generations are increased. </li></ul></ul></ul><ul><li>Parallel Compaction. </li></ul><ul><ul><li>Is done with marking phase, summary phase and compaction phase . </li></ul></ul><ul><ul><li>Objects are not moved around in dense prefix region. </li></ul></ul><ul><li>The Concurrent Collector. </li></ul><ul><ul><li>Characters: low pause, generational, multithreaded. </li></ul></ul><ul><ul><li>Uses a separate GC threads to trace live objects, concurrently. </li></ul></ul><ul><ul><li>1 st phase: initial mark STW, single thread marks the first level. STW. </li></ul></ul><ul><ul><li>2 nd phase: concurrent mark drills deep, longer, multi-threaded (trace), single threaded (retrace). no STW. </li></ul></ul><ul><ul><li>3 rd phase: remark tracing bulk of live objects that changed, concurrent, multithreaded. STW. </li></ul></ul><ul><ul><li>4 th phase: c oncurrent sweep app runs. Single threaded. No compaction. No STW. </li></ul></ul><ul><ul><li>Multiple pointers for free memory categorized by size. Keeps count of the requested lengths to determine popular sized objects. </li></ul></ul><ul><ul><li>Minor collections can interleave with on-going major collection. STW. </li></ul></ul>
  12. 12. JVM GC Tuning <ul><li>Cont … </li></ul><ul><ul><li>Tradeoff is processing time, which will otherwise be used by application. </li></ul></ul><ul><ul><li>Concurrent mode failure: inability to complete concurrent collection. </li></ul></ul><ul><ul><li>Floating Garbage: new Garbage that happens while collector is in action. </li></ul></ul><ul><ul><li>Tuning options for CMS: </li></ul></ul><ul><ul><ul><li>-XX:CMSInitiatingOccupancyFraction=<N> </li></ul></ul></ul><ul><ul><li>Scheduling Pauses: Concurrent collector attempts to schedule a remark pause between the previous and next young gen pauses. </li></ul></ul><ul><ul><li>Incremental mode: divides the work done concurrently by the collector into small chunks of time which are scheduled between young generation collections. </li></ul></ul><ul><ul><li>Dutycycle and Auto Pacing: controls the amt of work allowed to do. Auto pacing adjusts based on collected stats. </li></ul></ul><ul><ul><ul><li>-XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10 </li></ul></ul></ul>
  13. 13. JVM OS related tuning <ul><li>Other tuning areas. </li></ul><ul><ul><li>Network tcp tuning </li></ul></ul><ul><ul><ul><li>net.core.rmem_max = 33554432 net.core.wmem_max = 33554432 </li></ul></ul></ul><ul><ul><ul><li>Ifconfig ethX mtu 9000 (test first!) </li></ul></ul></ul><ul><ul><li>OS memory tuning –XX:+UseLargePages –XX:+LargePageSizeInBytes=<xm> </li></ul></ul><ul><ul><li>Filesystem tuning: noatime, nodiratime </li></ul></ul><ul><ul><li>I/O scheduler: noop or deadline </li></ul></ul><ul><ul><li>cpuaffinity and OS stack size. </li></ul></ul><ul><li>Reading a GC log. </li></ul><ul><ul><li>Turn on –verbose:gc –XX:+PrintGCDetails </li></ul></ul><ul><ul><li>Look at the live example. Next slide for explanation. </li></ul></ul>
  14. 14. GC Log <ul><li>GC log reading. </li></ul><ul><ul><li>Sample: 2011-08-15T14:03:59.324-0400: 13.572: [GC [1 CMS-initial-mark: 199434K(3481600K)] 203546K(3686336K), 0.0027280 secs] [Times: user=0.00 sys=0.00, real=0.01 secs] </li></ul></ul><ul><ul><li>2011-08-15T14:03:59.327-0400: 13.575: [CMS-concurrent-mark-start] </li></ul></ul><ul><ul><li>2011-08-15T14:03:59.508-0400: 13.757: [CMS-concurrent-mark: 0.112/0.181 secs] [Times: user=0.69 sys=0.22, real=0.18 secs] </li></ul></ul><ul><ul><li>2011-08-15T14:03:59.508-0400: 13.757: [CMS-concurrent-preclean-start] </li></ul></ul><ul><ul><li>2011-08-15T14:03:59.522-0400: 13.771: [CMS-concurrent-preclean: 0.014/0.014 secs] [Times: user=0.02 sys=0.00, real=0.01 secs] </li></ul></ul><ul><ul><li>2011-08-15T14:03:59.522-0400: 13.771: [CMS-concurrent-abortable-preclean-start] </li></ul></ul><ul><ul><li>Total time for which application threads were stopped: 0.0004990 seconds </li></ul></ul><ul><ul><li>2011-08-15T14:04:00.234-0400: 14.483: [CMS-concurrent-abortable-preclean: 0.113/0.712 secs] [Times: user=1.36 sys=0.20, real=0.71 secs] </li></ul></ul><ul><ul><li>2011-08-15T14:04:00.236-0400: 14.484: [GC[YG occupancy: 106943 K (204736 K)]14.484: [Rescan (parallel) , 0.0269510 secs]14.511: [weak refs processing, 0.0858930 secs]14.597: [class unloading, 0.0061610 secs]14.604: [scrub symbol & string tables, 0.0060430 secs] [1 CMS-remark: 199434K(3481600K)] 306377K(3686336K), 0.1282460 secs] [Times: user=0.29 sys=0.00, real=0.13 secs] </li></ul></ul><ul><ul><li>Total time for which application threads were stopped: 0.1296990 seconds </li></ul></ul><ul><ul><li>2011-08-15T14:04:00.365-0400: 14.613: [CMS-concurrent-sweep-start] </li></ul></ul><ul><ul><li>Total time for which application threads were stopped: 0.0013380 seconds2011-08-15T14:04:00.610-0400: 14.858: [CMS-concurrent-sweep: 0.221/0.245 secs] [Times: user=0.63 sys=0.06, real=0.24 secs] </li></ul></ul><ul><ul><li>2011-08-15T14:04:00.610-0400: 14.858: [CMS-concurrent-reset-start] </li></ul></ul><ul><ul><li>2011-08-15T14:04:00.689-0400: 14.937: [CMS-concurrent-reset: 0.079/0.079 secs] [Times: user=0.09 sys=0.08, real=0.08 secs] </li></ul></ul><ul><ul><li>Total time for which application threads were stopped: 0.0008140 seconds </li></ul></ul><ul><ul><li>2011-08-15T14:04:00.872-0400: 15.120: [GC [1 CMS-initial-mark: 191277K(3481600K)] 388680K(3686336K), 0.2749030 secs] [Times: user=0.27 sys=0.00, real=0.28 secs] </li></ul></ul><ul><ul><li>Total time for which application threads were stopped: 0.2757330 seconds </li></ul></ul><ul><ul><li>2011-08-15T14:04:01.147-0400: 15.395: [CMS-concurrent-mark-start] </li></ul></ul>STW STW CMSScheduleRemarkEdenSizeThreshold CMSScheduleRemarkEdenPenetration
  15. 15. Diagnostic approach <ul><li>Tenure distribution </li></ul><ul><ul><li>-XX:+PrintTenuringDistribution -XX:TargetSurvivorRatio= <x> -XX: MaxTenuringThreshold=<x> </li></ul></ul><ul><ul><li>Threshold accounts for number of times an object is copied before it is tenured. </li></ul></ul><ul><ul><ul><li>(survivor_capacity * TargetSurvivorRatio) / 100 * sizeof(a pointer) </li></ul></ul></ul><ul><ul><li>Example: 1125.353: [GC 1125.353: [ParNew </li></ul></ul><ul><ul><li>Desired survivor size 86232268 bytes, new threshold 6 (max 15) </li></ul></ul><ul><ul><li>- age 1: 50754696 bytes, 50754696 total </li></ul></ul><ul><ul><li>- age 2: 12147696 bytes, 62902392 total </li></ul></ul><ul><ul><li>- age 3: 12295552 bytes, 75197944 total </li></ul></ul><ul><ul><li>- age 4: 6537136 bytes, 81735080 total </li></ul></ul><ul><ul><li>- age 5: 2435944 bytes, 84171024 total </li></ul></ul><ul><ul><li>- age 6: 3013488 bytes, 87184512 total </li></ul></ul><ul><ul><li>- age 7: 627368 bytes, 87811880 total </li></ul></ul><ul><ul><li>- age 8: 999536 bytes, 88811416 total </li></ul></ul><ul><ul><li>- age 9: 924656 bytes, 89736072 total </li></ul></ul><ul><ul><li>- age 10: 1811480 bytes, 91547552 total </li></ul></ul><ul><ul><li>: 554848K->89528K(561792K), 0.5317388 secs] 607743K->146164K(1217152K) icms_dc=18 , 0.5326526 secs] </li></ul></ul>
  16. 16. Diagnostic approach (cont…) <ul><li>Sizing the young generation. Increase or decrease the –Xmn<x>m Example ( courtesy www.oracle.com): Before sizing newgen . </li></ul><ul><li>[GC [DefNew: 4032K->64K(4032K), 0.0429742 secs] 9350K->7748K(32704K), 0.0431096 secs] </li></ul><ul><li>[GC [DefNew: 4032K->64K(4032K), 0.0403446 secs] 11716K->10121K(32704K), 0.0404867 secs] </li></ul><ul><li>[GC [DefNew: 4032K->64K(4032K), 0.0443969 secs] 14089K->12562K(32704K), 0.0445251 secs] </li></ul><ul><li>======================================================== </li></ul><ul><li>After sizing newgen . </li></ul><ul><li>[GC [DefNew: 8128K->64K(8128K), 0.0453670 secs] 13000K->7427K(32704K), 0.0454906 secs] </li></ul><ul><li>[GC [DefNew: 8128K->64K(8128K), 0.0388632 secs] 15491K->9663K(32704K), 0.0390013 secs] </li></ul><ul><li>[GC [DefNew: 8128K->64K(8128K), 0.0388610 secs] 17727K->11829K(32704K), 0.0389919 secs] </li></ul><ul><li>======================================================== </li></ul><ul><li>Gone overboard . </li></ul><ul><li>[GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs] </li></ul><ul><li>[GC [DefNew: 16000K->16000K(16192K), 0.0000518 secs][Tenured: 2704K->2535K(16384K), 0.0931034 secs] 18704K->2535K(32576K), 0.0933519 secs] </li></ul><ul><li>[GC [DefNew: 16000K->16000K(16192K), 0.0000498 secs][Tenured: 2535K->2319K(16384K), 0.0860925 secs] 18535K->2319K(32576K), 0.0863350 secs] </li></ul>young Entire heap
  17. 17. Diagnostic approach (cont…) <ul><li>How to determine if the OLD gen is big or small? </li></ul><ul><li>Example (courtesy www.oracle.com ): </li></ul><ul><li>----------------------------------------------------------------------------------------------------------------------------------------------------- For the 32MB heap collections happen 10s to11s apart. </li></ul><ul><li>111 .042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs] </li></ul><ul><li>122 .463: [GC 122.463: [DefNew: 8128K->8128K(8128K), 0.0000560 secs]122.463: [Tenured: 18630K->2366K(24576K), 0.1322560 secs] 26758K->2366K(32704K), 0.1325284 secs] </li></ul><ul><li>133.896: [GC 133.897: [DefNew: 8128K->8128K(8128K), 0.0000443 secs]133.897: [Tenured: 18240K->2573K(24576K), 0.1340199 secs] 26368K->2573K(32704K), 0.1343218 secs] </li></ul><ul><li>144.112: [GC 144.112: [DefNew: 8128K->8128K(8128K), 0.0000544 secs]144.112: [Tenured: 16564K->2304K(24576K), 0.1246831 secs] 24692K->2304K(32704K), 0.1249602 secs] </li></ul><ul><li>----------------------------------------------------------------------------------------------------------------------------------------------------- </li></ul><ul><li>For the 64 Mbyte heap the major collections are occurring about every 30 seconds. </li></ul><ul><li>90.597 : [GC 90.597: [DefNew: 8128K->8128K(8128K), 0.0000542 secs]90.597: [Tenured: 49841K->5141K(57344K), 0.2129882 secs] 57969K->5141K(65472K), 0.2133274 secs] </li></ul><ul><li>120.899 : [GC 120.899: [DefNew: 8128K->8128K(8128K), 0.0000550 secs]120.899: [Tenured: 50384K->2430K(57344K), 0.2216590 secs] 58512K->2430K(65472K), 0.2219384 secs] </li></ul><ul><li>153.968: [GC 153.968: [DefNew: 8128K->8128K(8128K), 0.0000511 secs]153.968: [Tenured: 51164K->2309K(57344K), 0.2193906 secs] 59292K->2309K(65472K), 0.2196372 secs] </li></ul><ul><li>Conclusion : bigger heap better throughput, smaller heap is low pause time. </li></ul>
  18. 18. Diagnostic approach (cont…) <ul><li>Now make the YOUNG gen is bigger, by increasing the heap to 256MB and 64MB young gen size. </li></ul><ul><li>[GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs] </li></ul><ul><li>[GC [DefNew: 64575K->64575K(64576K), 0.0000573 secs][Tenured: 132673K->5437K(196608K), 0.4959855 secs] 197249K->5437K(261184K), 0.4962533 secs] </li></ul><ul><li>[GC [DefNew: 63616K->959K(64576K), 0.0360258 secs] 69053K->7600K(261184K), 0.0361663 secs] </li></ul><ul><li>After tuning if the minor GC pauses are high try -XX:+UseParallelGC followed by -XX:+ UseAdaptiveSizing . Alternatively try using –XX:+UseParNewGC. </li></ul><ul><li>If you want to address scalability use -XX:+UseParallelOldGC. </li></ul><ul><li>After tuning if the major GC pauses are high try –XX:+UseConcMarkSweepGC with and without –XX:+UseParNewGC . </li></ul><ul><li>To reduce the pause times further (especially for 1 or 2 core boxes) try adding i-cms. - XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:+CMSIncrementalDutyCycle=10 </li></ul>
  19. 19. Tuning guidelines <ul><li>Dos: </li></ul><ul><ul><li>Start with a benchmark or baseline. </li></ul></ul><ul><ul><li>Find and tune only the bottlenecks. </li></ul></ul><ul><ul><li>Change one variable at a time and run a test to record. </li></ul></ul><ul><li>Don’ts: </li></ul><ul><ul><li>Don’t tune without base lining or benchmarking. </li></ul></ul><ul><li>Performance trade-off </li></ul><ul><ul><li>Tuning one parameter may cause another bottleneck </li></ul></ul><ul><li>Performance metrics </li></ul><ul><ul><li>Throughput -% of total time not spent in garbage collection </li></ul></ul><ul><ul><li>Overhead – % of time spent in GC. </li></ul></ul><ul><ul><li>Pause time – duration of app not responding while GC </li></ul></ul><ul><ul><li>GC Frequency – how often GC is initiated. </li></ul></ul><ul><ul><li>Footprint – heap size </li></ul></ul>
  20. 20. Coarse tuning shortcuts and tips <ul><li>Serial GC suitable for small data sets. </li></ul><ul><li>Throughput and low pause meant for medium to large data sets. </li></ul><ul><li>General rule for sizing. </li></ul><ul><ul><li>Allocate 20% to 35% for young space. </li></ul></ul><ul><ul><li>Stateless needs more new gen space. </li></ul></ul><ul><ul><li>Stateful needs more tenured space. </li></ul></ul><ul><li>If you see Full GC (tenured space) happening too frequent adjust the –Xmn to a smaller value. </li></ul><ul><li>Adjust the heap space – bigger for throughput – smaller for low pause. </li></ul>
  21. 21. Brain dump Tuning Tips <ul><li>Sizing </li></ul><ul><li>-Xmx == -Xms or not ? </li></ul><ul><li>young Gen: use -Xmn for more controlled and expected and predictable performance </li></ul><ul><li>Choose a GC </li></ul><ul><li>Serial - new gen and old gen uses serial algorithm. </li></ul><ul><li>Parallel GC (default) - Parallel scavenging + Serial old gen algorithm. </li></ul><ul><li>UseParallelOldGC : Parallel scavenge + Parallel Old </li></ul><ul><li>UseCMS: Parallel newgen, CMS old, Serial OLD </li></ul><ul><li>G1 - introduced lately in 1.6 - uses page densities picks sparse pages and collects it and moves popular objects which is connected to so many other objects. Goal is to have 0 flags. </li></ul><ul><li>How to read GC logs. </li></ul><ul><li>When you see &quot;Full GC&quot; its STW. </li></ul><ul><li>Initial mark, Rescan/WeakRef/Remark triggers STW </li></ul><ul><li>Promotion failures and CMF </li></ul><ul><li>Tuning CMS </li></ul><ul><li>Avoid promotion too frequent, to avoid fragmentation. </li></ul><ul><li>Use TenuringThreshold - Avoid situation of never tenure. </li></ul><ul><li>Size the generations </li></ul><ul><li>Minimize GC times are a function of Live set </li></ul><ul><li>Old gen should host long lived state comfortably. </li></ul><ul><li>Avoid CMS Initiating heuristic -XX:+UseCMSInitiationOccupancyOnly </li></ul><ul><li>Use Concurrent </li></ul><ul><li>GC Threads </li></ul><ul><li>Parallelize on multicore processors. </li></ul><ul><li>-XX:parallelGCThreads=6 </li></ul><ul><li>Strategy A: Tune min GCs & let application data die in eden </li></ul><ul><li>Fragmentation </li></ul><ul><li>Performance degrades over time </li></ul>
  22. 22. Summary & Refs & Resources <ul><li>Remember whatever option that we introduce in jvm tuning is only a suggestion and its not guaranteed to follow. </li></ul><ul><li>Some tools for evaluation </li></ul><ul><ul><li>jmap (Solaris and Linux only) prints memory related stats for running jvm or core file. </li></ul></ul><ul><ul><li>jstat information on performance and resource consumptions of running application. Particularly for heap sizing and garbage collection. </li></ul></ul><ul><ul><li>HPROF: Heap Profiler presents CPU usage, heap stats and dump states of monitors and threads. Useful for analyzing performance, lock contention and memory leaks. </li></ul></ul><ul><ul><li>HAT: Heap Analysis Tool for debugging unintentional object retention . </li></ul></ul><ul><li>The above presentation explains the way I understood GC and if there is any correction to it, please send email to [email_address] </li></ul><ul><li>Refs and Resources. </li></ul><ul><ul><li>https://java.sun.com/j2se/reference/whitepapers </li></ul></ul><ul><ul><li>http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html </li></ul></ul>

×