SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
@aragozin#Devoxx #WhyJavaSlow
I Know Why Your Java is Slow
Alexey Ragozin
@aragozin#Devoxx #WhyJavaSlow
After months of hard work
with tons of new features
well tested and polished
your application is live!
Everything runs smoothly until …
@aragozin#Devoxx #WhyJavaSlow
Users start complaining about performance
application has become unusable
stakeholders are getting nervous
You have to fix it!
ASAP
@aragozin#Devoxx #WhyJavaSlow
Server
Browser
Application
Database
@aragozin#Devoxx #WhyJavaSlow
Server
Browser
Application
Database
Caching
JavaScript
HTTP
Network Network
SQL performance
CPU Memory / Swapping
Disk IO Virtualization
@aragozin#Devoxx #WhyJavaSlow
What exactly slow means – clarify your KPIs
 Business transaction ≠ Page
 Business transaction ≠ SQL transaction
 Page ≠ HTTP request
 HTTP request ≠ SQL transaction
Your system is a black box
 Do you think you know how it works?
 You do not!
Profiling produces three types of data
 Lies – incorrectly interpreted data
 Darn lies – incorrectly measured data
 Statistics – data which will help to fix your system
@aragozin#Devoxx #WhyJavaSlow
What types of bottlenecks with can find in Java?
CPU bound
 Single core bottleneck
 CPU starvation
Thread contention
Memory and GC related
 Frequent young GC
 Abnormally long GC pauses
 Frequent full GC
@aragozin#Devoxx #WhyJavaSlow
Single core bottleneck
 Certain singleton thread consumes 100% CPU
CPU starvation
 Number of threads compete for physical cores
 Thread nor sleeps neither waits but CPU usage is far below 100%
Thread contention
 Thread spends considerable time in synchronization (accruing, waiting)
Frequent young GC
 Frequent young GC – consume handful of CPU budget
 Caused by intensive memory allocation in application code
@aragozin#Devoxx #WhyJavaSlow
Thread CPU usage
 Number CPU cycles spend on this thread (translated into time or percentage)
 Calculated by OS (User + Kernel)
 Single thread can consume 100% at max
Java thread in RUNNABLE state
 Not BLOCKED, WAITING, SLEEPING or PARKED
 Thread in blocking socket read is RUNNABLE
 Java RUNNABLE
 may not be runnable from OS prospective
 may be runnable but not on CPU
 may actually be running on CPU (counted to thread CPU usage)
@aragozin#Devoxx #WhyJavaSlow
Bird eye view on Java process
using JVisualVM
@aragozin#Devoxx #WhyJavaSlow
Bird eye view on Java process
using JProfiler
@aragozin#Devoxx #WhyJavaSlow
Bird eye view on Java process
using SJK ttop
> java –jar sjk.jar ttop -p 12345 -n 20 -o CPU
2016-11-07T00:09:50.091+0400 Process summary
process cpu=268.35%
application cpu=244.42% (user=222.50% sys=21.93%)
other: cpu=23.93%
GC cpu=6.28% (young=1.62%, old=4.66%)
heap allocation rate 1218mb/s
safe point rate: 1.5 (events/s) avg. safe point pause: 43.24ms
safe point sync time: 0.03% processing time: 6.39% (wallclock time)
[000056] user=17.45% sys= 0.62% alloc= 106mb/s - hz._hzInstance_2_dev.generic-operation.thread-0
[000094] user=17.76% sys= 0.31% alloc= 113mb/s - hz._hzInstance_3_dev.generic-operation.thread-1
[000093] user=16.83% sys= 0.00% alloc= 111mb/s - hz._hzInstance_3_dev.generic-operation.thread-0
[000020] user=16.06% sys= 0.15% alloc= 108mb/s - hz._hzInstance_1_dev.generic-operation.thread-0
[000021] user=15.44% sys= 0.15% alloc= 110mb/s - hz._hzInstance_1_dev.generic-operation.thread-1
[000057] user=14.36% sys= 0.00% alloc= 110mb/s - hz._hzInstance_2_dev.generic-operation.thread-1
[000105] user=13.59% sys= 0.00% alloc= 72mb/s - hz._hzInstance_3_dev.cached.thread-1
[000079] user=13.43% sys= 0.15% alloc= 67mb/s - hz._hzInstance_2_dev.cached.thread-3
[000042] user=10.96% sys= 0.62% alloc= 65mb/s - hz._hzInstance_1_dev.cached.thread-2
[000174] user=10.65% sys= 0.31% alloc= 66mb/s - hz._hzInstance_3_dev.cached.thread-7
[000123] user= 8.96% sys= 0.00% alloc= 55mb/s - hz._hzInstance_4_dev.response
[000129] user= 7.72% sys= 0.31% alloc= 21mb/s - hz._hzInstance_4_dev.generic-operation.thread-0
[000168] user= 6.95% sys= 0.62% alloc= 33mb/s - hz._hzInstance_1_dev.cached.thread-6
[000178] user= 7.57% sys= 0.00% alloc= 32mb/s - hz._hzInstance_2_dev.cached.thread-9
[000166] user= 6.48% sys= 0.46% alloc= 33mb/s - hz._hzInstance_1_dev.cached.thread-5
[000130] user= 6.02% sys= 0.62% alloc= 20mb/s - hz._hzInstance_4_dev.generic-operation.thread-1
[000181] user= 5.56% sys= 0.00% alloc= 34mb/s - hz._hzInstance_2_dev.cached.thread-12
[000014] user= 2.32% sys= 0.15% alloc= 7345kb/s - hz._hzInstance_1_dev.response
https://github.com/aragozin/jvm-tools
@aragozin#Devoxx #WhyJavaSlow
Tracking CPU hogs using sampling
using JProfiler
@aragozin#Devoxx #WhyJavaSlow
Tracking CPU hogs using sampling
flame graph with SJK
@aragozin#Devoxx #WhyJavaSlow
JEE world example – JBoss + Seam + Hibernate
@aragozin#Devoxx #WhyJavaSlow
JEE world example – JBoss + Seam + Hibernate
Command
sjk ssa -f tracedump.std --categorize -tf **.CoyoteAdapter.service -nc
JDBC=**.jdbc
Hibernate=org.hibernate
"Facelets compile=com.sun.faces.facelets.compiler.Compiler.compile"
"Seam bijection=org.jboss.seam.**.aroundInvoke/!**.proceed"
JSF.execute=com.sun.faces.lifecycle.LifecycleImpl.execute
JSF.render=com.sun.faces.lifecycle.LifecycleImpl.render
Other=**
Report
Total samples 2732050 100.00%
JDBC 405439 14.84%
Hibernate 802932 29.39%
Facelets compile 395784 14.49%
Seam bijection 385491 14.11%
JSF.execute 290355 10.63%
JSF.render 297868 10.90%
Other 154181 5.64%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Time
Other
JSF.render
JSF.execute
Seam bijection
Facelets compile
Hibernate
JDBC
Excel
@aragozin#Devoxx #WhyJavaSlow
Tracking CPU hot spots using thread sampling
PRO
 Low overhead
 No upfront configuration
 Can identify, CPU hot spots, IO hot spots, contention
 Data are comparable across runs
CON
 Limited precision
 No real method execution time measured
 Safe point bias
 Destabilization is limited to static program structure – methods
@aragozin#Devoxx #WhyJavaSlow
Instrumentation profiling
 Modifies byte code at runtime
 Modifies behavior of program
 Affects JIT compilation
 Probe can access method arguments
When instrumentation should/can be used?
 You narrowed down problem area
 You need more context to find root cause
@aragozin#Devoxx #WhyJavaSlow
BTrace
 Open Source
 Byte code instrumentation
 Probes are coded Java
 CLI and API
BTrace script
@Property
Profiler prof = Profiling.newProfiler();
@OnMethod(clazz = "org.jboss.seam.Component",
method = "/(inject|disinject|outject)/")
void entryByMethod2(@ProbeClassName String className,
@ProbeMethodName String methodName, @Self Object component) {
if (component != null) {
Field nameField = field(classOf(component), "name", true);
if (nameField != null) {
String name = (String)get(nameField, component);
Profiling.recordEntry(prof, concat("org.jboss.seam.Component.",
concat(methodName, concat(":", name))));
}
}
}
@OnMethod(clazz = "org.jboss.seam.Component",
method = "/(inject|disinject|outject)/",
location = @Location(value = Kind.RETURN))
void exitByMthd2(@ProbeClassName String className,
@ProbeMethodName String methodName, @Self Object component,
@Duration long duration) {
if (component != null) {
Field nameField = field(classOf(component), "name", true);
if (nameField != null) {
String name = (String)get(nameField, component);
Profiling.recordExit(prof, concat("org.jboss.seam.Component.",
concat(methodName, concat(":", name))), duration);
}
}
}
https://github.com/jbachorik/btrace2
@aragozin#Devoxx #WhyJavaSlow
Garbage Collection is one to blame?
 Enable GC logs to see whole picture
 -XX:+PrintGCDetails
 -XX:+PrintReferenceGC
Common problems
 JVM has not enough memory
 Intensive allocation of short live object by application
 Reference problems
 Large object resurrection by finalizes
 Too many references
@aragozin#Devoxx #WhyJavaSlow
-XX:InitialTenuringThreshold=8
Initial value for tenuring threshold (number of collections
before object will be promoted to old space)
-XX:+UseTLAB Use thread local allocation blocks in eden
-XX:MaxTenuringThreshold=15
Max value for tenuring threshold
-XX:PretenureSizeThreshold=2m Max object size
allowed to be allocated in young space (large objects will be
allocated directly in old space). Thread local allocation
bypasses this check, so if TLAB is large enough object
exciding size threshold still may be allocated in young space.
-XX:+AlwaysTenure Promote all objects surviving
young collection immediately to tenured space
(equivalent of -XX:MaxTenuringThreshold=0)
-XX:+NeverTenure Objects from young space
will never get promoted to tenured space unless survivor
space is not enough to keep them
-XX:+ResizeTLAB Let JVM resize TLABs per thread
-XX:TLABSize=1m Initial size of thread’s TLAB
-XX:MinTLABSize=64k Min size of TLAB
-XX:+UseCMSInitiatingOccupancyOnly
Only use predefined occupancy as only criterion for starting
a CMS collection (disable adaptive behaviour)
-XX:CMSInitiatingOccupancyFraction=70
Percentage CMS generation occupancy to start a CMS cycle.
A negative value means that CMSTriggerRatio is used.
-XX:CMSBootstrapOccupancy=50
Percentage CMS generation occupancy at which to initiate
CMS collection for bootstrapping collection stats.
-XX:CMSTriggerRatio=70
Percentage of MinHeapFreeRatio in CMS generation that is
allocated before a CMS collection cycle commences.
-XX:CMSWaitDuration=30000
Once CMS collection is triggered, it will wait for next young
collection to perform initial mark right after. This parameter
specifies how long CMS can wait for young collection
-XX:+CMSScavengeBeforeRemark
Force young collection before remark phase
-XX:+CMSScheduleRemarkEdenSizeThreshold
If Eden used is below this value, don't try to schedule remark
-XX:CMSScheduleRemarkEdenPenetration=20
Eden occupancy % at which to try and schedule remark pause
-XX:CMSScheduleRemarkSamplingRatio=4
StartsamplingEdentopatleastbeforeyounggenerationoccupancy
reaches1/ofthesizeatwhichweplantoscheduleremark
-XX:+CMSIncrementalMode
Enable incremental CMS mode. Incremental mode was meant for
severs with small number of CPU, but may be used on multicore
servers to benefit from more conservative initiation strategy.
-XX:+CMSClassUnloadingEnabled
If not enabled, CMS will not clean permanent space. You
may need to enable it for containers such as JEE or OSGi.
-XX:ConcGCThreads=2
Number of parallel threads used for concurrent phase.
-XX:ParallelGCThreads=16
Number of parallel threads used for stop-the-world phases.
-XX:+DisableExplicitGC
JVM will ignore application calls to System.gc()
-XX:+ExplicitGCInvokesConcurrent
Let System.gc() trigger concurrent collection instead of full GC
-XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses
Same as above but also triggers permanent space collection.
-XX:PrintCMSStatistics=1
Print additional CMS statistics. Very verbose if n=2.
-XX:+PrintCMSInitiationStatistics
Print CMS initiation details
-XX:+CMSDumpAtPromotionFailure
Dump useful information about the state of the CMS old
generation upon a promotion failure
-XX:+CMSPrintChunksInDump (with optin above)
Add more detailed information about the free chunks
-XX:+CMSPrintObjectsInDump (with optin above)
Add more detailed information about the allocated objects
by Alexey Ragozin – http://blog.ragozin.info
HotSpot JVM options cheatsheet
Young space tenuring
Thread local allocation
Parallel processing
-XX:+CMSOldPLABMin=16 -XX:+CMSOldPLABMax=1024
Min and max size of CMS gen PLAB caches per worker per block size
CMS initiating options
CMS Stop-the-World pauses tuning
Misc CMS options
CMS Diagnostic options
- Options for “deterministic” CMS, they disable some heuristics and
require careful validation
Concurrent Mark Sweep (CMS)
All concrete numbers in JVM options in this card are for illustrational purposes only!
-XX:+ParallelRefProcEnabled Enable parallel
processing of references during GC pause
-XX:SoftRefLRUPolicyMSPerMB=1000 Factor
for calculating soft reference TTL based on free heap size
-XX:OnOutOfMemoryError=… Command to be executed
in case of out of memory.
E.g. “kill -9 %p” on Unix or “taskkill /F /PID %p” on Windows.
-XX:G1HeapRegionSize=32m Size of heap region
-XX:MaxGCPauseMillis=500 Target GC pause duration.
G1 is not deterministic, so no guaranties for GC pause to satisfy this limit.
-XX:G1ReservePercent=10 Percentage of heap to keep free.
Reserved memory is used as last resort to avoid promotion failure.
-XX:G1ConfidencePercent=50 Confidence level
for MMU/pause prediction
-XX:G1HeapWastePercent=10 If garbage level is below
threshold, G1 will not attempt to reclaim memory further
-XX:G1MixedGCCountTarget=8 Target number of mixed
collections after a marking cycle
-XX:InitiatingHeapOccupancyPercent=45
Percentage of (entire) heap occupancy to trigger concurrent GC
Garbage First (G1)
-XX:+CMSParallelRemarkEnabled
Whether parallel remark is enabled (enabled by default)
-XX:+CMSParallelSurvivorRemarkEnabled
Whether parallel remark of survivor space enabled,
effective only with option above (enabled by default)
-XX:+CMSConcurrentMTEnabled
Use multiple threads for concurrent phases.
CMS Concurrency options
-XX:+CMSParallelInitialMarkEnabled
Whether parallel initial mark is enabled (enabled by default)
-XX:CMSTriggerInterval=60000 Periodically triggers_
CMS collection. Useful for deterministic object finalization.
GC options cheat sheet download at
http://blog.ragozin.info/2016/10/hotspot-jvm-garbage-collection-options.html
Young collector Old collectior JVM Flags
Serial (DefNew)
Parallel scavenge (PSYoungGen)
Parallel scavenge (PSYoungGen)
Parallel (ParNew)
Serial (DefNew)
Parallel (ParNew)
Serial Mark Sweep Compact
Serial Mark Sweep Compact (PSOldGen)
Parallel Mark Sweep Compact (ParOldGen)
Concurrent Mark Sweep
Concurrent Mark Sweep
Serial Mark Sweep Compact
-XX:+UseSerialGC
-XX:+UseParallelGC
-XX:+UseParallelOldGC
-XX:+UseParNewGC
-XX:-UseParNewGC1
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC
-XX:+UseG1GCGarbage First (G1)
1 - Notice minus before UseParNewGC, which is explicitly disables parallel mode
-verbose:gc or -XX:+PrintGC Print basic GC info
-XX:+PrintGCDetails Print more details GC info
-XX:+PrintGCTimeStamps Print timestamps for each GC
event (seconds count from start of JVM)
-XX:+PrintGCDateStamps Print date stamps at garbage
collection events: 2011-09-08T14:20:29.557+0400: [GC...
-Xloggc:<file> RedirectsGCoutputtoafileinsteadofconsole
-XX:+PrintTLAB Print TLAB allocation statistics
-XX:+PrintReferenceGC Print times for special
(weak, JNI, etc) reference processing during STW pause
-XX:+PrintJNIGCStalls Reports if GC is waiting for
native code to unpin object in memory
-XX:+PrintClassHistogramAfterFullGC
Prints class histogram after full GC
-XX:+PrintClassHistogramBeforeFullGC
Prints class histogram before full GC
-XX:+UseGCLogFileRotation Enable GC log rotation
-XX:GCLogFileSize=512m Size threshold for GC log file
-XX:NumberOfGCLogFiles=5 Number GC log files
-XX:+PrintGCCause Add cause of GC in log
-XX:+PrintHeapAtGC Print heap details on GC
-XX:+PrintAdaptiveSizePolicy
Print young space sizing decisions
-XX:+PrintHeapAtSIGBREAK Print heap details on signal
-XX:+PrintPromotionFailure
Print additional information for promotion failure
-XX:+PrintPLAB Print survivor PLAB details
-XX:+PrintOldPLAB Print old space PLAB details
-XX:+PrintGCTaskTimeStamps Print timestamps for
individual GC worker thread tasks (very verbose)
-XX:+PrintGCApplicationStoppedTime
Print summary after each JVM safepoint (including non-GC)
-XX:+PrintGCApplicationConcurrentTime
Print time for each concurrent phase of GC
GC Log rotation
More logging options
-XX:+PrintTenuringDistribution Print detailed
demography of young space after each collection
GC log detail options
-Xms256m or -XX:InitialHeapSize=256m
Initial size of JVM heap (young + old)
-Xmx2g or -XX:MaxHeapSize=2g
Max size of JVM heap (young + old)
-XX:NewSize=64m
-XX:MaxNewSize=64m
Absolute (initial and max) size of
young space (Eden + 2 Survivours)
-XX:NewRatio=3 Alternative way to specify size
of young space. Sets ratio of young vs old space
(e.g. -XX:NewRatio=2 means that young space will be 2 time
smaller than old space, i.e. 1/3 of heap size).
-XX:SurvivorRatio=15 Sets size of single survivor space
relative to Eden space size
(e.g. -XX:NewSize=64m -XX:SurvivorRatio=6 means that each
Survivor space will be 8m and Eden will be 48m).
-XX:MetaspaceSize=512m
-XX:MaxMetaspaceSize=1g
Initial and max size of
JVM’s metaspace space
-Xss256k (size in bytes) or
-XX:ThreadStackSize=256 (size in Kbytes)
Thread stack size
-XX:MaxDirectMemorySize=2g Maximum amount
of memory available for NIO off-heap byte buffers
- Highly recommended option
Memory sizing options
by Alexey Ragozin – http://blog.ragozin.info
Available combinations of garbage collection algorithms in HotSpot JVM
- Highly recommended option
HotSpot JVM options cheatsheetAll concrete numbers in JVM options in this card are for illustrational purposes only!
Java Process Memory
JVM Memory
Java Heap
Non-JVMMemory
(nativelibraries)
Non-Heap
Young Gen
OldGen
Eden
Survivor0
Survivor1
ThreadStacks
Metaspace
CompressedClassSpace
CodeCache
NIODirectBuffers
OtherJVMMemmory
-Xms/-Xmx
-XX:CompressedClassSpaceSize=1g Memory reserved
for compressed class space (64bit only)
-XX:InitialCodeCacheSize=256m
-XX:ReservedCodeCacheSize=512m
Initial size and max
size of code cache area
@aragozin#Devoxx #WhyJavaSlow
Visualize GC Dynamics
https://github.com/chewiebug/GCViewer
with GC Viewer
@aragozin#Devoxx #WhyJavaSlow
Visualize GC Dynamics
with Mission Control
@aragozin#Devoxx #WhyJavaSlow
Tracing object allocation hot spots
Traditional approach – instrumentation
Instrumenting each allocation site
Intrusive and expensive
Flight recorder approach – out of TLAB allocation tracing
Low overhead
No byte code transformation
Biased sampling
@aragozin#Devoxx #WhyJavaSlow
Tracking allocation hotspots
with Mission Control
@aragozin#Devoxx #WhyJavaSlow
Conclusion
 Bottleneck could be everywhere in system
 Move from top to bottom
 Use right tools on each level
 Focus on application KPI
 There is no silver bullet tool
@aragozin#Devoxx #WhyJavaSlow
KEEP CALM
AND MAKE YOUR JAVA FAST
Alexey Ragozin
alexey.ragozin@gmai.com
http://blog.ragozin.info

Mais conteúdo relacionado

Mais procurados

77739818 troubleshooting-web-logic-103
77739818 troubleshooting-web-logic-10377739818 troubleshooting-web-logic-103
77739818 troubleshooting-web-logic-103
shashank_ibm
 
Nanocloud cloud scale jvm
Nanocloud   cloud scale jvmNanocloud   cloud scale jvm
Nanocloud cloud scale jvm
aragozin
 
Ruby/rails performance and profiling
Ruby/rails performance and profilingRuby/rails performance and profiling
Ruby/rails performance and profiling
Danny Guinther
 
Web Sphere Problem Determination Ext
Web Sphere Problem Determination ExtWeb Sphere Problem Determination Ext
Web Sphere Problem Determination Ext
Rohit Kelapure
 

Mais procurados (20)

자바 성능 강의
자바 성능 강의자바 성능 강의
자바 성능 강의
 
Diagnosing Your Application on the JVM
Diagnosing Your Application on the JVMDiagnosing Your Application on the JVM
Diagnosing Your Application on the JVM
 
Find bottleneck and tuning in Java Application
Find bottleneck and tuning in Java ApplicationFind bottleneck and tuning in Java Application
Find bottleneck and tuning in Java Application
 
Jdk Tools For Performance Diagnostics
Jdk Tools For Performance DiagnosticsJdk Tools For Performance Diagnostics
Jdk Tools For Performance Diagnostics
 
Ch10.애플리케이션 서버의 병목_발견_방법
Ch10.애플리케이션 서버의 병목_발견_방법Ch10.애플리케이션 서버의 병목_발견_방법
Ch10.애플리케이션 서버의 병목_발견_방법
 
77739818 troubleshooting-web-logic-103
77739818 troubleshooting-web-logic-10377739818 troubleshooting-web-logic-103
77739818 troubleshooting-web-logic-103
 
Performance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle CoherencePerformance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle Coherence
 
Java Tools and Techniques for Solving Tricky Problem
Java Tools and Techniques for Solving Tricky ProblemJava Tools and Techniques for Solving Tricky Problem
Java Tools and Techniques for Solving Tricky Problem
 
Integrated Cache on Netscaler
Integrated Cache on NetscalerIntegrated Cache on Netscaler
Integrated Cache on Netscaler
 
Shootout at the AWS Corral
Shootout at the AWS CorralShootout at the AWS Corral
Shootout at the AWS Corral
 
Oracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention TroubleshootingOracle Latch and Mutex Contention Troubleshooting
Oracle Latch and Mutex Contention Troubleshooting
 
Nanocloud cloud scale jvm
Nanocloud   cloud scale jvmNanocloud   cloud scale jvm
Nanocloud cloud scale jvm
 
Ruby/rails performance and profiling
Ruby/rails performance and profilingRuby/rails performance and profiling
Ruby/rails performance and profiling
 
Web Sphere Problem Determination Ext
Web Sphere Problem Determination ExtWeb Sphere Problem Determination Ext
Web Sphere Problem Determination Ext
 
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 2
 
Advanced Oracle Troubleshooting
Advanced Oracle TroubleshootingAdvanced Oracle Troubleshooting
Advanced Oracle Troubleshooting
 
Performance tests with Gatling
Performance tests with GatlingPerformance tests with Gatling
Performance tests with Gatling
 
Fail over fail_back
Fail over fail_backFail over fail_back
Fail over fail_back
 
Performance tests with Gatling (extended)
Performance tests with Gatling (extended)Performance tests with Gatling (extended)
Performance tests with Gatling (extended)
 
Cassandra - lesson learned
Cassandra  - lesson learnedCassandra  - lesson learned
Cassandra - lesson learned
 

Semelhante a I know why your Java is slow

SXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersSXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBusters
Elena-Oana Tabaranu
 
Slide 1
Slide 1Slide 1
Slide 1
butest
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web Operations
John Allspaw
 
Optimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardwareOptimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardware
IndicThreads
 
Best Practices for performance evaluation and diagnosis of Java Applications ...
Best Practices for performance evaluation and diagnosis of Java Applications ...Best Practices for performance evaluation and diagnosis of Java Applications ...
Best Practices for performance evaluation and diagnosis of Java Applications ...
IndicThreads
 

Semelhante a I know why your Java is slow (20)

The Cloud-natives are RESTless @ JavaOne
The Cloud-natives are RESTless @ JavaOneThe Cloud-natives are RESTless @ JavaOne
The Cloud-natives are RESTless @ JavaOne
 
Analysis bottleneck in J2EE application
Analysis bottleneck in J2EE applicationAnalysis bottleneck in J2EE application
Analysis bottleneck in J2EE application
 
Java/Spring과 Node.js의 공존 시즌2
Java/Spring과 Node.js의 공존 시즌2Java/Spring과 Node.js의 공존 시즌2
Java/Spring과 Node.js의 공존 시즌2
 
SXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBustersSXSW 2012 JavaScript MythBusters
SXSW 2012 JavaScript MythBusters
 
Introduction to Node.js
Introduction to Node.jsIntroduction to Node.js
Introduction to Node.js
 
Slide 1
Slide 1Slide 1
Slide 1
 
Complex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBoxComplex Made Simple: Sleep Better with TorqueBox
Complex Made Simple: Sleep Better with TorqueBox
 
Capacity Management for Web Operations
Capacity Management for Web OperationsCapacity Management for Web Operations
Capacity Management for Web Operations
 
Looming Marvelous - Virtual Threads in Java Javaland.pdf
Looming Marvelous - Virtual Threads in Java Javaland.pdfLooming Marvelous - Virtual Threads in Java Javaland.pdf
Looming Marvelous - Virtual Threads in Java Javaland.pdf
 
Java On Speed
Java On SpeedJava On Speed
Java On Speed
 
Devignition 2011
Devignition 2011Devignition 2011
Devignition 2011
 
Lightweight Grids With Terracotta
Lightweight Grids With TerracottaLightweight Grids With Terracotta
Lightweight Grids With Terracotta
 
Understanding DSE Search by Matt Stump
Understanding DSE Search by Matt StumpUnderstanding DSE Search by Matt Stump
Understanding DSE Search by Matt Stump
 
Optimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardwareOptimizing your java applications for multi core hardware
Optimizing your java applications for multi core hardware
 
Zepto and the rise of the JavaScript Micro-Frameworks
Zepto and the rise of the JavaScript Micro-FrameworksZepto and the rise of the JavaScript Micro-Frameworks
Zepto and the rise of the JavaScript Micro-Frameworks
 
Jsp and jstl
Jsp and jstlJsp and jstl
Jsp and jstl
 
Apache Wicket: 10 jaar en verder - Martijn Dashorst
Apache Wicket: 10 jaar en verder - Martijn DashorstApache Wicket: 10 jaar en verder - Martijn Dashorst
Apache Wicket: 10 jaar en verder - Martijn Dashorst
 
Best Practices for performance evaluation and diagnosis of Java Applications ...
Best Practices for performance evaluation and diagnosis of Java Applications ...Best Practices for performance evaluation and diagnosis of Java Applications ...
Best Practices for performance evaluation and diagnosis of Java Applications ...
 
Ruby on Rails survival guide of an aged Java developer
Ruby on Rails survival guide of an aged Java developerRuby on Rails survival guide of an aged Java developer
Ruby on Rails survival guide of an aged Java developer
 
Jvm performance tuning
Jvm performance tuningJvm performance tuning
Jvm performance tuning
 

Mais de aragozin

Блеск и нищета распределённых кэшей
Блеск и нищета распределённых кэшейБлеск и нищета распределённых кэшей
Блеск и нищета распределённых кэшей
aragozin
 
Распределённый кэш или хранилище данных. Что выбрать?
Распределённый кэш или хранилище данных. Что выбрать?Распределённый кэш или хранилище данных. Что выбрать?
Распределённый кэш или хранилище данных. Что выбрать?
aragozin
 
Tech talk network - friend or foe
Tech talk   network - friend or foeTech talk   network - friend or foe
Tech talk network - friend or foe
aragozin
 
Поиск на своем сайте, обзор open source решений
Поиск на своем сайте, обзор open source решенийПоиск на своем сайте, обзор open source решений
Поиск на своем сайте, обзор open source решений
aragozin
 

Mais de aragozin (20)

Распределённое нагрузочное тестирование на Java
Распределённое нагрузочное тестирование на JavaРаспределённое нагрузочное тестирование на Java
Распределённое нагрузочное тестирование на Java
 
Java black box profiling
Java black box profilingJava black box profiling
Java black box profiling
 
Блеск и нищета распределённых кэшей
Блеск и нищета распределённых кэшейБлеск и нищета распределённых кэшей
Блеск и нищета распределённых кэшей
 
JIT compilation in modern platforms – challenges and solutions
JIT compilation in modern platforms – challenges and solutionsJIT compilation in modern platforms – challenges and solutions
JIT compilation in modern platforms – challenges and solutions
 
Casual mass parallel computing
Casual mass parallel computingCasual mass parallel computing
Casual mass parallel computing
 
Java GC tuning and monitoring (by Alexander Ashitkin)
Java GC tuning and monitoring (by Alexander Ashitkin)Java GC tuning and monitoring (by Alexander Ashitkin)
Java GC tuning and monitoring (by Alexander Ashitkin)
 
Garbage collection in JVM
Garbage collection in JVMGarbage collection in JVM
Garbage collection in JVM
 
Filtering 100M objects in Coherence cache. What can go wrong?
Filtering 100M objects in Coherence cache. What can go wrong?Filtering 100M objects in Coherence cache. What can go wrong?
Filtering 100M objects in Coherence cache. What can go wrong?
 
Cборка мусора в Java без пауз (HighLoad++ 2013)
Cборка мусора в Java без пауз  (HighLoad++ 2013)Cборка мусора в Java без пауз  (HighLoad++ 2013)
Cборка мусора в Java без пауз (HighLoad++ 2013)
 
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
JIT-компиляция в виртуальной машине Java (HighLoad++ 2013)
 
Performance Test Driven Development (CEE SERC 2013 Moscow)
Performance Test Driven Development (CEE SERC 2013 Moscow)Performance Test Driven Development (CEE SERC 2013 Moscow)
Performance Test Driven Development (CEE SERC 2013 Moscow)
 
Борьба с GС паузами в JVM
Борьба с GС паузами в JVMБорьба с GС паузами в JVM
Борьба с GС паузами в JVM
 
Распределённый кэш или хранилище данных. Что выбрать?
Распределённый кэш или хранилище данных. Что выбрать?Распределённый кэш или хранилище данных. Что выбрать?
Распределённый кэш или хранилище данных. Что выбрать?
 
Devirtualization of method calls
Devirtualization of method callsDevirtualization of method calls
Devirtualization of method calls
 
Tech talk network - friend or foe
Tech talk   network - friend or foeTech talk   network - friend or foe
Tech talk network - friend or foe
 
Database backed coherence cache
Database backed coherence cacheDatabase backed coherence cache
Database backed coherence cache
 
ORM and distributed caching
ORM and distributed cachingORM and distributed caching
ORM and distributed caching
 
Секреты сборки мусора в Java [DUMP-IT 2012]
Секреты сборки мусора в Java [DUMP-IT 2012]Секреты сборки мусора в Java [DUMP-IT 2012]
Секреты сборки мусора в Java [DUMP-IT 2012]
 
Поиск на своем сайте, обзор open source решений
Поиск на своем сайте, обзор open source решенийПоиск на своем сайте, обзор open source решений
Поиск на своем сайте, обзор open source решений
 
High Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of ViewHigh Performance Computing - Cloud Point of View
High Performance Computing - Cloud Point of View
 

Último

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
VictoriaMetrics
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Último (20)

Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
tonesoftg
tonesoftgtonesoftg
tonesoftg
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With SimplicityWSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
WSO2Con2024 - Enabling Transactional System's Exponential Growth With Simplicity
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 

I know why your Java is slow

  • 1. @aragozin#Devoxx #WhyJavaSlow I Know Why Your Java is Slow Alexey Ragozin
  • 2. @aragozin#Devoxx #WhyJavaSlow After months of hard work with tons of new features well tested and polished your application is live! Everything runs smoothly until …
  • 3. @aragozin#Devoxx #WhyJavaSlow Users start complaining about performance application has become unusable stakeholders are getting nervous You have to fix it! ASAP
  • 6. @aragozin#Devoxx #WhyJavaSlow What exactly slow means – clarify your KPIs  Business transaction ≠ Page  Business transaction ≠ SQL transaction  Page ≠ HTTP request  HTTP request ≠ SQL transaction Your system is a black box  Do you think you know how it works?  You do not! Profiling produces three types of data  Lies – incorrectly interpreted data  Darn lies – incorrectly measured data  Statistics – data which will help to fix your system
  • 7. @aragozin#Devoxx #WhyJavaSlow What types of bottlenecks with can find in Java? CPU bound  Single core bottleneck  CPU starvation Thread contention Memory and GC related  Frequent young GC  Abnormally long GC pauses  Frequent full GC
  • 8. @aragozin#Devoxx #WhyJavaSlow Single core bottleneck  Certain singleton thread consumes 100% CPU CPU starvation  Number of threads compete for physical cores  Thread nor sleeps neither waits but CPU usage is far below 100% Thread contention  Thread spends considerable time in synchronization (accruing, waiting) Frequent young GC  Frequent young GC – consume handful of CPU budget  Caused by intensive memory allocation in application code
  • 9. @aragozin#Devoxx #WhyJavaSlow Thread CPU usage  Number CPU cycles spend on this thread (translated into time or percentage)  Calculated by OS (User + Kernel)  Single thread can consume 100% at max Java thread in RUNNABLE state  Not BLOCKED, WAITING, SLEEPING or PARKED  Thread in blocking socket read is RUNNABLE  Java RUNNABLE  may not be runnable from OS prospective  may be runnable but not on CPU  may actually be running on CPU (counted to thread CPU usage)
  • 10. @aragozin#Devoxx #WhyJavaSlow Bird eye view on Java process using JVisualVM
  • 11. @aragozin#Devoxx #WhyJavaSlow Bird eye view on Java process using JProfiler
  • 12. @aragozin#Devoxx #WhyJavaSlow Bird eye view on Java process using SJK ttop > java –jar sjk.jar ttop -p 12345 -n 20 -o CPU 2016-11-07T00:09:50.091+0400 Process summary process cpu=268.35% application cpu=244.42% (user=222.50% sys=21.93%) other: cpu=23.93% GC cpu=6.28% (young=1.62%, old=4.66%) heap allocation rate 1218mb/s safe point rate: 1.5 (events/s) avg. safe point pause: 43.24ms safe point sync time: 0.03% processing time: 6.39% (wallclock time) [000056] user=17.45% sys= 0.62% alloc= 106mb/s - hz._hzInstance_2_dev.generic-operation.thread-0 [000094] user=17.76% sys= 0.31% alloc= 113mb/s - hz._hzInstance_3_dev.generic-operation.thread-1 [000093] user=16.83% sys= 0.00% alloc= 111mb/s - hz._hzInstance_3_dev.generic-operation.thread-0 [000020] user=16.06% sys= 0.15% alloc= 108mb/s - hz._hzInstance_1_dev.generic-operation.thread-0 [000021] user=15.44% sys= 0.15% alloc= 110mb/s - hz._hzInstance_1_dev.generic-operation.thread-1 [000057] user=14.36% sys= 0.00% alloc= 110mb/s - hz._hzInstance_2_dev.generic-operation.thread-1 [000105] user=13.59% sys= 0.00% alloc= 72mb/s - hz._hzInstance_3_dev.cached.thread-1 [000079] user=13.43% sys= 0.15% alloc= 67mb/s - hz._hzInstance_2_dev.cached.thread-3 [000042] user=10.96% sys= 0.62% alloc= 65mb/s - hz._hzInstance_1_dev.cached.thread-2 [000174] user=10.65% sys= 0.31% alloc= 66mb/s - hz._hzInstance_3_dev.cached.thread-7 [000123] user= 8.96% sys= 0.00% alloc= 55mb/s - hz._hzInstance_4_dev.response [000129] user= 7.72% sys= 0.31% alloc= 21mb/s - hz._hzInstance_4_dev.generic-operation.thread-0 [000168] user= 6.95% sys= 0.62% alloc= 33mb/s - hz._hzInstance_1_dev.cached.thread-6 [000178] user= 7.57% sys= 0.00% alloc= 32mb/s - hz._hzInstance_2_dev.cached.thread-9 [000166] user= 6.48% sys= 0.46% alloc= 33mb/s - hz._hzInstance_1_dev.cached.thread-5 [000130] user= 6.02% sys= 0.62% alloc= 20mb/s - hz._hzInstance_4_dev.generic-operation.thread-1 [000181] user= 5.56% sys= 0.00% alloc= 34mb/s - hz._hzInstance_2_dev.cached.thread-12 [000014] user= 2.32% sys= 0.15% alloc= 7345kb/s - hz._hzInstance_1_dev.response https://github.com/aragozin/jvm-tools
  • 13. @aragozin#Devoxx #WhyJavaSlow Tracking CPU hogs using sampling using JProfiler
  • 14. @aragozin#Devoxx #WhyJavaSlow Tracking CPU hogs using sampling flame graph with SJK
  • 15. @aragozin#Devoxx #WhyJavaSlow JEE world example – JBoss + Seam + Hibernate
  • 16. @aragozin#Devoxx #WhyJavaSlow JEE world example – JBoss + Seam + Hibernate Command sjk ssa -f tracedump.std --categorize -tf **.CoyoteAdapter.service -nc JDBC=**.jdbc Hibernate=org.hibernate "Facelets compile=com.sun.faces.facelets.compiler.Compiler.compile" "Seam bijection=org.jboss.seam.**.aroundInvoke/!**.proceed" JSF.execute=com.sun.faces.lifecycle.LifecycleImpl.execute JSF.render=com.sun.faces.lifecycle.LifecycleImpl.render Other=** Report Total samples 2732050 100.00% JDBC 405439 14.84% Hibernate 802932 29.39% Facelets compile 395784 14.49% Seam bijection 385491 14.11% JSF.execute 290355 10.63% JSF.render 297868 10.90% Other 154181 5.64% 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Time Other JSF.render JSF.execute Seam bijection Facelets compile Hibernate JDBC Excel
  • 17. @aragozin#Devoxx #WhyJavaSlow Tracking CPU hot spots using thread sampling PRO  Low overhead  No upfront configuration  Can identify, CPU hot spots, IO hot spots, contention  Data are comparable across runs CON  Limited precision  No real method execution time measured  Safe point bias  Destabilization is limited to static program structure – methods
  • 18. @aragozin#Devoxx #WhyJavaSlow Instrumentation profiling  Modifies byte code at runtime  Modifies behavior of program  Affects JIT compilation  Probe can access method arguments When instrumentation should/can be used?  You narrowed down problem area  You need more context to find root cause
  • 19. @aragozin#Devoxx #WhyJavaSlow BTrace  Open Source  Byte code instrumentation  Probes are coded Java  CLI and API BTrace script @Property Profiler prof = Profiling.newProfiler(); @OnMethod(clazz = "org.jboss.seam.Component", method = "/(inject|disinject|outject)/") void entryByMethod2(@ProbeClassName String className, @ProbeMethodName String methodName, @Self Object component) { if (component != null) { Field nameField = field(classOf(component), "name", true); if (nameField != null) { String name = (String)get(nameField, component); Profiling.recordEntry(prof, concat("org.jboss.seam.Component.", concat(methodName, concat(":", name)))); } } } @OnMethod(clazz = "org.jboss.seam.Component", method = "/(inject|disinject|outject)/", location = @Location(value = Kind.RETURN)) void exitByMthd2(@ProbeClassName String className, @ProbeMethodName String methodName, @Self Object component, @Duration long duration) { if (component != null) { Field nameField = field(classOf(component), "name", true); if (nameField != null) { String name = (String)get(nameField, component); Profiling.recordExit(prof, concat("org.jboss.seam.Component.", concat(methodName, concat(":", name))), duration); } } } https://github.com/jbachorik/btrace2
  • 20. @aragozin#Devoxx #WhyJavaSlow Garbage Collection is one to blame?  Enable GC logs to see whole picture  -XX:+PrintGCDetails  -XX:+PrintReferenceGC Common problems  JVM has not enough memory  Intensive allocation of short live object by application  Reference problems  Large object resurrection by finalizes  Too many references
  • 21. @aragozin#Devoxx #WhyJavaSlow -XX:InitialTenuringThreshold=8 Initial value for tenuring threshold (number of collections before object will be promoted to old space) -XX:+UseTLAB Use thread local allocation blocks in eden -XX:MaxTenuringThreshold=15 Max value for tenuring threshold -XX:PretenureSizeThreshold=2m Max object size allowed to be allocated in young space (large objects will be allocated directly in old space). Thread local allocation bypasses this check, so if TLAB is large enough object exciding size threshold still may be allocated in young space. -XX:+AlwaysTenure Promote all objects surviving young collection immediately to tenured space (equivalent of -XX:MaxTenuringThreshold=0) -XX:+NeverTenure Objects from young space will never get promoted to tenured space unless survivor space is not enough to keep them -XX:+ResizeTLAB Let JVM resize TLABs per thread -XX:TLABSize=1m Initial size of thread’s TLAB -XX:MinTLABSize=64k Min size of TLAB -XX:+UseCMSInitiatingOccupancyOnly Only use predefined occupancy as only criterion for starting a CMS collection (disable adaptive behaviour) -XX:CMSInitiatingOccupancyFraction=70 Percentage CMS generation occupancy to start a CMS cycle. A negative value means that CMSTriggerRatio is used. -XX:CMSBootstrapOccupancy=50 Percentage CMS generation occupancy at which to initiate CMS collection for bootstrapping collection stats. -XX:CMSTriggerRatio=70 Percentage of MinHeapFreeRatio in CMS generation that is allocated before a CMS collection cycle commences. -XX:CMSWaitDuration=30000 Once CMS collection is triggered, it will wait for next young collection to perform initial mark right after. This parameter specifies how long CMS can wait for young collection -XX:+CMSScavengeBeforeRemark Force young collection before remark phase -XX:+CMSScheduleRemarkEdenSizeThreshold If Eden used is below this value, don't try to schedule remark -XX:CMSScheduleRemarkEdenPenetration=20 Eden occupancy % at which to try and schedule remark pause -XX:CMSScheduleRemarkSamplingRatio=4 StartsamplingEdentopatleastbeforeyounggenerationoccupancy reaches1/ofthesizeatwhichweplantoscheduleremark -XX:+CMSIncrementalMode Enable incremental CMS mode. Incremental mode was meant for severs with small number of CPU, but may be used on multicore servers to benefit from more conservative initiation strategy. -XX:+CMSClassUnloadingEnabled If not enabled, CMS will not clean permanent space. You may need to enable it for containers such as JEE or OSGi. -XX:ConcGCThreads=2 Number of parallel threads used for concurrent phase. -XX:ParallelGCThreads=16 Number of parallel threads used for stop-the-world phases. -XX:+DisableExplicitGC JVM will ignore application calls to System.gc() -XX:+ExplicitGCInvokesConcurrent Let System.gc() trigger concurrent collection instead of full GC -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses Same as above but also triggers permanent space collection. -XX:PrintCMSStatistics=1 Print additional CMS statistics. Very verbose if n=2. -XX:+PrintCMSInitiationStatistics Print CMS initiation details -XX:+CMSDumpAtPromotionFailure Dump useful information about the state of the CMS old generation upon a promotion failure -XX:+CMSPrintChunksInDump (with optin above) Add more detailed information about the free chunks -XX:+CMSPrintObjectsInDump (with optin above) Add more detailed information about the allocated objects by Alexey Ragozin – http://blog.ragozin.info HotSpot JVM options cheatsheet Young space tenuring Thread local allocation Parallel processing -XX:+CMSOldPLABMin=16 -XX:+CMSOldPLABMax=1024 Min and max size of CMS gen PLAB caches per worker per block size CMS initiating options CMS Stop-the-World pauses tuning Misc CMS options CMS Diagnostic options - Options for “deterministic” CMS, they disable some heuristics and require careful validation Concurrent Mark Sweep (CMS) All concrete numbers in JVM options in this card are for illustrational purposes only! -XX:+ParallelRefProcEnabled Enable parallel processing of references during GC pause -XX:SoftRefLRUPolicyMSPerMB=1000 Factor for calculating soft reference TTL based on free heap size -XX:OnOutOfMemoryError=… Command to be executed in case of out of memory. E.g. “kill -9 %p” on Unix or “taskkill /F /PID %p” on Windows. -XX:G1HeapRegionSize=32m Size of heap region -XX:MaxGCPauseMillis=500 Target GC pause duration. G1 is not deterministic, so no guaranties for GC pause to satisfy this limit. -XX:G1ReservePercent=10 Percentage of heap to keep free. Reserved memory is used as last resort to avoid promotion failure. -XX:G1ConfidencePercent=50 Confidence level for MMU/pause prediction -XX:G1HeapWastePercent=10 If garbage level is below threshold, G1 will not attempt to reclaim memory further -XX:G1MixedGCCountTarget=8 Target number of mixed collections after a marking cycle -XX:InitiatingHeapOccupancyPercent=45 Percentage of (entire) heap occupancy to trigger concurrent GC Garbage First (G1) -XX:+CMSParallelRemarkEnabled Whether parallel remark is enabled (enabled by default) -XX:+CMSParallelSurvivorRemarkEnabled Whether parallel remark of survivor space enabled, effective only with option above (enabled by default) -XX:+CMSConcurrentMTEnabled Use multiple threads for concurrent phases. CMS Concurrency options -XX:+CMSParallelInitialMarkEnabled Whether parallel initial mark is enabled (enabled by default) -XX:CMSTriggerInterval=60000 Periodically triggers_ CMS collection. Useful for deterministic object finalization. GC options cheat sheet download at http://blog.ragozin.info/2016/10/hotspot-jvm-garbage-collection-options.html Young collector Old collectior JVM Flags Serial (DefNew) Parallel scavenge (PSYoungGen) Parallel scavenge (PSYoungGen) Parallel (ParNew) Serial (DefNew) Parallel (ParNew) Serial Mark Sweep Compact Serial Mark Sweep Compact (PSOldGen) Parallel Mark Sweep Compact (ParOldGen) Concurrent Mark Sweep Concurrent Mark Sweep Serial Mark Sweep Compact -XX:+UseSerialGC -XX:+UseParallelGC -XX:+UseParallelOldGC -XX:+UseParNewGC -XX:-UseParNewGC1 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+UseG1GCGarbage First (G1) 1 - Notice minus before UseParNewGC, which is explicitly disables parallel mode -verbose:gc or -XX:+PrintGC Print basic GC info -XX:+PrintGCDetails Print more details GC info -XX:+PrintGCTimeStamps Print timestamps for each GC event (seconds count from start of JVM) -XX:+PrintGCDateStamps Print date stamps at garbage collection events: 2011-09-08T14:20:29.557+0400: [GC... -Xloggc:<file> RedirectsGCoutputtoafileinsteadofconsole -XX:+PrintTLAB Print TLAB allocation statistics -XX:+PrintReferenceGC Print times for special (weak, JNI, etc) reference processing during STW pause -XX:+PrintJNIGCStalls Reports if GC is waiting for native code to unpin object in memory -XX:+PrintClassHistogramAfterFullGC Prints class histogram after full GC -XX:+PrintClassHistogramBeforeFullGC Prints class histogram before full GC -XX:+UseGCLogFileRotation Enable GC log rotation -XX:GCLogFileSize=512m Size threshold for GC log file -XX:NumberOfGCLogFiles=5 Number GC log files -XX:+PrintGCCause Add cause of GC in log -XX:+PrintHeapAtGC Print heap details on GC -XX:+PrintAdaptiveSizePolicy Print young space sizing decisions -XX:+PrintHeapAtSIGBREAK Print heap details on signal -XX:+PrintPromotionFailure Print additional information for promotion failure -XX:+PrintPLAB Print survivor PLAB details -XX:+PrintOldPLAB Print old space PLAB details -XX:+PrintGCTaskTimeStamps Print timestamps for individual GC worker thread tasks (very verbose) -XX:+PrintGCApplicationStoppedTime Print summary after each JVM safepoint (including non-GC) -XX:+PrintGCApplicationConcurrentTime Print time for each concurrent phase of GC GC Log rotation More logging options -XX:+PrintTenuringDistribution Print detailed demography of young space after each collection GC log detail options -Xms256m or -XX:InitialHeapSize=256m Initial size of JVM heap (young + old) -Xmx2g or -XX:MaxHeapSize=2g Max size of JVM heap (young + old) -XX:NewSize=64m -XX:MaxNewSize=64m Absolute (initial and max) size of young space (Eden + 2 Survivours) -XX:NewRatio=3 Alternative way to specify size of young space. Sets ratio of young vs old space (e.g. -XX:NewRatio=2 means that young space will be 2 time smaller than old space, i.e. 1/3 of heap size). -XX:SurvivorRatio=15 Sets size of single survivor space relative to Eden space size (e.g. -XX:NewSize=64m -XX:SurvivorRatio=6 means that each Survivor space will be 8m and Eden will be 48m). -XX:MetaspaceSize=512m -XX:MaxMetaspaceSize=1g Initial and max size of JVM’s metaspace space -Xss256k (size in bytes) or -XX:ThreadStackSize=256 (size in Kbytes) Thread stack size -XX:MaxDirectMemorySize=2g Maximum amount of memory available for NIO off-heap byte buffers - Highly recommended option Memory sizing options by Alexey Ragozin – http://blog.ragozin.info Available combinations of garbage collection algorithms in HotSpot JVM - Highly recommended option HotSpot JVM options cheatsheetAll concrete numbers in JVM options in this card are for illustrational purposes only! Java Process Memory JVM Memory Java Heap Non-JVMMemory (nativelibraries) Non-Heap Young Gen OldGen Eden Survivor0 Survivor1 ThreadStacks Metaspace CompressedClassSpace CodeCache NIODirectBuffers OtherJVMMemmory -Xms/-Xmx -XX:CompressedClassSpaceSize=1g Memory reserved for compressed class space (64bit only) -XX:InitialCodeCacheSize=256m -XX:ReservedCodeCacheSize=512m Initial size and max size of code cache area
  • 22. @aragozin#Devoxx #WhyJavaSlow Visualize GC Dynamics https://github.com/chewiebug/GCViewer with GC Viewer
  • 23. @aragozin#Devoxx #WhyJavaSlow Visualize GC Dynamics with Mission Control
  • 24. @aragozin#Devoxx #WhyJavaSlow Tracing object allocation hot spots Traditional approach – instrumentation Instrumenting each allocation site Intrusive and expensive Flight recorder approach – out of TLAB allocation tracing Low overhead No byte code transformation Biased sampling
  • 25. @aragozin#Devoxx #WhyJavaSlow Tracking allocation hotspots with Mission Control
  • 26. @aragozin#Devoxx #WhyJavaSlow Conclusion  Bottleneck could be everywhere in system  Move from top to bottom  Use right tools on each level  Focus on application KPI  There is no silver bullet tool
  • 27. @aragozin#Devoxx #WhyJavaSlow KEEP CALM AND MAKE YOUR JAVA FAST Alexey Ragozin alexey.ragozin@gmai.com http://blog.ragozin.info