SlideShare uma empresa Scribd logo
Finding Java's
Hidden Performance Traps
@victorrentea ♦ VictorRentea.ro
Git: h2ps://github.com/victorrentea/performance-profiling.git
Branch: devoxx-uk-24
Slides: TODO
👋 I'm Victor Rentea 🇷🇴 Java Champion, PhD(CS)
18 years of coding
10 years of training & consul2ng at 130+ companies on:
❤ Refactoring, Architecture, Unit Tes4ng
🛠 Java: Spring, Hibernate, Performance, Reac7ve
Lead of European So8ware Cra8ers (7K developers)
Join for free monthly online mee7ngs from 1700 CET
Channel: YouTube.com/vrentea
Life += 👰 + 👧 + 🙇 + 🐈 @victorrentea
hBps://victorrentea.ro
🇷🇴
past events
3 VictorRentea.ro
a training by
The response *me of your endpoint is too long 😱
🎉 Congratula*ons! 🎉
You made it to produc.on, and
You're taking serious load!
How to start?
4 VictorRentea.ro
a training by
Agenda
2. Metrics
1. Distributed Time Sampling
3. Execu*on Profiling
5 VictorRentea.ro
a training by
1st challenge: trace requests
A B
D E F
C
Performance in a Distributed System
6 VictorRentea.ro
a training by
order-service.log:
2024-03-20T20:59:24.473 INFO [order,65fb320c74e6dd352b07535cf26506a8,776ae9fa7074b2c5]
60179 --- [nio-8088-exec-8] ...PlaceOrderRest : Placing order for products: [1]
Distributed Request Tracing
catalog-service.log:
2024-03-20T20:59:24.482 INFO [catalog,65fb320c74e6dd352b07535cf26506a8,9d3b54c924ab886d]
60762 --- [nio-8081-exec-7] ...GetPriceRest : Returning prices for products: [1]
TraceId
7 VictorRentea.ro
a training by
= All systems involved in a flow share the same TraceId
§TraceId is generated by the first system in a chain (eg API Gateway) (1)
§TraceId is received as a header in the incoming HTTP Request or Message(2)
- Standardized by OTEL (h3ps://opentelemetry.io)
§TraceId is prefixed to every log line (prev. slide)
§TraceId propagates magically with method calls (3)
- via ThreadLocal => Submit all async work to Spring-injected executors ⚠
- via ReactorContext in WebFlux => Spring must subscribe to your reacLve chains ⚠
§TraceId is sent as a header in any outgoing Request or Message (4)
- ⚠ By injected beans: RestTemplate, WebClient, KaPaSender...
Distributed Request Tracing
h"ps://spring.io/blog/2022/10/12/observability-with-spring-boot-3/
A B C D
no TraceId
1: TraceId = 🆕
3 4 4
4
<<HTTP Header>>
TraceId
<<Message Header>>
TraceId
<<HTTP Header>>
TraceId
2 2 2
8 VictorRentea.ro
a training by
§Every system reports the .me spent (span) working on a TraceId
§...to a central system that tracks where .me was spent
§Only a sample of traces are reported (eg 1%), for performance
Distributed Time Budge4ng
9 VictorRentea.ro
a training by
span
trace
Distributed Time Budge4ng
10 VictorRentea.ro
a training by
-details/456
/456
/456
independent calls
executed sequen0aly
Distributed Time Budge4ng
11 VictorRentea.ro
a training by
§Iden%cal repeated API calls by accident 😱
§Calls in a loop: GET /items/1, /2, /3... è batch fetch
§System-in-the-middle: A à B(just a proxy) à C è AàC
§Long chains of REST calls: A à B ... à E, paying network latency
§Independent API calls è call in parallel
§Long spans not wai.ng for other systems? 🤔 => Zoom in 🔎
Performance An*-PaIerns in a Distributed System
12 VictorRentea.ro
a training by
Main Suspect:
One System
13 VictorRentea.ro
a training by
Log timestamps
log.info("Before call"); !// 2023-05-23 00:01:50.000 !!...
!// suspect code
log.info("After call"); !// 2023-05-23 00:01:53.100 !!...
Measuring Method Run*me – High-School Style
t1-t0
long t0 = currentTimeMillis();
List<CommentDto> comments = client.fetchComments(loanId); !// suspect code
long t1 = currentTimeMillis();
System.out.println("Took millis: " + (t1 - t0));
=> Log Clu.er
😵💫
14 VictorRentea.ro
a training by
Micrometer
💖 (OpenTelemetry.io) 💖
@Timed !// an AOP proxy intercepts the call and monitors its execution time
List<CommentDto> fetchComments(Long id) {
!// suspect: API call, heavy DB query, CPU-intensive section, !!...
}
GET /actuator/prometheus (in a Spring Boot app) :
method_timed_seconds_sum{method="fetchComments"} = 100 seconds
method_timed_seconds_count{method="fetchComments"} = 400 calls
5 seconds later:
method_timed_seconds_sum{...} = 150 – 100 = 50 seconds
method_timed_seconds_count{...} = 500 – 400 = 100 calls
4me
average[ms]
250
500
≃ 250 millis/call
≃ 500 millis/call
(⚠double over the last 5 seconds)
15 VictorRentea.ro
a training by
Micrometer
h&ps://www.baeldung.com/micrometer
!// eg. "95% requests took ≤ 100 ms"
@Timed(percentiles = {0.95}, histogram = true)
⚠Usual AOP Pi:alls: @Timed works on methods of beans injected by Spring,
and method is NOT final & NOT sta7c; can add CPU overhead to simple methods.
!// programatic alternative to @Timed:
meterRegistry.timer("name").record(() -> { !/*suspect code!*/ });
meterRegistry.timer("name").record(t1 - t0, MILLISECONDS);
!// accumulator
meterRegistry.counter("earnings").increment(purchase);
!// current value
meterRegistry.gauge("player-count", currentPlayers.size());
16 VictorRentea.ro
a training by
§Micrometer exposes metrics
- your own (prev. slide)
- criLcal framework metrics to track common issues (next slide)
- opLonal: Tomcat, executors, ...
§...that are pulled and stored by Prometheus every 5 seconds
§Grafana can query them later to:
- Display impressive charts
- Aggregate metrics across Lme/node/app
- Raise alerts
§Other (paid) soluPons:
- NewRelic, DynaTrace, DataDog, Splunk, AppDynamics
Metrics
17 VictorRentea.ro
a training by
§ h3p_server_requests_seconds_sum{method="GET",uri="/profile/showcase/{id}",} 165.04
h3p_server_requests_seconds_count{method="GET",uri="/profile/showcase/{id}",} 542.0
§ tomcat_connecLons_current_connecLons{...} 298 ⚠ 98 requests waiLng in queue
tomcat_threads_busy_threads{...} 200 ⚠ = using 100% of the threads => thread pool starvaLon?
§ jvm_gc_memory_allocated_bytes_total 1.8874368E9
jvm_memory_usage_aher_gc_percent{...} 45.15 ⚠ used>30% => increase max heap?
§ hikaricp_connecLons_acquire_seconds_sum{pool="HikariPool-1",} 80.37 ⚠ conn pool starvaLon?
hikaricp_connecLons_acquire_seconds_count{pool="HikariPool-1",} 551.0 è 145 ms waiLng
§ cache_gets_total{cache="cache1",result="miss",} 0.99 ⚠ 99% miss = bad config? wrong key?
cache_gets_total{cache="cache1",result="hit",} 0.01
§ logback_events_total{level="error",} 1255.0 ⚠ + 1000 errors over the last 5 seconds => 😱
§ my_own_custom_metric xyz
Example Metrics
Exposed automaMcally at h2p://localhost:8080/actuator/prometheus
18 VictorRentea.ro
a training by
Finding common performance problems
should be done now.
Best Prac%ce
Capture and Monitor metrics for any recurrent performance issue
19 VictorRentea.ro
a training by
Bo.leneck in code that does not expose metrics?
(library or unknown legacy code)
💡
Let's measure execu<on <me of *all* method?
Naive: instrument your JVM with a -javaagent
that measures the execuFon any executed method
🚫 OMG, NO! 🚫
Adds huge CPU overhead!
20 VictorRentea.ro
a training by
Agenda
2. Metrics
1. Distributed Time Sampling
3. Execu*on Profiling
21 VictorRentea.ro
a training by
Java Flight Recorder
The JVM Profiler
28 VictorRentea.ro
a training by
Java Flight Recorder (JFR)
1. Captures stack traces of running threads every second = samples
2. Records internal JVM events about:
- Locks, pinned Carrier Threads (Java 21)
- File I/O
- Network Sockets
- GC
- Your own custom events
3. Very-low overhead (<2%) => Can be used in produc%on💖
4. Free since Java 11 (link)
29 VictorRentea.ro
a training by
Using JFR
§Local JVM, eg via GlowRoot, Java Mission Control or IntelliJ ⤴
§Remote JVM: ⏺ ... ⏹ => analyze generated .jfr file in JMC/IntelliJ
First, enable JFR via -XX:+FlightRecorder
- Auto-start on startup: -XX:StartFlightRecording=....
- Start via command-line (SSH): jcmd <JAVAPID> JFR.start
- Start via endpoints: JMX or Spring Boot Actuator
- Start via tools: DynaTrace, DataDog, Glowroot (in our experiment)
30 VictorRentea.ro
a training by
Flame Graph
Visualize the samples captured by an execu5on profiler
31 VictorRentea.ro
a training by
Thread.sleep(Native Method)
Flame.f(Flame.java:21)
Flame.entry(Flame.java:17)
Flame.dummy(Flame.java:14)
Thread.sleep(Native Method)
Flame.g(Flame.java:24)
Flame.entry(Flame.java:18)
Flame.dummy(Flame.java:14)
2 samples (33%)
4 samples (66%)
JFR captures stack traces of running threads once every second
33%
66%
32 VictorRentea.ro
a training by
Thread.sleep(Native Method)
Flame.g(Flame.java:24)
Flame.entry(Flame.java:18)
Flame.dummy(Flame.java:14)
Thread.sleep(Native Method)
Flame.f(Flame.java:21)
Flame.entry(Flame.java:17)
Flame.dummy(Flame.java:14)
2 samples (33%) 4 samples (66%)
Length of each bar is proporFonal to the number of samples (≈Fme)
Anatomy of a Flame Graph
JFR captures stack traces of running threads once every second
Flame.dummy(Flame.java:14)
33 VictorRentea.ro
a training by
Thread.sleep(Native Method) Thread.sleep(Native Method)
Flame.f(Flame.java:21) Flame.g(Flame.java:24)
Flame.entry(Flame.java:17) Flame.entry(Flame.java:18)
Flame.dummy(Flame.java:14)
IdenFcal bars are merged
2 samples (33%) 4 samples (66%)
Length of each bar is proporFonal to the number of samples (≈Fme)
Anatomy of a Flame Graph
JFR captures stack traces of running threads once every second
Flame.dummy(Flame.java:14)
FOCUS ON SPLIT POINTS
34 VictorRentea.ro
a training by
Thread.sleep(Native Method) Thread.sleep(Native Method)
Flame.f(Flame.java:21) Flame.g(Flame.java:24)
Flame.entry(Flame.java:17) Flame.entry(Flame.java:18)
Flame.dummy(Flame.java:14)
2 samples (33%) 4 samples (66%)
IN REAL LIFE
Unsafe.park(Native Method)
......50 more lines↕......
SocketInputStream.socketRead0(Native Method)
......40 more lines 😵💫 ......
423
210
100 unknown library methods 😱 ...
IdenFcal bars are merged
Length of each bar is proportional to the number of samples (≈time)
Anatomy of a Flame Graph
JFR samples the stack traces of running threads every second
THREAD BLOCKED NETWORK
More samples
improve accuracy
(heavy load test or 1 week in prod)
35 VictorRentea.ro
a training by
ProfiledApp
Spring Boot 3+
http://localhost:8080
Load Tests
Java app
(Gatling.io)
23 parallel requests
(closed world model)
PostgreSQL
in a Local Docker
localhost:5432
WireMock
in a Local Docker
http://localhost:9999
remote API responses
with 40ms delay
Glowroot.org
-javaagent:glowroot.jar
h:p://localhost:4000
display captured flamegraph
Experiment Environment
ToxiProxy
github.com/Shopify/toxiproxy
in a Local Docker
localhost:55432
+3ms network latency
JFR
or jMeter, K6..
36 VictorRentea.ro
a training by
EXPERIMENT
38 VictorRentea.ro
a training by
Glowroot.org
§Simple, didacPc profiler
- Add VM arg: -javaagent:/path/to/glowroot.jar
- Open h3p://localhost:4000
§Monitors
- Methods Profiling, displayed as Flame Graphs
- SQLs
- API calls
- Typical bo3lenecks
- Data per endpoint
§Free
- Probably not producLon-ready ⚠
40 VictorRentea.ro
a training by
Expensive Aspects
TransactionInterceptor
acquires a JDBC connection before entering the @Transactional method
and releases that connection after the end of the method
:382
:388
41 VictorRentea.ro
a training by
JDBC ConnecDon Pool StarvaDon
ß 85% of the query endpoint is spent acquiring a JDBC Connection à
Waiting for
SQL server
42 VictorRentea.ro
a training by
False Split by JIT OpDmizaDon
Fix:
a) Warmup JVM: eg. repeat load test (already optimized)
b) Record more samples (all future calls calls go to #2 Optimized)
Raw Method (#1)
Optimized Method (#2)
Same method is called
via 2 different paths
43 VictorRentea.ro
a training by
Lock ContenDon
Automatic analysis of the .jfr output
by JDK Mission Control (JMC)
https://wiki.openjdk.org/display/jmc/Releases
ß Wai7ng to enter a synchronized method à
(naBve C code does not run in JVM => is not recorded by JFR)
44 VictorRentea.ro
a training by
hashSet.removeAll(list);
100K times ArrayList.contains !!
= 100K x 100K ops
🔥Hot (CPU) Method
set.size() list.size() Δ Fme
100.000 100 1 millis
100.000 99.999 50 millis
100.000 100.000 2.5 seconds (50x more) 😱
45 VictorRentea.ro
a training by
War Story: Expor4ng from Mongo
Export took 6 days!!?!!#$%$@
The team was concern about misusing an obscure library (jsonl)
JFR showed that MongoItemReader loaded data in pages, not streaming (link)
Switched to MongoCursorItemReader!
6 days è 3 minutes
46 VictorRentea.ro
a training by
War Story: Impor4ng in MsSQL
INSERT #1: 11.000 rows ... 1 second – OK.
INSERT #2: 5.000 rows ... 100 seconds !!?😬
Profiler showed that #2 was NOT using "doInsertBulk" as #1
StackOverflow revealed an issue inser7ng in DATE columns.
privacy
privacy
privacy
privacy
privacy
privacy
privacy
100 seconds è <1 second
50 VictorRentea.ro
a training by
@TransacIonal triggers a proxy
to execute before entering the method
(a @Transac?onal query method??!)
RestTemplate.getForObject()
Network request launched
outside of the controller method!!
(by an unknown @Aspect)
Acquire a JDBC connecLon
54% of total ?me => 🤔
connec?on pool starva?on issue?
Time waited outside of JVM
(in C na?ve code, not seen by JFR)
Controller method
com.sun.proxy.$Proxy230.findById()
Spring Data Repo call
SessionImpl.initializeCollection
ORM lazy-loading
AbstractConnPool.getPoolEntryBlocking
Feign client uses Apache H"p Client,
which blocks to acquire a HTTP connec?on
(default: 2 pooled connec>ons/des>na>on host)
100% of this h"p request
(165 samples)
Spring proxy👻
Spring proxy👻 in front of
the controller method
My actual Service method
51 VictorRentea.ro
a training by
Flagship Feature of Java 21?
🎉 Released on 19 Sep 2023 🎉
Virtual Threads
(Project Loom)
JVM can reuse the 1MB OS Pla@orm Thread
for other tasks when you block in a call to an API/DB
Blocking is free!! 🥳
52 VictorRentea.ro
a training by
JFR can record
❤Virtual Threads❤
(a Spring Boot 3 app running on Java 21)
53 VictorRentea.ro
a training by
Structured Concurrency
(Java 25 LTS🤞)
Thread Dump of code using Virtual Threads + Structured Concurrency (Java 25)
captures the parent thread
{
"container": "jdk.internal.misc.ThreadFlock@6373f7e2",
"parent": "java.util.concurrent.ThreadPerTaskExecutor@634c5b98",
"owner": "83",
"threads": [
{
"tid": "84",
"name": "",
"stack": [
"java.base/jdk.internal.vm.Continuation.yield(Continuation.java:357)",
"java.base/java.lang.VirtualThread.yieldContinuation(VirtualThread.java:370)",
"java.base/java.lang.VirtualThread.park(VirtualThread.java:499)",
....
"org.springframework.web.client.RestTemplate.getForObject(RestTemplate.java:378)",
"victor.training.java.loom.StructuredConcurrency.fetchBeer(StructuredConcurrency.java:46
....
"java.base/jdk.internal.vm.Continuation.enter(Continuation.java:320)"
]
}
55 VictorRentea.ro
a training by
JFR Key Points
✅ Sees inside unknown code / libraries
❌ CANNOT see into na.ve C++ code (like synchronized keyword)
⚠ Sample-based: requires load tests or record in produc.on
✅ Can be used in produc.on thanks to extra-low overhead <2%
⚠ Tuning requires pa.ence, expecta.ons, and library awareness
⚠ JIT op.miza.ons can bias profile of CPU-intensive code (link)
❌ CANNOT follow thread hops (eg. thread pools) or reac.ve chains
✅ Records Virtual Threads and Structured Concurrency
56 VictorRentea.ro
a training by
Finding the BoNleneck
§Identify slow system (eg w/ Zipkin)
§Study its metrics + expose more (eg w/ Grafana)
§Profile the execution in production / load-tests
- Load the .jfr in IntelliJ (code navigation) and in JMC (auto-analysis)
§Optimize code/config, balancing gain vs. risk vs. code mess
Tracing Java's Hidden
Performance Traps
Network Calls
Resource Starva7on
Concurrency Control
CPU work
Aspects, JPA, Lombok ..
Naive @Transactional
Large synchronized
Library misuse/misconfig
Thank You!
victor.rentea@gmail.com ♦ ♦ @victorrentea ♦ VictorRentea.ro
Git: https://github.com/victorrentea/performance-profiling.git
Branch: devoxx-uk-24
Meet me online at:
I was Victor Rentea,
trainer & coach for experienced teams

Mais conteúdo relacionado

Mais procurados

[CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼
[CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼 [CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼
[CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼
CODE BLUE
 
10分で分かるLinuxブロックレイヤ
10分で分かるLinuxブロックレイヤ10分で分かるLinuxブロックレイヤ
10分で分かるLinuxブロックレイヤ
Takashi Hoshino
 

Mais procurados (20)

Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)
 
[DockerCon 2019] Hardening Docker daemon with Rootless mode
[DockerCon 2019] Hardening Docker daemon with Rootless mode[DockerCon 2019] Hardening Docker daemon with Rootless mode
[DockerCon 2019] Hardening Docker daemon with Rootless mode
 
Bowling Game Kata by Robert C. Martin
Bowling Game Kata by Robert C. MartinBowling Game Kata by Robert C. Martin
Bowling Game Kata by Robert C. Martin
 
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみたOPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
 
【学習メモ#1st】12ステップで作る組込みOS自作入門
【学習メモ#1st】12ステップで作る組込みOS自作入門【学習メモ#1st】12ステップで作る組込みOS自作入門
【学習メモ#1st】12ステップで作る組込みOS自作入門
 
Gatekeeper Exposed
Gatekeeper ExposedGatekeeper Exposed
Gatekeeper Exposed
 
PHP AST 徹底解説
PHP AST 徹底解説PHP AST 徹底解説
PHP AST 徹底解説
 
[CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼
[CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼 [CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼
[CB19] アンチウイルスをオラクルとしたWindows Defenderに対する新しい攻撃手法 by 市川遼
 
JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011
 
10分で分かるLinuxブロックレイヤ
10分で分かるLinuxブロックレイヤ10分で分かるLinuxブロックレイヤ
10分で分かるLinuxブロックレイヤ
 
Linux PV on HVM
Linux PV on HVMLinux PV on HVM
Linux PV on HVM
 
OPcache の最適化器の今
OPcache の最適化器の今OPcache の最適化器の今
OPcache の最適化器の今
 
Trucs et astuces PHP et MySQL
Trucs et astuces PHP et MySQLTrucs et astuces PHP et MySQL
Trucs et astuces PHP et MySQL
 
Quick tour of PHP from inside
Quick tour of PHP from insideQuick tour of PHP from inside
Quick tour of PHP from inside
 
さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)さいきんの InnoDB Adaptive Flushing (仮)
さいきんの InnoDB Adaptive Flushing (仮)
 
Java Crash分析(2012-05-10)
Java Crash分析(2012-05-10)Java Crash分析(2012-05-10)
Java Crash分析(2012-05-10)
 
より速く より運用しやすく 進化し続けるJVM(Java Developers Summit Online 2023 発表資料)
より速く より運用しやすく 進化し続けるJVM(Java Developers Summit Online 2023 発表資料)より速く より運用しやすく 進化し続けるJVM(Java Developers Summit Online 2023 発表資料)
より速く より運用しやすく 進化し続けるJVM(Java Developers Summit Online 2023 発表資料)
 
cloudpackサーバ仕様書(サンプル)
cloudpackサーバ仕様書(サンプル)cloudpackサーバ仕様書(サンプル)
cloudpackサーバ仕様書(サンプル)
 
モダン PHP テクニック 12 選 ―PsalmとPHP 8.1で今はこんなこともできる!―
モダン PHP テクニック 12 選 ―PsalmとPHP 8.1で今はこんなこともできる!―モダン PHP テクニック 12 選 ―PsalmとPHP 8.1で今はこんなこともできる!―
モダン PHP テクニック 12 選 ―PsalmとPHP 8.1で今はこんなこともできる!―
 
お金が無いときのMySQL Cluster頼み
お金が無いときのMySQL Cluster頼みお金が無いときのMySQL Cluster頼み
お金が無いときのMySQL Cluster頼み
 

Semelhante a Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Jdk 7 4-forkjoin
Jdk 7 4-forkjoinJdk 7 4-forkjoin
Jdk 7 4-forkjoin
knight1128
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time Optimization
Kan-Ru Chen
 

Semelhante a Finding Java's Hidden Performance Traps @ DevoxxUK 2024 (20)

Profiling your Java Application
Profiling your Java ApplicationProfiling your Java Application
Profiling your Java Application
 
Monitoring using Prometheus and Grafana
Monitoring using Prometheus and GrafanaMonitoring using Prometheus and Grafana
Monitoring using Prometheus and Grafana
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
 
What the CRaC - Superfast JVM startup
What the CRaC - Superfast JVM startupWhat the CRaC - Superfast JVM startup
What the CRaC - Superfast JVM startup
 
Ob1k presentation at Java.IL
Ob1k presentation at Java.ILOb1k presentation at Java.IL
Ob1k presentation at Java.IL
 
Jdk 7 4-forkjoin
Jdk 7 4-forkjoinJdk 7 4-forkjoin
Jdk 7 4-forkjoin
 
Testing Microservices @DevoxxBE 23.pdf
Testing Microservices @DevoxxBE 23.pdfTesting Microservices @DevoxxBE 23.pdf
Testing Microservices @DevoxxBE 23.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ASML_FlightRecorderMeetsJava.pdf
ASML_FlightRecorderMeetsJava.pdfASML_FlightRecorderMeetsJava.pdf
ASML_FlightRecorderMeetsJava.pdf
 
Bsides final
Bsides finalBsides final
Bsides final
 
DevDays: Profiling With Java Flight Recorder
DevDays: Profiling With Java Flight RecorderDevDays: Profiling With Java Flight Recorder
DevDays: Profiling With Java Flight Recorder
 
Curator intro
Curator introCurator intro
Curator intro
 
Jdk Tools For Performance Diagnostics
Jdk Tools For Performance DiagnosticsJdk Tools For Performance Diagnostics
Jdk Tools For Performance Diagnostics
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?
 
Debugging & profiling node.js
Debugging & profiling node.jsDebugging & profiling node.js
Debugging & profiling node.js
 
My name is Trinidad
My name is TrinidadMy name is Trinidad
My name is Trinidad
 
Android Boot Time Optimization
Android Boot Time OptimizationAndroid Boot Time Optimization
Android Boot Time Optimization
 
Definitive Guide to Working With Exceptions in Java
Definitive Guide to Working With Exceptions in JavaDefinitive Guide to Working With Exceptions in Java
Definitive Guide to Working With Exceptions in Java
 
P4 Introduction
P4 Introduction P4 Introduction
P4 Introduction
 
Adopting GraalVM - NE Scala 2019
Adopting GraalVM - NE Scala 2019Adopting GraalVM - NE Scala 2019
Adopting GraalVM - NE Scala 2019
 

Mais de Victor Rentea

Vertical Slicing Architectures
Vertical Slicing ArchitecturesVertical Slicing Architectures
Vertical Slicing Architectures
Victor Rentea
 

Mais de Victor Rentea (20)

Microservice Resilience Patterns @VoxxedCern'24
Microservice Resilience Patterns @VoxxedCern'24Microservice Resilience Patterns @VoxxedCern'24
Microservice Resilience Patterns @VoxxedCern'24
 
Distributed Consistency.pdf
Distributed Consistency.pdfDistributed Consistency.pdf
Distributed Consistency.pdf
 
Clean Code @Voxxed Days Cluj 2023 - opening Keynote
Clean Code @Voxxed Days Cluj 2023 - opening KeynoteClean Code @Voxxed Days Cluj 2023 - opening Keynote
Clean Code @Voxxed Days Cluj 2023 - opening Keynote
 
From Web to Flux @DevoxxBE 2023.pptx
From Web to Flux @DevoxxBE 2023.pptxFrom Web to Flux @DevoxxBE 2023.pptx
From Web to Flux @DevoxxBE 2023.pptx
 
Test-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptxTest-Driven Design Insights@DevoxxBE 2023.pptx
Test-Driven Design Insights@DevoxxBE 2023.pptx
 
OAuth in the Wild
OAuth in the WildOAuth in the Wild
OAuth in the Wild
 
The tests are trying to tell you something@VoxxedBucharest.pptx
The tests are trying to tell you something@VoxxedBucharest.pptxThe tests are trying to tell you something@VoxxedBucharest.pptx
The tests are trying to tell you something@VoxxedBucharest.pptx
 
Vertical Slicing Architectures
Vertical Slicing ArchitecturesVertical Slicing Architectures
Vertical Slicing Architectures
 
Software Craftsmanship @Code Camp Festival 2022.pdf
Software Craftsmanship @Code Camp Festival 2022.pdfSoftware Craftsmanship @Code Camp Festival 2022.pdf
Software Craftsmanship @Code Camp Festival 2022.pdf
 
Unit testing - 9 design hints
Unit testing - 9 design hintsUnit testing - 9 design hints
Unit testing - 9 design hints
 
Clean pragmatic architecture @ devflix
Clean pragmatic architecture @ devflixClean pragmatic architecture @ devflix
Clean pragmatic architecture @ devflix
 
Extreme Professionalism - Software Craftsmanship
Extreme Professionalism - Software CraftsmanshipExtreme Professionalism - Software Craftsmanship
Extreme Professionalism - Software Craftsmanship
 
Clean architecture - Protecting the Domain
Clean architecture - Protecting the DomainClean architecture - Protecting the Domain
Clean architecture - Protecting the Domain
 
Refactoring blockers and code smells @jNation 2021
Refactoring   blockers and code smells @jNation 2021Refactoring   blockers and code smells @jNation 2021
Refactoring blockers and code smells @jNation 2021
 
Hibernate and Spring - Unleash the Magic
Hibernate and Spring - Unleash the MagicHibernate and Spring - Unleash the Magic
Hibernate and Spring - Unleash the Magic
 
Integration testing with spring @JAX Mainz
Integration testing with spring @JAX MainzIntegration testing with spring @JAX Mainz
Integration testing with spring @JAX Mainz
 
The Proxy Fairy and the Magic of Spring @JAX Mainz 2021
The Proxy Fairy and the Magic of Spring @JAX Mainz 2021The Proxy Fairy and the Magic of Spring @JAX Mainz 2021
The Proxy Fairy and the Magic of Spring @JAX Mainz 2021
 
Integration testing with spring @snow one
Integration testing with spring @snow oneIntegration testing with spring @snow one
Integration testing with spring @snow one
 
Pure functions and immutable objects @dev nexus 2021
Pure functions and immutable objects @dev nexus 2021Pure functions and immutable objects @dev nexus 2021
Pure functions and immutable objects @dev nexus 2021
 
TDD Mantra
TDD MantraTDD Mantra
TDD Mantra
 

Último

Último (20)

Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone KomSalesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
The architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdfThe architecture of Generative AI for enterprises.pdf
The architecture of Generative AI for enterprises.pdf
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdfAgentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

  • 1. Finding Java's Hidden Performance Traps @victorrentea ♦ VictorRentea.ro Git: h2ps://github.com/victorrentea/performance-profiling.git Branch: devoxx-uk-24 Slides: TODO
  • 2. 👋 I'm Victor Rentea 🇷🇴 Java Champion, PhD(CS) 18 years of coding 10 years of training & consul2ng at 130+ companies on: ❤ Refactoring, Architecture, Unit Tes4ng 🛠 Java: Spring, Hibernate, Performance, Reac7ve Lead of European So8ware Cra8ers (7K developers) Join for free monthly online mee7ngs from 1700 CET Channel: YouTube.com/vrentea Life += 👰 + 👧 + 🙇 + 🐈 @victorrentea hBps://victorrentea.ro 🇷🇴 past events
  • 3. 3 VictorRentea.ro a training by The response *me of your endpoint is too long 😱 🎉 Congratula*ons! 🎉 You made it to produc.on, and You're taking serious load! How to start?
  • 4. 4 VictorRentea.ro a training by Agenda 2. Metrics 1. Distributed Time Sampling 3. Execu*on Profiling
  • 5. 5 VictorRentea.ro a training by 1st challenge: trace requests A B D E F C Performance in a Distributed System
  • 6. 6 VictorRentea.ro a training by order-service.log: 2024-03-20T20:59:24.473 INFO [order,65fb320c74e6dd352b07535cf26506a8,776ae9fa7074b2c5] 60179 --- [nio-8088-exec-8] ...PlaceOrderRest : Placing order for products: [1] Distributed Request Tracing catalog-service.log: 2024-03-20T20:59:24.482 INFO [catalog,65fb320c74e6dd352b07535cf26506a8,9d3b54c924ab886d] 60762 --- [nio-8081-exec-7] ...GetPriceRest : Returning prices for products: [1] TraceId
  • 7. 7 VictorRentea.ro a training by = All systems involved in a flow share the same TraceId §TraceId is generated by the first system in a chain (eg API Gateway) (1) §TraceId is received as a header in the incoming HTTP Request or Message(2) - Standardized by OTEL (h3ps://opentelemetry.io) §TraceId is prefixed to every log line (prev. slide) §TraceId propagates magically with method calls (3) - via ThreadLocal => Submit all async work to Spring-injected executors ⚠ - via ReactorContext in WebFlux => Spring must subscribe to your reacLve chains ⚠ §TraceId is sent as a header in any outgoing Request or Message (4) - ⚠ By injected beans: RestTemplate, WebClient, KaPaSender... Distributed Request Tracing h"ps://spring.io/blog/2022/10/12/observability-with-spring-boot-3/ A B C D no TraceId 1: TraceId = 🆕 3 4 4 4 <<HTTP Header>> TraceId <<Message Header>> TraceId <<HTTP Header>> TraceId 2 2 2
  • 8. 8 VictorRentea.ro a training by §Every system reports the .me spent (span) working on a TraceId §...to a central system that tracks where .me was spent §Only a sample of traces are reported (eg 1%), for performance Distributed Time Budge4ng
  • 9. 9 VictorRentea.ro a training by span trace Distributed Time Budge4ng
  • 10. 10 VictorRentea.ro a training by -details/456 /456 /456 independent calls executed sequen0aly Distributed Time Budge4ng
  • 11. 11 VictorRentea.ro a training by §Iden%cal repeated API calls by accident 😱 §Calls in a loop: GET /items/1, /2, /3... è batch fetch §System-in-the-middle: A à B(just a proxy) à C è AàC §Long chains of REST calls: A à B ... à E, paying network latency §Independent API calls è call in parallel §Long spans not wai.ng for other systems? 🤔 => Zoom in 🔎 Performance An*-PaIerns in a Distributed System
  • 12. 12 VictorRentea.ro a training by Main Suspect: One System
  • 13. 13 VictorRentea.ro a training by Log timestamps log.info("Before call"); !// 2023-05-23 00:01:50.000 !!... !// suspect code log.info("After call"); !// 2023-05-23 00:01:53.100 !!... Measuring Method Run*me – High-School Style t1-t0 long t0 = currentTimeMillis(); List<CommentDto> comments = client.fetchComments(loanId); !// suspect code long t1 = currentTimeMillis(); System.out.println("Took millis: " + (t1 - t0)); => Log Clu.er 😵💫
  • 14. 14 VictorRentea.ro a training by Micrometer 💖 (OpenTelemetry.io) 💖 @Timed !// an AOP proxy intercepts the call and monitors its execution time List<CommentDto> fetchComments(Long id) { !// suspect: API call, heavy DB query, CPU-intensive section, !!... } GET /actuator/prometheus (in a Spring Boot app) : method_timed_seconds_sum{method="fetchComments"} = 100 seconds method_timed_seconds_count{method="fetchComments"} = 400 calls 5 seconds later: method_timed_seconds_sum{...} = 150 – 100 = 50 seconds method_timed_seconds_count{...} = 500 – 400 = 100 calls 4me average[ms] 250 500 ≃ 250 millis/call ≃ 500 millis/call (⚠double over the last 5 seconds)
  • 15. 15 VictorRentea.ro a training by Micrometer h&ps://www.baeldung.com/micrometer !// eg. "95% requests took ≤ 100 ms" @Timed(percentiles = {0.95}, histogram = true) ⚠Usual AOP Pi:alls: @Timed works on methods of beans injected by Spring, and method is NOT final & NOT sta7c; can add CPU overhead to simple methods. !// programatic alternative to @Timed: meterRegistry.timer("name").record(() -> { !/*suspect code!*/ }); meterRegistry.timer("name").record(t1 - t0, MILLISECONDS); !// accumulator meterRegistry.counter("earnings").increment(purchase); !// current value meterRegistry.gauge("player-count", currentPlayers.size());
  • 16. 16 VictorRentea.ro a training by §Micrometer exposes metrics - your own (prev. slide) - criLcal framework metrics to track common issues (next slide) - opLonal: Tomcat, executors, ... §...that are pulled and stored by Prometheus every 5 seconds §Grafana can query them later to: - Display impressive charts - Aggregate metrics across Lme/node/app - Raise alerts §Other (paid) soluPons: - NewRelic, DynaTrace, DataDog, Splunk, AppDynamics Metrics
  • 17. 17 VictorRentea.ro a training by § h3p_server_requests_seconds_sum{method="GET",uri="/profile/showcase/{id}",} 165.04 h3p_server_requests_seconds_count{method="GET",uri="/profile/showcase/{id}",} 542.0 § tomcat_connecLons_current_connecLons{...} 298 ⚠ 98 requests waiLng in queue tomcat_threads_busy_threads{...} 200 ⚠ = using 100% of the threads => thread pool starvaLon? § jvm_gc_memory_allocated_bytes_total 1.8874368E9 jvm_memory_usage_aher_gc_percent{...} 45.15 ⚠ used>30% => increase max heap? § hikaricp_connecLons_acquire_seconds_sum{pool="HikariPool-1",} 80.37 ⚠ conn pool starvaLon? hikaricp_connecLons_acquire_seconds_count{pool="HikariPool-1",} 551.0 è 145 ms waiLng § cache_gets_total{cache="cache1",result="miss",} 0.99 ⚠ 99% miss = bad config? wrong key? cache_gets_total{cache="cache1",result="hit",} 0.01 § logback_events_total{level="error",} 1255.0 ⚠ + 1000 errors over the last 5 seconds => 😱 § my_own_custom_metric xyz Example Metrics Exposed automaMcally at h2p://localhost:8080/actuator/prometheus
  • 18. 18 VictorRentea.ro a training by Finding common performance problems should be done now. Best Prac%ce Capture and Monitor metrics for any recurrent performance issue
  • 19. 19 VictorRentea.ro a training by Bo.leneck in code that does not expose metrics? (library or unknown legacy code) 💡 Let's measure execu<on <me of *all* method? Naive: instrument your JVM with a -javaagent that measures the execuFon any executed method 🚫 OMG, NO! 🚫 Adds huge CPU overhead!
  • 20. 20 VictorRentea.ro a training by Agenda 2. Metrics 1. Distributed Time Sampling 3. Execu*on Profiling
  • 21. 21 VictorRentea.ro a training by Java Flight Recorder The JVM Profiler
  • 22. 28 VictorRentea.ro a training by Java Flight Recorder (JFR) 1. Captures stack traces of running threads every second = samples 2. Records internal JVM events about: - Locks, pinned Carrier Threads (Java 21) - File I/O - Network Sockets - GC - Your own custom events 3. Very-low overhead (<2%) => Can be used in produc%on💖 4. Free since Java 11 (link)
  • 23. 29 VictorRentea.ro a training by Using JFR §Local JVM, eg via GlowRoot, Java Mission Control or IntelliJ ⤴ §Remote JVM: ⏺ ... ⏹ => analyze generated .jfr file in JMC/IntelliJ First, enable JFR via -XX:+FlightRecorder - Auto-start on startup: -XX:StartFlightRecording=.... - Start via command-line (SSH): jcmd <JAVAPID> JFR.start - Start via endpoints: JMX or Spring Boot Actuator - Start via tools: DynaTrace, DataDog, Glowroot (in our experiment)
  • 24. 30 VictorRentea.ro a training by Flame Graph Visualize the samples captured by an execu5on profiler
  • 25. 31 VictorRentea.ro a training by Thread.sleep(Native Method) Flame.f(Flame.java:21) Flame.entry(Flame.java:17) Flame.dummy(Flame.java:14) Thread.sleep(Native Method) Flame.g(Flame.java:24) Flame.entry(Flame.java:18) Flame.dummy(Flame.java:14) 2 samples (33%) 4 samples (66%) JFR captures stack traces of running threads once every second 33% 66%
  • 26. 32 VictorRentea.ro a training by Thread.sleep(Native Method) Flame.g(Flame.java:24) Flame.entry(Flame.java:18) Flame.dummy(Flame.java:14) Thread.sleep(Native Method) Flame.f(Flame.java:21) Flame.entry(Flame.java:17) Flame.dummy(Flame.java:14) 2 samples (33%) 4 samples (66%) Length of each bar is proporFonal to the number of samples (≈Fme) Anatomy of a Flame Graph JFR captures stack traces of running threads once every second Flame.dummy(Flame.java:14)
  • 27. 33 VictorRentea.ro a training by Thread.sleep(Native Method) Thread.sleep(Native Method) Flame.f(Flame.java:21) Flame.g(Flame.java:24) Flame.entry(Flame.java:17) Flame.entry(Flame.java:18) Flame.dummy(Flame.java:14) IdenFcal bars are merged 2 samples (33%) 4 samples (66%) Length of each bar is proporFonal to the number of samples (≈Fme) Anatomy of a Flame Graph JFR captures stack traces of running threads once every second Flame.dummy(Flame.java:14) FOCUS ON SPLIT POINTS
  • 28. 34 VictorRentea.ro a training by Thread.sleep(Native Method) Thread.sleep(Native Method) Flame.f(Flame.java:21) Flame.g(Flame.java:24) Flame.entry(Flame.java:17) Flame.entry(Flame.java:18) Flame.dummy(Flame.java:14) 2 samples (33%) 4 samples (66%) IN REAL LIFE Unsafe.park(Native Method) ......50 more lines↕...... SocketInputStream.socketRead0(Native Method) ......40 more lines 😵💫 ...... 423 210 100 unknown library methods 😱 ... IdenFcal bars are merged Length of each bar is proportional to the number of samples (≈time) Anatomy of a Flame Graph JFR samples the stack traces of running threads every second THREAD BLOCKED NETWORK More samples improve accuracy (heavy load test or 1 week in prod)
  • 29. 35 VictorRentea.ro a training by ProfiledApp Spring Boot 3+ http://localhost:8080 Load Tests Java app (Gatling.io) 23 parallel requests (closed world model) PostgreSQL in a Local Docker localhost:5432 WireMock in a Local Docker http://localhost:9999 remote API responses with 40ms delay Glowroot.org -javaagent:glowroot.jar h:p://localhost:4000 display captured flamegraph Experiment Environment ToxiProxy github.com/Shopify/toxiproxy in a Local Docker localhost:55432 +3ms network latency JFR or jMeter, K6..
  • 31. 38 VictorRentea.ro a training by Glowroot.org §Simple, didacPc profiler - Add VM arg: -javaagent:/path/to/glowroot.jar - Open h3p://localhost:4000 §Monitors - Methods Profiling, displayed as Flame Graphs - SQLs - API calls - Typical bo3lenecks - Data per endpoint §Free - Probably not producLon-ready ⚠
  • 32. 40 VictorRentea.ro a training by Expensive Aspects TransactionInterceptor acquires a JDBC connection before entering the @Transactional method and releases that connection after the end of the method :382 :388
  • 33. 41 VictorRentea.ro a training by JDBC ConnecDon Pool StarvaDon ß 85% of the query endpoint is spent acquiring a JDBC Connection à Waiting for SQL server
  • 34. 42 VictorRentea.ro a training by False Split by JIT OpDmizaDon Fix: a) Warmup JVM: eg. repeat load test (already optimized) b) Record more samples (all future calls calls go to #2 Optimized) Raw Method (#1) Optimized Method (#2) Same method is called via 2 different paths
  • 35. 43 VictorRentea.ro a training by Lock ContenDon Automatic analysis of the .jfr output by JDK Mission Control (JMC) https://wiki.openjdk.org/display/jmc/Releases ß Wai7ng to enter a synchronized method à (naBve C code does not run in JVM => is not recorded by JFR)
  • 36. 44 VictorRentea.ro a training by hashSet.removeAll(list); 100K times ArrayList.contains !! = 100K x 100K ops 🔥Hot (CPU) Method set.size() list.size() Δ Fme 100.000 100 1 millis 100.000 99.999 50 millis 100.000 100.000 2.5 seconds (50x more) 😱
  • 37. 45 VictorRentea.ro a training by War Story: Expor4ng from Mongo Export took 6 days!!?!!#$%$@ The team was concern about misusing an obscure library (jsonl) JFR showed that MongoItemReader loaded data in pages, not streaming (link) Switched to MongoCursorItemReader! 6 days è 3 minutes
  • 38. 46 VictorRentea.ro a training by War Story: Impor4ng in MsSQL INSERT #1: 11.000 rows ... 1 second – OK. INSERT #2: 5.000 rows ... 100 seconds !!?😬 Profiler showed that #2 was NOT using "doInsertBulk" as #1 StackOverflow revealed an issue inser7ng in DATE columns. privacy privacy privacy privacy privacy privacy privacy 100 seconds è <1 second
  • 39. 50 VictorRentea.ro a training by @TransacIonal triggers a proxy to execute before entering the method (a @Transac?onal query method??!) RestTemplate.getForObject() Network request launched outside of the controller method!! (by an unknown @Aspect) Acquire a JDBC connecLon 54% of total ?me => 🤔 connec?on pool starva?on issue? Time waited outside of JVM (in C na?ve code, not seen by JFR) Controller method com.sun.proxy.$Proxy230.findById() Spring Data Repo call SessionImpl.initializeCollection ORM lazy-loading AbstractConnPool.getPoolEntryBlocking Feign client uses Apache H"p Client, which blocks to acquire a HTTP connec?on (default: 2 pooled connec>ons/des>na>on host) 100% of this h"p request (165 samples) Spring proxy👻 Spring proxy👻 in front of the controller method My actual Service method
  • 40. 51 VictorRentea.ro a training by Flagship Feature of Java 21? 🎉 Released on 19 Sep 2023 🎉 Virtual Threads (Project Loom) JVM can reuse the 1MB OS Pla@orm Thread for other tasks when you block in a call to an API/DB Blocking is free!! 🥳
  • 41. 52 VictorRentea.ro a training by JFR can record ❤Virtual Threads❤ (a Spring Boot 3 app running on Java 21)
  • 42. 53 VictorRentea.ro a training by Structured Concurrency (Java 25 LTS🤞) Thread Dump of code using Virtual Threads + Structured Concurrency (Java 25) captures the parent thread { "container": "jdk.internal.misc.ThreadFlock@6373f7e2", "parent": "java.util.concurrent.ThreadPerTaskExecutor@634c5b98", "owner": "83", "threads": [ { "tid": "84", "name": "", "stack": [ "java.base/jdk.internal.vm.Continuation.yield(Continuation.java:357)", "java.base/java.lang.VirtualThread.yieldContinuation(VirtualThread.java:370)", "java.base/java.lang.VirtualThread.park(VirtualThread.java:499)", .... "org.springframework.web.client.RestTemplate.getForObject(RestTemplate.java:378)", "victor.training.java.loom.StructuredConcurrency.fetchBeer(StructuredConcurrency.java:46 .... "java.base/jdk.internal.vm.Continuation.enter(Continuation.java:320)" ] }
  • 43. 55 VictorRentea.ro a training by JFR Key Points ✅ Sees inside unknown code / libraries ❌ CANNOT see into na.ve C++ code (like synchronized keyword) ⚠ Sample-based: requires load tests or record in produc.on ✅ Can be used in produc.on thanks to extra-low overhead <2% ⚠ Tuning requires pa.ence, expecta.ons, and library awareness ⚠ JIT op.miza.ons can bias profile of CPU-intensive code (link) ❌ CANNOT follow thread hops (eg. thread pools) or reac.ve chains ✅ Records Virtual Threads and Structured Concurrency
  • 44. 56 VictorRentea.ro a training by Finding the BoNleneck §Identify slow system (eg w/ Zipkin) §Study its metrics + expose more (eg w/ Grafana) §Profile the execution in production / load-tests - Load the .jfr in IntelliJ (code navigation) and in JMC (auto-analysis) §Optimize code/config, balancing gain vs. risk vs. code mess
  • 45. Tracing Java's Hidden Performance Traps Network Calls Resource Starva7on Concurrency Control CPU work Aspects, JPA, Lombok .. Naive @Transactional Large synchronized Library misuse/misconfig
  • 46. Thank You! victor.rentea@gmail.com ♦ ♦ @victorrentea ♦ VictorRentea.ro Git: https://github.com/victorrentea/performance-profiling.git Branch: devoxx-uk-24 Meet me online at: I was Victor Rentea, trainer & coach for experienced teams