SlideShare uma empresa Scribd logo
1 de 31
Brought to you by
Why User-Mode Threads
Are Good for Performance
Ron Pressler
Architect, Java Platform Group, Oracle
2
Why?
• Why do anything?
• Why do this rather than that?
2 Copyright © 2022, Oracle and/or its affiliates
3 Copyright © 2022, Oracle and/or its affiliates
Concurrency
4
Concurrency and Parallelism
• Parallelism:
• Speed up a task by splitting it into cooperating subtasks
scheduled onto multiple available computing resources.
• Performance measure: latency (time duration)
• Concurrency:
• Schedule available computing resources to multiple largely
independent tasks that compete over them.
• Performance measure: throughput (task/time unit)
4 Copyright © 2022, Oracle and/or its affiliates
5
Little’s Law
5 Copyright © 2022, Oracle and/or its affiliates
6
Little’s Law
6 Copyright © 2022, Oracle and/or its affiliates
In any stable system
7
Little’s Law
7 Copyright © 2022, Oracle and/or its affiliates
In any stable system
with long term averages:
W
λ
λ
}L
λ — arrival rate = exit rate
= throughput
W — duration inside
L — no. items inside
8
Little’s Law
8 Copyright © 2022, Oracle and/or its affiliates
In any stable system
with long term averages:
L = λW
W
λ
λ
}L
λ — arrival rate = exit rate
= throughput
W — duration inside
L — no. items inside
9
Little’s Law
9 Copyright © 2022, Oracle and/or its affiliates
W
L̄
¯
λ
}
L̄ = ¯
λW
L̄
— Capacity
¯
λ
(long-term average)
(long-term average)
10
Little’s Law
10 Copyright © 2022, Oracle and/or its affiliates
W
L̄
}
L̄ = ¯
λW
L̄
— Capacity
¯
λ
> ¯
λ (long-term average)
(momentary)
11
Little’s Law
11 Copyright © 2022, Oracle and/or its affiliates
W
L̄
}
L̄ = ¯
λW
L̄
— Capacity
¯
λ
< ¯
λ (long-term average)
(momentary)
12
Little’s Law — subsystems
12 Copyright © 2022, Oracle and/or its affiliates
L = λW
S
S′

L′

= λ′

W′

13
Little’s Law — subsystems and queues
13 Copyright © 2022, Oracle and/or its affiliates
— Average queue length
}
n
L = λW
n
S
S′

L′

= λ′

W′

14
Little’s Law — subsystems and queues
14 Copyright © 2022, Oracle and/or its affiliates
— Average queue length
}
n
L = λW
n
S
S′

L′

= λ′

W′

λ′

= λ
L′

= L + n
W′

= W + n
W
L
15
Little’s Law — subsystems and queues
15 Copyright © 2022, Oracle and/or its affiliates
— Average queue length
}
n
L = λW
n
S
S′

L′

= λ′

W′

λ′

= λ
L′

= L + n
W′

= W + n
W
L
L′

= λ′

W′

L + n = λ(W + n
W
L
)
L(1 +
n
L
) = λW(1 +
n
L
)
L = λW
16
Little’s Law — other applications
16 Copyright © 2022, Oracle and/or its affiliates
— Interference coefficient
L1 = pλW1
x
L = L1 + L2
L = λ(1 + xL)W
L =
λW
1 − xλW
L2 = (1 − p)λW2
= λ(pW1 + (1 − p)W2)
P(cache hit) = p
P(cache miss) = 1 − p
Caching Interference
= λW + (xλW)2
+ (xλW)3
+ . . .
W
17
Capacity and throughput
17 Copyright © 2022, Oracle and/or its affiliates
¯
λ =
L̄
W
L̄
¯
λ
— Capacity
— Maximum throughput
W — Average latency
18
CPU capacity
18 Copyright © 2022, Oracle and/or its affiliates
¯
λ =
L̄CPU
WCPU
L̄CPU
¯
λ
— #cores
— Maximum throughput
WCPU
— Average CPU consumption
WCPU = 100μs = 1 × 10−4
s (> 1500 cache misses)
L̄ = 30 cores
¯
λ = 30,000 req/s
19
Capacity and throughput
19 Copyright © 2022, Oracle and/or its affiliates
¯
λ =
L̄
W
L̄
¯
λ
— Capacity
— Maximum throughput
W — Average latency
λ/3
λ/3
λ/3
Load
balancer
20
20 Copyright © 2022, Oracle and/or its affiliates
Thread-per-Request — Thread Capacity
Thread-per-Request: A request consumes a thread
(either new or borrowed from a pool) for its duration, W
Request: The domain’s unit of concurrency
Thread: The software unit of concurrency
21
21 Copyright © 2022, Oracle and/or its affiliates
Thread-per-Request — Thread Capacity
#threads = L = λW
Thread-per-Request: A request consumes a thread
(either new or borrowed from a pool) for its duration, W
Request: The domain’s unit of concurrency
Thread: The software unit of concurrency
22
22 Copyright © 2022, Oracle and/or its affiliates
Thread-per-Request — Thread Capacity
with parallel fanout
⇒ For the purpose of estimating #threads, we can consider W to be the sum of all fanout latencies,
even if they’re done in parallel.
#threads = cL = λ(
W
c
) = λW
c — average fanout, i.e. average number of threads consumed by a request
#threads = L = λW
Thread-per-Request: A request consumes a thread
(either new or borrowed from a pool) for its duration, W
Request: The domain’s unit of concurrency
Thread: The software unit of concurrency
23
23 Copyright © 2022, Oracle and/or its affiliates
CPU vs. Thread Capacity with thread/req
WCPU = 100μs
(> 1500 cache misses)
⇒
L̄CPU = #cores
L̄ = ¯
λW
W = 10ms
= (
L̄CPU
WCPU
)W
= L̄CPU(
W
WCPU
)
100 threads per core
WCPU = 50μs
(> 800 cache misses)
⇒
W = 50ms
1000 threads per core
WCPU = 10μs
(> 150 cache misses)
⇒
W = 100ms
10,000 threads per core
24
• Exceptions
• Thread Locals
• Debugger
• Profiler (JFR)
Java Is Made of Threads
Copyright © 2022, Oracle and/or its affiliates
25
simple
less scalable
scalable,
complex,
non-interoperable,
hard to debug/profile
OR
SYNC
Java Blue
ASYNC
Programmer 😀
Hardware ☹
Copyright © 2022, Oracle and/or its affiliates
Programmer ☹
Hardware 😀
26
App
Connections
26
Copyright © 2022, Oracle and/or its affiliates
Virtual Threads
Programmer 😀
Hardware 😀
27
27 Copyright © 2022, Oracle and/or its affiliates
¯
λ =
L̄
W
28
28 Copyright © 2022, Oracle and/or its affiliates
The Impact of Context Switching
(
λ1
λ2
=
W2
W1
)
W = n(μ + t)
n(μ + t)
nμ
= 1 +
t
μ
• Context-switching affects throughput by means of duration, not capacity
• Virtual threads have a faster context-switch than OS threads
• Structured concurrency allows waiting for a set of operations with one context-switch
where
n
t
μ
avg. #operations
avg. wait duration
avg. context-switch duration
μ = 20μs
t = 1μs
⇒ 5% impact
(quite fast)
(quite slow)
⇒
29
Not cooperative, but no time-sharing yet
29 Copyright © 2022, Oracle and/or its affiliates
• Non-cooperative scheduling is more composable
• … but people overestimate the importance of time-sharing in servers
• Effective time-sharing effective when #threads is #cores×105
requires study
30
Summary
30 Copyright © 2022, Oracle and/or its affiliates
• Virtual threads allow higher throughput for the
thread-per-request style — the style harmonious
with the platform — by drastically increasing the
request capacity of the server.
• We can juggle more balls not by adding hands
but by enlarging the arch.
• Context-switching cost could be important, but
aren’t the main reason for the throughput
increase.
Brought to you by
Ron Pressler
ron.pressler@oracle.com
@pressron
• JEP 425: Virtual Threads (Preview)
• The Design of User Mode Threads in Java [video] (why not async/await)

Mais conteúdo relacionado

Semelhante a Why User-Mode Threads Are Good for Performance

Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...
Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...
Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...
JAXLondon2014
 
4392091081755796971 emea10 zero_downtimeoperations
4392091081755796971 emea10 zero_downtimeoperations4392091081755796971 emea10 zero_downtimeoperations
4392091081755796971 emea10 zero_downtimeoperations
Locuto Riorama
 

Semelhante a Why User-Mode Threads Are Good for Performance (20)

Oracle CloudWorld 2023 - A High-Speed Data Ingestion Service in Java Using MQ...
Oracle CloudWorld 2023 - A High-Speed Data Ingestion Service in Java Using MQ...Oracle CloudWorld 2023 - A High-Speed Data Ingestion Service in Java Using MQ...
Oracle CloudWorld 2023 - A High-Speed Data Ingestion Service in Java Using MQ...
 
HTTP/2 comes to Java. What Servlet 4.0 means to you. DevNexus 2015
HTTP/2 comes to Java.  What Servlet 4.0 means to you. DevNexus 2015HTTP/2 comes to Java.  What Servlet 4.0 means to you. DevNexus 2015
HTTP/2 comes to Java. What Servlet 4.0 means to you. DevNexus 2015
 
ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)
 
Functional Programming With Lambdas and Streams in JDK8
 Functional Programming With Lambdas and Streams in JDK8 Functional Programming With Lambdas and Streams in JDK8
Functional Programming With Lambdas and Streams in JDK8
 
New Generation Oracle RAC Performance
New Generation Oracle RAC PerformanceNew Generation Oracle RAC Performance
New Generation Oracle RAC Performance
 
Continuous Performance Monitoring of a Distributed Application [CON4730]
Continuous Performance Monitoring of a Distributed Application [CON4730]Continuous Performance Monitoring of a Distributed Application [CON4730]
Continuous Performance Monitoring of a Distributed Application [CON4730]
 
Oracle WebLogic Server 12c: Seamless Oracle Database Integration (with NEC, O...
Oracle WebLogic Server 12c: Seamless Oracle Database Integration (with NEC, O...Oracle WebLogic Server 12c: Seamless Oracle Database Integration (with NEC, O...
Oracle WebLogic Server 12c: Seamless Oracle Database Integration (with NEC, O...
 
Pushing Java EE outside of the Enterprise - Home Automation
Pushing Java EE outside of the Enterprise - Home AutomationPushing Java EE outside of the Enterprise - Home Automation
Pushing Java EE outside of the Enterprise - Home Automation
 
Oracle ADF Architecture TV - Design - Task Flow Navigation Options
Oracle ADF Architecture TV - Design - Task Flow Navigation OptionsOracle ADF Architecture TV - Design - Task Flow Navigation Options
Oracle ADF Architecture TV - Design - Task Flow Navigation Options
 
Loom Virtual Threads in the JDK 19
Loom Virtual Threads in the JDK 19Loom Virtual Threads in the JDK 19
Loom Virtual Threads in the JDK 19
 
Cloud based database
Cloud based databaseCloud based database
Cloud based database
 
Achieving Continuous Availability for Your Applications with Oracle MAA
Achieving Continuous Availability for Your Applications with Oracle MAAAchieving Continuous Availability for Your Applications with Oracle MAA
Achieving Continuous Availability for Your Applications with Oracle MAA
 
Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL
Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCLBoosting your HTML Apps – Overview of OpenCL and Hello World of WebCL
Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL
 
MySQL Replication Performance in the Cloud
MySQL Replication Performance in the CloudMySQL Replication Performance in the Cloud
MySQL Replication Performance in the Cloud
 
MySQL Shell/AdminAPI - MySQL Architectures Made Easy For All!
MySQL Shell/AdminAPI - MySQL Architectures Made Easy For All!MySQL Shell/AdminAPI - MySQL Architectures Made Easy For All!
MySQL Shell/AdminAPI - MySQL Architectures Made Easy For All!
 
Javantura v6 - JDK 11 & JDK 12 - Dalibor Topic
Javantura v6 - JDK 11 & JDK 12 - Dalibor TopicJavantura v6 - JDK 11 & JDK 12 - Dalibor Topic
Javantura v6 - JDK 11 & JDK 12 - Dalibor Topic
 
Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...
Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...
Pushing Java EE outside of the Enterprise: Home Automation and IoT - David De...
 
CON5898 What Servlet 4.0 Means To You
CON5898 What Servlet 4.0 Means To YouCON5898 What Servlet 4.0 Means To You
CON5898 What Servlet 4.0 Means To You
 
Oracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud InfrastructureOracle Database Migration to Oracle Cloud Infrastructure
Oracle Database Migration to Oracle Cloud Infrastructure
 
4392091081755796971 emea10 zero_downtimeoperations
4392091081755796971 emea10 zero_downtimeoperations4392091081755796971 emea10 zero_downtimeoperations
4392091081755796971 emea10 zero_downtimeoperations
 

Mais de ScyllaDB

Mais de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Why User-Mode Threads Are Good for Performance

  • 1. Brought to you by Why User-Mode Threads Are Good for Performance Ron Pressler Architect, Java Platform Group, Oracle
  • 2. 2 Why? • Why do anything? • Why do this rather than that? 2 Copyright © 2022, Oracle and/or its affiliates
  • 3. 3 Copyright © 2022, Oracle and/or its affiliates Concurrency
  • 4. 4 Concurrency and Parallelism • Parallelism: • Speed up a task by splitting it into cooperating subtasks scheduled onto multiple available computing resources. • Performance measure: latency (time duration) • Concurrency: • Schedule available computing resources to multiple largely independent tasks that compete over them. • Performance measure: throughput (task/time unit) 4 Copyright © 2022, Oracle and/or its affiliates
  • 5. 5 Little’s Law 5 Copyright © 2022, Oracle and/or its affiliates
  • 6. 6 Little’s Law 6 Copyright © 2022, Oracle and/or its affiliates In any stable system
  • 7. 7 Little’s Law 7 Copyright © 2022, Oracle and/or its affiliates In any stable system with long term averages: W λ λ }L λ — arrival rate = exit rate = throughput W — duration inside L — no. items inside
  • 8. 8 Little’s Law 8 Copyright © 2022, Oracle and/or its affiliates In any stable system with long term averages: L = λW W λ λ }L λ — arrival rate = exit rate = throughput W — duration inside L — no. items inside
  • 9. 9 Little’s Law 9 Copyright © 2022, Oracle and/or its affiliates W L̄ ¯ λ } L̄ = ¯ λW L̄ — Capacity ¯ λ (long-term average) (long-term average)
  • 10. 10 Little’s Law 10 Copyright © 2022, Oracle and/or its affiliates W L̄ } L̄ = ¯ λW L̄ — Capacity ¯ λ > ¯ λ (long-term average) (momentary)
  • 11. 11 Little’s Law 11 Copyright © 2022, Oracle and/or its affiliates W L̄ } L̄ = ¯ λW L̄ — Capacity ¯ λ < ¯ λ (long-term average) (momentary)
  • 12. 12 Little’s Law — subsystems 12 Copyright © 2022, Oracle and/or its affiliates L = λW S S′  L′  = λ′  W′ 
  • 13. 13 Little’s Law — subsystems and queues 13 Copyright © 2022, Oracle and/or its affiliates — Average queue length } n L = λW n S S′  L′  = λ′  W′ 
  • 14. 14 Little’s Law — subsystems and queues 14 Copyright © 2022, Oracle and/or its affiliates — Average queue length } n L = λW n S S′  L′  = λ′  W′  λ′  = λ L′  = L + n W′  = W + n W L
  • 15. 15 Little’s Law — subsystems and queues 15 Copyright © 2022, Oracle and/or its affiliates — Average queue length } n L = λW n S S′  L′  = λ′  W′  λ′  = λ L′  = L + n W′  = W + n W L L′  = λ′  W′  L + n = λ(W + n W L ) L(1 + n L ) = λW(1 + n L ) L = λW
  • 16. 16 Little’s Law — other applications 16 Copyright © 2022, Oracle and/or its affiliates — Interference coefficient L1 = pλW1 x L = L1 + L2 L = λ(1 + xL)W L = λW 1 − xλW L2 = (1 − p)λW2 = λ(pW1 + (1 − p)W2) P(cache hit) = p P(cache miss) = 1 − p Caching Interference = λW + (xλW)2 + (xλW)3 + . . . W
  • 17. 17 Capacity and throughput 17 Copyright © 2022, Oracle and/or its affiliates ¯ λ = L̄ W L̄ ¯ λ — Capacity — Maximum throughput W — Average latency
  • 18. 18 CPU capacity 18 Copyright © 2022, Oracle and/or its affiliates ¯ λ = L̄CPU WCPU L̄CPU ¯ λ — #cores — Maximum throughput WCPU — Average CPU consumption WCPU = 100μs = 1 × 10−4 s (> 1500 cache misses) L̄ = 30 cores ¯ λ = 30,000 req/s
  • 19. 19 Capacity and throughput 19 Copyright © 2022, Oracle and/or its affiliates ¯ λ = L̄ W L̄ ¯ λ — Capacity — Maximum throughput W — Average latency λ/3 λ/3 λ/3 Load balancer
  • 20. 20 20 Copyright © 2022, Oracle and/or its affiliates Thread-per-Request — Thread Capacity Thread-per-Request: A request consumes a thread (either new or borrowed from a pool) for its duration, W Request: The domain’s unit of concurrency Thread: The software unit of concurrency
  • 21. 21 21 Copyright © 2022, Oracle and/or its affiliates Thread-per-Request — Thread Capacity #threads = L = λW Thread-per-Request: A request consumes a thread (either new or borrowed from a pool) for its duration, W Request: The domain’s unit of concurrency Thread: The software unit of concurrency
  • 22. 22 22 Copyright © 2022, Oracle and/or its affiliates Thread-per-Request — Thread Capacity with parallel fanout ⇒ For the purpose of estimating #threads, we can consider W to be the sum of all fanout latencies, even if they’re done in parallel. #threads = cL = λ( W c ) = λW c — average fanout, i.e. average number of threads consumed by a request #threads = L = λW Thread-per-Request: A request consumes a thread (either new or borrowed from a pool) for its duration, W Request: The domain’s unit of concurrency Thread: The software unit of concurrency
  • 23. 23 23 Copyright © 2022, Oracle and/or its affiliates CPU vs. Thread Capacity with thread/req WCPU = 100μs (> 1500 cache misses) ⇒ L̄CPU = #cores L̄ = ¯ λW W = 10ms = ( L̄CPU WCPU )W = L̄CPU( W WCPU ) 100 threads per core WCPU = 50μs (> 800 cache misses) ⇒ W = 50ms 1000 threads per core WCPU = 10μs (> 150 cache misses) ⇒ W = 100ms 10,000 threads per core
  • 24. 24 • Exceptions • Thread Locals • Debugger • Profiler (JFR) Java Is Made of Threads Copyright © 2022, Oracle and/or its affiliates
  • 25. 25 simple less scalable scalable, complex, non-interoperable, hard to debug/profile OR SYNC Java Blue ASYNC Programmer 😀 Hardware ☹ Copyright © 2022, Oracle and/or its affiliates Programmer ☹ Hardware 😀
  • 26. 26 App Connections 26 Copyright © 2022, Oracle and/or its affiliates Virtual Threads Programmer 😀 Hardware 😀
  • 27. 27 27 Copyright © 2022, Oracle and/or its affiliates ¯ λ = L̄ W
  • 28. 28 28 Copyright © 2022, Oracle and/or its affiliates The Impact of Context Switching ( λ1 λ2 = W2 W1 ) W = n(μ + t) n(μ + t) nμ = 1 + t μ • Context-switching affects throughput by means of duration, not capacity • Virtual threads have a faster context-switch than OS threads • Structured concurrency allows waiting for a set of operations with one context-switch where n t μ avg. #operations avg. wait duration avg. context-switch duration μ = 20μs t = 1μs ⇒ 5% impact (quite fast) (quite slow) ⇒
  • 29. 29 Not cooperative, but no time-sharing yet 29 Copyright © 2022, Oracle and/or its affiliates • Non-cooperative scheduling is more composable • … but people overestimate the importance of time-sharing in servers • Effective time-sharing effective when #threads is #cores×105 requires study
  • 30. 30 Summary 30 Copyright © 2022, Oracle and/or its affiliates • Virtual threads allow higher throughput for the thread-per-request style — the style harmonious with the platform — by drastically increasing the request capacity of the server. • We can juggle more balls not by adding hands but by enlarging the arch. • Context-switching cost could be important, but aren’t the main reason for the throughput increase.
  • 31. Brought to you by Ron Pressler ron.pressler@oracle.com @pressron • JEP 425: Virtual Threads (Preview) • The Design of User Mode Threads in Java [video] (why not async/await)