SlideShare a Scribd company logo
1 of 14
A study of the Scalability of Stop-
the-World Garbage Collectors on
            Multicores
          Aliya Ibragimova
        University of Fribourg
Agenda
•   Overview
•   Problem Statement
•   Parallel Scavenge description
•   Identifying bottlenecks
•   Methods and solutions
•   Results
•   Conclusion
Overview
• A Stop-the-World Collector performs garbage
  collection while the application is completely
  stopped
• A Parallel Collection uses multiple threads to
  perform Garbage Collection

Parallel Scavenge example available in
OpenJDK7
Problem Statement
   Stop-the-world (STW) algorithm degrades badly beyond
8 – cores on a 48-core NUMA-machine with OpenJDK 7:

  – Does the Stop-the-World design has intrinsic
    limitations?
  – If no what are the limitations of the STW approach?
  – How we can improve the current design?
Parallel Scavenge
Contended locks: GC monitor’s lock
Beginning of parallel phase

                 GC monitor’s lock




                         GC task queue



   GC threads

     Solution: use Michael-Scott lock-free queue
Contended locks: GC monitor’s lock
The end of parallel phase

                      GC monitor’s lock

                      Global
                      counter




Solution: remove redundant synchronization
          use timestamps to avoid race conditions
Contended locks: GC monitor’s lock
Idea: remove GC monitor’s lock

1. Task queue
     Use lock-free task queue

2. Barrier at the end of parallel phase
     Remove redundant synchronization

3. Conditional variable of the GC monitor
     Replace conditional variable with Linux’s
     futex_wait calls.
Lack of NUMA-awareness

        Memory           Memory

       CPU   CPU        CPU   CPU



     NUMA – Non-Uniform Memory access

• Memory access imbalance
• Memory locality
Lack of NUMA-awareness
 • Interleaved spaces
     – map pages from different nodes with round robin
       policy
 • Fragmented spaces
     – thread allocates memory from the fragment
       associated with the node where it is executing
 • Segregated spaces
     – Fragmented space that is restricted to being
       accessed by GC threads running on the same node
Best performance: fragmented spaces in the young space interleaved
in others
Results
Resulting GC, NAPS for NUMA-Aware Parallel Scavenge

Look at the effect of the optimization on 3
benchmarks:
      • SPECjbb2005
      • SPECjvm2008
      • DeCapo
8 memory nodes, 48 cores, 96 GB RAM, Linux 3.0 64-bit
Results
• NAPS improves performance and scalability over
  Parallel Scavenge all most in all cases
• NAPS performance continue to increase up to 48
  cores
• NAPS reduces pause time up to 2.8 times in the best
  case
• NAPS improves responsiveness of applications
Conclusion

• This slide is about next steps…
Questions
If you have any questions you are welcome to ask.

More Related Content

What's hot

Memory Bandwidth QoS
Memory Bandwidth QoSMemory Bandwidth QoS
Memory Bandwidth QoSRohit Jnagal
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsSpeedment, Inc.
 
Yet another introduction to Linux RCU
Yet another introduction to Linux RCUYet another introduction to Linux RCU
Yet another introduction to Linux RCUViller Hsiao
 

What's hot (7)

Memory Bandwidth QoS
Memory Bandwidth QoSMemory Bandwidth QoS
Memory Bandwidth QoS
 
Zk meetup talk
Zk meetup talkZk meetup talk
Zk meetup talk
 
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMsJava one2015 - Work With Hundreds of Hot Terabytes in JVMs
Java one2015 - Work With Hundreds of Hot Terabytes in JVMs
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
Memory models
Memory modelsMemory models
Memory models
 
Yet another introduction to Linux RCU
Yet another introduction to Linux RCUYet another introduction to Linux RCU
Yet another introduction to Linux RCU
 
Stack Frame Protection
Stack Frame ProtectionStack Frame Protection
Stack Frame Protection
 

Viewers also liked

Viewers also liked (7)

Cat orgin ofmfr
Cat orgin ofmfrCat orgin ofmfr
Cat orgin ofmfr
 
Realism of image composits
Realism of image compositsRealism of image composits
Realism of image composits
 
Starting a company in Qatar
Starting a company in QatarStarting a company in Qatar
Starting a company in Qatar
 
Sport tandem
Sport tandemSport tandem
Sport tandem
 
Guidelines toupload
Guidelines touploadGuidelines toupload
Guidelines toupload
 
manfaat dan kandungan alpukat
manfaat dan kandungan alpukatmanfaat dan kandungan alpukat
manfaat dan kandungan alpukat
 
Ah, lord god
Ah, lord godAh, lord god
Ah, lord god
 

Similar to Stop-the-world GCs on milticores

Theta and the Future of Accelerator Programming
Theta and the Future of Accelerator ProgrammingTheta and the Future of Accelerator Programming
Theta and the Future of Accelerator Programminginside-BigData.com
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsIgor Sfiligoi
 
Jvm problem diagnostics
Jvm problem diagnosticsJvm problem diagnostics
Jvm problem diagnosticsDanijel Mitar
 
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time SystemsTaming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time SystemsHeechul Yun
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntuSim Janghoon
 
High Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander KrishnamurthyHigh Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander KrishnamurthyMassimo Talia
 
Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)AllineaSoftware
 
A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...
A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...
A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...Dawei Mu
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxAkshitAgiwal1
 
Harnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsHarnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsUnai Lopez-Novoa
 
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Anna Shymchenko
 
Demystifying Garbage Collection in Java
Demystifying Garbage Collection in JavaDemystifying Garbage Collection in Java
Demystifying Garbage Collection in JavaIgor Braga
 
CUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU ComputingCUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU ComputingJeff Larkin
 
Ways to reduce misses
Ways to reduce missesWays to reduce misses
Ways to reduce missesnellins
 
Talk at dnGASP workshop, April 5, 2011
Talk at dnGASP workshop, April 5, 2011Talk at dnGASP workshop, April 5, 2011
Talk at dnGASP workshop, April 5, 2011Fedor Tsarev
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...Edge AI and Vision Alliance
 

Similar to Stop-the-world GCs on milticores (20)

Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 
Theta and the Future of Accelerator Programming
Theta and the Future of Accelerator ProgrammingTheta and the Future of Accelerator Programming
Theta and the Future of Accelerator Programming
 
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence SimulationsPerformance Optimization of CGYRO for Multiscale Turbulence Simulations
Performance Optimization of CGYRO for Multiscale Turbulence Simulations
 
Jvm problem diagnostics
Jvm problem diagnosticsJvm problem diagnostics
Jvm problem diagnostics
 
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time SystemsTaming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
Taming Non-blocking Caches to Improve Isolation in Multicore Real-Time Systems
 
Kvm performance optimization for ubuntu
Kvm performance optimization for ubuntuKvm performance optimization for ubuntu
Kvm performance optimization for ubuntu
 
High Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander KrishnamurthyHigh Speed Design Closure Techniques-Balachander Krishnamurthy
High Speed Design Closure Techniques-Balachander Krishnamurthy
 
Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)Preparing Codes for Intel Knights Landing (KNL)
Preparing Codes for Intel Knights Landing (KNL)
 
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
Javantura v6 - On the Aspects of Polyglot Programming and Memory Management i...
 
A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...
A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...
A Buffering Approach to Manage I/O in a Normalized Cross-Correlation Earthqua...
 
Project Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptxProject Slides for Website 2020-22.pptx
Project Slides for Website 2020-22.pptx
 
Java Performance Tuning
Java Performance TuningJava Performance Tuning
Java Performance Tuning
 
Harnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsHarnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern Coprocessors
 
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
 
Js on-microcontrollers
Js on-microcontrollersJs on-microcontrollers
Js on-microcontrollers
 
Demystifying Garbage Collection in Java
Demystifying Garbage Collection in JavaDemystifying Garbage Collection in Java
Demystifying Garbage Collection in Java
 
CUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU ComputingCUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU Computing
 
Ways to reduce misses
Ways to reduce missesWays to reduce misses
Ways to reduce misses
 
Talk at dnGASP workshop, April 5, 2011
Talk at dnGASP workshop, April 5, 2011Talk at dnGASP workshop, April 5, 2011
Talk at dnGASP workshop, April 5, 2011
 
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
“Efficiently Map AI and Vision Applications onto Multi-core AI Processors Usi...
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Stop-the-world GCs on milticores

  • 1. A study of the Scalability of Stop- the-World Garbage Collectors on Multicores Aliya Ibragimova University of Fribourg
  • 2. Agenda • Overview • Problem Statement • Parallel Scavenge description • Identifying bottlenecks • Methods and solutions • Results • Conclusion
  • 3. Overview • A Stop-the-World Collector performs garbage collection while the application is completely stopped • A Parallel Collection uses multiple threads to perform Garbage Collection Parallel Scavenge example available in OpenJDK7
  • 4. Problem Statement Stop-the-world (STW) algorithm degrades badly beyond 8 – cores on a 48-core NUMA-machine with OpenJDK 7: – Does the Stop-the-World design has intrinsic limitations? – If no what are the limitations of the STW approach? – How we can improve the current design?
  • 6. Contended locks: GC monitor’s lock Beginning of parallel phase GC monitor’s lock GC task queue GC threads Solution: use Michael-Scott lock-free queue
  • 7. Contended locks: GC monitor’s lock The end of parallel phase GC monitor’s lock Global counter Solution: remove redundant synchronization use timestamps to avoid race conditions
  • 8. Contended locks: GC monitor’s lock Idea: remove GC monitor’s lock 1. Task queue Use lock-free task queue 2. Barrier at the end of parallel phase Remove redundant synchronization 3. Conditional variable of the GC monitor Replace conditional variable with Linux’s futex_wait calls.
  • 9. Lack of NUMA-awareness Memory Memory CPU CPU CPU CPU NUMA – Non-Uniform Memory access • Memory access imbalance • Memory locality
  • 10. Lack of NUMA-awareness • Interleaved spaces – map pages from different nodes with round robin policy • Fragmented spaces – thread allocates memory from the fragment associated with the node where it is executing • Segregated spaces – Fragmented space that is restricted to being accessed by GC threads running on the same node Best performance: fragmented spaces in the young space interleaved in others
  • 11. Results Resulting GC, NAPS for NUMA-Aware Parallel Scavenge Look at the effect of the optimization on 3 benchmarks: • SPECjbb2005 • SPECjvm2008 • DeCapo 8 memory nodes, 48 cores, 96 GB RAM, Linux 3.0 64-bit
  • 12. Results • NAPS improves performance and scalability over Parallel Scavenge all most in all cases • NAPS performance continue to increase up to 48 cores • NAPS reduces pause time up to 2.8 times in the best case • NAPS improves responsiveness of applications
  • 13. Conclusion • This slide is about next steps…
  • 14. Questions If you have any questions you are welcome to ask.