SlideShare uma empresa Scribd logo
1 de 33
Devoxx UK 2015.
Peter Lawrey
Higher Frequency Trading Ltd
Responding rapidly when you have
100+ GB data sets in Java
InfoQ.com: News & Community Site
• 750,000 unique visitors/month
• Published in 4 languages (English, Chinese, Japanese and Brazilian
Portuguese)
• Post content from our QCon conferences
• News 15-20 / week
• Articles 3-4 / week
• Presentations (videos) 12-15 / week
• Interviews 2-3 / week
• Books 1 / month
Watch the video with slide
synchronization on InfoQ.com!
http://www.infoq.com/presentations
/reactive-large-data-java
Presented at QCon London
www.qconlondon.com
Purpose of QCon
- to empower software development by facilitating the spread of
knowledge and innovation
Strategy
- practitioner-driven conference designed for YOU: influencers of
change and innovation in your teams
- speakers and topics driving the evolution and innovation
- connecting and catalyzing the influencers and innovators
Highlights
- attended by more than 12,000 delegates since 2007
- held in 9 cities worldwide
Reactive system design
• Responsive (predictable and sufficient response time)
• Resilient (can cope with failures)
• Elastic (can employ more hardware resources and grow cost
effectively).
Reactive system design
Java responds *much* faster if you can keep your data in memory
Standard JVMs scale well of a specific range of memory sizes. From
• 100 MB – it is hard to make a JVM use less memory than this
(with libraries etc)
• 32 GB – above this the memory efficiency drops and you see
significant GC pauses times.
With larger data sizes, you have to be more aware of how memory is
being used, the lifecycle of your objects and which tools will best
help you utilise more memory effectively.
Reactive system design (Resilient)
The larger the node, the longer the recovery time. It if can
gracefully rebuild at a rate of 100 MB/s (without impact production
too much) the recover time can look like this.
Data Set to recover Time to replicate at 100 MB/s
10 GB < 2 minutes
100 GB 17 minutes
1 TB 3 hours
10 TB 28 hours
100 TB 12 days
1 PB 4 months
Reactive Memory Design
Knowing the limitations of your address space.
• 31–bit memory sizes (up to 2 GB)
• 35-bit memory sizes (up to 32 GB)
• 36-bit memory sizes (up to 64 GB)
• 40-bit memory sizes (up to 1 TB)
• 48-bit memory sizes (up to 256 TB)
• Beyond.
Speedment SQL Relector
• Incrementally replicates your SQL database into Java.
• Access to SQL data with in memory speeds
32 bit operating system (31-bit heap)
Java 7 memory layout
Compress Oops in Java 7 (35-bit)
Using the default of
–XX:+UseCompressedOops
• In a 64-bit JVM, it can use “compressed” memory references.
• This allows the heap to be up to 32 GB without the overhead of 64-
bit object references. The Oracle/OpenJDK JVM still uses 64-bit
class references by default.
• As all object must be 8-byte aligned, the lower 3 bits of the
address are always 000 and don’t need to be stored. This allows
the heap to reference 4 billion * 8-bytes or 32 GB.
• Uses 32-bit references.
Compressed Oops with 8 byte alignment.
Java 8 memory layout
Compress Oops in Java 8 (36 bits)
Using the default of
–XX:+UseCompressedOops
–XX:ObjectAlignmentInBytes=16
• In a 64-bit JVM, it can use “compressed” memory references.
• This allows the heap to be up to 64 GB without the overhead of 64-
bit object references. The Oracle/OpenJDK JVM still uses 64-bit
class references by default.
• As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of
the address are always zeros and don’t need to be stored. This
allows the heap to reference 4 billion * 16-bytes or 64 GB.
• Uses 32-bit references.
64-bit references in Java (100 GB?)
• A small but significant overhead on main memory use.
• Reduces the efficiency of CPU caches as less objects can fit in.
• Can address up to the limit of the free memory. Limited to main
memory.
• GC pauses become a real concern and can take tens of second or
many minutes.
Concurrent Mark Sweep (100 GB?)
64-bit references in Java with a Concurrent Collector
• A small but significant overhead on every memory access to
support concurrent collection.
• Azul Zing with a fully concurrent collector. They can get worst case
pauses down to 1 to 10 milli-seconds.
• RedHat is developing a most concurrent collector (….) which can
support heap sizes around 100 GB with sub-second pauses.
Azul Zing Concurrent Collector (100s GB)
Azul Zing Concurrent Collector (100s GB)
• Can dramatically reduce GC pauses.
• Makes it easier to see and solve non GC pauses.
• Reduced development time to get a system to meet your SLAs.
• Can have lower throughputs.
Terracotta BigMemory
• Scale to multiple machines. Can utilize more smaller machines.
Terracotta BigMemory
• Uses off heap memory to store the bulk of the data.
Hazelcast High Density Memory Store
• Advanced memory caching
solution.
• Utilizes all main memory.
• Supports a wide range of
distributed collections.
NUMA Regions (~40 bits)
• Large machine are limited in how large a single bank of memory
can be. This varies based on the architecture.
• Ivy and Sandy bridge Xeon processors are limited to addressing 40
bits of real memory.
• In Haswell this has been lifted to 46-bits.
• Each Socket has “local” access to a bank of memory, however to
access other bank it may need to use a bus. This is much slower.
• The GC of a JVM can perform very poorly if it doesn’t sit within one
NUMA region. Ideally you want a JVM to use just one NUMA
region.
NUMA Regions (~40 bits)
Virtual address space (48-bit)
Virtual address space (48-bit)
Memory Mapped files (48+ bits)
• Memory mappings are not limited to main memory size.
• 64-bit OS support 128 TiB to 256 TiB virtual memory at once.
• For larger data sizes, memory mapping need to be managed and
cached manually.
• Can be shared between processes.
• A library can hide the 48-bit limitation by caching memory
mapping.
Memory Mapped files.
• No need to have a monolithic JVM just because you want in
memory access to all the data.
Peta Byte JVMs (50+ bits)
• If you are receiving 1 GB/s down a 10 Gig-E line in two weeks you
will have received over 1 PB.
• Managing this much data in large servers is more complex than
your standard JVM.
• Replication is critical. Large complex systems, are more likely to
fail and take longer to recover.
• You can have systems which cannot be recovered in the normal
way. i.e. Unless you recover faster than new data is added, you will
never catch up.
Peta Byte JVMs (50+ bits)
Peta Byte JVMs (50+ bits)
Questions and Answers
peter.lawrey@higherfrequencytrading.com
@PeterLawrey
http://higherfrequencytrading.com
Watch the video with slide synchronization on
InfoQ.com!
http://www.infoq.com/presentations/reactive-
large-data-java

Mais conteúdo relacionado

Mais de C4Media

Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechC4Media
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/awaitC4Media
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaC4Media
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?C4Media
 

Mais de C4Media (20)

Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 

Último

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Último (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Responding Rapidly When You Have 100GB+ Data Sets in Java

  • 1. Devoxx UK 2015. Peter Lawrey Higher Frequency Trading Ltd Responding rapidly when you have 100+ GB data sets in Java
  • 2. InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /reactive-large-data-java
  • 3. Presented at QCon London www.qconlondon.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. Reactive system design • Responsive (predictable and sufficient response time) • Resilient (can cope with failures) • Elastic (can employ more hardware resources and grow cost effectively).
  • 5. Reactive system design Java responds *much* faster if you can keep your data in memory Standard JVMs scale well of a specific range of memory sizes. From • 100 MB – it is hard to make a JVM use less memory than this (with libraries etc) • 32 GB – above this the memory efficiency drops and you see significant GC pauses times. With larger data sizes, you have to be more aware of how memory is being used, the lifecycle of your objects and which tools will best help you utilise more memory effectively.
  • 6. Reactive system design (Resilient) The larger the node, the longer the recovery time. It if can gracefully rebuild at a rate of 100 MB/s (without impact production too much) the recover time can look like this. Data Set to recover Time to replicate at 100 MB/s 10 GB < 2 minutes 100 GB 17 minutes 1 TB 3 hours 10 TB 28 hours 100 TB 12 days 1 PB 4 months
  • 7. Reactive Memory Design Knowing the limitations of your address space. • 31–bit memory sizes (up to 2 GB) • 35-bit memory sizes (up to 32 GB) • 36-bit memory sizes (up to 64 GB) • 40-bit memory sizes (up to 1 TB) • 48-bit memory sizes (up to 256 TB) • Beyond.
  • 8. Speedment SQL Relector • Incrementally replicates your SQL database into Java. • Access to SQL data with in memory speeds
  • 9. 32 bit operating system (31-bit heap)
  • 10. Java 7 memory layout
  • 11. Compress Oops in Java 7 (35-bit) Using the default of –XX:+UseCompressedOops • In a 64-bit JVM, it can use “compressed” memory references. • This allows the heap to be up to 32 GB without the overhead of 64- bit object references. The Oracle/OpenJDK JVM still uses 64-bit class references by default. • As all object must be 8-byte aligned, the lower 3 bits of the address are always 000 and don’t need to be stored. This allows the heap to reference 4 billion * 8-bytes or 32 GB. • Uses 32-bit references.
  • 12. Compressed Oops with 8 byte alignment.
  • 13. Java 8 memory layout
  • 14. Compress Oops in Java 8 (36 bits) Using the default of –XX:+UseCompressedOops –XX:ObjectAlignmentInBytes=16 • In a 64-bit JVM, it can use “compressed” memory references. • This allows the heap to be up to 64 GB without the overhead of 64- bit object references. The Oracle/OpenJDK JVM still uses 64-bit class references by default. • As all object must be 8 or 16-byte aligned, the lower 3 or 4 bits of the address are always zeros and don’t need to be stored. This allows the heap to reference 4 billion * 16-bytes or 64 GB. • Uses 32-bit references.
  • 15. 64-bit references in Java (100 GB?) • A small but significant overhead on main memory use. • Reduces the efficiency of CPU caches as less objects can fit in. • Can address up to the limit of the free memory. Limited to main memory. • GC pauses become a real concern and can take tens of second or many minutes.
  • 17. 64-bit references in Java with a Concurrent Collector • A small but significant overhead on every memory access to support concurrent collection. • Azul Zing with a fully concurrent collector. They can get worst case pauses down to 1 to 10 milli-seconds. • RedHat is developing a most concurrent collector (….) which can support heap sizes around 100 GB with sub-second pauses.
  • 18. Azul Zing Concurrent Collector (100s GB)
  • 19. Azul Zing Concurrent Collector (100s GB) • Can dramatically reduce GC pauses. • Makes it easier to see and solve non GC pauses. • Reduced development time to get a system to meet your SLAs. • Can have lower throughputs.
  • 20. Terracotta BigMemory • Scale to multiple machines. Can utilize more smaller machines.
  • 21. Terracotta BigMemory • Uses off heap memory to store the bulk of the data.
  • 22. Hazelcast High Density Memory Store • Advanced memory caching solution. • Utilizes all main memory. • Supports a wide range of distributed collections.
  • 23. NUMA Regions (~40 bits) • Large machine are limited in how large a single bank of memory can be. This varies based on the architecture. • Ivy and Sandy bridge Xeon processors are limited to addressing 40 bits of real memory. • In Haswell this has been lifted to 46-bits. • Each Socket has “local” access to a bank of memory, however to access other bank it may need to use a bus. This is much slower. • The GC of a JVM can perform very poorly if it doesn’t sit within one NUMA region. Ideally you want a JVM to use just one NUMA region.
  • 27. Memory Mapped files (48+ bits) • Memory mappings are not limited to main memory size. • 64-bit OS support 128 TiB to 256 TiB virtual memory at once. • For larger data sizes, memory mapping need to be managed and cached manually. • Can be shared between processes. • A library can hide the 48-bit limitation by caching memory mapping.
  • 28. Memory Mapped files. • No need to have a monolithic JVM just because you want in memory access to all the data.
  • 29. Peta Byte JVMs (50+ bits) • If you are receiving 1 GB/s down a 10 Gig-E line in two weeks you will have received over 1 PB. • Managing this much data in large servers is more complex than your standard JVM. • Replication is critical. Large complex systems, are more likely to fail and take longer to recover. • You can have systems which cannot be recovered in the normal way. i.e. Unless you recover faster than new data is added, you will never catch up.
  • 30. Peta Byte JVMs (50+ bits)
  • 31. Peta Byte JVMs (50+ bits)
  • 33. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/reactive- large-data-java