SlideShare uma empresa Scribd logo
1 de 39
Avoiding Full GCs with
MemStore-Local Allocation Buffers
                 Todd Lipcon
              todd@cloudera.com
Twitter: @tlipcon      #hbase IRC: tlipcon




            February 22, 2011
Outline

  Background

  HBase and GC

  A solution

  Summary
Intro / who am I?
     Been working on data stuff for a few years
     HBase, HDFS, MR committer
     Cloudera engineer since March ’09
Motivation
     HBase users want to use large heaps
         Bigger block caches make for better hit rates
         Bigger memstores make for larger and more
         efficient flushes
         Machines come with 24G-48G RAM
     But bigger heaps mean longer GC pauses
         Around 10 seconds/GB on my boxes.
         Several minute GC pauses wreak havoc
GC Disasters
   1. Client requests stalled
           1 minute “latency” is just as bad as unavailability
   2. ZooKeeper sessions stop pinging
           The dreaded “Juliet Pause” scenario
   3. Triggers all kinds of other nasty bugs
Yo Concurrent
   Mark-and-Sweep (CMS)!

What part of Concurrent didn’t
      you understand?
Java GC Background
     Java’s GC is generational
         Generational hypothesis: most objects either die
         young or stick around for quite a long time
         Split the heap into two “generations” - young (aka
         new) and old (aka tenured)
     Use different algorithms for the two generations
     We usually recommend -XX:+UseParNewGC
     -XX:+UseConcMarkSweepGC
         Young generation: Parallel New collector
         Old generation: Concurrent-mark-sweep
The Parallel New collector in 60 seconds
     Divide the young generation into eden,
     survivor-0, and survivor-1
     One survivor space is from-space and the other
     is to-space
     Allocate all objects in eden
     When eden fills up, stop the world and copy
     live objects from eden and from-space into
     to-space, swap from and to
         Once an object has been copied back and forth N
         times, copy it to the old generation
         N is the “Tenuring Threshold” (tunable)
The CMS collector in 60 seconds
A bit simplified, sorry...
            Several phases:
               1. initial-mark (stop-the-world) - marks roots (eg
                  thread stacks)
               2. concurrent-mark - traverse references starting at
                  roots, marking what’s live
               3. concurrent-preclean - another pass of the same
                  (catch new objects)
               4. remark (stop-the-world) - any last changed/new
                  objects
               5. concurrent-sweep - clean up dead objects to
                  update free space tracking
            Note: dead objects free up space, but it’s not
            contiguous. We’ll come back to this later!
CMS failure modes
   1. When young generation collection happens, it
      needs space in the old gen. What if CMS is
      already in the middle of concurrent work, but
      there’s no space?
          The dreaded concurrent mode failure! Stop
          the world and collect.
          Solution: lower value of
          -XX:CMSInitiatingOccupancyFraction so
          CMS starts working earlier
   2. What if there’s space in the old generation, but
      not enough contiguous space to promote a
      large object?
          We need to compact the old generation (move all
          free space to be contiguous)
          This is also stop-the-world! Kaboom!
OK... so life sucks.

What can we do about it?
Step 1. Hypothesize
     Setting the initiating occupancy fraction low
     puts off GC, but it eventually happens no
     matter what
     We see promotion failed followed by long
     GC pause, even when 30% of the heap is free.
     Why? Must be fragmentation!
Step 2. Measure
     Let’s make some graphs:
     -XX:PrintFLSStatistics=1
     -XX:PrintCMSStatistics=1
     -XX:+PrintGCDetails
     -XX:+PrintGCDateStamps -verbose:gc
     -Xloggc:/.../logs/gc-$(hostname).log
     FLS Statistics: verbose information about the
     state of the free space inside the old generation

         Free space - total amount of free space
         Num blocks - number of fragments it’s spread into
         Max chunk size
     parse-fls-statistics.py → R and
     ggplot2
3 YCSB workloads, graphed
Workload 1
Insert-only
Workload 2
Read-only with cache churn
Workload 3
Read-only with no cache churn




          So boring I didn’t make a graph!
          All allocations are short lived → stay in young
          gen
Recap
What we have learned?

             Fragmentation is what causes long GC pauses
             Write load seems to cause fragmentation
             Read load (LRU cache churn) isn’t nearly so
             bad1




       1
           At least for my test workloads
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          A
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          AB
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABC
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCD
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDE
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
          Now B’s memstore fills up and flushes. We’re
          left with:
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
          Now B’s memstore fills up and flushes. We’re
          left with:
          A CDEA CEDDAEC ACE CED
Taking a step back
Why does write load cause fragmentation?

          Imagine we have 5 regions, A through E
          We take writes in the following order into an
          empty old generation:
          ABCDEABCEDDAECBACEBCED
          Now B’s memstore fills up and flushes. We’re
          left with:
          A CDEA CEDDAEC ACE CED
          Looks like fragmentation!
Also known as swiss cheese




If every write is exactly the same size, it’s fine -
we’ll fill in those holes. But this is seldom true.
A solution
     Crucial issue is that memory allocations for a
     given memstore aren’t next to each other in
     the old generation.
     When we free an entire memstore we only get
     tiny blocks of free space
     What if we ensure that the memory for a
     memstore is made of large blocks?
     Enter the MemStore Local Allocation Buffer
     (MSLAB)
What’s an MSLAB?
    Each MemStore has an instance of
    MemStoreLAB.
    MemStoreLAB has a 2MB curChunk with
    nextFreeOffset starting at 0.
    Before inserting a KeyValue that points to
    some byte[], copy the data into curChunk
    and increment nextFreeOffset by data.length
    Insert a KeyValue pointing inside curChunk
    instead of the original data.
    If a chunk fills up, just make a new one.
    This is all lock-free, using atomic
    compare-and-swap instructions.
How does this help?
     The original data to be inserted becomes very
     short-lived, and dies in the young generation.
     The only data in the old generation is made of
     2MB chunks
     Each chunk only belongs to one memstore.
     When we flush, we always free up 2MB chunks,
     and avoid the swiss cheese effect.
     Next time we allocate, we need exactly 2MB
     chunks again, and there will definitely be space.
Does it work?
It works!



    Have seen basically zero full
    GCs with MSLAB enabled,
     after days of load testing
Summary
    Most GC pauses are caused by fragmentation
    in the old generation.
    The CMS collector doesn’t compact, so the
    only way it can fight fragmentation is to pause.
    The MSLAB moves all MemStore allocations
    into contiguous 2MB chunks in the old
    generation.
    No more GC pauses!
How to try it
   1. Upgrade to HBase 0.90.1 (included in
      CDH3b4)
   2. Set hbase.hregion.memstore.mslab.enabled to
      true
          Also tunable:
          hbase.hregion.memstore.mslab.chunksize
          (in bytes, default 2M)

          hbase.hregion.memstore.mslab.max.allocation
          (in bytes, default 256K)
   3. Report back your results!
Future work
     Flat 2MB chunk per region → 2GB RAM
     minimum usage for 1000 regions
     incrementColumnValue currently bypasses
     MSLAB for subtle reasons
     We’re doing an extra memory copy into
     MSLAB chunk - we can optimize this out
     Maybe we can relax
     CMSInitiatingOccupancyFraction back up
     a bit?
So I don’t forget...
Corporate shill time




    Cloudera offering HBase training on March 10th.

    15 percent off with hbase meetup code.
todd@cloudera.com
  Twitter: @tlipcon
#hbase IRC: tlipcon

   P.S. we’re hiring!

Mais conteúdo relacionado

Mais procurados

Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Tuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseTuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseAnil Gupta
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberFlink Forward
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & FeaturesDataStax Academy
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanVerverica
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceCloudera, Inc.
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...HostedbyConfluent
 
Kafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presentedKafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presentedSumant Tambe
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Giuseppe Paterno'
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedInGuozhang Wang
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureVARUN SAXENA
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewenconfluent
 

Mais procurados (20)

Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Tuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseTuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBase
 
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, UberDemystifying flink memory allocation and tuning - Roshan Naik, Uber
Demystifying flink memory allocation and tuning - Roshan Naik, Uber
 
Cassandra Introduction & Features
Cassandra Introduction & FeaturesCassandra Introduction & Features
Cassandra Introduction & Features
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
What is the State of my Kafka Streams Application? Unleashing Metrics. | Neil...
 
Kafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presentedKafka tiered-storage-meetup-2022-final-presented
Kafka tiered-storage-meetup-2022-final-presented
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
Application Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and FutureApplication Timeline Server - Past, Present and Future
Application Timeline Server - Past, Present and Future
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan EwenAdvanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
Advanced Streaming Analytics with Apache Flink and Apache Kafka, Stephan Ewen
 

Destaque

Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionCloudera, Inc.
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopMatthew Hayes
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.
 
Hbase运维碎碎念
Hbase运维碎碎念Hbase运维碎碎念
Hbase运维碎碎念haiyuan ning
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopMatthew Hayes
 
Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Chris Aniszczyk
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicasenissoz
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringTuri, Inc.
 
Nail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your PresentationNail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your PresentationBruce Kasanoff
 
The Growth Hacker Wake Up Call
The Growth Hacker Wake Up CallThe Growth Hacker Wake Up Call
The Growth Hacker Wake Up CallRyan Holiday
 
16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your Business16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your BusinessNicoleElmore.com
 
PSFK Future of Work Report 2013
PSFK Future of Work Report 2013PSFK Future of Work Report 2013
PSFK Future of Work Report 2013PSFK
 
5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShare5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShareEugene Cheng
 
The Evolution of Film Editing
The Evolution of Film EditingThe Evolution of Film Editing
The Evolution of Film EditingAdobe
 

Destaque (20)

Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on Hadoop
 
Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
Hbase运维碎碎念
Hbase运维碎碎念Hbase运维碎碎念
Hbase运维碎碎念
 
Hourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on HadoopHourglass: a Library for Incremental Processing on Hadoop
Hourglass: a Library for Incremental Processing on Hadoop
 
Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)Apache Mesos at Twitter (Texas LinuxFest 2014)
Apache Mesos at Twitter (Texas LinuxFest 2014)
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
HBase Read High Availability Using Timeline Consistent Region Replicas
HBase  Read High Availability Using Timeline Consistent Region ReplicasHBase  Read High Availability Using Timeline Consistent Region Replicas
HBase Read High Availability Using Timeline Consistent Region Replicas
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
Overview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature EngineeringOverview of Machine Learning and Feature Engineering
Overview of Machine Learning and Feature Engineering
 
Nail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your PresentationNail the First 60 Seconds of Your Presentation
Nail the First 60 Seconds of Your Presentation
 
The Growth Hacker Wake Up Call
The Growth Hacker Wake Up CallThe Growth Hacker Wake Up Call
The Growth Hacker Wake Up Call
 
16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your Business16 Unique & Innovative Ways to Market your Business
16 Unique & Innovative Ways to Market your Business
 
Slide Wars- The Force Sleeps
Slide Wars- The Force SleepsSlide Wars- The Force Sleeps
Slide Wars- The Force Sleeps
 
PSFK Future of Work Report 2013
PSFK Future of Work Report 2013PSFK Future of Work Report 2013
PSFK Future of Work Report 2013
 
5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShare5 Secrets to Killer Lead Generation Using SlideShare
5 Secrets to Killer Lead Generation Using SlideShare
 
The Evolution of Film Editing
The Evolution of Film EditingThe Evolution of Film Editing
The Evolution of Film Editing
 
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
 
99 Facts on the Future of Business
99 Facts on the Future of Business99 Facts on the Future of Business
99 Facts on the Future of Business
 

Semelhante a HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers

The JVM is your friend
The JVM is your friendThe JVM is your friend
The JVM is your friendKai Koenig
 
Trigger maxl from fdmee
Trigger maxl from fdmeeTrigger maxl from fdmee
Trigger maxl from fdmeeBernard Ash
 
A quick view about Java Virtual Machine
A quick view about Java Virtual MachineA quick view about Java Virtual Machine
A quick view about Java Virtual MachineJoão Santana
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityConSanFrancisco123
 
[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?Alonso Torres
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySematext Group, Inc.
 
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...Lucidworks
 
006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradminScott Miao
 
Low pause GC in HotSpot
Low pause GC in HotSpotLow pause GC in HotSpot
Low pause GC in HotSpotjClarity
 
Lessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterLessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterEugene Kirpichov
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localyticsandrew311
 
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust RewriteTaming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust RewriteScyllaDB
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme MakeoverHBaseCon
 
2009 Eclipse Con
2009 Eclipse Con2009 Eclipse Con
2009 Eclipse Conguest29922
 
CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011Alessandro Nadalin
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-PatternsMatthew Dennis
 

Semelhante a HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers (20)

JVM Magic
JVM MagicJVM Magic
JVM Magic
 
Jvm is-your-friend
Jvm is-your-friendJvm is-your-friend
Jvm is-your-friend
 
The JVM is your friend
The JVM is your friendThe JVM is your friend
The JVM is your friend
 
Trigger maxl from fdmee
Trigger maxl from fdmeeTrigger maxl from fdmee
Trigger maxl from fdmee
 
A quick view about Java Virtual Machine
A quick view about Java Virtual MachineA quick view about Java Virtual Machine
A quick view about Java Virtual Machine
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
Clustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And AvailabilityClustered Architecture Patterns Delivering Scalability And Availability
Clustered Architecture Patterns Delivering Scalability And Availability
 
[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?[Jbcn 2016] Garbage Collectors WTF!?
[Jbcn 2016] Garbage Collectors WTF!?
 
Solr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the UglySolr on Docker - the Good, the Bad and the Ugly
Solr on Docker - the Good, the Bad and the Ugly
 
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
Solr on Docker: the Good, the Bad, and the Ugly - Radu Gheorghe, Sematext Gro...
 
006 performance tuningandclusteradmin
006 performance tuningandclusteradmin006 performance tuningandclusteradmin
006 performance tuningandclusteradmin
 
Low pause GC in HotSpot
Low pause GC in HotSpotLow pause GC in HotSpot
Low pause GC in HotSpot
 
Lessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core clusterLessons learnt on a 2000-core cluster
Lessons learnt on a 2000-core cluster
 
Optimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at LocalyticsOptimizing MongoDB: Lessons Learned at Localytics
Optimizing MongoDB: Lessons Learned at Localytics
 
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust RewriteTaming Go's Memory Usage — and Avoiding a Rust Rewrite
Taming Go's Memory Usage — and Avoiding a Rust Rewrite
 
HBase: Extreme Makeover
HBase: Extreme MakeoverHBase: Extreme Makeover
HBase: Extreme Makeover
 
2009 Eclipse Con
2009 Eclipse Con2009 Eclipse Con
2009 Eclipse Con
 
CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011CMF: a pain in the F @ PHPDay 05-14-2011
CMF: a pain in the F @ PHPDay 05-14-2011
 
Java8 bench gc
Java8 bench gcJava8 bench gc
Java8 bench gc
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-Patterns
 

Mais de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mais de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Último (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers

  • 1. Avoiding Full GCs with MemStore-Local Allocation Buffers Todd Lipcon todd@cloudera.com Twitter: @tlipcon #hbase IRC: tlipcon February 22, 2011
  • 2. Outline Background HBase and GC A solution Summary
  • 3. Intro / who am I? Been working on data stuff for a few years HBase, HDFS, MR committer Cloudera engineer since March ’09
  • 4. Motivation HBase users want to use large heaps Bigger block caches make for better hit rates Bigger memstores make for larger and more efficient flushes Machines come with 24G-48G RAM But bigger heaps mean longer GC pauses Around 10 seconds/GB on my boxes. Several minute GC pauses wreak havoc
  • 5. GC Disasters 1. Client requests stalled 1 minute “latency” is just as bad as unavailability 2. ZooKeeper sessions stop pinging The dreaded “Juliet Pause” scenario 3. Triggers all kinds of other nasty bugs
  • 6. Yo Concurrent Mark-and-Sweep (CMS)! What part of Concurrent didn’t you understand?
  • 7. Java GC Background Java’s GC is generational Generational hypothesis: most objects either die young or stick around for quite a long time Split the heap into two “generations” - young (aka new) and old (aka tenured) Use different algorithms for the two generations We usually recommend -XX:+UseParNewGC -XX:+UseConcMarkSweepGC Young generation: Parallel New collector Old generation: Concurrent-mark-sweep
  • 8. The Parallel New collector in 60 seconds Divide the young generation into eden, survivor-0, and survivor-1 One survivor space is from-space and the other is to-space Allocate all objects in eden When eden fills up, stop the world and copy live objects from eden and from-space into to-space, swap from and to Once an object has been copied back and forth N times, copy it to the old generation N is the “Tenuring Threshold” (tunable)
  • 9. The CMS collector in 60 seconds A bit simplified, sorry... Several phases: 1. initial-mark (stop-the-world) - marks roots (eg thread stacks) 2. concurrent-mark - traverse references starting at roots, marking what’s live 3. concurrent-preclean - another pass of the same (catch new objects) 4. remark (stop-the-world) - any last changed/new objects 5. concurrent-sweep - clean up dead objects to update free space tracking Note: dead objects free up space, but it’s not contiguous. We’ll come back to this later!
  • 10. CMS failure modes 1. When young generation collection happens, it needs space in the old gen. What if CMS is already in the middle of concurrent work, but there’s no space? The dreaded concurrent mode failure! Stop the world and collect. Solution: lower value of -XX:CMSInitiatingOccupancyFraction so CMS starts working earlier 2. What if there’s space in the old generation, but not enough contiguous space to promote a large object? We need to compact the old generation (move all free space to be contiguous) This is also stop-the-world! Kaboom!
  • 11. OK... so life sucks. What can we do about it?
  • 12. Step 1. Hypothesize Setting the initiating occupancy fraction low puts off GC, but it eventually happens no matter what We see promotion failed followed by long GC pause, even when 30% of the heap is free. Why? Must be fragmentation!
  • 13. Step 2. Measure Let’s make some graphs: -XX:PrintFLSStatistics=1 -XX:PrintCMSStatistics=1 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -verbose:gc -Xloggc:/.../logs/gc-$(hostname).log FLS Statistics: verbose information about the state of the free space inside the old generation Free space - total amount of free space Num blocks - number of fragments it’s spread into Max chunk size parse-fls-statistics.py → R and ggplot2
  • 14. 3 YCSB workloads, graphed
  • 17. Workload 3 Read-only with no cache churn So boring I didn’t make a graph! All allocations are short lived → stay in young gen
  • 18. Recap What we have learned? Fragmentation is what causes long GC pauses Write load seems to cause fragmentation Read load (LRU cache churn) isn’t nearly so bad1 1 At least for my test workloads
  • 19. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation:
  • 20. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: A
  • 21. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: AB
  • 22. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABC
  • 23. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCD
  • 24. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDE
  • 25. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED
  • 26. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED Now B’s memstore fills up and flushes. We’re left with:
  • 27. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED Now B’s memstore fills up and flushes. We’re left with: A CDEA CEDDAEC ACE CED
  • 28. Taking a step back Why does write load cause fragmentation? Imagine we have 5 regions, A through E We take writes in the following order into an empty old generation: ABCDEABCEDDAECBACEBCED Now B’s memstore fills up and flushes. We’re left with: A CDEA CEDDAEC ACE CED Looks like fragmentation!
  • 29. Also known as swiss cheese If every write is exactly the same size, it’s fine - we’ll fill in those holes. But this is seldom true.
  • 30. A solution Crucial issue is that memory allocations for a given memstore aren’t next to each other in the old generation. When we free an entire memstore we only get tiny blocks of free space What if we ensure that the memory for a memstore is made of large blocks? Enter the MemStore Local Allocation Buffer (MSLAB)
  • 31. What’s an MSLAB? Each MemStore has an instance of MemStoreLAB. MemStoreLAB has a 2MB curChunk with nextFreeOffset starting at 0. Before inserting a KeyValue that points to some byte[], copy the data into curChunk and increment nextFreeOffset by data.length Insert a KeyValue pointing inside curChunk instead of the original data. If a chunk fills up, just make a new one. This is all lock-free, using atomic compare-and-swap instructions.
  • 32. How does this help? The original data to be inserted becomes very short-lived, and dies in the young generation. The only data in the old generation is made of 2MB chunks Each chunk only belongs to one memstore. When we flush, we always free up 2MB chunks, and avoid the swiss cheese effect. Next time we allocate, we need exactly 2MB chunks again, and there will definitely be space.
  • 34. It works! Have seen basically zero full GCs with MSLAB enabled, after days of load testing
  • 35. Summary Most GC pauses are caused by fragmentation in the old generation. The CMS collector doesn’t compact, so the only way it can fight fragmentation is to pause. The MSLAB moves all MemStore allocations into contiguous 2MB chunks in the old generation. No more GC pauses!
  • 36. How to try it 1. Upgrade to HBase 0.90.1 (included in CDH3b4) 2. Set hbase.hregion.memstore.mslab.enabled to true Also tunable: hbase.hregion.memstore.mslab.chunksize (in bytes, default 2M) hbase.hregion.memstore.mslab.max.allocation (in bytes, default 256K) 3. Report back your results!
  • 37. Future work Flat 2MB chunk per region → 2GB RAM minimum usage for 1000 regions incrementColumnValue currently bypasses MSLAB for subtle reasons We’re doing an extra memory copy into MSLAB chunk - we can optimize this out Maybe we can relax CMSInitiatingOccupancyFraction back up a bit?
  • 38. So I don’t forget... Corporate shill time Cloudera offering HBase training on March 10th. 15 percent off with hbase meetup code.
  • 39. todd@cloudera.com Twitter: @tlipcon #hbase IRC: tlipcon P.S. we’re hiring!