SlideShare uma empresa Scribd logo
1 de 17
2013-08-20
Dave Latham
 History
 Stats
 HowWe Store Data
 Challenges
 MistakesWe Made
 Tips / Patterns
 Future
 Moral of the Story
 2008 –Flurry Analytics for MobileApps
 Sharded MySQL, or
 HBase!
 Launched on 0.18.1 with a 3 node cluster
 Great community
 Now running 0.94.5 (+ patches)
 2 data centers with 2 clusters each
 Bidirectional replication
 1000 slave nodes per cluster
 32 GB RAM, 4 drives (1 or 2TB), 1 GigE, dual quad-
core * 2 HT = 16 procs
 DataNode,TaskTracker, RegionServer
(11GB), 5 Mappers, 2 Reducers
 ~30 tables, 250k regions, 430TB (after LZO)
 2 big tables are about 90% of that
▪ 1 wide table: 3 CF, 4 billion rows, up to 1MM cells per row
▪ 1 tall table: 1 CF, 1 trillion rows, most 1 cell per row
 12 physical nodes
 5 region servers with 20GB heaps on each
 1 table - 8 billion small rows - 500GB (LZO)
 All in block cache (after 20 minute warmup)
 100k-1MM QPS - 99.9% Reads
 2ms mean, 99% <10ms
 25 ms GC pause every 40 seconds
 slow after compaction
 DAO for Java apps
 Requires:
▪ writeRowIndex / readRowIndex
▪ readKeyValue / writeRowContents
 Provides:
▪ save / delete
▪ streamEntities / pagination
▪ MR input formats on entities (rather than Result)
 Uses HTable or asynchbase
 Change row key format
 DAO supports both formats
1. Create new table
2. Writes to both
3. Migrate existing
4. Validate
5. Reads to new table
6. Write to (only) new table
7. Drop old table
 Bottlenecks (not horizontally scalable)
 HMaster (e.g. HLog cleaning falls behind creation
[HBASE-9208])
 NameNode
▪ Disable table / shutdown => many HDFS files at once
▪ Scan table directory => slow region assignments
 ZooKeeper (HBase replication)
 JobTracker (heap)
 META region
 Too many regions (250k)
 Max size 256M -> 1 GB -> 5 GB
 Slow reassignments on failure
 Slow hbck recovery
 Lots of META queries / big client cache
▪ Soft refs can exacerbate
 Slow rolling restarts
 More failures (Common and otherwise)
 Zombie RS
 Latency long tail
 HTable Flush write buffer
 GC pauses
 RegionServer failure
 (SeeTheTail at Scale – Jeff Dean, Luiz André Barroso)
 Shared cluster for MapReduce and live
queries
 IO bound requests hog handler threads
 Even cached reads get slow
 RegionServer falls behind, stays behind
 If the cluster goes down, it takes awhile to come
back
 HDFS-5042 Completed files lost after power failure
 ZOOKEEPER-1277 servers stop serving when lower 32bits of
zxid roll over
 ZOOKEEPER-1731 Unsynchronized access to
ServerCnxnFactory.connectionBeans results in deadlock
 Small region size -> many regions
 Nagle’s
 Trying to solve a crisis you don’t understand
(hbck fixSplitParents)
 Setting up replication
 Custom backup / restore
 CopyTable OOM
 Verification
 Compact data matters (even with
compression)
 Block cache, network not compressed
 Avoid random reads on non cached tables (duh!)
 Write cell fragments, combine at read time to
avoid doing random reads
 compact later - coprocessor?
 can lead to large rows
▪ probabilistic counter
 HDFS HA
 Snapshots (see how it works with 100k
regions on 1000 servers)
 2000 node clusters
 test those bottlenecks
 larger regions, larger HDFS blocks, larger HLogs
 More (independent) clusters
 Load aware balancing?
 Separate RPC priorities for workloads
 0.96
 Scaled 1000x and more on the same DB
 If you’re on the edge you need to understand
your system
 Monitor
 Open Source
 Load test
 Know your load
 Disk or Cache (or SSDs?)
 And maybe some answers

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction Apache HBase, Accelerated: In-Memory Flush and Compaction
Apache HBase, Accelerated: In-Memory Flush and Compaction
 
Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path Off-heaping the Apache HBase Read Path
Off-heaping the Apache HBase Read Path
 
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in PinterestHBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
 
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBaseHBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: ETL for Apache HBase
 
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - ClouderaHBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
 
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 gohbase: Pure Go HBase Client
 
hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0hbaseconasia2017: hbase-2.0.0
hbaseconasia2017: hbase-2.0.0
 
Date-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series DataDate-tiered Compaction Policy for Time-series Data
Date-tiered Compaction Policy for Time-series Data
 
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, PhotobucketHBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
 
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Neteasehbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: Apache HBase at Netease
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
HBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance EvaluationHBase 0.20.0 Performance Evaluation
HBase 0.20.0 Performance Evaluation
 
Apache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at CernerApache HBase in the Enterprise Data Hub at Cerner
Apache HBase in the Enterprise Data Hub at Cerner
 
Real-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the CloudReal-time HBase: Lessons from the Cloud
Real-time HBase: Lessons from the Cloud
 
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase UpdateHBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2015: OpenTSDB and AsyncHBase Update
 
Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa Rolling Out Apache HBase for Mobile Offerings at Visa
Rolling Out Apache HBase for Mobile Offerings at Visa
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devicesHBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
 
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
 
Argus Production Monitoring at Salesforce
Argus Production Monitoring at SalesforceArgus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
 

Semelhante a HBase at Flurry

Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
Cloudera, Inc.
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
NAVER D2
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
Yiwei Ma
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
yongboy
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
Hassan Islamov
 
MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR HadoopMapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoop
abord
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
Jack Levin
 

Semelhante a HBase at Flurry (20)

Hw09 Practical HBase Getting The Most From Your H Base Install
Hw09   Practical HBase  Getting The Most From Your H Base InstallHw09   Practical HBase  Getting The Most From Your H Base Install
Hw09 Practical HBase Getting The Most From Your H Base Install
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
 
Hbase: an introduction
Hbase: an introductionHbase: an introduction
Hbase: an introduction
 
MySQL HA
MySQL HAMySQL HA
MySQL HA
 
Hbase 20141003
Hbase 20141003Hbase 20141003
Hbase 20141003
 
Red Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep DiveRed Hat Storage Server Administration Deep Dive
Red Hat Storage Server Administration Deep Dive
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
Apache hadoop
Apache hadoopApache hadoop
Apache hadoop
 
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
Redundancy for Big Hadoop Clusters is hard  - Stuart PookRedundancy for Big Hadoop Clusters is hard  - Stuart Pook
Redundancy for Big Hadoop Clusters is hard - Stuart Pook
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
 
MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR HadoopMapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoop
 
Whirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIveWhirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIve
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Understanding Hadoop
Understanding HadoopUnderstanding Hadoop
Understanding Hadoop
 
Meet Apache HBase - 2.0
Meet Apache HBase - 2.0Meet Apache HBase - 2.0
Meet Apache HBase - 2.0
 

Último

Último (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

HBase at Flurry

  • 2.  History  Stats  HowWe Store Data  Challenges  MistakesWe Made  Tips / Patterns  Future  Moral of the Story
  • 3.  2008 –Flurry Analytics for MobileApps  Sharded MySQL, or  HBase!  Launched on 0.18.1 with a 3 node cluster  Great community  Now running 0.94.5 (+ patches)  2 data centers with 2 clusters each  Bidirectional replication
  • 4.  1000 slave nodes per cluster  32 GB RAM, 4 drives (1 or 2TB), 1 GigE, dual quad- core * 2 HT = 16 procs  DataNode,TaskTracker, RegionServer (11GB), 5 Mappers, 2 Reducers  ~30 tables, 250k regions, 430TB (after LZO)  2 big tables are about 90% of that ▪ 1 wide table: 3 CF, 4 billion rows, up to 1MM cells per row ▪ 1 tall table: 1 CF, 1 trillion rows, most 1 cell per row
  • 5.  12 physical nodes  5 region servers with 20GB heaps on each  1 table - 8 billion small rows - 500GB (LZO)  All in block cache (after 20 minute warmup)  100k-1MM QPS - 99.9% Reads  2ms mean, 99% <10ms  25 ms GC pause every 40 seconds  slow after compaction
  • 6.  DAO for Java apps  Requires: ▪ writeRowIndex / readRowIndex ▪ readKeyValue / writeRowContents  Provides: ▪ save / delete ▪ streamEntities / pagination ▪ MR input formats on entities (rather than Result)  Uses HTable or asynchbase
  • 7.  Change row key format  DAO supports both formats 1. Create new table 2. Writes to both 3. Migrate existing 4. Validate 5. Reads to new table 6. Write to (only) new table 7. Drop old table
  • 8.  Bottlenecks (not horizontally scalable)  HMaster (e.g. HLog cleaning falls behind creation [HBASE-9208])  NameNode ▪ Disable table / shutdown => many HDFS files at once ▪ Scan table directory => slow region assignments  ZooKeeper (HBase replication)  JobTracker (heap)  META region
  • 9.  Too many regions (250k)  Max size 256M -> 1 GB -> 5 GB  Slow reassignments on failure  Slow hbck recovery  Lots of META queries / big client cache ▪ Soft refs can exacerbate  Slow rolling restarts  More failures (Common and otherwise)  Zombie RS
  • 10.  Latency long tail  HTable Flush write buffer  GC pauses  RegionServer failure  (SeeTheTail at Scale – Jeff Dean, Luiz André Barroso)
  • 11.  Shared cluster for MapReduce and live queries  IO bound requests hog handler threads  Even cached reads get slow  RegionServer falls behind, stays behind  If the cluster goes down, it takes awhile to come back
  • 12.  HDFS-5042 Completed files lost after power failure  ZOOKEEPER-1277 servers stop serving when lower 32bits of zxid roll over  ZOOKEEPER-1731 Unsynchronized access to ServerCnxnFactory.connectionBeans results in deadlock
  • 13.  Small region size -> many regions  Nagle’s  Trying to solve a crisis you don’t understand (hbck fixSplitParents)  Setting up replication  Custom backup / restore  CopyTable OOM  Verification
  • 14.  Compact data matters (even with compression)  Block cache, network not compressed  Avoid random reads on non cached tables (duh!)  Write cell fragments, combine at read time to avoid doing random reads  compact later - coprocessor?  can lead to large rows ▪ probabilistic counter
  • 15.  HDFS HA  Snapshots (see how it works with 100k regions on 1000 servers)  2000 node clusters  test those bottlenecks  larger regions, larger HDFS blocks, larger HLogs  More (independent) clusters  Load aware balancing?  Separate RPC priorities for workloads  0.96
  • 16.  Scaled 1000x and more on the same DB  If you’re on the edge you need to understand your system  Monitor  Open Source  Load test  Know your load  Disk or Cache (or SSDs?)
  • 17.  And maybe some answers