Yahoo has long been involved in HBase and its community. In 2013, HBase was offered as a hosted service at Yahoo. Since then, adoption has grown rapidly., and today, HBase is used by numerous teams across the company, helping to enable a diverse set of use cases ranging from near real-time processing to data warehousing.
This was made possible thanks to HBase along with some enhancements to support multi-tenancy and scale. As our clusters continue to grow and use cases become more demanding we are working towards supporting a million regions in a single cluster.
In this keynote, we’ll paint a picture of where Yahoo! is today and the enhancements we have been working on to reach today’s scale as well as supporting a million regions and beyond.
6. Multi-tenancy at Scale
• 35 Tenants
• 800 RegionServers
• 300k regions
• RS Peak 115k requests/sec
7. Divide and Conquer
RS RS…Group A RS
RS RS…Group B RS
RS RS…Group C RS
RS RS…Group D RS
RS RS…Group E RS
8. RegionServer Groups
• Group Membership
• Table
• RegionServer
• Coarse Isolation
• Group customization
• Namespace integration
9. Multi-tenancy at Scale
• 800 RegionServers
• 40 namespaces
• 40 Region server groups
• 4 to 100s of servers
• Up to 2000+ regions per server
• ~1 week rolling upgrade
10. Scaling to 10’s of PBs (and Beyond)
• Scale to Millions of Regions (and Beyond)
• Avoid large regions
• Data Locality
• Network utilization
• Datanode load
• Performance
11. • Region directories under table directory
• HDFS data structure bottleneck
• Namenode Hard Limit of ~6.7 Million
Filesystem Layout
14. Performance Comparison
Test 1M Regions 5M Regions 10M
Regions
Normal Table 20 mins 4 hours 23
mins
DNF
Humongous 15 mins 48
secs
1 hour 27
mins
2 hours 53
mins
Region directory creation time
15. ▪ Lock Thrashing
▪ ZK bottlenecks
› List/Mutate Millions of Znodes
› Notification firehose
▪ State is kept in 3 places
› Cached in master
› Zookeeper
› Meta
ZK Region Assignment
RS
Master
Zookeeper
Meta
Region 1
Region 2
RS
16. ZKLess Region Assignment
▪ ZK no longer involved
▪ Master approves all assignment
▪ State is persisted only in Meta
▪ State is updated by the Master
Meta region
RS
Master Region 1
Region 2
RS
20. Performance Comparison
Scan Meta Assignment Total
1 Meta / 1 RS 56min 19.79min 75.79min
1 Meta / 1 RS 58.63min 28.16min 86.79min
32 Meta / 3
RS
2.91min 12.56min 15.47min
32 Meta / 3
RS
3.6min 12.54min 16.4min
Assignment Time for 3 Million Regions
21. Data Locality
▪ HDFS
› Hadoop Distributed Filesystem
▪ Region Server
› Serves Regions
› Locality of a Region’s Data blocks
22. Favored Nodes
▪ HDFS
› Dictate block placement on file creation
▪ HBase
› Partially completed in Apache HBase
› Select 3 favored nodes per Region
› 1 Node on-rack, 2 Node off-rack
› Restrict Region Assignment