This document discusses Bloomberg's use of HBase for its "medium data" needs in serving financial customers. It outlines challenges including speed, availability, and supporting many users. Bloomberg has invested in HBase to address these, seeing it as applicable to its "medium data" volumes. The document calls for continued work to further bolster HBase's reliability, take advantage of modern hardware, improve multi-tenancy, integrate Spark, and increase analytics efficiency. It expresses optimism for HBase's role in data platforms going forward.
5. HBASEATBLOOMBERG//
September 28: Full Workshop at Bloomberg
September 30: Showcase at Strata Hadoop
Call for papers at:
bloomberglabs.com/data-science
DATA SCIENCE
FOR SOCIAL GOOD:
GOVERNMENT INNOVATION,
PUBLIC HEALTH, ENVIRONMENT,
EDUCATION
6. HBASEATBLOOMBERG//
6
• We have a “medium data” problem…
• Speed and availability are paramount
• Hundreds of thousands of users with
expensive requests
We’ve built many systems
to address
DATA MANAGEMENT TODAY
7. HBASEATBLOOMBERG//
DATA MANAGEMENT CHALLENGES
7
• Single security
analytics on Big Iron
• Replication of
Systems and Data
• Complexity kills
Top 500 Supercomputer list, 2013
>96% Linux. 100% of top 40.
9. HBASEATBLOOMBERG//
THE PREMISE
9
• Can apply big data techniques to our medium
data problem, by addressing gaps in existing
open systems
• HBase is a good bet
• Part of a broader whole
• The Biggest community wins
10. HBASEATBLOOMBERG//
CHALLENGES
Our requirements from HBase are:
• Read performance – fast with low variability
• High availability
• Operational simplicity
• Efficient use of good hardware
• Expressive power
Bloomberg has been investing in all these
aspects of HBase
21. HBASEATBLOOMBERG//
THE FUTURE IS BRIGHT
21
• The state of the “Hadoop Database” union is strong
– Increasing adoption
– Strong foundation
– Great community
• Prominent role in the data & analytics platform of
the future
• Let’s go create the future