Welcome!
Michael Stack, Software Engineer, Cloudera & HBase PMC Chair
9:00-9:05am
Conference MC Michael Stack, Chair of the HBaseCon 2013 Program Committee, welcomes you to the conference and offers a preview of the day.
The Apache HBase Community: Best Ever and Getting Better
Amr Awadallah, CTO and Co-founder, Cloudera
9:05-9:15am
Amr comments on the explosion of interest in Apache HBase over the past few years, how that interest has influenced the Hadoop stack overall, and why Cloudera considers its involvement in the HBase community to be so important.
State of the Apache HBase Union
Michael Stack & Lars Hofhansl, Architect, Salesforce.com
9:15-9:40am
Release-managers-in-crime Michael and Lars offer a look back, and a look forward, at HBase releases and what they have brought us (and will bring us in the future).
The Apache HBase Ecosystem
Aaron Kimball, Chief Architect, WibiData
9:40-10:05am
Today, HBase stands as Apache Hadoop did years ago, a project with a growing and vibrant community in its own right. In this talk, Aaron will overview some of the projects built on top of HBase that you’ll get a chance to learn about during the day – each of these projects having grown out of a need to use HBase for an application that requires real-time atomic access to data. As an example, he’ll present the motivations for Kiji and how it is helping organizations create amazing new applications using HBase and Hadoop.
Overview of Apache HBase at Facebook (Slides Not Available)
Liyin Tang, Software Engineer, Facebook & HBase PMC Member
10:05-10:30am
In this keynote, you’ll get an overview of how HBase is used at Facebook. Explore Facebook’s applications using HBase as an OLTP service, which require high reliability, efficiency, and scalability, and how HBase can tolerate small network glitches and rack failures. You’ll also learn the use cases for adopting HBase as a batch processing service and various optimizations to scale processing throughput. Finally, learn Facebook’s thoughts about the future of HBase.
3. Goals of HBaseCon 2013
Bring the Apache HBase community together
Encourage contributions to the HBase ecosystem
Share challenges and solutions for HBase
1
2
3
5. HBaseCon 2013 Program Committee
Gary Helmling
Lars Hofhansl
Jonathan Hsieh
Doug Meil
Andrew Purtell
Enis Söztutar
Michael Stack – Chair
Liyin Tang
Architect
Engineer
Software Engineer
Chief Software Architect
Systems Architect
Member of Technical Staff
Software Engineer
Software Engineer
6. Thank You to Our Sponsors
Community Sponsor
Conference Sponsors
Media Sponsors
8. Conference Notes
• Please fill out the overall
conference survey
• Reception is 5:40pm – 8:00pm
in the Yerba Buena Foyer
• Connecting to the internet
• Wireless network = Marriott Conference
• Passcode = db075b
9. Hosted by
The Apache HBase Community:
Best Ever and Getting Better
Amr Awadallah, CTO and Co-founder, Cloudera
@awadallah
10. The Apache HBase Community Has
Never Been Healthier
JIRA ActivityCommits Activity
12. The HBase Ecosystem is Rich and Expanding
HBaseCon 2013 speakers from these companies this year
(logos below the dotted line are net-new from 2012!)
13. Top 5 Reasons Cloudera Loves HBase
Its vibrant community is a benchmark for the entire
Apache Hadoop ecosystem.
It’s a first-class citizen inside the Hadoop stack.
It allows us to offer support services for which a lot of customers
will pay good money.
It draws top-drawer engineer talent to Cloudera.
It gives us an excuse to host this tremendous conference and
throw a big party for the community!
1
2
3
4
5
16. Hosted by
State of the Apache HBase Union
Michael Stack, Software Engineer, Cloudera & HBase PMC
Chair and Lars Hofhansl, Architect, Salesforce.com
17. We are your Release Managers!
• Mr. (0.94.x) Lars Hofhansl
• Michael Stack (0.95.x/0.96.x)
51. Tests
• Cluster test module
• Standalone or cluster
• Sizeable
• x data
• x runtime
• “Borrows” test types from all over
• Netflix “ChaosMonkey”
• Apache Accumulo linked-list dataloss checker
56. kiji.org
• Entity-centric, simple model
• Types, complex, compound types
• Each cell is schema versioned
• Works across MR & REST, etc.
• Production users
• Open-source
62. Related: QoS
Next
• Latency resilience/”Latency tolerance”*
• Bring home the outliers
* http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/people/jeff/Berkeley-Latency-Mar2012.pdf
76. Big Data Apps are hard to build
• Serialization & versioning
• Deployment
• Communication between teams
• Front end, back end, short request, batch, real time…
Every Java developer should be able to build
Big Data Apps – today it’s too hard.
77. Kiji
Kiji is designed to help you build real-time
Big Data Applications on Apache HBase
+ +
100% Apache 2 licensed
79. Leading design decisions
• Store your data in HBase
• Encode it using Avro
• An entity-centric table design
• Manage a data dictionary around tables
• Distribute writes across the cluster
80. Key features
• Work with big data in rich types with schema
evolution
• Guides users to successful application design
• Scala-based modeling language
• Integration with front-end systems
• Deployment of real-time model scoring
81. Kiji
• Go to kiji.org and download the BentoBox
– Zero-config Hadoop + HBase + Kiji instance
– “Batteries included”
• 15-minute quickstart guide and a tutorial with
full source code
82. Come attend !
Want a deep dive on Kiji? KijiCon is tomorrow!
A 1-day workshop of tutorials & hacking
Register @ kijicon.eventbrite.com
83. Conclusions
• Each month shows new peak interest in HBase
• The ecosystem is growing
• Open source technologists are working hand in
hand to make HBase more accessible
• We’d love your help in the community!