SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
Apache	
  HBase	
  1.0	
  Release	
  
Nick	
  Dimiduk,	
  Hortonworks	
  
	
  	
  	
  	
  	
  	
  @xefyr	
  	
  	
  	
  	
  	
  	
  	
  n10k.com	
  
February	
  20,	
  2015	
  
Release	
  1.0	
  
	
  
	
  
“The	
  theme	
  of	
  (eventual)	
  1.0	
  release	
  is	
  to	
  become	
  a	
  stable	
  
base	
  for	
  future	
  1.x	
  series	
  of	
  releases.	
  1.0	
  release	
  will	
  aim	
  to	
  
achieve	
  at	
  least	
  the	
  same	
  level	
  of	
  stability	
  of	
  0.98	
  releases	
  
without	
  introducing	
  too	
  many	
  new	
  features.”	
  
	
  
Enis	
  Söztutar	
  
HBase	
  1.0	
  Release	
  Manager	
  
Agenda	
  
•  A	
  Brief	
  History	
  of	
  HBase	
  
•  What	
  is	
  HBase	
  
•  Major	
  Changes	
  for	
  1.0	
  
•  Upgrade	
  Path	
  
A	
  BRIEF	
  HISTORY	
  OF	
  HBASE	
  
How	
  we	
  got	
  here	
  
The	
  Early	
  Years	
  
•  2006:	
  BigTable	
  paper	
  published	
  by	
  Google	
  
•  2006:	
  HBase	
  development	
  starts	
  
•  2007:	
  HBase	
  added	
  Hadoop	
  contrib	
  
•  2007:	
  Release	
  Hadoop	
  0.15.0	
  
•  2008:	
  Hadoop	
  graduates	
  Incubator	
  
•  2008:	
  HBase	
  becomes	
  Hadoop	
  sub-­‐project	
  
•  2008:	
  Release	
  HBase	
  0.18.1	
  
•  2009:	
  Release	
  HBase	
  0.19.0	
  
•  2009:	
  Release	
  HBase	
  0.20.0	
  
Into	
  Produc_on	
  
•  2010:	
  HBase	
  becomes	
  Apache	
  top-­‐level	
  project	
  
•  2011:	
  Release	
  HBase	
  0.90.0	
  
•  2011:	
  Release	
  HBase	
  0.92.0	
  
•  2011:	
  HBase:	
  The	
  Defini1ve	
  Guide	
  published	
  
•  2012:	
  Release	
  HBase	
  0.94.0	
  
•  2012:	
  First	
  HBaseCon	
  
•  2012:	
  HBase	
  Administra1on	
  Cookbook	
  published	
  
•  2012:	
  HBase	
  In	
  Ac1on	
  published	
  
Modern	
  HBase	
  
•  2013:	
  HBaseCon	
  2013	
  
•  2013:	
  Release	
  HBase	
  0.96.0	
  
•  2013:	
  Apache	
  Phoenix	
  enters	
  Incubator	
  
•  2014:	
  Release	
  HBase	
  0.98.0	
  
•  2014:	
  HBaseCon	
  2014	
  
•  2014:	
  Apache	
  Phoenix	
  graduates	
  Incubator	
  
•  2015:	
  Release	
  HBase	
  1.0	
  
…	
  
•  2016:	
  Release	
  HBase	
  2.0?	
  
WHAT	
  IS	
  HBASE	
  
HBase	
  architecture	
  in	
  5	
  minutes	
  or	
  less	
  
Data	
  Model	
  
1368387247 [3.6 kb png data]"thumb"cf2b
a
cf1
1368394583 7
1368394261 "hello"
"bar"
1368394583 22
1368394925 13.6
1368393847 "world"
"foo"
cf2
1368387684 "almost the loneliest number"1.0001
1368396302 "fourth of July""2011-07-04"
Table A
rowkey
column
family
column
qualifier
timestamp value
Rows
Column Families
Logical	
  Architecture	
  
a
b
d
c
e
f
h
g
i
j
l
k
m
n
p
o
Table A
Region 1
Region 2
Region 3
Region 4
Region Server 7
Table A, Region 1
Table A, Region 2
Table G, Region 1070
Table L, Region 25
Region Server 86
Table A, Region 3
Table C, Region 30
Table F, Region 160
Table F, Region 776
Region Server 367
Table A, Region 4
Table C, Region 17
Table E, Region 52
Table P, Region 1116
Physical	
  Architecture	
  
system and can therefore host any region (figure 3.8). By physically collocating Data
Nodes and RegionServers, you can use the data locality property; that is, RegionServ
ers can theoretically read and write to the local DataNode as the primary DataNode.
You may wonder where the TaskTrackers are in this scheme of things. In some
HBase deployments, the MapReduce framework isn’t deployed at all if the workload i
primarily random reads and writes. In other deployments, where the MapReduce pro
cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region
Servers can run together.
DataNode RegionServer DataNode RegionServer DataNode RegionServer
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host
system and can therefore host any region (figure 3.8)
Nodes and RegionServers, you can use the data locali
ers can theoretically read and write to the local DataN
You may wonder where the TaskTrackers are in t
HBase deployments, the MapReduce framework isn’t d
primarily random reads and writes. In other deployme
cessing is also a part of the workloads, TaskTrackers,
Servers can run together.
DataNode RegionServer DataNode RegionServer
Figure 3.7 HBase RegionServer and HDFS DataNode processes are
system and can therefore host any region (figure 3.8). By physically colloca
Nodes and RegionServers, you can use the data locality property; that is, R
ers can theoretically read and write to the local DataNode as the primary D
You may wonder where the TaskTrackers are in this scheme of thing
HBase deployments, the MapReduce framework isn’t deployed at all if the w
primarily random reads and writes. In other deployments, where the MapR
cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBa
Servers can run together.
DataNode RegionServer DataNode RegionServer DataNode Reg
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on th
system and can therefore host any region (figure 3.8). By physica
Nodes and RegionServers, you can use the data locality property
ers can theoretically read and write to the local DataNode as the p
You may wonder where the TaskTrackers are in this scheme
HBase deployments, the MapReduce framework isn’t deployed at
primarily random reads and writes. In other deployments, where
cessing is also a part of the workloads, TaskTrackers, DataNodes
Servers can run together.
DataNode RegionServer DataNode RegionServer Dat
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically col
Region
Server
Data
Node
Region
Server
Data
Node
Region
Server
Data
Node
Region
Server
Data
Node
...
Nodes and RegionServers, you can use th
ers can theoretically read and write to the
You may wonder where the TaskTrac
HBase deployments, the MapReduce fram
primarily random reads and writes. In oth
cessing is also a part of the workloads, T
Servers can run together.
DataNode RegionServer DataNode
Figure 3.7 HBase RegionServer and HDFS DataNo
Master
Zoo
Keeper
Given that the underlying data is stored in HDFS, which is available to all clients as
a single namespace, all RegionServers have access to the same persisted files in the file
system and can therefore host any region (figure 3.8). By physically collocating Data-
Nodes and RegionServers, you can use the data locality property; that is, RegionServ-
ers can theoretically read and write to the local DataNode as the primary DataNode.
You may wonder where the TaskTrackers are in this scheme of things. In some
HBase deployments, the MapReduce framework isn’t deployed at all if the workload is
primarily random reads and writes. In other deployments, where the MapReduce pro-
cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-
Servers can run together.
DataNode RegionServer DataNode RegionServer DataNode RegionServer
Name
Node
cessing is also a part of the workloads, TaskTrackers, DataNode
Servers can run together.
DataNode RegionServer DataNode RegionServer Da
Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically co
Licensed to Nick Dimiduk <ndimiduk@gmail.com>
HBase
Client
HDFS
HBase
MAJOR	
  CHANGES	
  FOR	
  1.0	
  
What’s	
  all	
  the	
  excitement	
  about?	
  
Stability:	
  Co-­‐Locate	
  Meta	
  with	
  Master	
  
•  Simplify,	
  Improve	
  region	
  assignment	
  reliability	
  
–  Fewer	
  components	
  involved	
  in	
  upda_ng	
  “truth”	
  
•  Master	
  embeds	
  a	
  RegionServer	
  
–  Will	
  host	
  only	
  system	
  tables	
  
–  Baby	
  step	
  towards	
  combining	
  RS/Master	
  into	
  a	
  single	
  hbase	
  daemon	
  
•  Backup	
  masters	
  unchanged	
  
–  Can	
  be	
  configured	
  to	
  host	
  user	
  tables	
  while	
  in	
  standby	
  
•  Plumbing	
  is	
  all	
  there,	
  off	
  by	
  default	
  
	
  
hip://issues.apache.org/jira/browse/HBASE-­‐10569	
  
Availability:	
  Region	
  Replicas	
  
•  Mul_ple	
  RegionServers	
  host	
  a	
  Region	
  
–  One	
  is	
  “primary”,	
  others	
  are	
  “replicas”	
  
–  Only	
  primary	
  accepts	
  writes	
  
•  Client	
  reads	
  against	
  primary	
  only	
  or	
  any	
  
–  Results	
  marked	
  as	
  appropriate	
  
•  Baby	
  step	
  toward	
  quorum	
  reads,	
  writes	
  
	
  
	
  
hip://issues.apache.org/jira/browse/HBASE-­‐10070	
  
hip://www.slideshare.net/HBaseCon/features-­‐session-­‐1	
  
Usability:	
  Client	
  API	
  Cleanup	
  
•  Improved	
  self-­‐consistency	
  
•  Simpler	
  seman_cs	
  
•  Easier	
  to	
  maintain	
  
•  Obvious	
  @InterfaceAudience	
  annota_ons	
  
	
  
	
  
hip://issues.apache.org/jira/browse/HBASE-­‐10602	
  
hip://s.apache.org/hbase-­‐1.0-­‐api	
  
hips://github.com/ndimiduk/hbase-­‐1.0-­‐api-­‐examples	
  
New	
  and	
  Noteworthy	
  
•  Greatly	
  expanded	
  hbase.apache.org/book.html	
  
•  Truncate	
  table	
  shell	
  command	
  
•  Automa_c	
  tuning	
  of	
  global	
  MemStore	
  and	
  BlockCache	
  sizes	
  
•  Basic	
  backpressure	
  mechanism	
  
•  BucketCache	
  easier	
  to	
  configure	
  
•  Compressed	
  BlockCache	
  
•  Pluggable	
  replica_on	
  endpoint	
  
•  A	
  Dockerfile	
  to	
  easily	
  run	
  HBase	
  from	
  source	
  
Under	
  the	
  Covers	
  
•  ZooKeeper	
  abstrac_ons	
  
•  Meta	
  table	
  used	
  for	
  assignment	
  
•  Cell-­‐based	
  read/write	
  path	
  
•  Combining	
  mvcc/seqid	
  
•  Sundry	
  security,	
  tags,	
  labels	
  improvements	
  
Groundwork	
  for	
  2.0	
  
•  More,	
  Smaller	
  Regions	
  
–  Millions,	
  1G	
  or	
  less	
  
–  Less	
  write	
  amplifica_on	
  
–  Splinng	
  hbase:meta	
  
•  Performance	
  
–  More	
  off-­‐heap	
  
–  Less	
  resource	
  conten_on	
  
–  Faster	
  region	
  failover/recovery	
  
–  Mul_ple	
  WALs	
  
–  QoS/Quotas/Mul_-­‐tenancy	
  
	
  
•  Rigging	
  
–  Faster,	
  more	
  intelligent	
  assignment	
  
–  Procedure	
  bus	
  
–  Resumable,	
  query-­‐able	
  opera_ons	
  
•  Other	
  possibili_es	
  
–  Quorum/consensus	
  reads,	
  writes?	
  
–  Hydrabase,	
  mul_-­‐DC	
  consensus?	
  
–  Streaming	
  RPCs?	
  
–  High	
  level	
  coprocessor	
  API	
  
Seman_c	
  Versioning	
  
•  Major/Minor/Patch	
  version	
  numbers	
  
–  Only	
  major/minor	
  pre-­‐1.0	
  
•  Dimensions	
  
–  Client/Server	
  wire	
  compa_bility	
  
–  Server/Server	
  wire	
  and	
  feature	
  compa_bility	
  
–  API	
  compa_bility	
  
–  ABI	
  compa_bility	
  
•  Proposal	
  up	
  for	
  a	
  vote	
  
	
  
hip://s.apache.org/hbase-­‐semver	
  
UPGRADE	
  PATH	
  
Tell	
  it	
  to	
  me	
  straight,	
  how	
  bad	
  is	
  it?	
  
Online/Wire	
  Compa_bility	
  
•  Direct	
  migra_on	
  from	
  0.94	
  supported	
  
–  Looks	
  a	
  lot	
  like	
  upgrade	
  from	
  0.94	
  to	
  0.96:	
  requires	
  down_me	
  
–  Not	
  tested	
  yet,	
  will	
  be	
  before	
  release	
  
•  RPC	
  is	
  backward-­‐compa_ble	
  to	
  0.96	
  
–  Enabled	
  mixing	
  clients	
  and	
  servers	
  across	
  versions	
  
–  So	
  long	
  as	
  no	
  new	
  features	
  are	
  enabled	
  
•  Rolling	
  upgrade	
  "out	
  of	
  the	
  box"	
  from	
  0.98	
  
•  Rolling	
  upgrade	
  "with	
  some	
  massaging"	
  from	
  0.96	
  
–  IE,	
  0.96	
  cannot	
  read	
  HFileV3,	
  the	
  new	
  default	
  
–  not	
  tested	
  yet,	
  will	
  be	
  before	
  release	
  
Client	
  Applica_on	
  Compa_bility	
  
•  API	
  is	
  backward	
  compa_ble	
  to	
  0.96	
  
–  No	
  code	
  change	
  required	
  
–  You’ll	
  start	
  genng	
  new	
  depreca_on	
  warnings	
  
–  We	
  recommend	
  you	
  start	
  using	
  new	
  APIs	
  
•  ABI	
  is	
  NOT	
  backward	
  compa_ble	
  
–  Cannot	
  drop	
  current	
  applica_on	
  jars	
  onto	
  new	
  run_me	
  
–  Recompile	
  your	
  applica_on	
  vs.	
  1.0	
  jars	
  
–  Just	
  like	
  0.96	
  to	
  0.98	
  upgrade	
  
Hadoop	
  Versions	
  
•  Hadoop	
  1.x	
  is	
  NOT	
  supported	
  
–  Bite	
  the	
  bullet;	
  you’ll	
  enjoy	
  the	
  performance	
  benefits	
  
•  Hadoop	
  2.x	
  only	
  
–  Most	
  thoroughly	
  tested	
  on	
  2.4.x,	
  2.5.x	
  
–  Probably	
  works	
  on	
  2.2.x,	
  2.3.x,	
  but	
  less	
  thoroughly	
  tested	
  
	
  
	
  
hips://hbase.apache.org/book/configura_on.html#hadoop	
  
Java	
  Versions	
  
•  JDK	
  6	
  is	
  NOT	
  supported!	
  
•  JDK	
  7	
  is	
  the	
  target	
  run_me	
  
•  JDK	
  8	
  support	
  is	
  experimental	
  
	
  
	
  
hips://hbase.apache.org/book/configura_on.html#hadoop	
  
1.0.0	
  RCs	
  Available	
  Now!	
  
•  Release	
  Candidate	
  vo_ng	
  has	
  commenced	
  	
  
•  Last	
  chance	
  to	
  catch	
  show-­‐stopping	
  bugs	
  
	
  
RELEASE	
  CANDIDATES	
  NOT	
  FOR	
  PRODUCTION	
  USE	
  
	
  
•  Try	
  out	
  the	
  new	
  features	
  
•  Help	
  us	
  test	
  your	
  upgrade	
  path	
  
•  Be	
  a	
  part	
  of	
  history	
  in	
  the	
  making!	
  
•  1.0.0rc5	
  available	
  2015-­‐02-­‐19	
  
	
  
hip://search-­‐hadoop.com/m/DHED40Ih5n	
  
Thanks!	
  
M A N N I N G
Nick Dimiduk
Amandeep Khurana
FOREWORD BY
Michael Stack
hbaseinac_on.com	
  
Nick	
  Dimiduk	
  
	
   	
  	
  	
  	
  	
  github.com/ndimiduk	
  
	
   	
  	
  	
  	
  	
  @xefyr	
  
	
   	
  	
  	
  	
  	
  n10k.com	
  
hip://www.apache.org/dyn/closer.cgi/hbase/	
  

Mais conteúdo relacionado

Mais procurados

Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0enissoz
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for ArchitectsNick Dimiduk
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars GeorgeJAX London
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaCloudera, Inc.
 
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix clusterFive major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix clustermas4share
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the BasicsHBaseCon
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Hadoop hbase mapreduce
Hadoop hbase mapreduceHadoop hbase mapreduce
Hadoop hbase mapreduceFARUK BERKSÖZ
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconYiwei Ma
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBaseCon
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuningAnil Reddy
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the BasicsHBaseCon
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseCloudera, Inc.
 

Mais procurados (20)

Meet HBase 1.0
Meet HBase 1.0Meet HBase 1.0
Meet HBase 1.0
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for Architects
 
Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, ClouderaHadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
Hadoop World 2011: Advanced HBase Schema Design - Lars George, Cloudera
 
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix clusterFive major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
HBase: Just the Basics
HBase: Just the BasicsHBase: Just the Basics
HBase: Just the Basics
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and CompactionHBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Hadoop hbase mapreduce
Hadoop hbase mapreduceHadoop hbase mapreduce
Hadoop hbase mapreduce
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
 
Facebook keynote-nicolas-qcon
Facebook keynote-nicolas-qconFacebook keynote-nicolas-qcon
Facebook keynote-nicolas-qcon
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
 
White paper hadoop performancetuning
White paper hadoop performancetuningWhite paper hadoop performancetuning
White paper hadoop performancetuning
 
Apache HBase - Just the Basics
Apache HBase - Just the BasicsApache HBase - Just the Basics
Apache HBase - Just the Basics
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 
HBase Storage Internals
HBase Storage InternalsHBase Storage Internals
HBase Storage Internals
 

Destaque

Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase强 王
 
Ppt shapes
Ppt shapesPpt shapes
Ppt shapesNag S
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseNick Dimiduk
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best PracticesVenu Anuganti
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceCloudera, Inc.
 
How to start a startup 1-10강
How to start a startup 1-10강How to start a startup 1-10강
How to start a startup 1-10강종익 주
 
Performance and Fault Tolerance for the Netflix API
Performance and Fault Tolerance for the Netflix API Performance and Fault Tolerance for the Netflix API
Performance and Fault Tolerance for the Netflix API Ben Christensen
 
Web intelligence and big data
Web intelligence and big dataWeb intelligence and big data
Web intelligence and big dataRafael Mendes
 
The Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBaseThe Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBaseDataWorks Summit
 
Celi @Codemotion 2014 - Roberto Franchini GlusterFS
Celi @Codemotion 2014 - Roberto Franchini GlusterFSCeli @Codemotion 2014 - Roberto Franchini GlusterFS
Celi @Codemotion 2014 - Roberto Franchini GlusterFSCELI
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)Nick Dimiduk
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014Nick Dimiduk
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the CloudNick Dimiduk
 
Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data Cloude...
Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data  Cloude...Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data  Cloude...
Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data Cloude...Flavio Fonte, PMP, ITIL
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)Amazon Web Services
 

Destaque (20)

Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Ppt shapes
Ppt shapesPpt shapes
Ppt shapes
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
HBase Operations and Best Practices
HBase Operations and Best PracticesHBase Operations and Best Practices
HBase Operations and Best Practices
 
Hive: Loading Data
Hive: Loading DataHive: Loading Data
Hive: Loading Data
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, SalesforceHBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
HBaseCon 2012 | HBase Schema Design - Ian Varley, Salesforce
 
How to start a startup 1-10강
How to start a startup 1-10강How to start a startup 1-10강
How to start a startup 1-10강
 
Performance and Fault Tolerance for the Netflix API
Performance and Fault Tolerance for the Netflix API Performance and Fault Tolerance for the Netflix API
Performance and Fault Tolerance for the Netflix API
 
The analytics edge
The analytics edgeThe analytics edge
The analytics edge
 
Web intelligence and big data
Web intelligence and big dataWeb intelligence and big data
Web intelligence and big data
 
The Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBaseThe Evolution of a Relational Database Layer over HBase
The Evolution of a Relational Database Layer over HBase
 
Celi @Codemotion 2014 - Roberto Franchini GlusterFS
Celi @Codemotion 2014 - Roberto Franchini GlusterFSCeli @Codemotion 2014 - Roberto Franchini GlusterFS
Celi @Codemotion 2014 - Roberto Franchini GlusterFS
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
HBase Data Types (WIP)
HBase Data Types (WIP)HBase Data Types (WIP)
HBase Data Types (WIP)
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014
 
Bring Cartography to the Cloud
Bring Cartography to the CloudBring Cartography to the Cloud
Bring Cartography to the Cloud
 
Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data Cloude...
Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data  Cloude...Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data  Cloude...
Big Data – HBase, integrando hadoop, bi e dw; Montando o seu big data Cloude...
 
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
AWS re:Invent 2016: How Citus Enables Scalable PostgreSQL on AWS (DAT207)
 

Semelhante a Apache HBase 1.0 Release

HBase, crazy dances on the elephant back.
HBase, crazy dances on the elephant back.HBase, crazy dances on the elephant back.
HBase, crazy dances on the elephant back.Roman Nikitchenko
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configurationprabakaranbrick
 
Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015Aaron Benz
 
Unit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxUnit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxBhavanaHotchandani
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统yongboy
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewNisanth Simon
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & developmentShashwat Shriparv
 
Hypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.comHypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.comEdward D. Kim
 
Big Data and Hadoop Guide
Big Data and Hadoop GuideBig Data and Hadoop Guide
Big Data and Hadoop GuideSimplilearn
 

Semelhante a Apache HBase 1.0 Release (20)

Hbase mhug 2015
Hbase mhug 2015Hbase mhug 2015
Hbase mhug 2015
 
HBase, crazy dances on the elephant back.
HBase, crazy dances on the elephant back.HBase, crazy dances on the elephant back.
HBase, crazy dances on the elephant back.
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
 
1.0 vs2.0
1.0 vs2.01.0 vs2.0
1.0 vs2.0
 
4. hbase overview
4. hbase overview4. hbase overview
4. hbase overview
 
Big Data Conference April 2015
Big Data Conference April 2015Big Data Conference April 2015
Big Data Conference April 2015
 
Unit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxUnit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptx
 
HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0HBaseCon 2015: Meet HBase 1.0
HBaseCon 2015: Meet HBase 1.0
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统支撑Facebook消息处理的h base存储系统
支撑Facebook消息处理的h base存储系统
 
Hbase
HbaseHbase
Hbase
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Hbase
HbaseHbase
Hbase
 
Apache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce OverviewApache hadoop, hdfs and map reduce Overview
Apache hadoop, hdfs and map reduce Overview
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & development
 
Hypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.comHypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.com
 
H base
H baseH base
H base
 
Mar 2012 HUG: Hive with HBase
Mar 2012 HUG: Hive with HBaseMar 2012 HUG: Hive with HBase
Mar 2012 HUG: Hive with HBase
 
Big Data and Hadoop Guide
Big Data and Hadoop GuideBig Data and Hadoop Guide
Big Data and Hadoop Guide
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 

Mais de Nick Dimiduk

Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixNick Dimiduk
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101Nick Dimiduk
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low LatencyNick Dimiduk
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)Nick Dimiduk
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop EasyNick Dimiduk
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLNick Dimiduk
 

Mais de Nick Dimiduk (7)

Apache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - PhoenixApache Big Data EU 2015 - Phoenix
Apache Big Data EU 2015 - Phoenix
 
HBase Blockcache 101
HBase Blockcache 101HBase Blockcache 101
HBase Blockcache 101
 
HBase Data Types
HBase Data TypesHBase Data Types
HBase Data Types
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low Latency
 
HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)HBase Client APIs (for webapps?)
HBase Client APIs (for webapps?)
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop Easy
 
Introduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQLIntroduction to Hadoop, HBase, and NoSQL
Introduction to Hadoop, HBase, and NoSQL
 

Último

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Último (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Apache HBase 1.0 Release

  • 1. Apache  HBase  1.0  Release   Nick  Dimiduk,  Hortonworks              @xefyr                n10k.com   February  20,  2015  
  • 2. Release  1.0       “The  theme  of  (eventual)  1.0  release  is  to  become  a  stable   base  for  future  1.x  series  of  releases.  1.0  release  will  aim  to   achieve  at  least  the  same  level  of  stability  of  0.98  releases   without  introducing  too  many  new  features.”     Enis  Söztutar   HBase  1.0  Release  Manager  
  • 3. Agenda   •  A  Brief  History  of  HBase   •  What  is  HBase   •  Major  Changes  for  1.0   •  Upgrade  Path  
  • 4. A  BRIEF  HISTORY  OF  HBASE   How  we  got  here  
  • 5. The  Early  Years   •  2006:  BigTable  paper  published  by  Google   •  2006:  HBase  development  starts   •  2007:  HBase  added  Hadoop  contrib   •  2007:  Release  Hadoop  0.15.0   •  2008:  Hadoop  graduates  Incubator   •  2008:  HBase  becomes  Hadoop  sub-­‐project   •  2008:  Release  HBase  0.18.1   •  2009:  Release  HBase  0.19.0   •  2009:  Release  HBase  0.20.0  
  • 6. Into  Produc_on   •  2010:  HBase  becomes  Apache  top-­‐level  project   •  2011:  Release  HBase  0.90.0   •  2011:  Release  HBase  0.92.0   •  2011:  HBase:  The  Defini1ve  Guide  published   •  2012:  Release  HBase  0.94.0   •  2012:  First  HBaseCon   •  2012:  HBase  Administra1on  Cookbook  published   •  2012:  HBase  In  Ac1on  published  
  • 7. Modern  HBase   •  2013:  HBaseCon  2013   •  2013:  Release  HBase  0.96.0   •  2013:  Apache  Phoenix  enters  Incubator   •  2014:  Release  HBase  0.98.0   •  2014:  HBaseCon  2014   •  2014:  Apache  Phoenix  graduates  Incubator   •  2015:  Release  HBase  1.0   …   •  2016:  Release  HBase  2.0?  
  • 8. WHAT  IS  HBASE   HBase  architecture  in  5  minutes  or  less  
  • 9. Data  Model   1368387247 [3.6 kb png data]"thumb"cf2b a cf1 1368394583 7 1368394261 "hello" "bar" 1368394583 22 1368394925 13.6 1368393847 "world" "foo" cf2 1368387684 "almost the loneliest number"1.0001 1368396302 "fourth of July""2011-07-04" Table A rowkey column family column qualifier timestamp value Rows Column Families
  • 10. Logical  Architecture   a b d c e f h g i j l k m n p o Table A Region 1 Region 2 Region 3 Region 4 Region Server 7 Table A, Region 1 Table A, Region 2 Table G, Region 1070 Table L, Region 25 Region Server 86 Table A, Region 3 Table C, Region 30 Table F, Region 160 Table F, Region 776 Region Server 367 Table A, Region 4 Table C, Region 17 Table E, Region 52 Table P, Region 1116
  • 11. Physical  Architecture   system and can therefore host any region (figure 3.8). By physically collocating Data Nodes and RegionServers, you can use the data locality property; that is, RegionServ ers can theoretically read and write to the local DataNode as the primary DataNode. You may wonder where the TaskTrackers are in this scheme of things. In some HBase deployments, the MapReduce framework isn’t deployed at all if the workload i primarily random reads and writes. In other deployments, where the MapReduce pro cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region Servers can run together. DataNode RegionServer DataNode RegionServer DataNode RegionServer Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host system and can therefore host any region (figure 3.8) Nodes and RegionServers, you can use the data locali ers can theoretically read and write to the local DataN You may wonder where the TaskTrackers are in t HBase deployments, the MapReduce framework isn’t d primarily random reads and writes. In other deployme cessing is also a part of the workloads, TaskTrackers, Servers can run together. DataNode RegionServer DataNode RegionServer Figure 3.7 HBase RegionServer and HDFS DataNode processes are system and can therefore host any region (figure 3.8). By physically colloca Nodes and RegionServers, you can use the data locality property; that is, R ers can theoretically read and write to the local DataNode as the primary D You may wonder where the TaskTrackers are in this scheme of thing HBase deployments, the MapReduce framework isn’t deployed at all if the w primarily random reads and writes. In other deployments, where the MapR cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBa Servers can run together. DataNode RegionServer DataNode RegionServer DataNode Reg Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on th system and can therefore host any region (figure 3.8). By physica Nodes and RegionServers, you can use the data locality property ers can theoretically read and write to the local DataNode as the p You may wonder where the TaskTrackers are in this scheme HBase deployments, the MapReduce framework isn’t deployed at primarily random reads and writes. In other deployments, where cessing is also a part of the workloads, TaskTrackers, DataNodes Servers can run together. DataNode RegionServer DataNode RegionServer Dat Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically col Region Server Data Node Region Server Data Node Region Server Data Node Region Server Data Node ... Nodes and RegionServers, you can use th ers can theoretically read and write to the You may wonder where the TaskTrac HBase deployments, the MapReduce fram primarily random reads and writes. In oth cessing is also a part of the workloads, T Servers can run together. DataNode RegionServer DataNode Figure 3.7 HBase RegionServer and HDFS DataNo Master Zoo Keeper Given that the underlying data is stored in HDFS, which is available to all clients as a single namespace, all RegionServers have access to the same persisted files in the file system and can therefore host any region (figure 3.8). By physically collocating Data- Nodes and RegionServers, you can use the data locality property; that is, RegionServ- ers can theoretically read and write to the local DataNode as the primary DataNode. You may wonder where the TaskTrackers are in this scheme of things. In some HBase deployments, the MapReduce framework isn’t deployed at all if the workload is primarily random reads and writes. In other deployments, where the MapReduce pro- cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region- Servers can run together. DataNode RegionServer DataNode RegionServer DataNode RegionServer Name Node cessing is also a part of the workloads, TaskTrackers, DataNode Servers can run together. DataNode RegionServer DataNode RegionServer Da Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically co Licensed to Nick Dimiduk <ndimiduk@gmail.com> HBase Client HDFS HBase
  • 12. MAJOR  CHANGES  FOR  1.0   What’s  all  the  excitement  about?  
  • 13. Stability:  Co-­‐Locate  Meta  with  Master   •  Simplify,  Improve  region  assignment  reliability   –  Fewer  components  involved  in  upda_ng  “truth”   •  Master  embeds  a  RegionServer   –  Will  host  only  system  tables   –  Baby  step  towards  combining  RS/Master  into  a  single  hbase  daemon   •  Backup  masters  unchanged   –  Can  be  configured  to  host  user  tables  while  in  standby   •  Plumbing  is  all  there,  off  by  default     hip://issues.apache.org/jira/browse/HBASE-­‐10569  
  • 14. Availability:  Region  Replicas   •  Mul_ple  RegionServers  host  a  Region   –  One  is  “primary”,  others  are  “replicas”   –  Only  primary  accepts  writes   •  Client  reads  against  primary  only  or  any   –  Results  marked  as  appropriate   •  Baby  step  toward  quorum  reads,  writes       hip://issues.apache.org/jira/browse/HBASE-­‐10070   hip://www.slideshare.net/HBaseCon/features-­‐session-­‐1  
  • 15. Usability:  Client  API  Cleanup   •  Improved  self-­‐consistency   •  Simpler  seman_cs   •  Easier  to  maintain   •  Obvious  @InterfaceAudience  annota_ons       hip://issues.apache.org/jira/browse/HBASE-­‐10602   hip://s.apache.org/hbase-­‐1.0-­‐api   hips://github.com/ndimiduk/hbase-­‐1.0-­‐api-­‐examples  
  • 16. New  and  Noteworthy   •  Greatly  expanded  hbase.apache.org/book.html   •  Truncate  table  shell  command   •  Automa_c  tuning  of  global  MemStore  and  BlockCache  sizes   •  Basic  backpressure  mechanism   •  BucketCache  easier  to  configure   •  Compressed  BlockCache   •  Pluggable  replica_on  endpoint   •  A  Dockerfile  to  easily  run  HBase  from  source  
  • 17. Under  the  Covers   •  ZooKeeper  abstrac_ons   •  Meta  table  used  for  assignment   •  Cell-­‐based  read/write  path   •  Combining  mvcc/seqid   •  Sundry  security,  tags,  labels  improvements  
  • 18. Groundwork  for  2.0   •  More,  Smaller  Regions   –  Millions,  1G  or  less   –  Less  write  amplifica_on   –  Splinng  hbase:meta   •  Performance   –  More  off-­‐heap   –  Less  resource  conten_on   –  Faster  region  failover/recovery   –  Mul_ple  WALs   –  QoS/Quotas/Mul_-­‐tenancy     •  Rigging   –  Faster,  more  intelligent  assignment   –  Procedure  bus   –  Resumable,  query-­‐able  opera_ons   •  Other  possibili_es   –  Quorum/consensus  reads,  writes?   –  Hydrabase,  mul_-­‐DC  consensus?   –  Streaming  RPCs?   –  High  level  coprocessor  API  
  • 19. Seman_c  Versioning   •  Major/Minor/Patch  version  numbers   –  Only  major/minor  pre-­‐1.0   •  Dimensions   –  Client/Server  wire  compa_bility   –  Server/Server  wire  and  feature  compa_bility   –  API  compa_bility   –  ABI  compa_bility   •  Proposal  up  for  a  vote     hip://s.apache.org/hbase-­‐semver  
  • 20. UPGRADE  PATH   Tell  it  to  me  straight,  how  bad  is  it?  
  • 21. Online/Wire  Compa_bility   •  Direct  migra_on  from  0.94  supported   –  Looks  a  lot  like  upgrade  from  0.94  to  0.96:  requires  down_me   –  Not  tested  yet,  will  be  before  release   •  RPC  is  backward-­‐compa_ble  to  0.96   –  Enabled  mixing  clients  and  servers  across  versions   –  So  long  as  no  new  features  are  enabled   •  Rolling  upgrade  "out  of  the  box"  from  0.98   •  Rolling  upgrade  "with  some  massaging"  from  0.96   –  IE,  0.96  cannot  read  HFileV3,  the  new  default   –  not  tested  yet,  will  be  before  release  
  • 22. Client  Applica_on  Compa_bility   •  API  is  backward  compa_ble  to  0.96   –  No  code  change  required   –  You’ll  start  genng  new  depreca_on  warnings   –  We  recommend  you  start  using  new  APIs   •  ABI  is  NOT  backward  compa_ble   –  Cannot  drop  current  applica_on  jars  onto  new  run_me   –  Recompile  your  applica_on  vs.  1.0  jars   –  Just  like  0.96  to  0.98  upgrade  
  • 23. Hadoop  Versions   •  Hadoop  1.x  is  NOT  supported   –  Bite  the  bullet;  you’ll  enjoy  the  performance  benefits   •  Hadoop  2.x  only   –  Most  thoroughly  tested  on  2.4.x,  2.5.x   –  Probably  works  on  2.2.x,  2.3.x,  but  less  thoroughly  tested       hips://hbase.apache.org/book/configura_on.html#hadoop  
  • 24. Java  Versions   •  JDK  6  is  NOT  supported!   •  JDK  7  is  the  target  run_me   •  JDK  8  support  is  experimental       hips://hbase.apache.org/book/configura_on.html#hadoop  
  • 25. 1.0.0  RCs  Available  Now!   •  Release  Candidate  vo_ng  has  commenced     •  Last  chance  to  catch  show-­‐stopping  bugs     RELEASE  CANDIDATES  NOT  FOR  PRODUCTION  USE     •  Try  out  the  new  features   •  Help  us  test  your  upgrade  path   •  Be  a  part  of  history  in  the  making!   •  1.0.0rc5  available  2015-­‐02-­‐19     hip://search-­‐hadoop.com/m/DHED40Ih5n  
  • 26. Thanks!   M A N N I N G Nick Dimiduk Amandeep Khurana FOREWORD BY Michael Stack hbaseinac_on.com   Nick  Dimiduk              github.com/ndimiduk              @xefyr              n10k.com   hip://www.apache.org/dyn/closer.cgi/hbase/  

Notas do Editor

  1. Now with 1000% more Orca!
  2. Stable Reliable Performant
  3. Improving the distributed system “rigging” Consider enabling in highly volatile environments (like EC2)
  4. “paving the way for new features and 2.0”
  5. How to get from here to there
  6. hbaseugcf (43% off HBase in Action, all formats, valid through Nov 20)