SlideShare a Scribd company logo
1 of 33
© Hortonworks Inc. 2011
HBase and HDFSUnderstanding file system usage in HBase
Enis Söztutar
enis [ at ] apache [dot] org
@enissoz
Page 1
© Hortonworks Inc. 2011
About Me
Page 2
Architecting the Future of Big Data
• In the Hadoop space since 2007
• Committer and PMC Member in Apache HBase and Hadoop
• Working at Hortonworks as member of Technical Staff
• Twitter: @enissoz
© Hortonworks Inc. 2011
Motivation
• HBase as a database depends on FileSystem for many things
• HBase has to work over HDFS, linux & windows
• HBase is the most advanced user of HDFS
• For tuning for IO performance, you have to understand how HBase does
IO
Page 3
Architecting the Future of Big Data
MapReduce
Large files
Few random seek
Batch oriented
High throughput
Failure handling at task level
Computation moves to data
HBase
Large files
A lot of random seek
Latency sensitive
Durability guarantees with sync
Computation generates local data
Large number of open files
© Hortonworks Inc. 2011
Agenda
• Overview of file types in Hbase
• Durability semantics
• IO Fencing / Lease recovery
• Data locality
– Short circuit reads (SSR)
– Checksums
– Block Placement
• Open topics
Page 4
Architecting the Future of Big Data
© Hortonworks Inc. 2011
HBase file types
Architecting the Future of Big Data
Page 5
© Hortonworks Inc. 2011
Overview of file types
• Mainly three types of files in Hbase
– Write Ahead Logs (a.k.a. WALs, logs)
– Data files (a.k.a. store files, hfiles)
– References / symbolic or logical links (0 length files)
• Every file is 3-way replicated
Page 6
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Overview of file types
/hbase/.archive
/hbase/.logs/
/hbase/.logs/server1,60020,1370043265148/
/hbase/.logs/server1,60020,1370043265148/server1%2C60020%2C1370043265148.1370050467720
/hbase/.logs/server1,60020,1370043265105/server1%2C60020%2C1370043265105.1370046867591
…
/hbase/.oldlogs
/hbase/usertable/0711fd70ce0df641e9440e4979d67995/family/449e2fa173c14747b9d2e5..
/hbase/usertable/0711fd70ce0df641e9440e4979d67995/family/9103f38174ab48aa898a4b..
/hbase/table1/565bfb6220ca3edf02ac1f425cf18524/f1/49b32d3ee94543fb9055..
/hbase/.hbase-snapshot/usertable_snapshot/0ae3d2a93d3cf34a7cd30../family/12f114..
…
Page 7
Architecting the Future of Big Data
Write Ahead Logs
Data files
Links
© Hortonworks Inc. 2011
Data Files (HFile)
• Immutable once written
• Generated by flush or compactions (sequential writes)
• Read randomly (preads), or sequentially
• Big in size (flushsize -> tens of GBs)
• All data is in blocks (Hfile blocks not to be confused by HDFS blocks)
• Data blocks have target size:
– BLOCKSIZE in column family descriptor
– 64K by default
– Uncompressed and un-encoded size
• Index blocks (leaf, intermediate, root) have target size:
– hfile.index.block.max.size, 128K by default
• Bloom filter blocks have target size:
– io.storefile.bloom.block.size, 128K by default
Page 8
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Data Files (HFile version 2.x)
Page 9
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Data Files
• IO happens at block boundaries
– Random reads => disk seek + read whole block sequentially
– Read blocks are put into the block cache
– Leaf index blocks and bloom filter blocks also go to the block cache
• Use smaller block sizes for faster random-access
– Smaller read + faster in-block search
– Block index becomes bigger, more memory consumption
• Larger block sizes for faster scans
• Think about how many key values will fit in an average block
• Try compression and Data Block Encoding (PREFIX, DIFF, FAST_DIFF,
PREFIX_TREE)
– Minimizes file sizes + on disk block sizes
Page 10
Architecting the Future of Big Data
Key
length
Value
length
Row
length
Row key Family
length
Family Column
qualifier
Timesta
mp
KeyType Value
Int (4) Int (4) Short(2) Byte[] byte Byte[] Byte[] Long(8) byte Byte[]
© Hortonworks Inc. 2011
Reference Files / Links
• When region is split, “reference files” are created referring to the top or
bottom half of the parent store file according to splitkey
• HBase does not delete data/WAL files just “archives” them
/hbase/.oldlogs
/hbase/.archive
• Logs/hfiles are kept until TTL, and replication or snapshots are not
referring to them
– (hbase.master.logcleaner.ttl, 10min)
– (hbase.master.hfilecleaner.ttl, 5min)
• HFileLink: kind of hard / soft links that is application specific
• HBase snapshots are logical links to files (with backrefs)
Page 11
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Write Ahead Logs
• One logical WAL per region / one physical per regionserver
• Rolled frequently
– hbase.regionserver.logroll.multiplier (0.95)
– hbase.regionserver.hlog.blocksize (default file system block size)
• Chronologically ordered set of files, only last one is open for writing
• Exceeding hbase.regionserver.maxlogs (32) will cause force flush
• Old log files are deleted as a whole
• Every edit is appended
• Sequential writes from WAL, sync very frequently (hundreds of times
per sec)
• Only sequential reads from replication, and crash recovery
• One log file per region server limits the write throughput per Region
Server
Page 12
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Durability
(as in ACID)
Architecting the Future of Big Data
Page 13
© Hortonworks Inc. 2011
Overview of Write Path
1. Client sends the operations over RPC (Put/Delete)
2. Obtain row locks
3. Obtain the next mvcc write number
4. Tag the cells with the mvcc write number
5. Add the cells to the memstores (changes not visible yet)
6. Append a WALEdit to WAL, do not sync
7. Release row locks
8. Sync WAL
9. Advance mvcc, make changes visible
Page 14
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Durability
• 0.94 and before:
– HTable property “DEFERRED_LOG_FLUSH” and
– Mutation.setWriteToWAL(false)
• 0.94 and 0.96:
Page 15
Architecting the Future of Big Data
Durability Semantics
USE_DEFAULT Use global hbase default, OR table default (SYNC_WAL)
SKIP_WAL Do not write updates to WAL
ASYNC_WAL Write entries to WAL asynchronously
(hbase.regionserver.optionallogflushinterval, 1 sec default)
SYNC_WAL Write entries to WAL, flush to datanodes
FSYNC_WAL Write entries to WAL, fsync in datanodes
© Hortonworks Inc. 2011
Durability
• 0.94 Durability setting per Mutation (HBASE-7801) / per table (HBASE-
8375)
• Allows intermixing different durability settings for updates to the same
table
• Durability is chosen from the mutation, unless it is USE_DEFAULT, in
which case Table’s Durability is used
• Limit the amount of time an edit can live in the memstore (HBASE-5930)
– hbase.regionserver.optionalcacheflushinterval
– Default 1hr
– Important for SKIP_WAL
– Cause a flush if there are unflushed edits that are older than
optionalcacheflushinterval
Page 16
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Durability
Page 17
Architecting the Future of Big Data
public enum Durability {
USE_DEFAULT,
SKIP_WAL,
ASYNC_WAL,
SYNC_WAL,
FSYNC_WAL
}
Per Table:
HTableDescriptor htd = new HTableDescriptor("myTable");
htd.setDurability(Durability.ASYNC_WAL);
admin.createTable(htd);
Shell:
hbase(main):007:0> create 't12', 'f1', DURABILITY=>'ASYNC_WAL’
Per mutation:
Put put = new Put(rowKey);
put.setDurability(Durability.ASYNC_WAL);
table.put(put);
© Hortonworks Inc. 2011
Durability (Hflush / Hsync)
• Hflush() : Flush the data packet down the datanode pipeline. Wait for
ack’s.
• Hsync() : Flush the data packet down the pipeline. Have datanodes
execute FSYNC equivalent. Wait for ack’s.
• hflush is currently default, hsync() usage in HBase is not implemented
(HBASE-5954). Also not optimized (2x slow) and only Hadoop 2.0.
• hflush does not lose data, unless all 3 replicas die without syncing to
disk (datacenter power failure)
• Ensure that log is replicated 3 times
hbase.regionserver.hlog.tolerable.lowreplication
defaults to FileSystem default replication count (3 for HDFS)
Page 18
Architecting the Future of Big Data
public interface Syncable {
public void hflush() throws IOException;
public void hsync() throws IOException;
}
© Hortonworks Inc. 2011
Page 19
Architecting the Future of Big Data
© Hortonworks Inc. 2011
IO Fencing
Fencing is the process of isolating a node of a computer
cluster or protecting shared resources when a node appears
to be malfunctioning
Page 20
Architecting the Future of Big Data
© Hortonworks Inc. 2011
IO Fencing
Page 21
Architecting the Future of Big Data
Region1Client
Region Server A
(dying)
WAL
Region1
Region Server B
Append+sync
ack
edit
edit
WAL
Append+sync
ack
Master
Zookeeper
RegionServer A znode deleted
assign
Region1 Region Server A
Region 2 …
… …
YouAreDeadException
abort
RegionServer A session timeout
--
B
RegionServer A session timeout
Client
© Hortonworks Inc. 2011
IO Fencing
• Split Brain
• Ensure that a region is only hosted by a single region server at any time
• If master thinks that region server no longer hosts the region, RS
should not be able to accept and sync() updates
• Master renames the region server logs directory on HDFS:
– Current WAL cannot be rolled, new log file cannot be created
– For each WAL, before replaying recoverLease() is called
– recoverLease => lease recovery + block recovery
– Ensure that WAL is closed, and all data is visible (file length)
• Guarantees for region data files:
– Compactions => Remove files + add files
– Flushed => Allowed since resulting data is idempotent
• HBASE-2231, HBASE-7878, HBASE-8449
Page 22
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Data Locality
Short circuit reads, checksums, block placement
Architecting the Future of Big Data
Page 23
© Hortonworks Inc. 2011
HDFS local reads (short circuit reads)
• Bypasses the datanode layer and directly
goes to the OS files
• Hadoop 1.x implementation:
– DFSClient asks for local paths for a block to the
local datanode
– Datanode checks whether the user has
permission
– Client gets the path for the block, opens the file
with FileInputStream
hdfs-site.xml
dfs.block.local-path-access.user = hbase
dfs.datanode.data.dir.perm = 750
hbase-site.xml
dfs.client.read.shortcircuit = true
Page 24
Architecting the Future of Big Data
RegionServer
Hadoop FileSystem
DFSClient
Datanode
OS Filesystem (ext3)
Disks
Disks
Disks
HBase Client
RPC
RPC
BlockReader
© Hortonworks Inc. 2011
HDFS local reads (short circuit reads)
• Hadoop 2.0 implementation (HDFS-347)
– Keep the legacy implementation
– Use Unix Domain sockets to pass the File Descriptor (FD)
– Datanode opens the block file and passes FD to the BlockReaderLocal running in
Regionserver process
– More secure than previous implementation
– Windows also supports domain sockets, need to implement native APIs
• Local buffer size dfs.client.read.shortcircuit.buffer.size
– BlockReaderLocal will fill this whole buffer everytime HBase will try to read an
HfileBlock
– dfs.client.read.shortcircuit.buffer.size = 1MB vs 64KB Hfile block size
– SSR buffer is a direct buffer (in Hadoop 2, not in Hadoop 1)
– # regions x # stores x #avg store files x # avg blocks per file x SSR buffer size
– 10 regions x 2 x 4 x (1GB / 64MB) x 1 MB = 1.28GB
non-heap memory usage
Page 25
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Checksums
• HDFS checksums are not inlined.
• Two files per block, one for data, one for
checksums (HDFS-2699)
• Random positioned read causes 2 seeks
• HBase checksums comes with 0.94 (HDP
1.2+). HBASE-5074.
Page 26
Architecting the Future of Big Data
blk_123456789
.blk_123456789.meta
: Data chunk (dfs.bytes-per-checksum, 512 bytes)
: Checksum chunk (4 bytes)
© Hortonworks Inc. 2011
Checksums
Page 27
Architecting the Future of Big Data
• HFile version 2.1 writes checksums per
Hfile block
• HDFS checksum verification is bypassed
on block read, will be done by HBase
• If checksum fails, we go back to reading
checksums from HDFS for “some time”
• Due to double checksum bug(HDFS-3429)
in remote reads in Hadoop 1, not enabled
by default for now. Benchmark it yourself
hbase.regionserver.checksum.verify = true
hbase.hstore.bytes.per.checksum = 16384
hbase.hstore.checksum.algorithm = CRC32C
Never set this:
dfs.client.read.shortcircuit.skip.checksum = false
HFile
: Hfile data block chunk
: Checksum chunk
Hfile block
: Block header
© Hortonworks Inc. 2011
Rack 1 / Server 1
DataNode
Default Block Placement Policy
Page 28
Architecting the Future of Big Data
b1
RegionServer
Region A
Region B
StoreFile
StoreFile
StoreFile
StoreFile
StoreFile
b2 b2
b9 b1
b1
b2
b3
b2
b1 b2b1
Rack N / Server M
DataNode
b2
b1
b1
Rack L / Server K
DataNode
b2
b1
Rack X / Server Y
DataNode
b1b2 b2
b3
RegionServer RegionServer RegionServer
© Hortonworks Inc. 2011
Data locality for HBase
• Poor data locality when the region is moved:
– As a result of load balancing
– Region server crash + failover
• Most of the data won’t be local unless the files are compacted
• Idea (from Facebook): Regions have affiliated nodes
(primary, secondary, tertiary), HBASE-4755
• When writing a data file, give hints to the NN that we want these
locations for block replicas (HDFS-2576)
• LB should assign the region to one of the affiliated nodes on server
crash
– Keep data locality
– SSR will still work
• Reduces data loss probability
Page 29
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Rack X / Server Y
RegionServer
Rack L / Server K
RegionServer
Rack N / Server M
RegionServer
Rack 1 / Server 1
Default Block Placement Policy
Page 30
Architecting the Future of Big Data
RegionServer
Region A
StoreFile
StoreFile
StoreFile
Region B
StoreFile
StoreFile
DataNode
b1
b2 b2
b9 b1
b1
b2
b3
b2
b1 b2b1
DataNode
b1
b2
b2
b9b1
b2
b1
DataNode
b1
b2
b2
b9
b2
b1
DataNode
b1
b2
b3
b2
b1
© Hortonworks Inc. 2011
Other considerations
• HBase riding over Namenode HA
– Both Hadoop 1 (NFS based) and Hadoop 2 HA (JQM, etc)
– Heavily tested with full stack HA
• Retry HDFS operations
• Isolate FileSystem usage from HBase internals
• Hadoop 2 vs Hadoop 1 performance
– Hadoop 2 is coming!
• HDFS snapshots vs HBase snapshots
– HBase DOES NOT use HDFS snapshots
– Need hardlinks
– Super flush API
• HBase security vs HDFS security
– All files are owned by HBase principal
– No ACL’s in HDFS. Allowing a user to read Hfiles / snapshots directly is hard
Page 31
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Open Topics
• HDFS hard links
– Rethink how we do snapshots, backups, etc
• Parallel writes for WAL
– Reduce latency on WAL syncs
• SSD storage, cache
– SSD storage type in Hadoop or local filesystem
– Using SSD’s as a secondary cache
– Selectively places tables / column families on SSD
• HDFS zero-copy reads (HDFS-3051, HADOOP-8148)
• HDFS inline checksums (HDFS-2699)
• HDFS Quorum reads (HBASE-7509)
Page 32
Architecting the Future of Big Data
© Hortonworks Inc. 2011
Thanks
Questions?
Architecting the Future of Big Data
Page 33
Enis Söztutar
enis [ at ] apache [dot] org
@enissoz

More Related Content

What's hot

Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?lucenerevolution
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkFlink Forward
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLDatabricks
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detailMIJIN AN
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Flink Forward
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Databricks
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaDatabricks
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars GeorgeJAX London
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsSpark Summit
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Databricks
 

What's hot (20)

Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Node Labels in YARN
Node Labels in YARNNode Labels in YARN
Node Labels in YARN
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
 
RocksDB detail
RocksDB detailRocksDB detail
RocksDB detail
 
Practical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital KediaTuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
Tuning Apache Spark for Large-Scale Workloads Gaoxiang Liu and Sital Kedia
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 

Viewers also liked

HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBaseCon
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsCloudera, Inc.
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)alexbaranau
 
HBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbmsHBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbmsMichael Stack
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...Yahoo Developer Network
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineData Con LA
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Sematext Group, Inc.
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics Cloudera, Inc.
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashCloudera, Inc.
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARNHBaseCon
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesCloudera, Inc.
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCCloudera, Inc.
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponCloudera, Inc.
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseCloudera, Inc.
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera FieldHBaseCon
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...Cloudera, Inc.
 

Viewers also liked (20)

HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region ReplicasHBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Read High Availability Using Timeline-Consistent Region Replicas
 
HBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table SnapshotsHBaseCon 2013: Apache HBase Table Snapshots
HBaseCon 2013: Apache HBase Table Snapshots
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)
 
HBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbmsHBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbms
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema Design
 
Apache Phoenix + Apache HBase
Apache Phoenix + Apache HBaseApache Phoenix + Apache HBase
Apache Phoenix + Apache HBase
 
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
 
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on FlashHBaseCon 2013: Apache HBase on Flash
HBaseCon 2013: Apache HBase on Flash
 
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARNHBaseCon 2015: DeathStar - Easy, Dynamic,  Multi-tenant HBase via YARN
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
 
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 MinutesHBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: 1500 JIRAs in 20 Minutes
 
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLCHBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
 
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUponHBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
 
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBaseHBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
Tales from the Cloudera Field
Tales from the Cloudera FieldTales from the Cloudera Field
Tales from the Cloudera Field
 
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
 

Similar to HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase

Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's EvolutionDataWorks Summit
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseCloudera, Inc.
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for ArchitectsNick Dimiduk
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0enissoz
 
Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0enissoz
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfsNAVER D2
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BIDataWorks Summit
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemDataWorks Summit/Hadoop Summit
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandJosh Elser
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceChris Nauroth
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsEsther Kundin
 
Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30Ashish Narasimham
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestHBaseCon
 

Similar to HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase (20)

Ozone and HDFS's Evolution
Ozone and HDFS's EvolutionOzone and HDFS's Evolution
Ozone and HDFS's Evolution
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Evolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage SubsystemEvolving HDFS to a Generalized Storage Subsystem
Evolving HDFS to a Generalized Storage Subsystem
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 
HBase for Architects
HBase for ArchitectsHBase for Architects
HBase for Architects
 
Meet Apache HBase - 2.0
Meet Apache HBase - 2.0Meet Apache HBase - 2.0
Meet Apache HBase - 2.0
 
Meet hbase 2.0
Meet hbase 2.0Meet hbase 2.0
Meet hbase 2.0
 
Meet HBase 2.0
Meet HBase 2.0Meet HBase 2.0
Meet HBase 2.0
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
[B4]deview 2012-hdfs
[B4]deview 2012-hdfs[B4]deview 2012-hdfs
[B4]deview 2012-hdfs
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BI
 
Evolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage SubsystemEvolving HDFS to a Generalized Distributed Storage Subsystem
Evolving HDFS to a Generalized Distributed Storage Subsystem
 
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to UnderstandApache HBase Internals you hoped you Never Needed to Understand
Apache HBase Internals you hoped you Never Needed to Understand
 
Storage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduceStorage and-compute-hdfs-map reduce
Storage and-compute-hdfs-map reduce
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30Big data processing engines, Atlanta Meetup 4/30
Big data processing engines, Atlanta Meetup 4/30
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase

  • 1. © Hortonworks Inc. 2011 HBase and HDFSUnderstanding file system usage in HBase Enis Söztutar enis [ at ] apache [dot] org @enissoz Page 1
  • 2. © Hortonworks Inc. 2011 About Me Page 2 Architecting the Future of Big Data • In the Hadoop space since 2007 • Committer and PMC Member in Apache HBase and Hadoop • Working at Hortonworks as member of Technical Staff • Twitter: @enissoz
  • 3. © Hortonworks Inc. 2011 Motivation • HBase as a database depends on FileSystem for many things • HBase has to work over HDFS, linux & windows • HBase is the most advanced user of HDFS • For tuning for IO performance, you have to understand how HBase does IO Page 3 Architecting the Future of Big Data MapReduce Large files Few random seek Batch oriented High throughput Failure handling at task level Computation moves to data HBase Large files A lot of random seek Latency sensitive Durability guarantees with sync Computation generates local data Large number of open files
  • 4. © Hortonworks Inc. 2011 Agenda • Overview of file types in Hbase • Durability semantics • IO Fencing / Lease recovery • Data locality – Short circuit reads (SSR) – Checksums – Block Placement • Open topics Page 4 Architecting the Future of Big Data
  • 5. © Hortonworks Inc. 2011 HBase file types Architecting the Future of Big Data Page 5
  • 6. © Hortonworks Inc. 2011 Overview of file types • Mainly three types of files in Hbase – Write Ahead Logs (a.k.a. WALs, logs) – Data files (a.k.a. store files, hfiles) – References / symbolic or logical links (0 length files) • Every file is 3-way replicated Page 6 Architecting the Future of Big Data
  • 7. © Hortonworks Inc. 2011 Overview of file types /hbase/.archive /hbase/.logs/ /hbase/.logs/server1,60020,1370043265148/ /hbase/.logs/server1,60020,1370043265148/server1%2C60020%2C1370043265148.1370050467720 /hbase/.logs/server1,60020,1370043265105/server1%2C60020%2C1370043265105.1370046867591 … /hbase/.oldlogs /hbase/usertable/0711fd70ce0df641e9440e4979d67995/family/449e2fa173c14747b9d2e5.. /hbase/usertable/0711fd70ce0df641e9440e4979d67995/family/9103f38174ab48aa898a4b.. /hbase/table1/565bfb6220ca3edf02ac1f425cf18524/f1/49b32d3ee94543fb9055.. /hbase/.hbase-snapshot/usertable_snapshot/0ae3d2a93d3cf34a7cd30../family/12f114.. … Page 7 Architecting the Future of Big Data Write Ahead Logs Data files Links
  • 8. © Hortonworks Inc. 2011 Data Files (HFile) • Immutable once written • Generated by flush or compactions (sequential writes) • Read randomly (preads), or sequentially • Big in size (flushsize -> tens of GBs) • All data is in blocks (Hfile blocks not to be confused by HDFS blocks) • Data blocks have target size: – BLOCKSIZE in column family descriptor – 64K by default – Uncompressed and un-encoded size • Index blocks (leaf, intermediate, root) have target size: – hfile.index.block.max.size, 128K by default • Bloom filter blocks have target size: – io.storefile.bloom.block.size, 128K by default Page 8 Architecting the Future of Big Data
  • 9. © Hortonworks Inc. 2011 Data Files (HFile version 2.x) Page 9 Architecting the Future of Big Data
  • 10. © Hortonworks Inc. 2011 Data Files • IO happens at block boundaries – Random reads => disk seek + read whole block sequentially – Read blocks are put into the block cache – Leaf index blocks and bloom filter blocks also go to the block cache • Use smaller block sizes for faster random-access – Smaller read + faster in-block search – Block index becomes bigger, more memory consumption • Larger block sizes for faster scans • Think about how many key values will fit in an average block • Try compression and Data Block Encoding (PREFIX, DIFF, FAST_DIFF, PREFIX_TREE) – Minimizes file sizes + on disk block sizes Page 10 Architecting the Future of Big Data Key length Value length Row length Row key Family length Family Column qualifier Timesta mp KeyType Value Int (4) Int (4) Short(2) Byte[] byte Byte[] Byte[] Long(8) byte Byte[]
  • 11. © Hortonworks Inc. 2011 Reference Files / Links • When region is split, “reference files” are created referring to the top or bottom half of the parent store file according to splitkey • HBase does not delete data/WAL files just “archives” them /hbase/.oldlogs /hbase/.archive • Logs/hfiles are kept until TTL, and replication or snapshots are not referring to them – (hbase.master.logcleaner.ttl, 10min) – (hbase.master.hfilecleaner.ttl, 5min) • HFileLink: kind of hard / soft links that is application specific • HBase snapshots are logical links to files (with backrefs) Page 11 Architecting the Future of Big Data
  • 12. © Hortonworks Inc. 2011 Write Ahead Logs • One logical WAL per region / one physical per regionserver • Rolled frequently – hbase.regionserver.logroll.multiplier (0.95) – hbase.regionserver.hlog.blocksize (default file system block size) • Chronologically ordered set of files, only last one is open for writing • Exceeding hbase.regionserver.maxlogs (32) will cause force flush • Old log files are deleted as a whole • Every edit is appended • Sequential writes from WAL, sync very frequently (hundreds of times per sec) • Only sequential reads from replication, and crash recovery • One log file per region server limits the write throughput per Region Server Page 12 Architecting the Future of Big Data
  • 13. © Hortonworks Inc. 2011 Durability (as in ACID) Architecting the Future of Big Data Page 13
  • 14. © Hortonworks Inc. 2011 Overview of Write Path 1. Client sends the operations over RPC (Put/Delete) 2. Obtain row locks 3. Obtain the next mvcc write number 4. Tag the cells with the mvcc write number 5. Add the cells to the memstores (changes not visible yet) 6. Append a WALEdit to WAL, do not sync 7. Release row locks 8. Sync WAL 9. Advance mvcc, make changes visible Page 14 Architecting the Future of Big Data
  • 15. © Hortonworks Inc. 2011 Durability • 0.94 and before: – HTable property “DEFERRED_LOG_FLUSH” and – Mutation.setWriteToWAL(false) • 0.94 and 0.96: Page 15 Architecting the Future of Big Data Durability Semantics USE_DEFAULT Use global hbase default, OR table default (SYNC_WAL) SKIP_WAL Do not write updates to WAL ASYNC_WAL Write entries to WAL asynchronously (hbase.regionserver.optionallogflushinterval, 1 sec default) SYNC_WAL Write entries to WAL, flush to datanodes FSYNC_WAL Write entries to WAL, fsync in datanodes
  • 16. © Hortonworks Inc. 2011 Durability • 0.94 Durability setting per Mutation (HBASE-7801) / per table (HBASE- 8375) • Allows intermixing different durability settings for updates to the same table • Durability is chosen from the mutation, unless it is USE_DEFAULT, in which case Table’s Durability is used • Limit the amount of time an edit can live in the memstore (HBASE-5930) – hbase.regionserver.optionalcacheflushinterval – Default 1hr – Important for SKIP_WAL – Cause a flush if there are unflushed edits that are older than optionalcacheflushinterval Page 16 Architecting the Future of Big Data
  • 17. © Hortonworks Inc. 2011 Durability Page 17 Architecting the Future of Big Data public enum Durability { USE_DEFAULT, SKIP_WAL, ASYNC_WAL, SYNC_WAL, FSYNC_WAL } Per Table: HTableDescriptor htd = new HTableDescriptor("myTable"); htd.setDurability(Durability.ASYNC_WAL); admin.createTable(htd); Shell: hbase(main):007:0> create 't12', 'f1', DURABILITY=>'ASYNC_WAL’ Per mutation: Put put = new Put(rowKey); put.setDurability(Durability.ASYNC_WAL); table.put(put);
  • 18. © Hortonworks Inc. 2011 Durability (Hflush / Hsync) • Hflush() : Flush the data packet down the datanode pipeline. Wait for ack’s. • Hsync() : Flush the data packet down the pipeline. Have datanodes execute FSYNC equivalent. Wait for ack’s. • hflush is currently default, hsync() usage in HBase is not implemented (HBASE-5954). Also not optimized (2x slow) and only Hadoop 2.0. • hflush does not lose data, unless all 3 replicas die without syncing to disk (datacenter power failure) • Ensure that log is replicated 3 times hbase.regionserver.hlog.tolerable.lowreplication defaults to FileSystem default replication count (3 for HDFS) Page 18 Architecting the Future of Big Data public interface Syncable { public void hflush() throws IOException; public void hsync() throws IOException; }
  • 19. © Hortonworks Inc. 2011 Page 19 Architecting the Future of Big Data
  • 20. © Hortonworks Inc. 2011 IO Fencing Fencing is the process of isolating a node of a computer cluster or protecting shared resources when a node appears to be malfunctioning Page 20 Architecting the Future of Big Data
  • 21. © Hortonworks Inc. 2011 IO Fencing Page 21 Architecting the Future of Big Data Region1Client Region Server A (dying) WAL Region1 Region Server B Append+sync ack edit edit WAL Append+sync ack Master Zookeeper RegionServer A znode deleted assign Region1 Region Server A Region 2 … … … YouAreDeadException abort RegionServer A session timeout -- B RegionServer A session timeout Client
  • 22. © Hortonworks Inc. 2011 IO Fencing • Split Brain • Ensure that a region is only hosted by a single region server at any time • If master thinks that region server no longer hosts the region, RS should not be able to accept and sync() updates • Master renames the region server logs directory on HDFS: – Current WAL cannot be rolled, new log file cannot be created – For each WAL, before replaying recoverLease() is called – recoverLease => lease recovery + block recovery – Ensure that WAL is closed, and all data is visible (file length) • Guarantees for region data files: – Compactions => Remove files + add files – Flushed => Allowed since resulting data is idempotent • HBASE-2231, HBASE-7878, HBASE-8449 Page 22 Architecting the Future of Big Data
  • 23. © Hortonworks Inc. 2011 Data Locality Short circuit reads, checksums, block placement Architecting the Future of Big Data Page 23
  • 24. © Hortonworks Inc. 2011 HDFS local reads (short circuit reads) • Bypasses the datanode layer and directly goes to the OS files • Hadoop 1.x implementation: – DFSClient asks for local paths for a block to the local datanode – Datanode checks whether the user has permission – Client gets the path for the block, opens the file with FileInputStream hdfs-site.xml dfs.block.local-path-access.user = hbase dfs.datanode.data.dir.perm = 750 hbase-site.xml dfs.client.read.shortcircuit = true Page 24 Architecting the Future of Big Data RegionServer Hadoop FileSystem DFSClient Datanode OS Filesystem (ext3) Disks Disks Disks HBase Client RPC RPC BlockReader
  • 25. © Hortonworks Inc. 2011 HDFS local reads (short circuit reads) • Hadoop 2.0 implementation (HDFS-347) – Keep the legacy implementation – Use Unix Domain sockets to pass the File Descriptor (FD) – Datanode opens the block file and passes FD to the BlockReaderLocal running in Regionserver process – More secure than previous implementation – Windows also supports domain sockets, need to implement native APIs • Local buffer size dfs.client.read.shortcircuit.buffer.size – BlockReaderLocal will fill this whole buffer everytime HBase will try to read an HfileBlock – dfs.client.read.shortcircuit.buffer.size = 1MB vs 64KB Hfile block size – SSR buffer is a direct buffer (in Hadoop 2, not in Hadoop 1) – # regions x # stores x #avg store files x # avg blocks per file x SSR buffer size – 10 regions x 2 x 4 x (1GB / 64MB) x 1 MB = 1.28GB non-heap memory usage Page 25 Architecting the Future of Big Data
  • 26. © Hortonworks Inc. 2011 Checksums • HDFS checksums are not inlined. • Two files per block, one for data, one for checksums (HDFS-2699) • Random positioned read causes 2 seeks • HBase checksums comes with 0.94 (HDP 1.2+). HBASE-5074. Page 26 Architecting the Future of Big Data blk_123456789 .blk_123456789.meta : Data chunk (dfs.bytes-per-checksum, 512 bytes) : Checksum chunk (4 bytes)
  • 27. © Hortonworks Inc. 2011 Checksums Page 27 Architecting the Future of Big Data • HFile version 2.1 writes checksums per Hfile block • HDFS checksum verification is bypassed on block read, will be done by HBase • If checksum fails, we go back to reading checksums from HDFS for “some time” • Due to double checksum bug(HDFS-3429) in remote reads in Hadoop 1, not enabled by default for now. Benchmark it yourself hbase.regionserver.checksum.verify = true hbase.hstore.bytes.per.checksum = 16384 hbase.hstore.checksum.algorithm = CRC32C Never set this: dfs.client.read.shortcircuit.skip.checksum = false HFile : Hfile data block chunk : Checksum chunk Hfile block : Block header
  • 28. © Hortonworks Inc. 2011 Rack 1 / Server 1 DataNode Default Block Placement Policy Page 28 Architecting the Future of Big Data b1 RegionServer Region A Region B StoreFile StoreFile StoreFile StoreFile StoreFile b2 b2 b9 b1 b1 b2 b3 b2 b1 b2b1 Rack N / Server M DataNode b2 b1 b1 Rack L / Server K DataNode b2 b1 Rack X / Server Y DataNode b1b2 b2 b3 RegionServer RegionServer RegionServer
  • 29. © Hortonworks Inc. 2011 Data locality for HBase • Poor data locality when the region is moved: – As a result of load balancing – Region server crash + failover • Most of the data won’t be local unless the files are compacted • Idea (from Facebook): Regions have affiliated nodes (primary, secondary, tertiary), HBASE-4755 • When writing a data file, give hints to the NN that we want these locations for block replicas (HDFS-2576) • LB should assign the region to one of the affiliated nodes on server crash – Keep data locality – SSR will still work • Reduces data loss probability Page 29 Architecting the Future of Big Data
  • 30. © Hortonworks Inc. 2011 Rack X / Server Y RegionServer Rack L / Server K RegionServer Rack N / Server M RegionServer Rack 1 / Server 1 Default Block Placement Policy Page 30 Architecting the Future of Big Data RegionServer Region A StoreFile StoreFile StoreFile Region B StoreFile StoreFile DataNode b1 b2 b2 b9 b1 b1 b2 b3 b2 b1 b2b1 DataNode b1 b2 b2 b9b1 b2 b1 DataNode b1 b2 b2 b9 b2 b1 DataNode b1 b2 b3 b2 b1
  • 31. © Hortonworks Inc. 2011 Other considerations • HBase riding over Namenode HA – Both Hadoop 1 (NFS based) and Hadoop 2 HA (JQM, etc) – Heavily tested with full stack HA • Retry HDFS operations • Isolate FileSystem usage from HBase internals • Hadoop 2 vs Hadoop 1 performance – Hadoop 2 is coming! • HDFS snapshots vs HBase snapshots – HBase DOES NOT use HDFS snapshots – Need hardlinks – Super flush API • HBase security vs HDFS security – All files are owned by HBase principal – No ACL’s in HDFS. Allowing a user to read Hfiles / snapshots directly is hard Page 31 Architecting the Future of Big Data
  • 32. © Hortonworks Inc. 2011 Open Topics • HDFS hard links – Rethink how we do snapshots, backups, etc • Parallel writes for WAL – Reduce latency on WAL syncs • SSD storage, cache – SSD storage type in Hadoop or local filesystem – Using SSD’s as a secondary cache – Selectively places tables / column families on SSD • HDFS zero-copy reads (HDFS-3051, HADOOP-8148) • HDFS inline checksums (HDFS-2699) • HDFS Quorum reads (HBASE-7509) Page 32 Architecting the Future of Big Data
  • 33. © Hortonworks Inc. 2011 Thanks Questions? Architecting the Future of Big Data Page 33 Enis Söztutar enis [ at ] apache [dot] org @enissoz