SlideShare uma empresa Scribd logo
1 de 75
PERFORMANCE TUNING & CLUSTER
ADMINISTRATION
2012/8/2
Scott Miao
AGENDA
 Course Credit
 Performance Tuning
 More…
 Cluster Administration
 More…
2
COURSE CREDIT
 Show up, 30 scores
 Ask question, each question earns 5 scores
 Hands-on, 40 scores
 70 scores will pass this course
 Each course credit will be calculated once for each
course finished
 The course credit will be sent to you and your
supervisor by mail
3
PERFORMANCE TUNING
 Garbage Collection Tuning
 MSLAB
 Compression
 Optimizing Splits and Compactions
 Load Balancing
 Merging Regions
 Client API: Best Practices
 Configuration
 Load Tests
4
GARBAGE COLLECTION TUNING
 The process to rewrite the heap generation in
question is called a garbage collection (GC)
 GC parameters only need to be added to the region
servers
 JRE comes with basic assumptions
 Regarding what your programs are doing, how they
create objects, how they allocate the heap to handle
data, and so on
 These assumptions work well in a lot of cases
 But NOT work well for HBase…
 Especially write-heavy ones
 It cannot safely rely on the JRE assumption alone 5
6
https://service.ithome.com.tw/20120720Java/index3.html#3
7
GARBAGE COLLECTION TUNING –
WRITE-HEAVY USE CASES (1/2)
 Memstore flushes the data by the configured minimum
flush size, hbase.hregion.memstore.flush.size
 It leaves different size of holes in the heap
 Data resided in different locations in the generational
architecture of the Java heap
 Depending on how long the data was in memory
 Young generation (new generation)
 The space can be reclaimed quickly and no harm is done
 Old generation (tenured generation)
 Data promoted to this location if it stays in memory for a longer
period of time
8
GARBAGE COLLECTION TUNING –
WRITE-HEAVY USE CASES (2/2)
 Reuse the holes created by data that has been written
to disk
 Requests a size of heap that does not fit into one of
those holes
 Needs to compact the fragmented heap
 Young to Old
 The promotion of longer-living objects from the young to the old
generation
 Old to Stop-The-World
 There is no longer enough space for a young allocation caused by
the fragmentation
 Falls back to the stop-the-world garbage collector
 Rewrites the entire heap space and compacts it to the remaining
active objects
 If this fails, you will see a promotion failure in your
garbage collection logs
9
10
What is the Heap looks like ?
GARBAGE COLLECTION TUNING –
SPECIFY THE YOUNG GENERATION SIZE
 Young generation
 is between 128 MB and 512 MB
 Old generation
 holds the remaining available heap, which is usually
many gigabytes of memory
 Using 128 MB is a good starting point
 Further observation of the JVM metrics should be
conducted
 Specify the young generation size like so
 -XX:MaxNewSize=128m -XX:NewSize=128m
 One convenient option
 -Xmn128m
11
GARBAGE COLLECTION TUNING –
GC OPTIONS SETTING
 GC Options setting for HBase
 Adding them in the hbase-env.sh configuration file
 HBASE_OPTS variable for all HBase
 HBASE_REGIONSERVER_OPTS variable for all region
servers
 Enable the JRE’s log output for garbage collection
details
 Monitor it for occurrences of
 "concurrent mode failure" or "promotion
failed" messages 12
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
-Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"
GARBAGE COLLECTION TUNING –
GC STRATEGY FOR YOUNG GENERATION
 Recommended value for young generation
 -XX:+UseParNewGC
 Use the Parallel New Collector
 It stops the entire Java process to clean up the young
generation heap
 Since Young generation’s size is small in comparison
 Usually less than a few hundred milliseconds
13
GARBAGE COLLECTION TUNING –
GC STRATEGY FOR OLD GENERATION
 Recommended value for old generation
 -XX:+UseConcMarkSweepGC
 Use the Concurrent Mark-Sweep Collector (CMS)
 It tries to do as much work concurrently as
possible, without stopping the Java process
 It takes extra effort and an increased CPU load
 Avoids the required stops to rewrite a fragmented old
generation heap
 If you hit the promotion error
 It falls back to stop-the-world again
14
GARBAGE COLLECTION TUNING –
GC STRATEGY FOR OLD GENERATION
 A switch for CMS
 -XX:CMSInitiatingOccupancyFraction=70
 A percentage that specifies when the background
process starts
 Avoids the concurrent mode failure
 The background process to mark and sweep the heap for
collection is still running when the heap runs out of usable
space
 Falls back to stop-the-world again
 Initiating occupancy fraction to 70%
 20% block cache + 40% memstore limits = 60%, by default
 Starts the background process at appropriate time
 Early enough, and not too early 15
GARBAGE COLLECTION TUNING - SUMMARY
 Recommended GC options
 The Alex Su’s GC options
 GC Options Reference
16
export HBASE_REGIONSERVER_OPTS=
"-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC 
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc 
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
-Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"
-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
-Xloggc:<%= hbase_log_path %>/hbase-regionserver-gc-`date +%F-%H-%M-%S`.log 
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled 
-XX:CMSInitiatingOccupancyFraction=70 -XX:PrintFLSStatistics=1 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=<%= hbase_log_path %>/hbase-regionserver.hprof
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-
140102.html
MSLAB - QUESTION
 For solving the stop-the-world issue
 Stop-the-world
 The key to reducing these compacting collections is to
reduce fragmentation
 Only objects of exactly the same size should be
allocated from the heap
 Subsequent allocations of new objects of the exact same size
will always reuse these holes
 No promotion error, and therefore no stop-the-world
compacting collection is required
17
MSLAB –
MEMSTORE-LOCAL ALLOCATION BUFFER (1/3)
 Are buffers of fixed sizes containing KeyValue
instances of varying sizes
1. A buffer cannot completely fit a newly added
KeyValue, it is considered full
2. And a new buffer is created, once again of the given
fixed size
 Enabled by default in version 0.92
 Disabled in version 0.90 of HBase
 hbase.hregion.memstore.mslab.enabled property
 It is recommended that test your setup with this
feature 18
MSLAB –
MEMSTORE-LOCAL ALLOCATION BUFFER (2/3)
 The size of each allocated, fixed-sized buffer
 hbase.hregion.memstore.mslab.chunksize property
 Default is 2 MB
 Based on your KeyValue instances, you may have to adjust
this value
 E.g., 100 KB in size, you need to increase the MSLAB size to fit
more than just a few cells
 An upper boundary of what is stored in the buffers
 hbase.hregion.memstore.mslab.max.allocation property
 Default 256 KB
 Any cell (KeyValue) that is larger will be directly allocated in
the Java heap 19
MSLAB –
MEMSTORE-LOCAL ALLOCATION BUFFER (3/3)
 MSLAB do not come without a cost
 More wasteful in regard to heap usage
 Most likely not fill every buffer to the last byte
 A Tradeoff
 Use MSLABs and benefit from better garbage collection but
incur the extra space that is required
 NOT use MSLABs and benefit from better memory
efficiency but deal with the problem caused by garbage
collection pauses
 Could plan to restart the servers every few days, or weeks, before
the pause happens
 The buffers require an additional byte array copy
operation, therefore slightly slower
 Measure the impact on your workload
20
COMPRESSION
 A number of compression algorithms that can be
enabled at the column family level
 It is recommended
 Enable compression unless you have a reason not to do
so
 For example, when using already compressed content, such
as JPEG images
 Compression usually will yield overall better
performance
 The overhead of the CPU performing the compression
/de-compression is less than what is required to read
more data from disk
21
COMPRESSION – AVAILABLE CODECS
 It is recommended
 Snappy/Zippy (in Bigtable)
 Released by Google under the BSD License
 Ships with the required JNI libraries to be able to use it in HBase-0.92
 Must install the native binary library on all region servers
 LZO (Lempel-Ziv-Oberhumer)
 A lossless data compression algorithm that is focused on
decompression speed, and written in ANSI C
 HBase cannot ship with LZO because of licensing issues
 incompatible GNU General Public License (GPL)
 LZO installation needs to be performed separately, after HBase has
been installed
22
http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437
COMPRESSION –
COMPRESSION TEST TOOL
 Use command
 hbase org.apache.hadoop.hbase.util.CompressionTest
<path> <none|gz|lzo|snappy>
 Example
 ./bin/hbase org.apache.hadoop.hbase.util.CompressionTest 
/user/larsgeorge/test.gz gz
 It will return result based on the test
 If success
 If failed
23
…
SUCCESS
Exception in thread "main" java.lang.RuntimeException: 
java.lang.ClassNotFoundException:
com.hadoop.compression.lzo.LzoCodec
…
COMPRESSION – STARTUP CHECK
 A fast failing setup notices the missing libraries
 Instead of running into issues later
 For example, check the Snappy and LZO
compression libraries
 The server will abort at startup with an IOException
stating
 "Compression codec <codec-name> not
supported, aborting RS construction"
 Copy the changed configuration file to all region
servers and to restart them afterward
24
<property>
<name>hbase.regionserver.codecs</name>
<value>snappy,lzo</value>
</property>
COMPRESSION – ENABLING COMPRESSION
 Install the JNI libraries
 Install native compression libraries
 Specifying the chosen algorithm in the column family schema
 In HBase shell
 create 'testtable', { NAME => 'colfam1', COMPRESSION => 'GZ' }
 In API
 HColumnDescriptor.setCompressionType(…)
 Refer to ppt#003, p#11
25
OPTIMIZING SPLITS AND COMPACTIONS
- SPLIT/COMPACTION STORMS
 Grow your regions roughly at the same rate
 Eventually they all need to be split at about the
same time
 A large spike in disk I/O because of the required
compactions to rewrite the split region
 Refer to ppt#004, p#13
26
OPTIMIZING SPLITS AND COMPACTIONS –
MANAGED SPLITTING (1/2)
 you can turn it off and manually invoke the split and
major_compact commands
 Setting Region Maximum File Size
 hbase.hregion.max.filesize property for the entire cluster
 table level by API
 HTableDescriptor.setMaxFileSize(…)
 Refer to ppt#003, p#7
 To a very high number
 Better to set this value to a reasonable upper boundary
 Such as 100GB
 Long.MAX_VALUE is not recommended in case the manual
splits fail to run
 Then you can time-control them
 Running them staggered across all regions
 Spreads the I/O load as much as possible, avoiding any
split/compaction storm
 Use HBase shell + cron
 Or write your own codes with HBase Admin API supports
 Refer to #003, p#21
27
OPTIMIZING SPLITS AND COMPACTIONS –
MANAGED SPLITTING (2/2)
 RegionSplitter Class (added in version 0.90.2)
 Another way to split existing regions
 Rolling split feature
 Split the existing regions while waiting long enough for the
involved compactions to complete
 API docs
 An additional advantage
 Have better control over which regions are available at
any time
 In rare case, you need to do very low-level debugging
 With automated splits, it is hard to debug !!
 Due to this region is split to two daughter regions
28
OPTIMIZING SPLITS AND COMPACTIONS –
REGION HOTSPOTTING
 You may be dealing with a write pattern that is causing a
specific region to run hot
 Use Region Server Metrics to observe
 Refer to ppt#005, p#12
 Key design approaches
 Salt keys, random keys, etc
 Refer to ppt#004, p#52
 Other only way to alleviate this situation
 Manually split a hot region into one or more new regions, at
exact boundaries
 You can specify any row key within specific region
 Be able to generate halves that are completely different in size
 Refer ppt#003, p#21
 This can not dealing with completely sequential key ranges
 Those are always going to hit one region for a considerable amount
of time
29
OPTIMIZING SPLITS AND COMPACTIONS –
PRESPLITTING REGIONS (1/3)
 Manage splits manually is useful
 Therefore start with a larger number of regions right from
the table creation
 Means to create a table with the required number of
regions
 Three ways…
 HBase shell
 create, refer to ppt#003, p#37
 API
 HBaseAdmin.createTable(…), refer to ppt#003, p#16
 RegionSplitter Class
 By default, MD5StringSplit class to partition the row keys into
ranges
 Use -D split.algorithm=<your-algorithm-class> for other
implementation
30
/bin/hbase org.apache.hadoop.hbase.util.RegionSplitter
usage: RegionSplitter <TABLE>
OPTIMIZING SPLITS AND COMPACTIONS –
PRESPLITTING REGIONS (2/3)
 RegionSplitter with MD5StringSplit sample
31
testtable,,1309766006467.c0937d09f1da31f2a6c2950537a61093.
testtable,0ccccccc,1309766006467.83a0a6a949a6150c5680f39695450d8a.
testtable,19999998,1309766006467.1eba79c27eb9d5c2f89c3571f0d87a92.
testtable,26666664,1309766006467.7882cd50eb22652849491c08a6180258.
testtable,33333330,1309766006467.cef2853e36bd250c1b9324bac03e4bc9.
testtable,3ffffffc,1309766006467.00365940761359fee14d41db6a73ffc5.
OPTIMIZING SPLITS AND COMPACTIONS –
PRESPLITTING REGIONS (3/3)
 How many presplit regions ?
 Start low with 10 presplit regions per server and watch as data
grows over time
 It is better to err on the side of too few regions and using a
rolling split later
 If Presplit regions to thin
 Increase hbase.hregion.majorcompaction property
 Refet to ppt#004, p# 19
 If data size grows too large
 Use the RegionSplitter utility to perform a rolling split of all
regions
 The main objective is to avoid split/compaction storm
32
LOAD BALANCING – BALANCER (1/3)
 The master has a built-in feature
 Called the balancer
 By default, runs every five minutes
 hbase.balancer.period property
 Attempts to equal out the number of assigned
regions per region server
 Within one region of the average number per server
 Determines a new assignment plan
 Describes which regions should be moved where starts
the process of moving the regions by calling the
unassign() method
 Refer to ppt#003, p#22 33
LOAD BALANCING - BALANCER (2/3)
 balancer has an upper limit on how long it is allowed to
run
 hbase.balancer.max.balancing property
 defaults to half of the balancer period value
 2.5 mins
 The balancer switch
 Toggle the balancer status between enabled and disabled
 HBase shell
 balance_switch command, refer to ppt#003, p#39
 balanceSwitch() API method, refer to ppt#003, p#22
34
LOAD BALANCING - BALANCER (3/3)
 Can be explicitly started
 HBase shell
 balancer command, refer to ppt#003, p#39
 balancer() API method, refer to ppt#003, p#22
 Return true
 Any work has be done
 Return false
 balancer was switched off
 No work to be done
 balancer was not able to run the balancer
 There is a region currently in transition, the balancer will be
skipped
35
LOAD BALANCING - MOVE
 Can also use the move
 To assign regions to other servers
 HBase shell
 move command, refer to ppt#003, p#39
 move() API method, refer to ppt#003, p#22
36
MERGING REGIONS
 Sometimes you may need to merge regions
 For example, after you have removed a large amount of
data and you want to reduce the number of regions
hosted by each server
 HBase allows you to merge two adjacent regions
 The HBase cluster must be offline, but HDFS
37
/bin/hbase org.apache.hadoop.hbase.util.Merge
Usage: bin/hbase merge <table-name> <region-1> <region-2>
CLIENT API: BEST PRACTICES (1/3)
 Disable auto-flush
 When performing a lot of put operations
 Refer to ppt#002, p#9
 Use scanner-caching
 Set Scan.setCaching() method to something greater than the
default of 1 if needed
 Refer to ppt#002, p#26
 Limit scan scope
 If only a small number of the available columns are to be
processed, only those should be specified in the input scan
 For example, use Scan.addFamily() method
 Refer to ppt#002, p#24 38
CLIENT API: BEST PRACTICES (2/3)
 Close ResultScanners
 Avoiding performance problems
 This may cause problems on the region servers
 Refer to ppt#002, p#25
 Block cache usage
 Scan instances can be set to use the block cache in the
region server via the setCacheBlocks() method
 true by default, default settings of the table and family
are used
 API docs
 Server side block cache settings
 Refer to ppt#003, p#12 39
CLIENT API: BEST PRACTICES (3/3)
 Optimal loading of row keys
 When performing a table scan where only the row keys
are needed
 a FilterList with a MUST_PASS_ALL operator +
FirstKeyOnlyFilter + KeyOnlyFilter
 Refer to ppt#002, p#43 & 46
 Turn off WAL on Puts
 Increasing throughput on Puts is to call
writeToWAL(false), there might be data loss
 Consider to use the bulk loading techniques instead
40
CONFIGURATION (1/6)
 Advanced options you can consider adjusting
based on your use case
 Most properties are configured in hbase-site.xml
 Others are in hbase-env.sh
 Decrease ZooKeeper timeout
 The default timeout between a region server and the
ZooKeeper quorum is three minutes
 Tune the timeout down to a minute, or even less, so the
master notices failures sooner
 zookeeper.session.timeout property
 Be careful of ―Juliet Pause‖ 41
CONFIGURATION (2/6)
 Increase handlers
 The number of threads that are kept open to answer
incoming requests to user tables
 By default is 10
 hbase.regionserver.handler.count property
 Keep this number low when the payload per request
approaches megabytes
 And high when the payload is small
 Increase heap settings
 HBASE_HEAPSIZE setting in hbase-env.sh file
 Consider using HBASE_REGIONSERVER_OPTS
instead of changing the global HBASE_HEAP SIZE
 Region servers may need more memory than Master
42
CONFIGURATION (3/6)
 Enable data compression
 Should enable compression for the storage files
 In most cases, boosts performance
 Increase region size
 Consider going to larger regions to cut down on the total
number of regions on your cluster
 Fewer regions to manage makes for a smoother-running
cluster
43
CONFIGURATION (4/6)
 Adjust block cache size
 The amount of heap used for the block cache is specified as a
percentage
 Defaults to 20%
 perf.hfile.block.cache.size property
 It is good if you have mainly reading workloads
 Adjust memstore limits
 Memstore heap usage
 hbase.regionserver.global.memstore.upperLimit property
 Defaults to 40%
 hbase.regionserver.global.memstore.lowerLimit property
 Defaults to 35%
 Control the amount of flushing that will take place once the server is
required to free heap space
 Mainly read-oriented workloads
 Consider reducing both limits to make more room for the block cache
 Handling many writes
 Increase the memstore limits to reduce the excessive amount of I/O
this causes 44
CONFIGURATION (5/6)
 Increase blocking store files
 The region servers block further updates from clients to
give compactions time to reduce the number of files
 Default is seven files
 hbase.hstore.blockingStoreFiles property
 Increase block multiplier
 A safety latch that blocks any further updates from clients
when the memstores exceed the multiplier * flush size limit
 hbase.hregion.memstore.block.multiplier property
 Default to 2
 If you have enough memory, can increase this value to
handle spikes more gracefully
 Refer to ppt#003, p#8 45
CONFIGURATION (6/6)
 Decrease maximum logfiles
 How often flushes occur based on the number of WAL
files on disk
 Default is 32
 hbase.regionserver.maxlogs property
 Can be high in a write-heavy use case
 Lower it to force the servers to flush data more often to
disk
46
LOAD TESTS
 It is advisable to run performance tests to verify
functionality of your cluster
 These tests give you a baseline which you can refer
to
 After making changes to the configuration of the cluster
 Or the schemas of your tables
 Doing a burn-in of your cluster
 Show you how much you can gain from it
 But this does not replace a test with the load as
expected from your use case
47
LOAD TESTS –
PERFORMANCE EVALUATION (1/2)
 HBase ships with its own tool to execute a
performance evaluation
 Performance Evaluation (PE)
 Wiki
 http://wiki.apache.org/hadoop/Hbase/PerformanceEvalu
ation
48
/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation
Usage: java org.apache.hadoop.hbase.PerformanceEvaluation 
[--miniCluster] [--nomapred] [--rows=ROWS] <command> <nclients>
LOAD TESTS –
PERFORMANCE EVALUATION (2/2)
 Example
49
/bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1
11/07/03 13:18:34 INFO hbase.PerformanceEvaluation: Start class 
org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at 
offset 0 for 1048576 rows
...
11/07/03 13:18:41 INFO hbase.PerformanceEvaluation: 0/104857/1048576
...
11/07/03 13:18:45 INFO hbase.PerformanceEvaluation: 0/209714/1048576
...
11/07/03 13:20:03 INFO hbase.PerformanceEvaluation: 0/1048570/1048576
11/07/03 13:20:03 INFO hbase.PerformanceEvaluation: Finished class 
org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest 
in 89062ms at offset 0 for 1048576 rows
LOAD TESTS – YCSB (1/2)
 Yahoo! Cloud Serving Benchmark* (YCSB)
 It is a suite of tools that can be used to run comparable
workloads against different storage systems
 Also a reasonable tool for performing an HBase cluster burn-
in—or performance test
 Using YCSB is preferred over the HBase-supplied
Performance Evaluation
 Offers more options
 Can combine read and write workloads
 Home page
 http://research.yahoo.com/Web_Information_Management/YCSB
50
LOAD TESTS – YCSB (2/2)
 Use HBase shell
 create “usertable”, “family”
 git pull
 cd ${GIT_HOME}/hbase-training/006/ycsb
 Run command
 Then you can see performance metrics in ycsb-
laod.log file
51
java -cp "${HBASE_CONF_DIR}:core-0.1.4.jar:hbase-binding-0.1.4.jar"
com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P
workloads/workloada -p columnfamily=family -p recordcount=1000 -s > ycsb-
load.log
CLUSTER ADMINISTRATION
52
 Operational Tasks
 Node Decommission
 Rolling Restarts
 Adding Backup
Master
 Adding a Region
Server
 Data Task
 Export
 Import
 CopyTable Tool
 Bulk Import
 Troubleshooting
 HBase Fsck
 Analyzing the Logs
OPERATIONAL TASKS – NODE DECOMMISSION (1/2)
 Use following script
 In normal HBase distribution
 In tm distribution
 Disable the Load Balancer before
Decommissioning a node
 In hbase shell
 balance_switch false
 Regions could be offline for a good period of time
 Many regions on the server
 All regions close
 The master notices the region server’s ZooKeeper
znode being removed
53
${HBASE_HOME}/bin/hbase-daemon.sh stop regionserver
${TM_PUPPET_HOME}/bin/services/shutdown-regionservers.sh [<host> ...]
OPERATIONAL TASKS – NODE DECOMMISSION (2/2)
 Stop a region server gradually
 A node to gradually shed its load and then shut itself
down
 From HBASE 0.90.2
 ${HBASE_HOME}/bin/graceful_stop.sh
 Example
 Check the HOSTNAME on your HBase master UI
 Refer to ppt#003, p#41
 IP address is NOT supported at present
54
${HBASE_HOME}/bin/graceful_stop.sh HOSTNAME
OPERATIONAL TASKS – ROLLING RESTARTS
 Also use graceful_stop.sh
 Steps as follows
1. Ensure the cluster is consistent
 Fix it if inconsistent
2. Restart the master
3. Disable the region balancer
4. Run the graceful_stop.sh script per region server
5. Restart the master again
 Clear out the dead servers list and reenable the balancer
6. Run hbck to ensure the cluster is consistent
55
hbase hbck
hbase hbck -fix
${HBASE_HOME}/bin/hbase-daemon.sh stop master; 
${HBASE_HOME}/bin/hbase-daemon.sh start master
echo "balance_switch false" | ${HBASE_HOME}/bin/hbase shell
for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh 
--restart --reload --debug $i; done &> /tmp/log.txt &
OPERATIONAL TASKS –
ADDING BACKUP MASTER (1/2)
 To prevent the Single Point of Failure
 The machine currently hosting the active master is
failing, the system can fall back to a backup master
 Underlying operations
1. A dedicated ZooKeeper znode /hbase/master
2. All master processes will race to create, and the first
one to create it wins (become currently master)
 It happens at startup
3. All other master processes simply loop around the
znode check and wait for it to disappear
 Triggering the race again 56
OPERATIONAL TASKS –
ADDING BACKUP MASTER (2/2)
 How to start multiple backup master processes
 Use original way to start a master process
 In tm distribution
 Specifically start a backup master process
57
${HBASE_HOME}/bin/hbase-daemon.sh start master
${TM_PUPPET_HOME}/bin/services/startup-hmaster.sh [<host> ...]
${HBASE_HOME}/bin/hbase-daemon.sh start master --backup
OPERATIONAL TASKS –
ADDING A REGION SERVER
 In normal HBase distribution
 Edit the ${HBASE_HOME}/conf/regionservers
 To add newly added region server’s host name
 Two scripts can use…
 ${HBASE_HOME}/bin/start-hbase.sh
 It will bypass the original existing region servers, and start
the newly added region server referred to regionservers file
 ${HBASE_HOME}/bin/hbase-daemon.sh start regionserver
 Must executing on the newly added region server
 In tm distribution
 New feature, not talk about this here
58
DATA TASK
 You may be required to move the data as a whole
or in parts
 Archive data for backup purposes
 To bootstrap another cluster
59
hadoop jar ${HBASE_HOME}/hbase-0.91.0-SNAPSHOT.jar
An example program must be given as the first argument.
Valid program names are:
…
completebulkload: Complete a bulk data load.
copytable: Export a table from local cluster to peer cluster
export: Write table data to HDFS.
import: Import data written by Export.
importtsv: Import data in TSV format.
…
http://hbase.apache.org/book/ops_mgt.html
DATA TASK – EXPORT (1/3)
60
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar export
Usage: Export [-D <property=value>]* <tablename> <outputdir> 
[<versions> [<starttime> [<endtime>]]
DATA TASK - EXPORT (2/3)
61
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar export 
testtable /user/larsgeorge/backup-testtable
11/06/25 15:58:29 INFO mapred.JobClient: Running job: job_201106251558_0001
11/06/25 15:58:30 INFO mapred.JobClient: map 0% reduce 0%
…
11/06/25 15:59:40 INFO mapred.JobClient: map 100% reduce 0%
11/06/25 15:59:42 INFO mapred.JobClient: Job complete: job_201106251558_0001
11/06/25 15:59:42 INFO mapred.JobClient: Counters: 6
11/06/25 15:59:42 INFO mapred.JobClient: Job Counters
11/06/25 15:59:42 INFO mapred.JobClient: Rack-local map tasks=32
11/06/25 15:59:42 INFO mapred.JobClient: Launched map tasks=32
11/06/25 15:59:42 INFO mapred.JobClient: FileSystemCounters
11/06/25 15:59:42 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=3648
11/06/25 15:59:42 INFO mapred.JobClient: Map-Reduce Framework
11/06/25 15:59:42 INFO mapred.JobClient: Map input records=0
11/06/25 15:59:42 INFO mapred.JobClient: Spilled Records=0
11/06/25 15:59:42 INFO mapred.JobClient: Map output records=0
DATA TASK - EXPORT (3/3)
 Each part-m-nnnnn file contains a piece of the
exported data
 Together they form the full backup of the table
 Use the hadoop distcp command to move the
directory from one cluster to another, and perform
the import there 62
hadoop dfs -lsr /user/larsgeorge/backup-testtable
drwxr-xr-x - ... 0 2011-06-25 15:58 _logs
-rw-r--r-- 1 ... 114 2011-06-25 15:58 part-m-00000
-rw-r--r-- 1 ... 114 2011-06-25 15:58 part-m-00001
…
-rw-r--r-- 1 ... 114 2011-06-25 15:59 part-m-00030
-rw-r--r-- 1 ... 114 2011-06-25 15:59 part-m-00031
DATA TASK – IMPORT (1/2)
63
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar import
Usage: Import <tablename> <inputdir>
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar import 
testtable /user/larsgeorge/backup-testtable
11/06/25 17:09:48 INFO mapreduce.TableOutputFormat: Created table instance 
for testtable
11/06/25 17:09:48 INFO input.FileInputFormat: Total input paths to process : 32
11/06/25 17:09:49 INFO mapred.JobClient: Running job: job_201106251558_0003
11/06/25 17:09:50 INFO mapred.JobClient: map 0% reduce 0%
11/06/25 17:10:04 INFO mapred.JobClient: map 6% reduce 0%
…
11/06/25 17:10:51 INFO mapred.JobClient: Job Counters
11/06/25 17:10:51 INFO mapred.JobClient: Launched map tasks=32
11/06/25 17:10:51 INFO mapred.JobClient: Data-local map tasks=32
11/06/25 17:10:51 INFO mapred.JobClient: FileSystemCounters
11/06/25 17:10:51 INFO mapred.JobClient: HDFS_BYTES_READ=3648
11/06/25 17:10:51 INFO mapred.JobClient: Map-Reduce Framework
11/06/25 17:10:51 INFO mapred.JobClient: Map input records=0
11/06/25 17:10:51 INFO mapred.JobClient: Spilled Records=0
11/06/25 17:10:51 INFO mapred.JobClient: Map output records=0
DATA TASK - IMPORT (2/2)
 Use the Import job to store the data in a different
table
 With the same schema
 Both export/import commend are per-table only
 Use hadoop distcp command to copy the entire
/hbase in HDFS
 Not recommended
 May copy store files that are halfway through a
memstore flush operation
64
DATA TASK – COPYTABLE TOOL (1/2)
 Designed to bootstrap cluster replication
 Make a copy of an existing table from the master
cluster to the slave cluster
65
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar copytable
Usage: CopyTable [--rs.class=CLASS] [--rs.impl=IMPL] [--starttime=X]
[--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>
DATA TASK – COPYTABLE TOOL (2/2)
 The copy of the table is stored on the same cluster
66
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar copytable 
--new.name=testtable3 testtable
11/06/26 15:20:07 INFO mapreduce.TableOutputFormat: 
Created table instance for testtable3
11/06/26 15:20:07 INFO mapred.JobClient: Running job: job_201106261454_0003
11/06/26 15:20:08 INFO mapred.JobClient: map 0% reduce 0%
11/06/26 15:20:19 INFO mapred.JobClient: map 6% reduce 0%
…
11/06/26 15:21:04 INFO mapred.JobClient: map 100% reduce 0%
11/06/26 15:21:06 INFO mapred.JobClient: Job complete: job_201106261454_0003
11/06/26 15:21:06 INFO mapred.JobClient: Counters: 5
11/06/26 15:21:06 INFO mapred.JobClient: Job Counters
11/06/26 15:21:06 INFO mapred.JobClient: Launched map tasks=32
11/06/26 15:21:06 INFO mapred.JobClient: Data-local map tasks=32
11/06/26 15:21:06 INFO mapred.JobClient: Map-Reduce Framework
11/06/26 15:21:06 INFO mapred.JobClient: Map input records=0
11/06/26 15:21:06 INFO mapred.JobClient: Spilled Records=0
11/06/26 15:21:06 INFO mapred.JobClient: Map output records=0
DATA TASK – BULK IMPORT (1/2)
 Importtsv tool
 Given files containing data in tab-separated value (TSV)
format
 By default , it uses the HBase put() API to insert data
into HBase one row at a time
 By setting importtsv.bulk.output option, generate files
using HFileOutputFormat
 These can subsequently be bulk-loaded into HBase by
completebulkload Tool
67
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar importtsv
Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
DATA TASK – BULK IMPORT (2/2)
 completebulkload Tool
 Is used to import the data into the running cluster
 After a data import has been prepared
 By using the importtsv tool with the importtsv.bulk.output
option
 By some other MapReduce job using the
HFileOutputFormat
68
hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar completebulkload 
-conf ~/my-hbase-site.xml /user/larsgeorge/myoutput mytable
TROUBLESHOOTING – HBASE FSCK (1/4)
 Shell Command
 ${HBASE_HOME}/bin/hbase hbck
 Once started
 Scans the .META. table to gather all the pertinent information
it holds
 Scans the HDFS root directory HBase is configured to use
 Compare the collected details to report on inconsistencies
and integrity issues
 Consistency check
 Whether the region is listed in .META. and exists in HDFS
 Is also assigned to exactly one region server
 Integrity check
 Compares the regions with the table details to find missing
regions
 Those that have holes or overlaps in their row key ranges 69
TROUBLESHOOTING – HBASE FSCK (2/4)
70
${HBASE_HOME}/bin/hbase hbck -h
Usage: fsck [opts]
where [opts] are:
-details Display full report of all regions.
-timelag {timeInSeconds} Process only regions that have not experienced
any metadata updates in the last {{timeInSeconds} seconds.
-fix Try to fix some of the errors.
-sleepBeforeRerun {timeInSeconds} Sleep this many seconds before checking
if the fix worked if run with -fix
-summary Print only summary of the tables and status.
TROUBLESHOOTING – HBASE FSCK (3/4)
 No option at all invokes the normal output detail
71
${HBASE_HOME}/bin/hbase hbck
Number of Tables: 40
Number of live region servers: 19
Number of dead region servers: 0
Number of empty REGIONINFO_QUALIFIER rows in .META.: 0
Summary:
...
testtable2 is okay.
Number of regions: 1
Deployed on: host11.foo.com:60020
0 inconsistencies detected.
Status: OK
TROUBLESHOOTING – HBASE FSCK (4/4)
 ${HBASE_HOME}/bin/hbase hbck -fix
 Repairs following issues
 Assign .META. to a single new server if it is unassigned
 Reassign .META. to a single new server if it is assigned to
multiple servers
 Assign a user table region to a new server if it is unassigned
 Reassign a user table region to a single new server if it is
assigned to multiple servers
 Reassign a user table region to a new server if the current
server does not match
 what the .META. table refers to
 hbck reports inconsistencies which are temporal, or
transitional only
 Rerun the tool a few times to confirm a permanent problem
72
TROUBLESHOOTING – ANALYZING THE LOGS (1/2)
Server type Default Logfile tm settings
HBase Master
$HBASE_HOME/logs/hbase-<user>-master-
<hostname>.log
/var/log/hbase/hbase-<user>-master-
<hostname>.log
HBase
RegionServer
$HBASE_HOME/logs/hbase-<user>-regionserver-
<hostname>.log
/var/log/hbase/hbase-<user>-regionserver-
<hostname>.log
ZooKeeper Console log output only
/var/log/hbase/hbase-<user>-zookeeper-
<hostname>.log
NameNode
$HADOOP_HOME/logs/hadoop-<user>-namenode-
<hostname>.log
/var/log/hadoop/hadoop-<user>-namenode-
<hostname>.log
DataNode
$HADOOP_HOME/logs/hadoop-<user>-datanode-
<hostname>.log
/var/log/hadoop/hadoop-<user>-datanode-
<hostname>.log
JobTracker
$HADOOP_HOME/logs/hadoop-<user>-jobtracker-
<hostname>.log
/var/log/hadoop/hadoop-<user>-jobtracker-
<hostname>.log
TaskTracker
$HADOOP_HOME/logs/hadoop-<user>-jobtracker-
<hostname>.log
/var/log/hadoop/hadoop-<user>-jobtracker-
<hostname>.log
73
TROUBLESHOOTING – ANALYZING THE LOGS (2/2)
 Is useful to begin with the master logfile first
 It acts as the coordinator service of the entire cluster
 Find the processes began logging ERROR level
messages
 Be able to identify the root cause
 A lot of subsequent messages are often side-effect of the
original problem
 Recommend to use the error log event metric under
System Event Metrics group
 Gives you a graph showing you where the server(s)
started logging an increasing number of error messages
in the logfiles
 If find an error message
 Google it !!
 Use the online resources to search for the message in
the public mailing lists
 Search Hadoop
74
HANDS-ON – USE YCSB
 New VM list
 Due to VMs are not affordable at present :p
 ${YOUR_HOME}=${GIT_HOME}/hbase-
training/006/hands-on/${YOUR_NAME}
 mkdir ${YOUR_HOME}
 cd ${YOUR_HOME}; cp -rf ../../ycsb/* .
 Use HBase shell
 create <YOUR_NAMED_TABLE>, “family”
 Run YCSB with 5000 record count
 And ouput ycsb-load.log file
 Hands-on result
 Put the ycsb-load.log file under ${YOUR_HOME}
75

Mais conteúdo relacionado

Mais procurados

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance ImprovementBiju Nair
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low LatencyNick Dimiduk
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersCloudera, Inc.
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014Nick Dimiduk
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetAlexey Lesovsky
 
plProxy, pgBouncer, pgBalancer
plProxy, pgBouncer, pgBalancerplProxy, pgBouncer, pgBalancer
plProxy, pgBouncer, pgBalancerelliando dias
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Command Prompt., Inc
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALEPostgreSQL Experts, Inc.
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuningelliando dias
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
HBase Coprocessor Introduction
HBase Coprocessor IntroductionHBase Coprocessor Introduction
HBase Coprocessor IntroductionSchubert Zhang
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practiceAlexey Lesovsky
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoverySteven Francia
 
Tuning Java for Big Data
Tuning Java for Big DataTuning Java for Big Data
Tuning Java for Big DataScott Seighman
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleHBaseCon
 
PGPool-II Load testing
PGPool-II Load testingPGPool-II Load testing
PGPool-II Load testingEDB
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationEDB
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015PostgreSQL-Consulting
 

Mais procurados (20)

HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
Apache HBase Low Latency
Apache HBase Low LatencyApache HBase Low Latency
Apache HBase Low Latency
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014HBase Low Latency, StrataNYC 2014
HBase Low Latency, StrataNYC 2014
 
PostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication CheatsheetPostgreSQL Streaming Replication Cheatsheet
PostgreSQL Streaming Replication Cheatsheet
 
plProxy, pgBouncer, pgBalancer
plProxy, pgBouncer, pgBalancerplProxy, pgBouncer, pgBalancer
plProxy, pgBouncer, pgBalancer
 
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
Building tungsten-clusters-with-postgre sql-hot-standby-and-streaming-replica...
 
HBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environmentHBaseCon2017 Improving HBase availability in a multi tenant environment
HBaseCon2017 Improving HBase availability in a multi tenant environment
 
PostgreSQL Replication in 10 Minutes - SCALE
PostgreSQL Replication in 10  Minutes - SCALEPostgreSQL Replication in 10  Minutes - SCALE
PostgreSQL Replication in 10 Minutes - SCALE
 
PostgreSQL Performance Tuning
PostgreSQL Performance TuningPostgreSQL Performance Tuning
PostgreSQL Performance Tuning
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
HBase Coprocessor Introduction
HBase Coprocessor IntroductionHBase Coprocessor Introduction
HBase Coprocessor Introduction
 
Streaming replication in practice
Streaming replication in practiceStreaming replication in practice
Streaming replication in practice
 
Replication, Durability, and Disaster Recovery
Replication, Durability, and Disaster RecoveryReplication, Durability, and Disaster Recovery
Replication, Durability, and Disaster Recovery
 
Tuning Java for Big Data
Tuning Java for Big DataTuning Java for Big Data
Tuning Java for Big Data
 
Keynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! ScaleKeynote: Apache HBase at Yahoo! Scale
Keynote: Apache HBase at Yahoo! Scale
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
PGPool-II Load testing
PGPool-II Load testingPGPool-II Load testing
PGPool-II Load testing
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
 

Semelhante a PT&CA-Performance Tuning & Cluster Administration

Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Anna Shymchenko
 
Webcenter application performance tuning guide
Webcenter application performance tuning guideWebcenter application performance tuning guide
Webcenter application performance tuning guideVinay Kumar
 
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...In-Memory Computing Summit
 
Trigger maxl from fdmee
Trigger maxl from fdmeeTrigger maxl from fdmee
Trigger maxl from fdmeeBernard Ash
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
PoC: Using a Group Communication System to improve MySQL Replication HA
PoC: Using a Group Communication System to improve MySQL Replication HAPoC: Using a Group Communication System to improve MySQL Replication HA
PoC: Using a Group Communication System to improve MySQL Replication HAUlf Wendel
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephenSteve Feldman
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
Jvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationJvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationQuentin Ambard
 
Building Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleBuilding Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleAlex Thompson
 
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningWebinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningSeveralnines
 
#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future DesignPivotalOpenSourceHub
 
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Anna Shymchenko
 
Performance Tuning - MuraCon 2012
Performance Tuning - MuraCon 2012Performance Tuning - MuraCon 2012
Performance Tuning - MuraCon 2012eballisty
 
♨️CPU limitation per Oracle database instance
♨️CPU limitation per Oracle database instance♨️CPU limitation per Oracle database instance
♨️CPU limitation per Oracle database instanceAlireza Kamrani
 

Semelhante a PT&CA-Performance Tuning & Cluster Administration (20)

Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
 
JVM Magic
JVM MagicJVM Magic
JVM Magic
 
Webcenter application performance tuning guide
Webcenter application performance tuning guideWebcenter application performance tuning guide
Webcenter application performance tuning guide
 
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
 
Trigger maxl from fdmee
Trigger maxl from fdmeeTrigger maxl from fdmee
Trigger maxl from fdmee
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
PoC: Using a Group Communication System to improve MySQL Replication HA
PoC: Using a Group Communication System to improve MySQL Replication HAPoC: Using a Group Communication System to improve MySQL Replication HA
PoC: Using a Group Communication System to improve MySQL Replication HA
 
071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen071410 sun a_1515_feldman_stephen
071410 sun a_1515_feldman_stephen
 
OpenDS_Jazoon2010
OpenDS_Jazoon2010OpenDS_Jazoon2010
OpenDS_Jazoon2010
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
Jvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationJvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies application
 
ESX performance problems 10 steps
ESX performance problems 10 stepsESX performance problems 10 steps
ESX performance problems 10 steps
 
Building Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scaleBuilding Apache Cassandra clusters for massive scale
Building Apache Cassandra clusters for massive scale
 
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance TuningWebinar slides: Our Guide to MySQL & MariaDB Performance Tuning
Webinar slides: Our Guide to MySQL & MariaDB Performance Tuning
 
#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Off-Heap Storage Current and Future Design
 
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
Вячеслав Блинов «Java Garbage Collection: A Performance Impact»
 
ZGC-SnowOne.pdf
ZGC-SnowOne.pdfZGC-SnowOne.pdf
ZGC-SnowOne.pdf
 
Performance Tuning - MuraCon 2012
Performance Tuning - MuraCon 2012Performance Tuning - MuraCon 2012
Performance Tuning - MuraCon 2012
 
♨️CPU limitation per Oracle database instance
♨️CPU limitation per Oracle database instance♨️CPU limitation per Oracle database instance
♨️CPU limitation per Oracle database instance
 

Mais de Scott Miao

My thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharingMy thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharingScott Miao
 
20171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v0120171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v01Scott Miao
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudScott Miao
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsScott Miao
 
Zero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter MigrationZero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter MigrationScott Miao
 
Attack on graph
Attack on graphAttack on graph
Attack on graphScott Miao
 
20121022 tm hbasecanarytool
20121022 tm hbasecanarytool20121022 tm hbasecanarytool
20121022 tm hbasecanarytoolScott Miao
 

Mais de Scott Miao (7)

My thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharingMy thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
My thoughts for - Building CI/CD Pipelines for Serverless Applications sharing
 
20171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v0120171122 aws usergrp_coretech-spn-cicd-aws-v01
20171122 aws usergrp_coretech-spn-cicd-aws-v01
 
Achieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloudAchieve big data analytic platform with lambda architecture on cloud
Achieve big data analytic platform with lambda architecture on cloud
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the aws
 
Zero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter MigrationZero-downtime Hadoop/HBase Cross-datacenter Migration
Zero-downtime Hadoop/HBase Cross-datacenter Migration
 
Attack on graph
Attack on graphAttack on graph
Attack on graph
 
20121022 tm hbasecanarytool
20121022 tm hbasecanarytool20121022 tm hbasecanarytool
20121022 tm hbasecanarytool
 

Último

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Último (20)

The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

PT&CA-Performance Tuning & Cluster Administration

  • 1. PERFORMANCE TUNING & CLUSTER ADMINISTRATION 2012/8/2 Scott Miao
  • 2. AGENDA  Course Credit  Performance Tuning  More…  Cluster Administration  More… 2
  • 3. COURSE CREDIT  Show up, 30 scores  Ask question, each question earns 5 scores  Hands-on, 40 scores  70 scores will pass this course  Each course credit will be calculated once for each course finished  The course credit will be sent to you and your supervisor by mail 3
  • 4. PERFORMANCE TUNING  Garbage Collection Tuning  MSLAB  Compression  Optimizing Splits and Compactions  Load Balancing  Merging Regions  Client API: Best Practices  Configuration  Load Tests 4
  • 5. GARBAGE COLLECTION TUNING  The process to rewrite the heap generation in question is called a garbage collection (GC)  GC parameters only need to be added to the region servers  JRE comes with basic assumptions  Regarding what your programs are doing, how they create objects, how they allocate the heap to handle data, and so on  These assumptions work well in a lot of cases  But NOT work well for HBase…  Especially write-heavy ones  It cannot safely rely on the JRE assumption alone 5
  • 7. 7
  • 8. GARBAGE COLLECTION TUNING – WRITE-HEAVY USE CASES (1/2)  Memstore flushes the data by the configured minimum flush size, hbase.hregion.memstore.flush.size  It leaves different size of holes in the heap  Data resided in different locations in the generational architecture of the Java heap  Depending on how long the data was in memory  Young generation (new generation)  The space can be reclaimed quickly and no harm is done  Old generation (tenured generation)  Data promoted to this location if it stays in memory for a longer period of time 8
  • 9. GARBAGE COLLECTION TUNING – WRITE-HEAVY USE CASES (2/2)  Reuse the holes created by data that has been written to disk  Requests a size of heap that does not fit into one of those holes  Needs to compact the fragmented heap  Young to Old  The promotion of longer-living objects from the young to the old generation  Old to Stop-The-World  There is no longer enough space for a young allocation caused by the fragmentation  Falls back to the stop-the-world garbage collector  Rewrites the entire heap space and compacts it to the remaining active objects  If this fails, you will see a promotion failure in your garbage collection logs 9
  • 10. 10 What is the Heap looks like ?
  • 11. GARBAGE COLLECTION TUNING – SPECIFY THE YOUNG GENERATION SIZE  Young generation  is between 128 MB and 512 MB  Old generation  holds the remaining available heap, which is usually many gigabytes of memory  Using 128 MB is a good starting point  Further observation of the JVM metrics should be conducted  Specify the young generation size like so  -XX:MaxNewSize=128m -XX:NewSize=128m  One convenient option  -Xmn128m 11
  • 12. GARBAGE COLLECTION TUNING – GC OPTIONS SETTING  GC Options setting for HBase  Adding them in the hbase-env.sh configuration file  HBASE_OPTS variable for all HBase  HBASE_REGIONSERVER_OPTS variable for all region servers  Enable the JRE’s log output for garbage collection details  Monitor it for occurrences of  "concurrent mode failure" or "promotion failed" messages 12 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"
  • 13. GARBAGE COLLECTION TUNING – GC STRATEGY FOR YOUNG GENERATION  Recommended value for young generation  -XX:+UseParNewGC  Use the Parallel New Collector  It stops the entire Java process to clean up the young generation heap  Since Young generation’s size is small in comparison  Usually less than a few hundred milliseconds 13
  • 14. GARBAGE COLLECTION TUNING – GC STRATEGY FOR OLD GENERATION  Recommended value for old generation  -XX:+UseConcMarkSweepGC  Use the Concurrent Mark-Sweep Collector (CMS)  It tries to do as much work concurrently as possible, without stopping the Java process  It takes extra effort and an increased CPU load  Avoids the required stops to rewrite a fragmented old generation heap  If you hit the promotion error  It falls back to stop-the-world again 14
  • 15. GARBAGE COLLECTION TUNING – GC STRATEGY FOR OLD GENERATION  A switch for CMS  -XX:CMSInitiatingOccupancyFraction=70  A percentage that specifies when the background process starts  Avoids the concurrent mode failure  The background process to mark and sweep the heap for collection is still running when the heap runs out of usable space  Falls back to stop-the-world again  Initiating occupancy fraction to 70%  20% block cache + 40% memstore limits = 60%, by default  Starts the background process at appropriate time  Early enough, and not too early 15
  • 16. GARBAGE COLLECTION TUNING - SUMMARY  Recommended GC options  The Alex Su’s GC options  GC Options Reference 16 export HBASE_REGIONSERVER_OPTS= "-Xmx8g -Xms8g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log" -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:<%= hbase_log_path %>/hbase-regionserver-gc-`date +%F-%H-%M-%S`.log -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:PrintFLSStatistics=1 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<%= hbase_log_path %>/hbase-regionserver.hprof http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp- 140102.html
  • 17. MSLAB - QUESTION  For solving the stop-the-world issue  Stop-the-world  The key to reducing these compacting collections is to reduce fragmentation  Only objects of exactly the same size should be allocated from the heap  Subsequent allocations of new objects of the exact same size will always reuse these holes  No promotion error, and therefore no stop-the-world compacting collection is required 17
  • 18. MSLAB – MEMSTORE-LOCAL ALLOCATION BUFFER (1/3)  Are buffers of fixed sizes containing KeyValue instances of varying sizes 1. A buffer cannot completely fit a newly added KeyValue, it is considered full 2. And a new buffer is created, once again of the given fixed size  Enabled by default in version 0.92  Disabled in version 0.90 of HBase  hbase.hregion.memstore.mslab.enabled property  It is recommended that test your setup with this feature 18
  • 19. MSLAB – MEMSTORE-LOCAL ALLOCATION BUFFER (2/3)  The size of each allocated, fixed-sized buffer  hbase.hregion.memstore.mslab.chunksize property  Default is 2 MB  Based on your KeyValue instances, you may have to adjust this value  E.g., 100 KB in size, you need to increase the MSLAB size to fit more than just a few cells  An upper boundary of what is stored in the buffers  hbase.hregion.memstore.mslab.max.allocation property  Default 256 KB  Any cell (KeyValue) that is larger will be directly allocated in the Java heap 19
  • 20. MSLAB – MEMSTORE-LOCAL ALLOCATION BUFFER (3/3)  MSLAB do not come without a cost  More wasteful in regard to heap usage  Most likely not fill every buffer to the last byte  A Tradeoff  Use MSLABs and benefit from better garbage collection but incur the extra space that is required  NOT use MSLABs and benefit from better memory efficiency but deal with the problem caused by garbage collection pauses  Could plan to restart the servers every few days, or weeks, before the pause happens  The buffers require an additional byte array copy operation, therefore slightly slower  Measure the impact on your workload 20
  • 21. COMPRESSION  A number of compression algorithms that can be enabled at the column family level  It is recommended  Enable compression unless you have a reason not to do so  For example, when using already compressed content, such as JPEG images  Compression usually will yield overall better performance  The overhead of the CPU performing the compression /de-compression is less than what is required to read more data from disk 21
  • 22. COMPRESSION – AVAILABLE CODECS  It is recommended  Snappy/Zippy (in Bigtable)  Released by Google under the BSD License  Ships with the required JNI libraries to be able to use it in HBase-0.92  Must install the native binary library on all region servers  LZO (Lempel-Ziv-Oberhumer)  A lossless data compression algorithm that is focused on decompression speed, and written in ANSI C  HBase cannot ship with LZO because of licensing issues  incompatible GNU General Public License (GPL)  LZO installation needs to be performed separately, after HBase has been installed 22 http://norfolk.cs.washington.edu/htbin-post/unrestricted/colloq/details.cgi?id=437
  • 23. COMPRESSION – COMPRESSION TEST TOOL  Use command  hbase org.apache.hadoop.hbase.util.CompressionTest <path> <none|gz|lzo|snappy>  Example  ./bin/hbase org.apache.hadoop.hbase.util.CompressionTest /user/larsgeorge/test.gz gz  It will return result based on the test  If success  If failed 23 … SUCCESS Exception in thread "main" java.lang.RuntimeException: java.lang.ClassNotFoundException: com.hadoop.compression.lzo.LzoCodec …
  • 24. COMPRESSION – STARTUP CHECK  A fast failing setup notices the missing libraries  Instead of running into issues later  For example, check the Snappy and LZO compression libraries  The server will abort at startup with an IOException stating  "Compression codec <codec-name> not supported, aborting RS construction"  Copy the changed configuration file to all region servers and to restart them afterward 24 <property> <name>hbase.regionserver.codecs</name> <value>snappy,lzo</value> </property>
  • 25. COMPRESSION – ENABLING COMPRESSION  Install the JNI libraries  Install native compression libraries  Specifying the chosen algorithm in the column family schema  In HBase shell  create 'testtable', { NAME => 'colfam1', COMPRESSION => 'GZ' }  In API  HColumnDescriptor.setCompressionType(…)  Refer to ppt#003, p#11 25
  • 26. OPTIMIZING SPLITS AND COMPACTIONS - SPLIT/COMPACTION STORMS  Grow your regions roughly at the same rate  Eventually they all need to be split at about the same time  A large spike in disk I/O because of the required compactions to rewrite the split region  Refer to ppt#004, p#13 26
  • 27. OPTIMIZING SPLITS AND COMPACTIONS – MANAGED SPLITTING (1/2)  you can turn it off and manually invoke the split and major_compact commands  Setting Region Maximum File Size  hbase.hregion.max.filesize property for the entire cluster  table level by API  HTableDescriptor.setMaxFileSize(…)  Refer to ppt#003, p#7  To a very high number  Better to set this value to a reasonable upper boundary  Such as 100GB  Long.MAX_VALUE is not recommended in case the manual splits fail to run  Then you can time-control them  Running them staggered across all regions  Spreads the I/O load as much as possible, avoiding any split/compaction storm  Use HBase shell + cron  Or write your own codes with HBase Admin API supports  Refer to #003, p#21 27
  • 28. OPTIMIZING SPLITS AND COMPACTIONS – MANAGED SPLITTING (2/2)  RegionSplitter Class (added in version 0.90.2)  Another way to split existing regions  Rolling split feature  Split the existing regions while waiting long enough for the involved compactions to complete  API docs  An additional advantage  Have better control over which regions are available at any time  In rare case, you need to do very low-level debugging  With automated splits, it is hard to debug !!  Due to this region is split to two daughter regions 28
  • 29. OPTIMIZING SPLITS AND COMPACTIONS – REGION HOTSPOTTING  You may be dealing with a write pattern that is causing a specific region to run hot  Use Region Server Metrics to observe  Refer to ppt#005, p#12  Key design approaches  Salt keys, random keys, etc  Refer to ppt#004, p#52  Other only way to alleviate this situation  Manually split a hot region into one or more new regions, at exact boundaries  You can specify any row key within specific region  Be able to generate halves that are completely different in size  Refer ppt#003, p#21  This can not dealing with completely sequential key ranges  Those are always going to hit one region for a considerable amount of time 29
  • 30. OPTIMIZING SPLITS AND COMPACTIONS – PRESPLITTING REGIONS (1/3)  Manage splits manually is useful  Therefore start with a larger number of regions right from the table creation  Means to create a table with the required number of regions  Three ways…  HBase shell  create, refer to ppt#003, p#37  API  HBaseAdmin.createTable(…), refer to ppt#003, p#16  RegionSplitter Class  By default, MD5StringSplit class to partition the row keys into ranges  Use -D split.algorithm=<your-algorithm-class> for other implementation 30 /bin/hbase org.apache.hadoop.hbase.util.RegionSplitter usage: RegionSplitter <TABLE>
  • 31. OPTIMIZING SPLITS AND COMPACTIONS – PRESPLITTING REGIONS (2/3)  RegionSplitter with MD5StringSplit sample 31 testtable,,1309766006467.c0937d09f1da31f2a6c2950537a61093. testtable,0ccccccc,1309766006467.83a0a6a949a6150c5680f39695450d8a. testtable,19999998,1309766006467.1eba79c27eb9d5c2f89c3571f0d87a92. testtable,26666664,1309766006467.7882cd50eb22652849491c08a6180258. testtable,33333330,1309766006467.cef2853e36bd250c1b9324bac03e4bc9. testtable,3ffffffc,1309766006467.00365940761359fee14d41db6a73ffc5.
  • 32. OPTIMIZING SPLITS AND COMPACTIONS – PRESPLITTING REGIONS (3/3)  How many presplit regions ?  Start low with 10 presplit regions per server and watch as data grows over time  It is better to err on the side of too few regions and using a rolling split later  If Presplit regions to thin  Increase hbase.hregion.majorcompaction property  Refet to ppt#004, p# 19  If data size grows too large  Use the RegionSplitter utility to perform a rolling split of all regions  The main objective is to avoid split/compaction storm 32
  • 33. LOAD BALANCING – BALANCER (1/3)  The master has a built-in feature  Called the balancer  By default, runs every five minutes  hbase.balancer.period property  Attempts to equal out the number of assigned regions per region server  Within one region of the average number per server  Determines a new assignment plan  Describes which regions should be moved where starts the process of moving the regions by calling the unassign() method  Refer to ppt#003, p#22 33
  • 34. LOAD BALANCING - BALANCER (2/3)  balancer has an upper limit on how long it is allowed to run  hbase.balancer.max.balancing property  defaults to half of the balancer period value  2.5 mins  The balancer switch  Toggle the balancer status between enabled and disabled  HBase shell  balance_switch command, refer to ppt#003, p#39  balanceSwitch() API method, refer to ppt#003, p#22 34
  • 35. LOAD BALANCING - BALANCER (3/3)  Can be explicitly started  HBase shell  balancer command, refer to ppt#003, p#39  balancer() API method, refer to ppt#003, p#22  Return true  Any work has be done  Return false  balancer was switched off  No work to be done  balancer was not able to run the balancer  There is a region currently in transition, the balancer will be skipped 35
  • 36. LOAD BALANCING - MOVE  Can also use the move  To assign regions to other servers  HBase shell  move command, refer to ppt#003, p#39  move() API method, refer to ppt#003, p#22 36
  • 37. MERGING REGIONS  Sometimes you may need to merge regions  For example, after you have removed a large amount of data and you want to reduce the number of regions hosted by each server  HBase allows you to merge two adjacent regions  The HBase cluster must be offline, but HDFS 37 /bin/hbase org.apache.hadoop.hbase.util.Merge Usage: bin/hbase merge <table-name> <region-1> <region-2>
  • 38. CLIENT API: BEST PRACTICES (1/3)  Disable auto-flush  When performing a lot of put operations  Refer to ppt#002, p#9  Use scanner-caching  Set Scan.setCaching() method to something greater than the default of 1 if needed  Refer to ppt#002, p#26  Limit scan scope  If only a small number of the available columns are to be processed, only those should be specified in the input scan  For example, use Scan.addFamily() method  Refer to ppt#002, p#24 38
  • 39. CLIENT API: BEST PRACTICES (2/3)  Close ResultScanners  Avoiding performance problems  This may cause problems on the region servers  Refer to ppt#002, p#25  Block cache usage  Scan instances can be set to use the block cache in the region server via the setCacheBlocks() method  true by default, default settings of the table and family are used  API docs  Server side block cache settings  Refer to ppt#003, p#12 39
  • 40. CLIENT API: BEST PRACTICES (3/3)  Optimal loading of row keys  When performing a table scan where only the row keys are needed  a FilterList with a MUST_PASS_ALL operator + FirstKeyOnlyFilter + KeyOnlyFilter  Refer to ppt#002, p#43 & 46  Turn off WAL on Puts  Increasing throughput on Puts is to call writeToWAL(false), there might be data loss  Consider to use the bulk loading techniques instead 40
  • 41. CONFIGURATION (1/6)  Advanced options you can consider adjusting based on your use case  Most properties are configured in hbase-site.xml  Others are in hbase-env.sh  Decrease ZooKeeper timeout  The default timeout between a region server and the ZooKeeper quorum is three minutes  Tune the timeout down to a minute, or even less, so the master notices failures sooner  zookeeper.session.timeout property  Be careful of ―Juliet Pause‖ 41
  • 42. CONFIGURATION (2/6)  Increase handlers  The number of threads that are kept open to answer incoming requests to user tables  By default is 10  hbase.regionserver.handler.count property  Keep this number low when the payload per request approaches megabytes  And high when the payload is small  Increase heap settings  HBASE_HEAPSIZE setting in hbase-env.sh file  Consider using HBASE_REGIONSERVER_OPTS instead of changing the global HBASE_HEAP SIZE  Region servers may need more memory than Master 42
  • 43. CONFIGURATION (3/6)  Enable data compression  Should enable compression for the storage files  In most cases, boosts performance  Increase region size  Consider going to larger regions to cut down on the total number of regions on your cluster  Fewer regions to manage makes for a smoother-running cluster 43
  • 44. CONFIGURATION (4/6)  Adjust block cache size  The amount of heap used for the block cache is specified as a percentage  Defaults to 20%  perf.hfile.block.cache.size property  It is good if you have mainly reading workloads  Adjust memstore limits  Memstore heap usage  hbase.regionserver.global.memstore.upperLimit property  Defaults to 40%  hbase.regionserver.global.memstore.lowerLimit property  Defaults to 35%  Control the amount of flushing that will take place once the server is required to free heap space  Mainly read-oriented workloads  Consider reducing both limits to make more room for the block cache  Handling many writes  Increase the memstore limits to reduce the excessive amount of I/O this causes 44
  • 45. CONFIGURATION (5/6)  Increase blocking store files  The region servers block further updates from clients to give compactions time to reduce the number of files  Default is seven files  hbase.hstore.blockingStoreFiles property  Increase block multiplier  A safety latch that blocks any further updates from clients when the memstores exceed the multiplier * flush size limit  hbase.hregion.memstore.block.multiplier property  Default to 2  If you have enough memory, can increase this value to handle spikes more gracefully  Refer to ppt#003, p#8 45
  • 46. CONFIGURATION (6/6)  Decrease maximum logfiles  How often flushes occur based on the number of WAL files on disk  Default is 32  hbase.regionserver.maxlogs property  Can be high in a write-heavy use case  Lower it to force the servers to flush data more often to disk 46
  • 47. LOAD TESTS  It is advisable to run performance tests to verify functionality of your cluster  These tests give you a baseline which you can refer to  After making changes to the configuration of the cluster  Or the schemas of your tables  Doing a burn-in of your cluster  Show you how much you can gain from it  But this does not replace a test with the load as expected from your use case 47
  • 48. LOAD TESTS – PERFORMANCE EVALUATION (1/2)  HBase ships with its own tool to execute a performance evaluation  Performance Evaluation (PE)  Wiki  http://wiki.apache.org/hadoop/Hbase/PerformanceEvalu ation 48 /bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation Usage: java org.apache.hadoop.hbase.PerformanceEvaluation [--miniCluster] [--nomapred] [--rows=ROWS] <command> <nclients>
  • 49. LOAD TESTS – PERFORMANCE EVALUATION (2/2)  Example 49 /bin/hbase org.apache.hadoop.hbase.PerformanceEvaluation sequentialWrite 1 11/07/03 13:18:34 INFO hbase.PerformanceEvaluation: Start class org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest at offset 0 for 1048576 rows ... 11/07/03 13:18:41 INFO hbase.PerformanceEvaluation: 0/104857/1048576 ... 11/07/03 13:18:45 INFO hbase.PerformanceEvaluation: 0/209714/1048576 ... 11/07/03 13:20:03 INFO hbase.PerformanceEvaluation: 0/1048570/1048576 11/07/03 13:20:03 INFO hbase.PerformanceEvaluation: Finished class org.apache.hadoop.hbase.PerformanceEvaluation$SequentialWriteTest in 89062ms at offset 0 for 1048576 rows
  • 50. LOAD TESTS – YCSB (1/2)  Yahoo! Cloud Serving Benchmark* (YCSB)  It is a suite of tools that can be used to run comparable workloads against different storage systems  Also a reasonable tool for performing an HBase cluster burn- in—or performance test  Using YCSB is preferred over the HBase-supplied Performance Evaluation  Offers more options  Can combine read and write workloads  Home page  http://research.yahoo.com/Web_Information_Management/YCSB 50
  • 51. LOAD TESTS – YCSB (2/2)  Use HBase shell  create “usertable”, “family”  git pull  cd ${GIT_HOME}/hbase-training/006/ycsb  Run command  Then you can see performance metrics in ycsb- laod.log file 51 java -cp "${HBASE_CONF_DIR}:core-0.1.4.jar:hbase-binding-0.1.4.jar" com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=family -p recordcount=1000 -s > ycsb- load.log
  • 52. CLUSTER ADMINISTRATION 52  Operational Tasks  Node Decommission  Rolling Restarts  Adding Backup Master  Adding a Region Server  Data Task  Export  Import  CopyTable Tool  Bulk Import  Troubleshooting  HBase Fsck  Analyzing the Logs
  • 53. OPERATIONAL TASKS – NODE DECOMMISSION (1/2)  Use following script  In normal HBase distribution  In tm distribution  Disable the Load Balancer before Decommissioning a node  In hbase shell  balance_switch false  Regions could be offline for a good period of time  Many regions on the server  All regions close  The master notices the region server’s ZooKeeper znode being removed 53 ${HBASE_HOME}/bin/hbase-daemon.sh stop regionserver ${TM_PUPPET_HOME}/bin/services/shutdown-regionservers.sh [<host> ...]
  • 54. OPERATIONAL TASKS – NODE DECOMMISSION (2/2)  Stop a region server gradually  A node to gradually shed its load and then shut itself down  From HBASE 0.90.2  ${HBASE_HOME}/bin/graceful_stop.sh  Example  Check the HOSTNAME on your HBase master UI  Refer to ppt#003, p#41  IP address is NOT supported at present 54 ${HBASE_HOME}/bin/graceful_stop.sh HOSTNAME
  • 55. OPERATIONAL TASKS – ROLLING RESTARTS  Also use graceful_stop.sh  Steps as follows 1. Ensure the cluster is consistent  Fix it if inconsistent 2. Restart the master 3. Disable the region balancer 4. Run the graceful_stop.sh script per region server 5. Restart the master again  Clear out the dead servers list and reenable the balancer 6. Run hbck to ensure the cluster is consistent 55 hbase hbck hbase hbck -fix ${HBASE_HOME}/bin/hbase-daemon.sh stop master; ${HBASE_HOME}/bin/hbase-daemon.sh start master echo "balance_switch false" | ${HBASE_HOME}/bin/hbase shell for i in `cat conf/regionservers|sort`; do ./bin/graceful_stop.sh --restart --reload --debug $i; done &> /tmp/log.txt &
  • 56. OPERATIONAL TASKS – ADDING BACKUP MASTER (1/2)  To prevent the Single Point of Failure  The machine currently hosting the active master is failing, the system can fall back to a backup master  Underlying operations 1. A dedicated ZooKeeper znode /hbase/master 2. All master processes will race to create, and the first one to create it wins (become currently master)  It happens at startup 3. All other master processes simply loop around the znode check and wait for it to disappear  Triggering the race again 56
  • 57. OPERATIONAL TASKS – ADDING BACKUP MASTER (2/2)  How to start multiple backup master processes  Use original way to start a master process  In tm distribution  Specifically start a backup master process 57 ${HBASE_HOME}/bin/hbase-daemon.sh start master ${TM_PUPPET_HOME}/bin/services/startup-hmaster.sh [<host> ...] ${HBASE_HOME}/bin/hbase-daemon.sh start master --backup
  • 58. OPERATIONAL TASKS – ADDING A REGION SERVER  In normal HBase distribution  Edit the ${HBASE_HOME}/conf/regionservers  To add newly added region server’s host name  Two scripts can use…  ${HBASE_HOME}/bin/start-hbase.sh  It will bypass the original existing region servers, and start the newly added region server referred to regionservers file  ${HBASE_HOME}/bin/hbase-daemon.sh start regionserver  Must executing on the newly added region server  In tm distribution  New feature, not talk about this here 58
  • 59. DATA TASK  You may be required to move the data as a whole or in parts  Archive data for backup purposes  To bootstrap another cluster 59 hadoop jar ${HBASE_HOME}/hbase-0.91.0-SNAPSHOT.jar An example program must be given as the first argument. Valid program names are: … completebulkload: Complete a bulk data load. copytable: Export a table from local cluster to peer cluster export: Write table data to HDFS. import: Import data written by Export. importtsv: Import data in TSV format. … http://hbase.apache.org/book/ops_mgt.html
  • 60. DATA TASK – EXPORT (1/3) 60 hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar export Usage: Export [-D <property=value>]* <tablename> <outputdir> [<versions> [<starttime> [<endtime>]]
  • 61. DATA TASK - EXPORT (2/3) 61 hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar export testtable /user/larsgeorge/backup-testtable 11/06/25 15:58:29 INFO mapred.JobClient: Running job: job_201106251558_0001 11/06/25 15:58:30 INFO mapred.JobClient: map 0% reduce 0% … 11/06/25 15:59:40 INFO mapred.JobClient: map 100% reduce 0% 11/06/25 15:59:42 INFO mapred.JobClient: Job complete: job_201106251558_0001 11/06/25 15:59:42 INFO mapred.JobClient: Counters: 6 11/06/25 15:59:42 INFO mapred.JobClient: Job Counters 11/06/25 15:59:42 INFO mapred.JobClient: Rack-local map tasks=32 11/06/25 15:59:42 INFO mapred.JobClient: Launched map tasks=32 11/06/25 15:59:42 INFO mapred.JobClient: FileSystemCounters 11/06/25 15:59:42 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=3648 11/06/25 15:59:42 INFO mapred.JobClient: Map-Reduce Framework 11/06/25 15:59:42 INFO mapred.JobClient: Map input records=0 11/06/25 15:59:42 INFO mapred.JobClient: Spilled Records=0 11/06/25 15:59:42 INFO mapred.JobClient: Map output records=0
  • 62. DATA TASK - EXPORT (3/3)  Each part-m-nnnnn file contains a piece of the exported data  Together they form the full backup of the table  Use the hadoop distcp command to move the directory from one cluster to another, and perform the import there 62 hadoop dfs -lsr /user/larsgeorge/backup-testtable drwxr-xr-x - ... 0 2011-06-25 15:58 _logs -rw-r--r-- 1 ... 114 2011-06-25 15:58 part-m-00000 -rw-r--r-- 1 ... 114 2011-06-25 15:58 part-m-00001 … -rw-r--r-- 1 ... 114 2011-06-25 15:59 part-m-00030 -rw-r--r-- 1 ... 114 2011-06-25 15:59 part-m-00031
  • 63. DATA TASK – IMPORT (1/2) 63 hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar import Usage: Import <tablename> <inputdir> hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar import testtable /user/larsgeorge/backup-testtable 11/06/25 17:09:48 INFO mapreduce.TableOutputFormat: Created table instance for testtable 11/06/25 17:09:48 INFO input.FileInputFormat: Total input paths to process : 32 11/06/25 17:09:49 INFO mapred.JobClient: Running job: job_201106251558_0003 11/06/25 17:09:50 INFO mapred.JobClient: map 0% reduce 0% 11/06/25 17:10:04 INFO mapred.JobClient: map 6% reduce 0% … 11/06/25 17:10:51 INFO mapred.JobClient: Job Counters 11/06/25 17:10:51 INFO mapred.JobClient: Launched map tasks=32 11/06/25 17:10:51 INFO mapred.JobClient: Data-local map tasks=32 11/06/25 17:10:51 INFO mapred.JobClient: FileSystemCounters 11/06/25 17:10:51 INFO mapred.JobClient: HDFS_BYTES_READ=3648 11/06/25 17:10:51 INFO mapred.JobClient: Map-Reduce Framework 11/06/25 17:10:51 INFO mapred.JobClient: Map input records=0 11/06/25 17:10:51 INFO mapred.JobClient: Spilled Records=0 11/06/25 17:10:51 INFO mapred.JobClient: Map output records=0
  • 64. DATA TASK - IMPORT (2/2)  Use the Import job to store the data in a different table  With the same schema  Both export/import commend are per-table only  Use hadoop distcp command to copy the entire /hbase in HDFS  Not recommended  May copy store files that are halfway through a memstore flush operation 64
  • 65. DATA TASK – COPYTABLE TOOL (1/2)  Designed to bootstrap cluster replication  Make a copy of an existing table from the master cluster to the slave cluster 65 hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar copytable Usage: CopyTable [--rs.class=CLASS] [--rs.impl=IMPL] [--starttime=X] [--endtime=Y] [--new.name=NEW] [--peer.adr=ADR] <tablename>
  • 66. DATA TASK – COPYTABLE TOOL (2/2)  The copy of the table is stored on the same cluster 66 hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar copytable --new.name=testtable3 testtable 11/06/26 15:20:07 INFO mapreduce.TableOutputFormat: Created table instance for testtable3 11/06/26 15:20:07 INFO mapred.JobClient: Running job: job_201106261454_0003 11/06/26 15:20:08 INFO mapred.JobClient: map 0% reduce 0% 11/06/26 15:20:19 INFO mapred.JobClient: map 6% reduce 0% … 11/06/26 15:21:04 INFO mapred.JobClient: map 100% reduce 0% 11/06/26 15:21:06 INFO mapred.JobClient: Job complete: job_201106261454_0003 11/06/26 15:21:06 INFO mapred.JobClient: Counters: 5 11/06/26 15:21:06 INFO mapred.JobClient: Job Counters 11/06/26 15:21:06 INFO mapred.JobClient: Launched map tasks=32 11/06/26 15:21:06 INFO mapred.JobClient: Data-local map tasks=32 11/06/26 15:21:06 INFO mapred.JobClient: Map-Reduce Framework 11/06/26 15:21:06 INFO mapred.JobClient: Map input records=0 11/06/26 15:21:06 INFO mapred.JobClient: Spilled Records=0 11/06/26 15:21:06 INFO mapred.JobClient: Map output records=0
  • 67. DATA TASK – BULK IMPORT (1/2)  Importtsv tool  Given files containing data in tab-separated value (TSV) format  By default , it uses the HBase put() API to insert data into HBase one row at a time  By setting importtsv.bulk.output option, generate files using HFileOutputFormat  These can subsequently be bulk-loaded into HBase by completebulkload Tool 67 hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar importtsv Usage: importtsv -Dimporttsv.columns=a,b,c <tablename> <inputdir>
  • 68. DATA TASK – BULK IMPORT (2/2)  completebulkload Tool  Is used to import the data into the running cluster  After a data import has been prepared  By using the importtsv tool with the importtsv.bulk.output option  By some other MapReduce job using the HFileOutputFormat 68 hadoop jar $HBASE_HOME/hbase-0.91.0-SNAPSHOT.jar completebulkload -conf ~/my-hbase-site.xml /user/larsgeorge/myoutput mytable
  • 69. TROUBLESHOOTING – HBASE FSCK (1/4)  Shell Command  ${HBASE_HOME}/bin/hbase hbck  Once started  Scans the .META. table to gather all the pertinent information it holds  Scans the HDFS root directory HBase is configured to use  Compare the collected details to report on inconsistencies and integrity issues  Consistency check  Whether the region is listed in .META. and exists in HDFS  Is also assigned to exactly one region server  Integrity check  Compares the regions with the table details to find missing regions  Those that have holes or overlaps in their row key ranges 69
  • 70. TROUBLESHOOTING – HBASE FSCK (2/4) 70 ${HBASE_HOME}/bin/hbase hbck -h Usage: fsck [opts] where [opts] are: -details Display full report of all regions. -timelag {timeInSeconds} Process only regions that have not experienced any metadata updates in the last {{timeInSeconds} seconds. -fix Try to fix some of the errors. -sleepBeforeRerun {timeInSeconds} Sleep this many seconds before checking if the fix worked if run with -fix -summary Print only summary of the tables and status.
  • 71. TROUBLESHOOTING – HBASE FSCK (3/4)  No option at all invokes the normal output detail 71 ${HBASE_HOME}/bin/hbase hbck Number of Tables: 40 Number of live region servers: 19 Number of dead region servers: 0 Number of empty REGIONINFO_QUALIFIER rows in .META.: 0 Summary: ... testtable2 is okay. Number of regions: 1 Deployed on: host11.foo.com:60020 0 inconsistencies detected. Status: OK
  • 72. TROUBLESHOOTING – HBASE FSCK (4/4)  ${HBASE_HOME}/bin/hbase hbck -fix  Repairs following issues  Assign .META. to a single new server if it is unassigned  Reassign .META. to a single new server if it is assigned to multiple servers  Assign a user table region to a new server if it is unassigned  Reassign a user table region to a single new server if it is assigned to multiple servers  Reassign a user table region to a new server if the current server does not match  what the .META. table refers to  hbck reports inconsistencies which are temporal, or transitional only  Rerun the tool a few times to confirm a permanent problem 72
  • 73. TROUBLESHOOTING – ANALYZING THE LOGS (1/2) Server type Default Logfile tm settings HBase Master $HBASE_HOME/logs/hbase-<user>-master- <hostname>.log /var/log/hbase/hbase-<user>-master- <hostname>.log HBase RegionServer $HBASE_HOME/logs/hbase-<user>-regionserver- <hostname>.log /var/log/hbase/hbase-<user>-regionserver- <hostname>.log ZooKeeper Console log output only /var/log/hbase/hbase-<user>-zookeeper- <hostname>.log NameNode $HADOOP_HOME/logs/hadoop-<user>-namenode- <hostname>.log /var/log/hadoop/hadoop-<user>-namenode- <hostname>.log DataNode $HADOOP_HOME/logs/hadoop-<user>-datanode- <hostname>.log /var/log/hadoop/hadoop-<user>-datanode- <hostname>.log JobTracker $HADOOP_HOME/logs/hadoop-<user>-jobtracker- <hostname>.log /var/log/hadoop/hadoop-<user>-jobtracker- <hostname>.log TaskTracker $HADOOP_HOME/logs/hadoop-<user>-jobtracker- <hostname>.log /var/log/hadoop/hadoop-<user>-jobtracker- <hostname>.log 73
  • 74. TROUBLESHOOTING – ANALYZING THE LOGS (2/2)  Is useful to begin with the master logfile first  It acts as the coordinator service of the entire cluster  Find the processes began logging ERROR level messages  Be able to identify the root cause  A lot of subsequent messages are often side-effect of the original problem  Recommend to use the error log event metric under System Event Metrics group  Gives you a graph showing you where the server(s) started logging an increasing number of error messages in the logfiles  If find an error message  Google it !!  Use the online resources to search for the message in the public mailing lists  Search Hadoop 74
  • 75. HANDS-ON – USE YCSB  New VM list  Due to VMs are not affordable at present :p  ${YOUR_HOME}=${GIT_HOME}/hbase- training/006/hands-on/${YOUR_NAME}  mkdir ${YOUR_HOME}  cd ${YOUR_HOME}; cp -rf ../../ycsb/* .  Use HBase shell  create <YOUR_NAMED_TABLE>, “family”  Run YCSB with 5000 record count  And ouput ycsb-load.log file  Hands-on result  Put the ycsb-load.log file under ${YOUR_HOME} 75