Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
HBase Application Performance Improvement
1. HBase
Cache
&
Performance
Biju
Nair
Boston
Hadoop
User
Group
Meet-‐up
28
May
2015
2. HBase
Overview
• Key
value
store
• Column
family
oriented
• Data
stored
as
byte[]
• Data
indexed
by
key
value
• Data
stored
in
sorted
order
by
key
• Data
model
doesn’t
have
to
be
pre-‐defined
• Scales
horizontally
2
5. HBase
Overview
5
Region
Server
Region
Server
Region
Server
Region
Server
Region
Server
…
appl,company…
…
…
ge,company…
…
…
ibm,company…
…
…
msft,company…
…
…
orcl,company…
…
HBase
Master
ZooKeeper
Client
6. Use
Case:
Data
and
Query
• Time
series
data
– Tickers
and
aYributes
– Monthly
data
stored
in
a
column;
256
bytes
– Up
to
20
years
worth
of
data
• Queries
– “get”s
for
up
to
1
year
data;
3072
bytes
6
7. Use
Case:
Requirements
• Meet
“get”
query
performance
requirements
– Under
10
ms
for
99%
of
queries
– Median
latency
2
to
3
ms
– 99.99%
latency
under
50
ms
• Efficient
HBase
cluster
capacity
uelizaeon
– 32
cores
per
node
– 128
GB
of
memory
per
node
– SSD
storage
in
all
nodes
7
8. Baseline
Test
Observaeons
• Spikes
in
read
response
emes
• Less
than
10%
uelizaeon
of
RS
node
CPUs
• Less
than
15%
uelizaeon
of
RS
node
memory
• Block
cache
uelizaeon
was
inefficient
– Low
hit
raeo
and
high
eviceon
rates
8
14. Memory
uelizaeon/Latency
Spikes
• JVM
GC
contributed
to
latency
spikes
• Increase
in
heap
size
increased
GC
eme
– Prevented
using
all
the
available
memory
• Proposed
change:
Use
off-‐heap
caching
– Minimize
spikes
in
response
eme
due
to
GC
– Increased
uelizaeon
of
node
memory
14
15. HBase
Off-‐Heap
Caching
HBase
Memory
(RS)
Mem
Store
Block
cache
(L1)
Idx
&
BF
data
HBase
Storage
WAL
HFiles
Off-‐heap
cache
(L2)
Tbl
Data
(Bucket
Cache)
15
20. Impact
of
Using
Off-‐Heap
Cache
Get
Performance
with
L1
cache
Get
Performance
with
L1
&
L2
cache
Note:
L1
cache
test
used
38
GB
of
data,
L1+L2
test
used
3
TB
of
data
Avg 3.872 3.995 3.936 4.007 4.052
Median 1 1 1 1 1
95% 14 14 14 15 15
99% 20 20 20 20 20
99.90% 27 27 27 28 28
99.99% 36 36 36 37 37
99.999% 208 310 332 207 232
Max 1360 1906 1736 1359 1363
807Mil797107Mil7Requests
BAvg 3.429 2.552 3.447 3.502 3.554
BMedian 2 2 2 2 2
B95% 10 8 10 10 10
B99% 18 14 18 18 18
B99.9% 30 23 30 30 31
BMax 78 1135 58 77 67
18Mil8Rows8>818Mil8Requests
20
21. Maximize
CPU
&
Memory
Uelizaeon
• Run
addieonal
RS
per
node
• Throughput
increased
50%
when
RS
increased
to
2
– Through
put
reduced
on
AWS
cluster
– There
was
no
degradaeon
on
the
response
eme
– Through
put
increase
tapered
awer
3
RS
per
node
• Note:
Maintenance
over
head
using
mule-‐RS
21
22. Known
Issues
• Using
“oueap”
opeon
of
BucketCache
prevents
RS
start
– [HBASE-‐10643]
– Can
be
miegated
using
tempfs
• LoadIncrementalHFiles
doesn’t
work
with
BucketCache
– [HBase-‐10500]
• BucketCache
for
different
block
sizes
is
not
configurable
– [HBASE-‐10641]
Fixed
22
23. Key
Takeaways
• Store
what
is
really
required
– Understand
the
query
paYern
– Leverage
column
family
(CF)
to
group
data
• Choose
appropriate
block
size
for
table/CF
• Use
off
heap
cache
to
minimize
latency
spikes
• Test
all
assumpeons
23