Hadoop Hardware @Twitter: Size does matter!

Hadoop Hardware @Twitter:
Size does matter.
@joep and @eecraft
Hadoop Summit 2013
v2.3

@Twitter#HadoopSummit2013
2
Joep Rottinghuis
Software Engineer @ Twitter
Engineering Manager Hadoop/HBase team @ Twitter
Follow me @joep
Jay Shenoy
Hardware Engineer @ Twitter
Engineering Manager HW @ Twitter
Follow me @eecraft
HW & Hadoop teams @ Twitter, Many others
•
•
•
•
•
•
•
•
•
About us

3
Scale of Hadoop Clusters
Single versus multiple clusters
Twitter Hadoop Architecture
Hardware investigations
Results
•
•
•
•
•
Agenda

Scale
4
Scaling limits
JobTracker 10’s thousands of jobs per day; 10’s Ks concurrent
slots
Namenode 250-300 M objects in single namespace
Namenode @~100 GB heap -> full GC pauses
Shipping job jars to 1,000’s of nodes
JobHistory server at a few 100’s K job history/conf files
•
•
•
•
•
•
# Nodes

When / why to split clusters ?
5
In principle preference for single cluster
Common logs, shared free space, reduced admin burden, more rack
diversity
Varying SLA’s
Workload diversity
Storage intensive
Processing (CPU / Disk IO) intensive
Network intensive
Data access
Hot, Warm, Cold
•
•
•
•
•
•
•
•
•

Cluster Architecture
6

Hardware investigations
7

8
Hadoop does not need live HDD swap
Twitter DC : No SLA on data nodes
Rack SLA : Only 1 rack down at any time in a cluster
•
•
•
Service criteria for hardware

9
Baseline Hadoop Server (~ early 2012)
E56xx
DIMM
DIMM
DIMM
E56xx
DIMM
DIMM
DIMM
PCH NIC
GbE
HBA
Expander
Works for the general cluster,
but...
Need more density for storage
Potential IO bottlenecks
•
•
Characteristics:
Standard 2U
server
20 servers / rack
E5645 CPU
Dual 6-core
72GB memory
12 x 2TB HDD
2 x 1 GbE
•
•
•
•
•
•
•

10
Hadoop Server: Possible evolution
Characteristics:
+ CPU performance
? 20 servers / rack
Candidate for
DW
•
NIC
GbE
HBA
Expander
16 x 2T?
16 x 3T?
24 x 3T?
E5-26xx or
E5-24xx
DIMM
DIMM
DIMM
DIMM
E5-26xx or
E5-24xx
DIMM
DIMM
DIMM
DIMM
10GbE ?
Can deploy into the general DW cluster, but...
Too much CPU for storage intensive apps
Server failure domain too large if we scale up
disks
•
•

Rethinking hardware evolution
11
Debunking myths
Bigger is always better
One size fits all
Back to Hadoop Hardware Roots:
Scale horizontally, not vertically
Twitter Hadoop Server - “THS”
•
•
•
•
•

12
NIC
SAS
HBA
E3-12xx
DIMM
DIMM
PCH
GbE
THS for backups
Storage focus:
Cost efficient (single socket, 3T
drives)
Less memory needed
•
•
Characteristics:
+ IO Performance
Few fast cores
E3-1230 V2 CPU
16 GB memory
12 x 3 TB HDD
SSD boot
2 x 1 GbE
•
•
•
•
•
•

13
THS variant for Hadoop-Proc and HBase
NIC
SAS
HBA
10GbE
E3-12xx
DIMM
DIMM
PCH
Characteristics:
+ IO Performance
Few fast cores
E3-1230 V2 CPU
32 GB memory
12 x 1 TB HDD
SSD boot
1 x 10 GbE
•
•
•
•
•
•
Processing / throughput focus:
Cost efficient (single socket, 1T
drives)
More disk and network IO per
socket
•
•

14
THS for cold cluster
NIC
SAS
HBA
E3-12xx
DIMM
DIMM
PCH
GbE
Characteristics:
Disk Efficiency
Some compute
E3-1230 V2 CPU
32 GB memory
12 x 3 TB HDD
2 x 1 GbE
•
•
•
•
•
•Combination of previous 2 use cases:
Space & power efficient
Storage dense and some processing
capabilities
•
•

15
Rack-level view
Baseline
Twitter Hadoop Server
Backups Proc Cold
Power ~ 8 kW ~ 8 kW ~ 8 kW ~ 8 kW
CPU sockets; DRAM 40; 1440 GB 40; 640 GB 40; 1280 GB 40; 1280 GB
Spindles; TB raw 240; 480 TB 480; 1,440 TB 480; 480 TB 480; 1,440 TB
Uplink; Internal BW 20 ; 40 Gbps 20 ; 80 Gbps 40 ; 400 Gbps 20 ; 80 Gbps
1G TOR
1G TOR
1G TOR
1G TOR
1G TOR10G TOR

16
Processing performance comparison
Benchmark Baseline Server THS (-Cold)
TestDFSIO (write replication = 1) 360 MB/s / node 780 MB/s / node
TeraGen (30TB replication = 3) 1:36 hrs 1:35 hrs
TeraSort (30 TB, replication = 3) 6:11 hrs 4:22 hrs
2 Parallel TeraSort (30 TB each, replication = 3) 10:36 hrs 6:21 hrs
Application #1 4:37 min 3:09 min
Application set #2 13:3 hrs 10:57 hrs
Performance benchmark set up:
Each clusters 102 nodes of respective type
Efficient server = 3 racks, Baseline 5+ racks
“Dated” stack: CentOS 5.5, Sun 1.6 JRE, Hadoop 2.0.3
•
•
•

Results
17

16
LZO performance comparison
18

Recap
19
At a certain scale it makes sense to split into multiple clusters
For us: RT, PROC, DW, COLD, BACKUPS, TST, EXP
For large enough clusters, depending on use-case, it may be worth to choose
different HW configurations
•
•
•

Conclusion
20
@Twitter our “Twitter Hadoop Server”
not only saves many $$$, it is also
faster !

#ThankYou
@joep and @eecraft
Come talk to us at booth 26

Hadoop Hardware @Twitter: Size does matter!

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (8)

Semelhante a Hadoop Hardware @Twitter: Size does matter!

Semelhante a Hadoop Hardware @Twitter: Size does matter! (20)

Mais de DataWorks Summit

Mais de DataWorks Summit (20)

Último

Último (20)

Hadoop Hardware @Twitter: Size does matter!