Hadoop on a personal supercomputer

Hadoop on a Personal
Supercomputer

Paul Dingman – Chief Technologist, Integration Division
pdingman@pervasive.com

PERVASIVE DATA INNOVATION

Pervasive and Hadoop

• Pervasive Software develops software products to manage, integrate
and analyze data.
• Innovation Lab projects around big data include:
– Hadoop
• Accelerate MapReduce (DataRush Community Edition)
• High-speed add-ons for HBase, Avro, Hive (TurboRush)
• Augment Sqoop
• Enhance ETL capabilities
– Benchmarks
• Terasort
• TPC-H
• SIEM/LogAnalytics EPS
• Genomics

2

Why are many-core systems interesting?

• Many-core processors make it possible to concentrate large amounts
of processing power in a single machine. Coupled with newer
storage technologies these systems can have high speed access to
tremendous amounts of storage.
• We have done a lot of work with multi-core systems at Pervasive
Software. Our Pervasive DataRush ™ Dataflow Engine takes
advantage of all available processor cores to efficiently process large
volumes of data.
– Analytics
– Data mining
– Genomics
• Potential cost and energy savings due to the need for fewer nodes.
• Potential performance gains by eliminating inter-node data exchange.

3

Pervasive DataRush™ Speed and Scalability

• World Record Performance set running Smith-Waterman algorithm
• Code written on an 8 core machine scaled to 384 cores with no changes!

4

Malstone-B10* Scalability
400

Run-time for 10B rows
Run-time

350 370.0

300
3.2 hours
with 4
250
cores
Time in Minutes

200

192.4
1.5 hours
with 8
150
cores Under 1
hour with
100
16 cores
90.3
50
51.6
31.5
0
2 cores 4 cores 8 cores 16 cores 32 cores
Core Count

* Cyber security benchmark from the Open Cloud Consortium

5

How well does Hadoop work on many-core
systems?
• One of the areas we wanted to explore with Hadoop is to determine
how well it works on systems with lots of cores. In other words is it
possible to run Hadoop in an environment where you could exploit the
cores for complex operations, but still have the benefits of the
distributed environment provided by Hadoop and HDFS?

6

Master Node (NameNode/JobTracker)

Commodity Box
P1 P1
• 2 Intel Xeon L5310 CPUs 1.6
GHz (8 cores)
Local DRAM (16 GB) • 16 GB DRAM (ECC)
• 8 SATA Hard Disks (4 TB)
• Mellanox ConnectX-2 VPI
Dual Port Adapter Infiniband
500 GB … 500 GB

local
(8 spindles)

7

Slave Nodes (DataNode/TaskTracker)

• 4 AMD Opteron 6172 CPUs
P1 P1 P1 P1 (48 cores)
• Supermicro MB
• 1 LSI 8 port HBA (6 GBps)
Local DRAM (256 GB) • 2 SATA SSDs (512 GB)
• 256 GB DRAM (ECC)
• 32 SATA Hard Disks (64 TB)
• Mellanox ConnectX-2 VPI
2TB … 2TB 2TB Dual Port Adapter Infiniband

HDFS local
(24 spindles, JBOD) (8 spindles)

8

Hadoop Cluster
Master
P1 P1
• CentOS 5.6 Local DRAM
• Hadoop (Cloudera CDH3u0)

IPoIB IPoIB

P1 P1 P1 P1 P1 P1 P1 P1

Local DRAM IPoIB Local DRAM

2TB
… 2TB 2TB 2TB
… 2TB 2TB

Slave Slave

• 104 cores (8/48/48)
• 128 TB storage (96 TB HDFS)
• 512 GB of memory
• 40 Gb Infiniband interconnects (IPoIB)

9

Hadoop Tuning

• We worked from the bottom up.
– Linux (various kernels and kernel settings)
– File systems (EXT2, EXT3, EXT4)
– Drivers (HBA)
– JVMs
• Initial tests were done using a single “fat” node (same config as
worker nodes).
• Made it easier to test different disk configurations.
• For Hadoop tests we primarily used 100 GB Terasort jobs for testing.
This test exercised all phases of the MapReduce process while not
being too large to run frequently.

10

Lessons Learned with Single Node Tuning

• We found we could comfortably run 40 maps and 20 reducers given
memory and CPU constraints
• Use large block size for HDFS.
– Execution time for map tasks was around 1 minute using 512 MB block size
• More spindles is better
– 1:1 ratio of map tasks to local HDFS spindles works well
– EXT2 seems to work well with JBOD
• Dedicated spindles for temporary files on each worker node
• Configure JVM settings for larger heap size to avoid spills
– Parallel GC seemed to help as well
• Compression of map outputs is a huge win (LZO)
• HBase scales well in fat nodes with DataRush (> 5M rows/sec bulk
load; >10M rows/sec sequential scan)

11

Varying Spindles for HDFS
Terasort Average Execution Time
900

800

700

600
Time (secs)

500

400 Terasort Average Execution Time

300

200

100

0
8 16 24 32 40 48
HDFS Disks (2TB)

12

Varying Spindles for Intermediate Outputs
800

700

600

500
Time (secs)

400

300

200

100

0
4 x 2TB 8 x 2TB 16 x 2TB Fusion I/O Drive
Flash RAID 0
(4 x 2TB)
Drives for Intermediate Map Output

13

14
Tasks

0
10
20
40
50
60
70

30
0
16
32
48
64
80
96
112
128
144
160
176
192
208
224
240
256
272
288
304
320
336
352

Execution Time (seconds)
368
384
400
416
432
448
Single node 100 GB Terasort

464
480
496
512
528
544
560
576
592
608
maps
merge
shuffle
reduce

Clustering the Nodes

• We had a total of 64 hard disks for the cluster and had to split them
between the two nodes.
• Installed and configured Open Fabrics OFED to enable IPoIB.
• Reconfigure Hadoop to cluster the nodes.

15

16
Tasks

100
120

20
40
80

60

0
1
11
21
31
41
51
61
71
81
91
101
111
121
131
141
151
161
171
181
191
201
211
221
231

Execution Time (seconds)
241
Cluster 100 GB Terasort

251
261
271
281
291
301
311
321
331
341
351
361
371
381
391
maps
merge
shuffle
reduce

Comparisons with Amazon Clusters

• The Amazon clusters were used to get a better idea of what to expect
using more conventionally sized Hadoop nodes (non-EMR).
• We used „Cluster Compute Quadruple Extra Large‟ instances
– 23 GB of memory
– 33.5 EC2 Compute Units (Dual Intel Xeon X5570 quad-core “Nehalem” processors;
8 cores total)
– 1690 GB of instance storage (2 spindles)
– Very high I/O performance (10 GbE)
• Used a similar Hadoop configuration, but dialed back the number of
maps and reducers due to lower core count.
• Used cluster sizes that were roughly core count equivalent for
comparison

17

Per Node Comparison

Feature Amazon cc1.4xlarge Personal
Supercomputer
Cores 8 48
Memory 23 GB 256 GB
Memory/core 2.875 GB 5.333 GB
Spindles 2 32 (24 HDFS/8 temp)
Storage 1.690 TB 64 TB
Network 10 Gb Ethernet IPoIB (40 Gb Infiniband)

18

Performance Comparison

743
712

460

388 Execution Time (secs)
Run time cost (cents)
MB/dollar
250 245
231

94 106

40 43 41

PSC Single node Amazon HPC Cluster PSC Cluster Amazon HPC Cluster
(6 workers + master) (2 workers + master) (11 workers + master)

48 cores 48 cores 96 cores 88 cores

19

Conclusions

• From what we have seen Hadoop works very well on many-core
systems. In fact, Hadoop runs quite well on even a single node
many-core system.
• Using denser nodes may make failures more expensive for some
system components. When using disk arrays the handling of hard
disk failures should be comparable to smaller nodes.
• The MapReduce framework treats all intermediate outputs as remote
resources. The copy phase of MapReduce doesn‟t benefit from
locality of data.

20

Questions?
Follow up/more information:-

Visit our booth
Pervasive DataRush for Hadoop
www.pervasivedatarush.com/Technology/PervasiveDataRushforHadoop.
aspx
Presentation content – paul.dingman@pervasive.com

PERVASIVE DATA INNOVATION

Hadoop on a personal supercomputer

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Hadoop on a personal supercomputer

Similar to Hadoop on a personal supercomputer (20)

Recently uploaded

Recently uploaded (20)

Hadoop on a personal supercomputer