SlideShare a Scribd company logo
1 of 21
Download to read offline
Hadoop on a Personal
Supercomputer


Paul Dingman – Chief Technologist, Integration Division
pdingman@pervasive.com




                                                 PERVASIVE DATA INNOVATION
Pervasive and Hadoop

• Pervasive Software develops software products to manage, integrate
  and analyze data.
• Innovation Lab projects around big data include:
    – Hadoop
       • Accelerate MapReduce (DataRush Community Edition)
       • High-speed add-ons for HBase, Avro, Hive (TurboRush)
       • Augment Sqoop
       • Enhance ETL capabilities
    – Benchmarks
       • Terasort
       • TPC-H
       • SIEM/LogAnalytics EPS
       • Genomics



2
Why are many-core systems interesting?

• Many-core processors make it possible to concentrate large amounts
  of processing power in a single machine. Coupled with newer
  storage technologies these systems can have high speed access to
  tremendous amounts of storage.
• We have done a lot of work with multi-core systems at Pervasive
  Software. Our Pervasive DataRush ™ Dataflow Engine takes
  advantage of all available processor cores to efficiently process large
  volumes of data.
    – Analytics
    – Data mining
    – Genomics
• Potential cost and energy savings due to the need for fewer nodes.
• Potential performance gains by eliminating inter-node data exchange.



3
Pervasive DataRush™ Speed and Scalability




    •   World Record Performance set running Smith-Waterman algorithm
    •   Code written on an 8 core machine scaled to 384 cores with no changes!


4
Malstone-B10* Scalability
                        400

                                                                                 Run-time for 10B rows
                                                                                Run-time

                        350   370.0



                        300
                                           3.2 hours
                                             with 4
                        250
                                             cores
      Time in Minutes




                        200

                                        192.4
                                                         1.5 hours
                                                           with 8
                        150
                                                           cores       Under 1
                                                                       hour with
                        100
                                                                       16 cores
                                                        90.3
                         50
                                                                       51.6
                                                                                           31.5
                          0
                              2 cores   4 cores        8 cores       16 cores          32 cores
                                                   Core Count


    * Cyber security benchmark from the Open Cloud Consortium

5
How well does Hadoop work on many-core
systems?
• One of the areas we wanted to explore with Hadoop is to determine
  how well it works on systems with lots of cores. In other words is it
  possible to run Hadoop in an environment where you could exploit the
  cores for complex operations, but still have the benefits of the
  distributed environment provided by Hadoop and HDFS?




6
Master Node (NameNode/JobTracker)

                                     Commodity Box
    P1                  P1
                             • 2 Intel Xeon L5310 CPUs 1.6
                               GHz (8 cores)
    Local DRAM (16 GB)       • 16 GB DRAM (ECC)
                             • 8 SATA Hard Disks (4 TB)
                             • Mellanox ConnectX-2 VPI
                               Dual Port Adapter Infiniband
    500 GB   …    500 GB

             local
         (8 spindles)




7
Slave Nodes (DataNode/TaskTracker)

                                              • 4 AMD Opteron 6172 CPUs
     P1       P1          P1        P1          (48 cores)
                                              • Supermicro MB
                                              • 1 LSI 8 port HBA (6 GBps)
          Local DRAM (256 GB)                 • 2 SATA SSDs (512 GB)
                                              • 256 GB DRAM (ECC)
                                              • 32 SATA Hard Disks (64 TB)
                                              • Mellanox ConnectX-2 VPI
    2TB     …      2TB            2TB           Dual Port Adapter Infiniband

           HDFS                    local
    (24 spindles, JBOD)        (8 spindles)




8
Hadoop Cluster
                                         Master
                                       P1           P1
• CentOS 5.6                           Local DRAM
• Hadoop (Cloudera CDH3u0)

                                       IPoIB              IPoIB

               P1     P1    P1    P1                     P1     P1    P1    P1

                     Local DRAM             IPoIB              Local DRAM

               2TB
                     …    2TB    2TB                     2TB
                                                               …    2TB    2TB

                         Slave                                     Slave

                 •       104 cores (8/48/48)
                 •       128 TB storage (96 TB HDFS)
                 •       512 GB of memory
                 •       40 Gb Infiniband interconnects (IPoIB)

  9
Hadoop Tuning

• We worked from the bottom up.
     –   Linux (various kernels and kernel settings)
     –   File systems (EXT2, EXT3, EXT4)
     –   Drivers (HBA)
     –   JVMs
• Initial tests were done using a single “fat” node (same config as
  worker nodes).
• Made it easier to test different disk configurations.
• For Hadoop tests we primarily used 100 GB Terasort jobs for testing.
  This test exercised all phases of the MapReduce process while not
  being too large to run frequently.




10
Lessons Learned with Single Node Tuning

• We found we could comfortably run 40 maps and 20 reducers given
  memory and CPU constraints
• Use large block size for HDFS.
     – Execution time for map tasks was around 1 minute using 512 MB block size
• More spindles is better
     – 1:1 ratio of map tasks to local HDFS spindles works well
     – EXT2 seems to work well with JBOD
• Dedicated spindles for temporary files on each worker node
• Configure JVM settings for larger heap size to avoid spills
     – Parallel GC seemed to help as well
• Compression of map outputs is a huge win (LZO)
• HBase scales well in fat nodes with DataRush (> 5M rows/sec bulk
  load; >10M rows/sec sequential scan)



11
Varying Spindles for HDFS
                              Terasort Average Execution Time
               900


               800


               700


               600
 Time (secs)




               500


               400                                              Terasort Average Execution Time

               300


               200


               100


                 0
                     8   16      24            32   40     48
                                 HDFS Disks (2TB)




12
Varying Spindles for Intermediate Outputs
                                         Terasort Average Execution Time
               800


               700


               600


               500
 Time (secs)




               400

                                                                                               Terasort Average Execution Time
               300


               200


               100


                 0
                     4 x 2TB   8 x 2TB           16 x 2TB       Fusion I/O Drive
                                                                     Flash          RAID 0
                                                                                   (4 x 2TB)
                                    Drives for Intermediate Map Output




13
14
                                                                        Tasks




                                      0
                                          10
                                               20
                                                                                      40
                                                                                           50
                                                                                                60
                                                                                                     70




                                                                30
                                  0
                                 16
                                 32
                                 48
                                 64
                                 80
                                 96
                                112
                                128
                                144
                                160
                                176
                                192
                                208
                                224
                                240
                                256
                                272
                                288
                                304
                                320
                                336
                                352




     Execution Time (seconds)
                                368
                                384
                                400
                                416
                                432
                                448
                                                                                                          Single node 100 GB Terasort




                                464
                                480
                                496
                                512
                                528
                                544
                                560
                                576
                                592
                                608
                                                    maps
                                                                     merge
                                                           shuffle
                                                                             reduce
Clustering the Nodes

• We had a total of 64 hard disks for the cluster and had to split them
  between the two nodes.
• Installed and configured Open Fabrics OFED to enable IPoIB.
• Reconfigure Hadoop to cluster the nodes.




15
16
                                                                   Tasks




                                                                                      100
                                                                                            120




                                          20
                                               40
                                                                                 80




                                                                        60




                                      0
                                  1
                                 11
                                 21
                                 31
                                 41
                                 51
                                 61
                                 71
                                 81
                                 91
                                101
                                111
                                121
                                131
                                141
                                151
                                161
                                171
                                181
                                191
                                201
                                211
                                221
                                231




     Execution Time (seconds)
                                241
                                                                                                  Cluster 100 GB Terasort




                                251
                                261
                                271
                                281
                                291
                                301
                                311
                                321
                                331
                                341
                                351
                                361
                                371
                                381
                                391
                                               maps
                                                                merge
                                                      shuffle
                                                                        reduce
Comparisons with Amazon Clusters

• The Amazon clusters were used to get a better idea of what to expect
  using more conventionally sized Hadoop nodes (non-EMR).
• We used „Cluster Compute Quadruple Extra Large‟ instances
     – 23 GB of memory
     – 33.5 EC2 Compute Units (Dual Intel Xeon X5570 quad-core “Nehalem” processors;
       8 cores total)
     – 1690 GB of instance storage (2 spindles)
     – Very high I/O performance (10 GbE)
• Used a similar Hadoop configuration, but dialed back the number of
  maps and reducers due to lower core count.
• Used cluster sizes that were roughly core count equivalent for
  comparison




17
Per Node Comparison


Feature       Amazon cc1.4xlarge   Personal
                                   Supercomputer
Cores         8                    48
Memory        23 GB                256 GB
Memory/core   2.875 GB             5.333 GB
Spindles      2                    32 (24 HDFS/8 temp)
Storage       1.690 TB             64 TB
Network       10 Gb Ethernet       IPoIB (40 Gb Infiniband)




18
Performance Comparison

                          743
      712




                                                                        460

                                                388                                          Execution Time (secs)
                                                                                             Run time cost (cents)
                                                                                             MB/dollar
                 250                                                          245
                                231




                                                      94   106

            40                        43                                            41



     PSC Single node   Amazon HPC Cluster         PSC Cluster        Amazon HPC Cluster
                       (6 workers + master)   (2 workers + master)   (11 workers + master)


      48 cores             48 cores             96 cores                 88 cores


19
Conclusions

• From what we have seen Hadoop works very well on many-core
  systems. In fact, Hadoop runs quite well on even a single node
  many-core system.
• Using denser nodes may make failures more expensive for some
  system components. When using disk arrays the handling of hard
  disk failures should be comparable to smaller nodes.
• The MapReduce framework treats all intermediate outputs as remote
  resources. The copy phase of MapReduce doesn‟t benefit from
  locality of data.




20
Questions?
Follow up/more information:-

Visit our booth
Pervasive DataRush for Hadoop
www.pervasivedatarush.com/Technology/PervasiveDataRushforHadoop.
aspx
Presentation content – paul.dingman@pervasive.com




                                            PERVASIVE DATA INNOVATION

More Related Content

What's hot

Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
Perf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsPerf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsYanghua Zhang
 
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuAzure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuMarco Obinu
 
ZFS for Databases
ZFS for DatabasesZFS for Databases
ZFS for Databasesahl0003
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexingYoshinori Matsunobu
 
All Oracle DBAs have to know about Unix Memory Monitoring
All Oracle DBAs have to know about Unix Memory MonitoringAll Oracle DBAs have to know about Unix Memory Monitoring
All Oracle DBAs have to know about Unix Memory MonitoringYury Velikanov
 
Raid designs in Qsan Storage
Raid designs in Qsan StorageRaid designs in Qsan Storage
Raid designs in Qsan Storageqsantechnology
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed_Hat_Storage
 
Introduction to TrioNAS LX U300
Introduction to TrioNAS LX U300Introduction to TrioNAS LX U300
Introduction to TrioNAS LX U300qsantechnology
 
Structure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performanceStructure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performanceAtlassian
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Community
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionSchubert Zhang
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databasesAngelo Rajadurai
 
PostgreSQL + ZFS best practices
PostgreSQL + ZFS best practicesPostgreSQL + ZFS best practices
PostgreSQL + ZFS best practicesSean Chittenden
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCeph Community
 
HPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialHPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialJeff Larkin
 

What's hot (18)

Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Perf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsPerf Vsphere Storage Protocols
Perf Vsphere Storage Protocols
 
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco ObinuAzure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
Azure VM 101 - HomeGen by CloudGen Verona - Marco Obinu
 
ZFS for Databases
ZFS for DatabasesZFS for Databases
ZFS for Databases
 
More mastering the art of indexing
More mastering the art of indexingMore mastering the art of indexing
More mastering the art of indexing
 
All Oracle DBAs have to know about Unix Memory Monitoring
All Oracle DBAs have to know about Unix Memory MonitoringAll Oracle DBAs have to know about Unix Memory Monitoring
All Oracle DBAs have to know about Unix Memory Monitoring
 
Raid designs in Qsan Storage
Raid designs in Qsan StorageRaid designs in Qsan Storage
Raid designs in Qsan Storage
 
Nd Evo Plus
Nd Evo PlusNd Evo Plus
Nd Evo Plus
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
 
Introduction to TrioNAS LX U300
Introduction to TrioNAS LX U300Introduction to TrioNAS LX U300
Introduction to TrioNAS LX U300
 
ceph-barcelona-v-1.2
ceph-barcelona-v-1.2ceph-barcelona-v-1.2
ceph-barcelona-v-1.2
 
Structure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performanceStructure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performance
 
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance BarriersCeph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
Ceph Day Melbourne - Ceph on All-Flash Storage - Breaking Performance Barriers
 
DaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solutionDaStor/Cassandra report for CDR solution
DaStor/Cassandra report for CDR solution
 
SSD based storage tuning for databases
SSD based storage tuning for databasesSSD based storage tuning for databases
SSD based storage tuning for databases
 
PostgreSQL + ZFS best practices
PostgreSQL + ZFS best practicesPostgreSQL + ZFS best practices
PostgreSQL + ZFS best practices
 
CephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at LastCephFS in Jewel: Stable at Last
CephFS in Jewel: Stable at Last
 
HPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialHPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorial
 

Similar to Hadoop on a personal supercomputer

How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsIsaac Christoffersen
 
Trend - HPC-29mai2012
Trend - HPC-29mai2012Trend - HPC-29mai2012
Trend - HPC-29mai2012Agora Group
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaCloudera, Inc.
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL Bernd Ocklin
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectureshypertable
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheDavid Grier
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisMike Pittaro
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis PyData
 
Oracle Exadata Version 2
Oracle Exadata Version 2Oracle Exadata Version 2
Oracle Exadata Version 2Jarod Wang
 
Nextserver Evo
Nextserver EvoNextserver Evo
Nextserver Evodellarocco
 
Theta and the Future of Accelerator Programming
Theta and the Future of Accelerator ProgrammingTheta and the Future of Accelerator Programming
Theta and the Future of Accelerator Programminginside-BigData.com
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanNarayana B
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014Hassan Islamov
 
OSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig MorganOSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig MorganOpenStorageSummit
 
Sun storage tek 6140 technical presentation
Sun storage tek 6140 technical presentationSun storage tek 6140 technical presentation
Sun storage tek 6140 technical presentationxKinAnx
 
SUN主机产品介绍.ppt
SUN主机产品介绍.pptSUN主机产品介绍.ppt
SUN主机产品介绍.pptPencilData
 

Similar to Hadoop on a personal supercomputer (20)

How to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation SavingsHow to Modernize Your Database Platform to Realize Consolidation Savings
How to Modernize Your Database Platform to Realize Consolidation Savings
 
Trend - HPC-29mai2012
Trend - HPC-29mai2012Trend - HPC-29mai2012
Trend - HPC-29mai2012
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012 Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012
 
MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL MySQL NDB Cluster 8.0 SQL faster than NoSQL
MySQL NDB Cluster 8.0 SQL faster than NoSQL
 
Accelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket CacheAccelerating HBase with NVMe and Bucket Cache
Accelerating HBase with NVMe and Bucket Cache
 
Dissecting Scalable Database Architectures
Dissecting Scalable Database ArchitecturesDissecting Scalable Database Architectures
Dissecting Scalable Database Architectures
 
Accelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cacheAccelerating hbase with nvme and bucket cache
Accelerating hbase with nvme and bucket cache
 
High Performance Hardware for Data Analysis
High Performance Hardware for Data AnalysisHigh Performance Hardware for Data Analysis
High Performance Hardware for Data Analysis
 
Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis Mike Pittaro - High Performance Hardware for Data Analysis
Mike Pittaro - High Performance Hardware for Data Analysis
 
Oracle Exadata Version 2
Oracle Exadata Version 2Oracle Exadata Version 2
Oracle Exadata Version 2
 
CLFS 2010
CLFS 2010CLFS 2010
CLFS 2010
 
Nextserver Evo
Nextserver EvoNextserver Evo
Nextserver Evo
 
Nextserver Evo
Nextserver EvoNextserver Evo
Nextserver Evo
 
Theta and the Future of Accelerator Programming
Theta and the Future of Accelerator ProgrammingTheta and the Future of Accelerator Programming
Theta and the Future of Accelerator Programming
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
 
OSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig MorganOSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
OSS Presentation VMWorld 2011 by Andy Bennett & Craig Morgan
 
Sun storage tek 6140 technical presentation
Sun storage tek 6140 technical presentationSun storage tek 6140 technical presentation
Sun storage tek 6140 technical presentation
 
SUN主机产品介绍.ppt
SUN主机产品介绍.pptSUN主机产品介绍.ppt
SUN主机产品介绍.ppt
 

Recently uploaded

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Recently uploaded (20)

Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Hadoop on a personal supercomputer

  • 1. Hadoop on a Personal Supercomputer Paul Dingman – Chief Technologist, Integration Division pdingman@pervasive.com PERVASIVE DATA INNOVATION
  • 2. Pervasive and Hadoop • Pervasive Software develops software products to manage, integrate and analyze data. • Innovation Lab projects around big data include: – Hadoop • Accelerate MapReduce (DataRush Community Edition) • High-speed add-ons for HBase, Avro, Hive (TurboRush) • Augment Sqoop • Enhance ETL capabilities – Benchmarks • Terasort • TPC-H • SIEM/LogAnalytics EPS • Genomics 2
  • 3. Why are many-core systems interesting? • Many-core processors make it possible to concentrate large amounts of processing power in a single machine. Coupled with newer storage technologies these systems can have high speed access to tremendous amounts of storage. • We have done a lot of work with multi-core systems at Pervasive Software. Our Pervasive DataRush ™ Dataflow Engine takes advantage of all available processor cores to efficiently process large volumes of data. – Analytics – Data mining – Genomics • Potential cost and energy savings due to the need for fewer nodes. • Potential performance gains by eliminating inter-node data exchange. 3
  • 4. Pervasive DataRush™ Speed and Scalability • World Record Performance set running Smith-Waterman algorithm • Code written on an 8 core machine scaled to 384 cores with no changes! 4
  • 5. Malstone-B10* Scalability 400 Run-time for 10B rows Run-time 350 370.0 300 3.2 hours with 4 250 cores Time in Minutes 200 192.4 1.5 hours with 8 150 cores Under 1 hour with 100 16 cores 90.3 50 51.6 31.5 0 2 cores 4 cores 8 cores 16 cores 32 cores Core Count * Cyber security benchmark from the Open Cloud Consortium 5
  • 6. How well does Hadoop work on many-core systems? • One of the areas we wanted to explore with Hadoop is to determine how well it works on systems with lots of cores. In other words is it possible to run Hadoop in an environment where you could exploit the cores for complex operations, but still have the benefits of the distributed environment provided by Hadoop and HDFS? 6
  • 7. Master Node (NameNode/JobTracker) Commodity Box P1 P1 • 2 Intel Xeon L5310 CPUs 1.6 GHz (8 cores) Local DRAM (16 GB) • 16 GB DRAM (ECC) • 8 SATA Hard Disks (4 TB) • Mellanox ConnectX-2 VPI Dual Port Adapter Infiniband 500 GB … 500 GB local (8 spindles) 7
  • 8. Slave Nodes (DataNode/TaskTracker) • 4 AMD Opteron 6172 CPUs P1 P1 P1 P1 (48 cores) • Supermicro MB • 1 LSI 8 port HBA (6 GBps) Local DRAM (256 GB) • 2 SATA SSDs (512 GB) • 256 GB DRAM (ECC) • 32 SATA Hard Disks (64 TB) • Mellanox ConnectX-2 VPI 2TB … 2TB 2TB Dual Port Adapter Infiniband HDFS local (24 spindles, JBOD) (8 spindles) 8
  • 9. Hadoop Cluster Master P1 P1 • CentOS 5.6 Local DRAM • Hadoop (Cloudera CDH3u0) IPoIB IPoIB P1 P1 P1 P1 P1 P1 P1 P1 Local DRAM IPoIB Local DRAM 2TB … 2TB 2TB 2TB … 2TB 2TB Slave Slave • 104 cores (8/48/48) • 128 TB storage (96 TB HDFS) • 512 GB of memory • 40 Gb Infiniband interconnects (IPoIB) 9
  • 10. Hadoop Tuning • We worked from the bottom up. – Linux (various kernels and kernel settings) – File systems (EXT2, EXT3, EXT4) – Drivers (HBA) – JVMs • Initial tests were done using a single “fat” node (same config as worker nodes). • Made it easier to test different disk configurations. • For Hadoop tests we primarily used 100 GB Terasort jobs for testing. This test exercised all phases of the MapReduce process while not being too large to run frequently. 10
  • 11. Lessons Learned with Single Node Tuning • We found we could comfortably run 40 maps and 20 reducers given memory and CPU constraints • Use large block size for HDFS. – Execution time for map tasks was around 1 minute using 512 MB block size • More spindles is better – 1:1 ratio of map tasks to local HDFS spindles works well – EXT2 seems to work well with JBOD • Dedicated spindles for temporary files on each worker node • Configure JVM settings for larger heap size to avoid spills – Parallel GC seemed to help as well • Compression of map outputs is a huge win (LZO) • HBase scales well in fat nodes with DataRush (> 5M rows/sec bulk load; >10M rows/sec sequential scan) 11
  • 12. Varying Spindles for HDFS Terasort Average Execution Time 900 800 700 600 Time (secs) 500 400 Terasort Average Execution Time 300 200 100 0 8 16 24 32 40 48 HDFS Disks (2TB) 12
  • 13. Varying Spindles for Intermediate Outputs Terasort Average Execution Time 800 700 600 500 Time (secs) 400 Terasort Average Execution Time 300 200 100 0 4 x 2TB 8 x 2TB 16 x 2TB Fusion I/O Drive Flash RAID 0 (4 x 2TB) Drives for Intermediate Map Output 13
  • 14. 14 Tasks 0 10 20 40 50 60 70 30 0 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 Execution Time (seconds) 368 384 400 416 432 448 Single node 100 GB Terasort 464 480 496 512 528 544 560 576 592 608 maps merge shuffle reduce
  • 15. Clustering the Nodes • We had a total of 64 hard disks for the cluster and had to split them between the two nodes. • Installed and configured Open Fabrics OFED to enable IPoIB. • Reconfigure Hadoop to cluster the nodes. 15
  • 16. 16 Tasks 100 120 20 40 80 60 0 1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 161 171 181 191 201 211 221 231 Execution Time (seconds) 241 Cluster 100 GB Terasort 251 261 271 281 291 301 311 321 331 341 351 361 371 381 391 maps merge shuffle reduce
  • 17. Comparisons with Amazon Clusters • The Amazon clusters were used to get a better idea of what to expect using more conventionally sized Hadoop nodes (non-EMR). • We used „Cluster Compute Quadruple Extra Large‟ instances – 23 GB of memory – 33.5 EC2 Compute Units (Dual Intel Xeon X5570 quad-core “Nehalem” processors; 8 cores total) – 1690 GB of instance storage (2 spindles) – Very high I/O performance (10 GbE) • Used a similar Hadoop configuration, but dialed back the number of maps and reducers due to lower core count. • Used cluster sizes that were roughly core count equivalent for comparison 17
  • 18. Per Node Comparison Feature Amazon cc1.4xlarge Personal Supercomputer Cores 8 48 Memory 23 GB 256 GB Memory/core 2.875 GB 5.333 GB Spindles 2 32 (24 HDFS/8 temp) Storage 1.690 TB 64 TB Network 10 Gb Ethernet IPoIB (40 Gb Infiniband) 18
  • 19. Performance Comparison 743 712 460 388 Execution Time (secs) Run time cost (cents) MB/dollar 250 245 231 94 106 40 43 41 PSC Single node Amazon HPC Cluster PSC Cluster Amazon HPC Cluster (6 workers + master) (2 workers + master) (11 workers + master) 48 cores 48 cores 96 cores 88 cores 19
  • 20. Conclusions • From what we have seen Hadoop works very well on many-core systems. In fact, Hadoop runs quite well on even a single node many-core system. • Using denser nodes may make failures more expensive for some system components. When using disk arrays the handling of hard disk failures should be comparable to smaller nodes. • The MapReduce framework treats all intermediate outputs as remote resources. The copy phase of MapReduce doesn‟t benefit from locality of data. 20
  • 21. Questions? Follow up/more information:- Visit our booth Pervasive DataRush for Hadoop www.pervasivedatarush.com/Technology/PervasiveDataRushforHadoop. aspx Presentation content – paul.dingman@pervasive.com PERVASIVE DATA INNOVATION