SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Taking the guesswork out of
 your Hadoop Infrastructure

         Steve Watt

         @wattsteve
Agenda – Clearing up common misconceptions

                                       Web Scale Hadoop Origins
                                        Single/Dual Socket 1+ GHz
                                        4-8 GB RAM
                                        2-4 Cores
                                        1x 1GbE NIC
                                        (2-4) x 1 TB SATA Drives




                                       Commodity in 2013


                                        Dual Socket 2+ GHz
                                        24-48 GB RAM
                                        4-6 Cores
                                        (2-4) x 1GbE NICs
                                        (4-14) x 2 TB SATA Drives




                                         The enterprise perspective
                                         is also different




2
You can quantify what is right for you
Balancing Performance and Storage Capacity with Price




                                    $
                                     PRICE




       Storage                                          Performance
       Capacity




3
“W e’ve profiled our Hadoop applications
so we know what type of infrastructure
we need”


    Said no-one. Ever.



4
Profiling your Hadoop Cluster
High Level Design goals


    Its pretty simple:
    1) Instrument the Cluster
    2) Run your workloads
    3) Analyze the Numbers

Don’t do paper exercises. Hadoop has a way of blowing all
your hypothesis out of the water.

Lets walk through a 10 TB TeraSort on Full 42u Rack:
- 2 x HP DL360p (JobTracker and NameNode)
- 18 x HP DL380p Hadoop Slaves (18 Maps, 12 Reducers)

64 GB RAM, Dual 6 core Intel 2.9 GHz, 2 x HP p420 Smart
Array Controllers, 16 x 1TB SFF Disks, 4 x 1GbE Bonded NICs




5
Instrumenting the Cluster
The key is to capture the data. Use whatever framework you’re comfortable with



    Analysis using the Linux SAR Tool

    -   Outer Script starts SAR gathering scripts on each node. Then starts Hadoop
        Job.

    -    SAR Scripts on each node gather I/O, CPU, Memory and Network Metrics for
        that node for the duration of the job.

    -   Upon completion the SAR data is converted to CSV and loaded into MySQL so
        we can do ad-hoc analysis of the data.

    -   Aggregations/Summations done via SQL.

    -   Excel is used to generate charts.




6
Examples
Performed for each node – results copied to repository node



ssh wkr01   sadf -d sar_test.dat -- -u           > wkr01_cpu_util.csv

ssh wkr01   sadf -d sar_test.dat -- -b           > wkr01_io_rate.csv

ssh wkr01   sadf -d sar_test.dat -- -n DEV > wkr01_net_dev.csv



ssh wkr02   sadf -d sar_test.dat -- -u           > wkr02_cpu_util.csv

ssh wkr02   sadf -d sar_test.dat -- -b           > wkr02_io_rate.csv

ssh wkr02   sadf -d sar_test.dat -- -n DEV > wkr02_net_dev.csv

…           File name is prefixed with node name (i.e., “wrknn”)




7
I/ Subsystem Test Chart
 O
I/O Subsystem has only around 10% of Total Throughput Utilization


                                                       RUN the DD Tests first to understand
                                                       what the Read and Write throughput
                                                       capabilities of your I/O Subsystem

                                                        -   X axis is time
                                                        -   Y axis is MB per second

                                                       TeraSort for this server design is not
                                                       I/O bound. 1.6 GB/s is upper bound,
                                                       this is less than 10% utilized.




8
Network Subsystem Test Chart
Network throughput utilization per server less than a ¼ of capacity


                                                       A 1GbE NICs can drive up to 1Gb/s for
                                                       Reads and 1Gb/s for Writes which is
                                                       roughly 4 Gb/s or 400 MB/s total across all
                                                       bonded NICs

                                                        -   Y axis is throughput
                                                        -   X axis is elapsed time

                                                       Rx = Received MB/sec
                                                       Tx = Transmitted MB/sec
                                                       Tot = Total MB/sec




9
CPU Subsystem Test Chart
CPU Utilization is High, I/O Wait is low – Where you want to be


                                                    Each data point is taken every 30 seconds
                                                    Y axis is percent utilization of CPUs
                                                    X axis is timeline of the run

                                                    CPU Utilization is captured by analyzing how
                                                    busy each core is and creating an average

                                                    IO Wait (1 of 10 other SAR CPU Metrics)
                                                    measures percentage of time CPU is waiting
                                                    on I/O




10
Memory Subsystem Test Chart
High Percentage Cache Ratio to Total Memory shows CPU is not waiting on memory


                                               -   X axis is Memory Utilization
                                               -   Y axis is elapsed time

                                               -   Percent Caches is a SAR Memory Metrics
                                                   (kb_cached). Tells you how many blocks of
                                                   memory are cached in the Linux Memory
                                                   Page Cache. The amount of memory used to
                                                   cache data by the kernel.

                                               -   Means we’re doing a good job of caching
                                                   data so the JVM is not having to do I/Os and
                                                   incurring I/O wait time (reflected in the
                                                   previous slide)




11
Tuning from your Server Configuration baseline

    - We’ve established now that the Server Configuration we used works pretty well. But what if you want to
      Tune it? i.e. sacrifice some performance for cost or to remove an I/O, Network or Memory limitation?




-     Types of Disks / Controllers
-     Number of Disks /                                                                        -   Type (Socket R 130 W vs.
      Controllers                                                                                Socket B 95 W)
                                                                                               - Amount of Cores
                                                                                               - Amount of Memory Channels
                                                                                               -   4GB vs. 8GB DIMMS
                                     I/ Subsystem
                                      O
                                                                            COMPUTE


-     Floor Space (Rack                                                                    -       Type of Network (1 or 10GbE)
      Density)                                                                             -       Amount of Server of NICs
-     Power and Cooling                                                                    -       Switch Port Availability
      Constraints                                                                          -       Deep Buffering


                                     Data CENTER                              Network




      Annual Power and Cooling Costs of a Cluster are 1/3 the cost
 12
      of acquiring the cluster
Introducing: Hadoop Ops whack-a-mole




13
W you really want is a shared
       hat
     service...

     The Server Building Blocks

 NameNode and JobTracker Configurations

 1u, 64GB of RAM, 2 x 2.9 GHz 6 Core Intel Socket R Processors, 4
 Small Form Factor Disks, RAID Configuration, 1 Disk Controller



 Hadoop Slave Configurations

 - 2u 48 GB of RAM, 2 x 2.4 GHz 6 Core Intel Socket B Processors, 1
   High Performing Disk Controller with twelve 2TB Large Form Factor
   Data Disks in JBOD Configuration
 - 24 TB of Storage Capacity, Low Power Consumption CPU
 -

14
Single Rack Config
Single rack configuration
                                         42u rack enclosure


This rack is the initia l build ing
blo c k that configures all the       1u 1GbE Switches           2 x 1GbE TOR switches

key management services for
a production scale out cluster        1u 2 Sockets, 4 Disks       1 x Management node
to 4000 nodes.
                                      2u 2 Sockets, 12 Disks      1 x JobTracker node



                                                                 1 x Name node



                                                               (3-18) x Worker nodes



                                                                  Open for KVM switch
                                      2u 2 Sockets, 12 Disks



                                        Open 1u




15
Multi-Rack Config
                                                        1u Switches                             2 x 10 GbE Aggregation switch

Scale out configuration
                                                                      1 or more 42u Racks
The Ex te ns io n build ing blo c k.
One adds 1 or more racks of
this block to the single rack
                                       Single rack RA                 1u 1GbE Switches         2 x 1GbE TOR switches


configuration to build out a
cluster that scales to 4000                                           2u 2 Sockets, 12 Disks     19 x Worker nodes
nodes.

                                                                                                 Open for KVM switch




                                                                  2u 2 Sockets, 12 Disks



                                                                       Open 2u




16
System on a Chip and Hadoop
Intel ATOM and ARM Processor Server Cartridges

                                                  Hadoop has come full circle




                                                       4-8 GB RAM
                                                       4 Cores 1.8 GHz
                                                       2-4 TB
                                                       1 GbE NICs



                                                 - Amazing Density Advances. 270 +
                                                   servers in a Rack (compared to 21!).

                                                 - Trevor Robinson from Calxeda is a
                                                   member of our community working on
                                                   this. He’d love some help:
                                                        trevor.robinson@calxeda.com




17
Thanks for attending

Thanks to our partner HP for letting me present their findings. If you’re
interested in learning more about these findings please check out
hp.com/go/hadoop and see detailed reference architectures and varying server
configurations for Hortonworks, MapR and Cloudera.

Steve Watt    swatt@redhat.com @wattsteve




18
HBase Testing – Results still TBD
 YCSB Workload A and C unable to Drive the CPU


                  CPU Utilization                                                                         2 Workloads Tested:
                                                                           Network Usage
                                                                                                           -   YCSB workload “C” : 100%
             100                                                     300                                      read-only requests, random
                                                                     250                                      access to DB, using a unique
              80
                                                                     200                                      primary key value
              60
                                                                     150
Axis Title                                                   MB/
                                                               sec
              40                                                     100                                   -    YCSB workload “A” : 50% read
              20                                                     50                                        and 50% updates, random
                                                                       -                                       access to DB, using a unique
                  0
                      1   3   5   7   9 11 13 15 17 19 21
                                                                           1 3 4 5 7 910 13 15 17 19 21
                                                                                       11                      primary key value
                                  Minutes                                          Minutes

                                                                                                           -   When data is cached in HBase
                                                                                                               cache, performance is really
                                                                                                               good and we can occupy the
                              Disk Usage                                      Memory Usage                     CPU. Ops throughput (and
                                                                                                               hence CPU Utilization) dwindles
         300                                                                 100
                                                                                                               when data exceeds HBase
         250                                                                  80
                                                                                                               cache but is available in Linux
         200                                                                  60
                                                               Axis Title
                                                                                                               cache. Resolution TBD –
         150                                                                  40
MB/
  sec                                                                                                          Related JIRA HDFS-2246
         100                                                                  20
             50
                                                                               0
              -                                                                 1 345 7 9 111315171921
                                                                                         10
                  1       3   5   7    9 11 13 15 17 19 21                             Minutes
                                      Minutes
 19

Mais conteúdo relacionado

Mais procurados

IP Address Lookup By Using GPU
IP Address Lookup By Using GPUIP Address Lookup By Using GPU
IP Address Lookup By Using GPUJino Antony
 
High Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and FutureHigh Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and Futurekarl.barnes
 
CUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU ComputingCUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU ComputingJeff Larkin
 
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAsScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAsShinya Takamaeda-Y
 
A Consolidation Success Story
A Consolidation Success StoryA Consolidation Success Story
A Consolidation Success StoryEnkitec
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaRob Gillen
 
Tungsten University: Set Up And Manage Advanced Replication Topologies
Tungsten University: Set Up And Manage Advanced Replication TopologiesTungsten University: Set Up And Manage Advanced Replication Topologies
Tungsten University: Set Up And Manage Advanced Replication TopologiesContinuent
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudRyousei Takano
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 HardwareJacob Wu
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureCloudera, Inc.
 
Fengqi.asia Cloud advantages
Fengqi.asia Cloud advantagesFengqi.asia Cloud advantages
Fengqi.asia Cloud advantagesAndrew Wong
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Community
 
Autonomous control in Big Data platforms: and experience with Cassandra
Autonomous control in Big Data platforms: and experience with CassandraAutonomous control in Big Data platforms: and experience with Cassandra
Autonomous control in Big Data platforms: and experience with CassandraEmiliano
 
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Jeff Larkin
 
R&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsR&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsJoshua Mora
 
Hdfs ha using journal nodes
Hdfs ha using journal nodesHdfs ha using journal nodes
Hdfs ha using journal nodesEvans Ye
 

Mais procurados (20)

Ch23
Ch23Ch23
Ch23
 
IP Address Lookup By Using GPU
IP Address Lookup By Using GPUIP Address Lookup By Using GPU
IP Address Lookup By Using GPU
 
Cuda tutorial
Cuda tutorialCuda tutorial
Cuda tutorial
 
High Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and FutureHigh Performance Computing Infrastructure: Past, Present, and Future
High Performance Computing Infrastructure: Past, Present, and Future
 
CUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU ComputingCUG2011 Introduction to GPU Computing
CUG2011 Introduction to GPU Computing
 
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAsScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
ScalableCore System: A Scalable Many-core Simulator by Employing Over 100 FPGAs
 
淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道 淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道
 
A Consolidation Success Story
A Consolidation Success StoryA Consolidation Success Story
A Consolidation Success Story
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with Cuda
 
Tungsten University: Set Up And Manage Advanced Replication Topologies
Tungsten University: Set Up And Manage Advanced Replication TopologiesTungsten University: Set Up And Manage Advanced Replication Topologies
Tungsten University: Set Up And Manage Advanced Replication Topologies
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
 
Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
Strata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and FutureStrata + Hadoop World 2012: HDFS: Now and Future
Strata + Hadoop World 2012: HDFS: Now and Future
 
Fengqi.asia Cloud advantages
Fengqi.asia Cloud advantagesFengqi.asia Cloud advantages
Fengqi.asia Cloud advantages
 
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
 
03 Hadoop
03 Hadoop03 Hadoop
03 Hadoop
 
Autonomous control in Big Data platforms: and experience with Cassandra
Autonomous control in Big Data platforms: and experience with CassandraAutonomous control in Big Data platforms: and experience with Cassandra
Autonomous control in Big Data platforms: and experience with Cassandra
 
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
Maximizing Application Performance on Cray XT6 and XE6 Supercomputers DOD-MOD...
 
R&D work on pre exascale HPC systems
R&D work on pre exascale HPC systemsR&D work on pre exascale HPC systems
R&D work on pre exascale HPC systems
 
Hdfs ha using journal nodes
Hdfs ha using journal nodesHdfs ha using journal nodes
Hdfs ha using journal nodes
 

Semelhante a Apache con 2013-hadoop

Optimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for HadoopOptimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for HadoopDataWorks Summit
 
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Fisnik Kraja
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalTommy Lee
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph clusterMirantis
 
Perf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsPerf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsYanghua Zhang
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...npinto
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQDon Brizendine
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaCloudera, Inc.
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] IO Visor Project
 
MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicEric Verhulst
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliabilitybryanrandol
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalTommy Lee
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCoburn Watson
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Databricks
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Databricks
 
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Suresh Kumar
 
Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Suresh Kumar
 

Semelhante a Apache con 2013-hadoop (20)

Optimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for HadoopOptimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for Hadoop
 
cosbench-openstack.pdf
cosbench-openstack.pdfcosbench-openstack.pdf
cosbench-openstack.pdf
 
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...Using Many-Core Processors to Improve the Performance of Space Computing Plat...
Using Many-Core Processors to Improve the Performance of Space Computing Plat...
 
Shak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-finalShak larry-jeder-perf-and-tuning-summit14-part2-final
Shak larry-jeder-perf-and-tuning-summit14-part2-final
 
Your 1st Ceph cluster
Your 1st Ceph clusterYour 1st Ceph cluster
Your 1st Ceph cluster
 
Perf Vsphere Storage Protocols
Perf Vsphere Storage ProtocolsPerf Vsphere Storage Protocols
Perf Vsphere Storage Protocols
 
Mysql talk
Mysql talkMysql talk
Mysql talk
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
 
Tips and Tricks for SAP Sybase IQ
Tips and Tricks for SAP  Sybase IQTips and Tricks for SAP  Sybase IQ
Tips and Tricks for SAP Sybase IQ
 
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, ClouderaHadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
Hadoop World 2011: Hadoop and Performance - Todd Lipcon & Yanpei Chen, Cloudera
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 Altreonic
 
Designing Information Structures For Performance And Reliability
Designing Information Structures For Performance And ReliabilityDesigning Information Structures For Performance And Reliability
Designing Information Structures For Performance And Reliability
 
Shak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-finalShak larry-jeder-perf-and-tuning-summit14-part1-final
Shak larry-jeder-perf-and-tuning-summit14-part1-final
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
CPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performanceCPN302 your-linux-ami-optimization-and-performance
CPN302 your-linux-ami-optimization-and-performance
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
 
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
 
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
Vmwareperformancetroubleshooting 100224104321-phpapp02 (1)
 
Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02Vmwareperformancetroubleshooting 100224104321-phpapp02
Vmwareperformancetroubleshooting 100224104321-phpapp02
 

Mais de Steve Watt

Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerSteve Watt
 
Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerSteve Watt
 
Hadoop for the disillusioned
Hadoop for the disillusionedHadoop for the disillusioned
Hadoop for the disillusionedSteve Watt
 
Hadoop file systems
Hadoop file systemsHadoop file systems
Hadoop file systemsSteve Watt
 
Apache con 2012 taking the guesswork out of your hadoop infrastructure
Apache con 2012 taking the guesswork out of your hadoop infrastructureApache con 2012 taking the guesswork out of your hadoop infrastructure
Apache con 2012 taking the guesswork out of your hadoop infrastructureSteve Watt
 
Mining the Web for Information using Hadoop
Mining the Web for Information using HadoopMining the Web for Information using Hadoop
Mining the Web for Information using HadoopSteve Watt
 
Tech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataTech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataSteve Watt
 
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and VerticaBridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and VerticaSteve Watt
 
Web Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache NutchWeb Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache NutchSteve Watt
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache HadoopSteve Watt
 

Mais de Steve Watt (12)

Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and Docker
 
Building Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and DockerBuilding Clustered Applications with Kubernetes and Docker
Building Clustered Applications with Kubernetes and Docker
 
Hadoop for the disillusioned
Hadoop for the disillusionedHadoop for the disillusioned
Hadoop for the disillusioned
 
Hadoop file systems
Hadoop file systemsHadoop file systems
Hadoop file systems
 
Apache con 2012 taking the guesswork out of your hadoop infrastructure
Apache con 2012 taking the guesswork out of your hadoop infrastructureApache con 2012 taking the guesswork out of your hadoop infrastructure
Apache con 2012 taking the guesswork out of your hadoop infrastructure
 
Mining the Web for Information using Hadoop
Mining the Web for Information using HadoopMining the Web for Information using Hadoop
Mining the Web for Information using Hadoop
 
Tech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big DataTech4Africa - Opportunities around Big Data
Tech4Africa - Opportunities around Big Data
 
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and VerticaBridging Structured and Unstructred Data with Apache Hadoop and Vertica
Bridging Structured and Unstructred Data with Apache Hadoop and Vertica
 
Final deck
Final deckFinal deck
Final deck
 
Web Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache NutchWeb Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache Nutch
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 
Extractiv
ExtractivExtractiv
Extractiv
 

Apache con 2013-hadoop

  • 1. Taking the guesswork out of your Hadoop Infrastructure Steve Watt @wattsteve
  • 2. Agenda – Clearing up common misconceptions Web Scale Hadoop Origins Single/Dual Socket 1+ GHz 4-8 GB RAM 2-4 Cores 1x 1GbE NIC (2-4) x 1 TB SATA Drives Commodity in 2013 Dual Socket 2+ GHz 24-48 GB RAM 4-6 Cores (2-4) x 1GbE NICs (4-14) x 2 TB SATA Drives The enterprise perspective is also different 2
  • 3. You can quantify what is right for you Balancing Performance and Storage Capacity with Price $ PRICE Storage Performance Capacity 3
  • 4. “W e’ve profiled our Hadoop applications so we know what type of infrastructure we need” Said no-one. Ever. 4
  • 5. Profiling your Hadoop Cluster High Level Design goals Its pretty simple: 1) Instrument the Cluster 2) Run your workloads 3) Analyze the Numbers Don’t do paper exercises. Hadoop has a way of blowing all your hypothesis out of the water. Lets walk through a 10 TB TeraSort on Full 42u Rack: - 2 x HP DL360p (JobTracker and NameNode) - 18 x HP DL380p Hadoop Slaves (18 Maps, 12 Reducers) 64 GB RAM, Dual 6 core Intel 2.9 GHz, 2 x HP p420 Smart Array Controllers, 16 x 1TB SFF Disks, 4 x 1GbE Bonded NICs 5
  • 6. Instrumenting the Cluster The key is to capture the data. Use whatever framework you’re comfortable with Analysis using the Linux SAR Tool - Outer Script starts SAR gathering scripts on each node. Then starts Hadoop Job. - SAR Scripts on each node gather I/O, CPU, Memory and Network Metrics for that node for the duration of the job. - Upon completion the SAR data is converted to CSV and loaded into MySQL so we can do ad-hoc analysis of the data. - Aggregations/Summations done via SQL. - Excel is used to generate charts. 6
  • 7. Examples Performed for each node – results copied to repository node ssh wkr01 sadf -d sar_test.dat -- -u > wkr01_cpu_util.csv ssh wkr01 sadf -d sar_test.dat -- -b > wkr01_io_rate.csv ssh wkr01 sadf -d sar_test.dat -- -n DEV > wkr01_net_dev.csv ssh wkr02 sadf -d sar_test.dat -- -u > wkr02_cpu_util.csv ssh wkr02 sadf -d sar_test.dat -- -b > wkr02_io_rate.csv ssh wkr02 sadf -d sar_test.dat -- -n DEV > wkr02_net_dev.csv … File name is prefixed with node name (i.e., “wrknn”) 7
  • 8. I/ Subsystem Test Chart O I/O Subsystem has only around 10% of Total Throughput Utilization RUN the DD Tests first to understand what the Read and Write throughput capabilities of your I/O Subsystem - X axis is time - Y axis is MB per second TeraSort for this server design is not I/O bound. 1.6 GB/s is upper bound, this is less than 10% utilized. 8
  • 9. Network Subsystem Test Chart Network throughput utilization per server less than a ¼ of capacity A 1GbE NICs can drive up to 1Gb/s for Reads and 1Gb/s for Writes which is roughly 4 Gb/s or 400 MB/s total across all bonded NICs - Y axis is throughput - X axis is elapsed time Rx = Received MB/sec Tx = Transmitted MB/sec Tot = Total MB/sec 9
  • 10. CPU Subsystem Test Chart CPU Utilization is High, I/O Wait is low – Where you want to be Each data point is taken every 30 seconds Y axis is percent utilization of CPUs X axis is timeline of the run CPU Utilization is captured by analyzing how busy each core is and creating an average IO Wait (1 of 10 other SAR CPU Metrics) measures percentage of time CPU is waiting on I/O 10
  • 11. Memory Subsystem Test Chart High Percentage Cache Ratio to Total Memory shows CPU is not waiting on memory - X axis is Memory Utilization - Y axis is elapsed time - Percent Caches is a SAR Memory Metrics (kb_cached). Tells you how many blocks of memory are cached in the Linux Memory Page Cache. The amount of memory used to cache data by the kernel. - Means we’re doing a good job of caching data so the JVM is not having to do I/Os and incurring I/O wait time (reflected in the previous slide) 11
  • 12. Tuning from your Server Configuration baseline - We’ve established now that the Server Configuration we used works pretty well. But what if you want to Tune it? i.e. sacrifice some performance for cost or to remove an I/O, Network or Memory limitation? - Types of Disks / Controllers - Number of Disks / - Type (Socket R 130 W vs. Controllers Socket B 95 W) - Amount of Cores - Amount of Memory Channels - 4GB vs. 8GB DIMMS I/ Subsystem O COMPUTE - Floor Space (Rack - Type of Network (1 or 10GbE) Density) - Amount of Server of NICs - Power and Cooling - Switch Port Availability Constraints - Deep Buffering Data CENTER Network Annual Power and Cooling Costs of a Cluster are 1/3 the cost 12 of acquiring the cluster
  • 13. Introducing: Hadoop Ops whack-a-mole 13
  • 14. W you really want is a shared hat service... The Server Building Blocks NameNode and JobTracker Configurations 1u, 64GB of RAM, 2 x 2.9 GHz 6 Core Intel Socket R Processors, 4 Small Form Factor Disks, RAID Configuration, 1 Disk Controller Hadoop Slave Configurations - 2u 48 GB of RAM, 2 x 2.4 GHz 6 Core Intel Socket B Processors, 1 High Performing Disk Controller with twelve 2TB Large Form Factor Data Disks in JBOD Configuration - 24 TB of Storage Capacity, Low Power Consumption CPU - 14
  • 15. Single Rack Config Single rack configuration 42u rack enclosure This rack is the initia l build ing blo c k that configures all the 1u 1GbE Switches 2 x 1GbE TOR switches key management services for a production scale out cluster 1u 2 Sockets, 4 Disks 1 x Management node to 4000 nodes. 2u 2 Sockets, 12 Disks 1 x JobTracker node 1 x Name node (3-18) x Worker nodes Open for KVM switch 2u 2 Sockets, 12 Disks Open 1u 15
  • 16. Multi-Rack Config 1u Switches 2 x 10 GbE Aggregation switch Scale out configuration 1 or more 42u Racks The Ex te ns io n build ing blo c k. One adds 1 or more racks of this block to the single rack Single rack RA 1u 1GbE Switches 2 x 1GbE TOR switches configuration to build out a cluster that scales to 4000 2u 2 Sockets, 12 Disks 19 x Worker nodes nodes. Open for KVM switch 2u 2 Sockets, 12 Disks Open 2u 16
  • 17. System on a Chip and Hadoop Intel ATOM and ARM Processor Server Cartridges Hadoop has come full circle 4-8 GB RAM 4 Cores 1.8 GHz 2-4 TB 1 GbE NICs - Amazing Density Advances. 270 + servers in a Rack (compared to 21!). - Trevor Robinson from Calxeda is a member of our community working on this. He’d love some help: trevor.robinson@calxeda.com 17
  • 18. Thanks for attending Thanks to our partner HP for letting me present their findings. If you’re interested in learning more about these findings please check out hp.com/go/hadoop and see detailed reference architectures and varying server configurations for Hortonworks, MapR and Cloudera. Steve Watt swatt@redhat.com @wattsteve 18
  • 19. HBase Testing – Results still TBD YCSB Workload A and C unable to Drive the CPU CPU Utilization 2 Workloads Tested: Network Usage - YCSB workload “C” : 100% 100 300 read-only requests, random 250 access to DB, using a unique 80 200 primary key value 60 150 Axis Title MB/ sec 40 100 - YCSB workload “A” : 50% read 20 50 and 50% updates, random - access to DB, using a unique 0 1 3 5 7 9 11 13 15 17 19 21 1 3 4 5 7 910 13 15 17 19 21 11 primary key value Minutes Minutes - When data is cached in HBase cache, performance is really good and we can occupy the Disk Usage Memory Usage CPU. Ops throughput (and hence CPU Utilization) dwindles 300 100 when data exceeds HBase 250 80 cache but is available in Linux 200 60 Axis Title cache. Resolution TBD – 150 40 MB/ sec Related JIRA HDFS-2246 100 20 50 0 - 1 345 7 9 111315171921 10 1 3 5 7 9 11 13 15 17 19 21 Minutes Minutes 19

Notas do Editor

  1. The average enterprise customers are running 1 or 2 Racks in Production. In that scenario, you have redundant switches in each RACK and you RAID Mirror the O/S and Hadoop Runtime and JBOD the Data Drives because losing Racks and Servers is costly. VERY Different to the Web Scale perspective.
  2. Hadoop Slave Server Configurations are about balancing the following: Performance (Keeping the CPU as busy as possible) Storage Capacity (Important for HDFS) Price (Managing the above at a price you can afford) Commodity Servers do not mean cheap. At scale you want to fully optimize performance and storage capacity for your workloads to keep costs down.
  3. So as any good Infrastructure Designer will tell you… you begin by Profiling your Hadoop Applications.
  4. But generally speaking, you design a pretty decent cluster infrastructure baseline that you can later optimize, provided you get these things right.
  5. So as any good Infrastructure Designer will tell you… you begin by Profiling your Hadoop Applications.
  6. In reality, you don’t want Hadoop Clusters popping up all over the place in your company. It becomes untenable to manage. You really want a single large cluster managed as a service. If that’s the case then you don’t want to be optimizing your cluster for just one workload, you want to be optimizing this for multiple concurrent workloads (dependent on scheduler) and have a balanced cluster.