SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Storage on EC2
                          (& Cassandra)
                                  Tom Wilkie
                          Cassandra Workshop 8/06/11




Wednesday, 8 June 2011
ACHTUNG!
            Data only collected over
            past 5 days
            Didn’t repeat experiments
            (that much)
            EC2 is a moving target

Wednesday, 8 June 2011
Consider:                  Not considering:

  • Ephemeral vs EBS        • Cluster Performance
  • ... vs Instance Type    • Internode latency,
                              throughput
  • ... vs RAID level
                            • Tuning...
  • ... vs # threads
                                         ES  ...
  • (...vs storage engine) D F A I L UR
                      A  TE   EL
                C        OR R
Wednesday, 8 June 2011
m1.large                7.5 GB RAM, 4 CU, 64-bit, ‘High’ IO


          m1.xlarge                 15 GB RAM, 8 CU, 64-bit, ‘High’ IO


           c1.xlarge                7GB RAM, 20 CU, 64-bit, ‘High’ IO



                         Cassandra 0.7.6, CentOS 5.5, OpenJDK...

Wednesday, 8 June 2011
Ephemeral Storage



Wednesday, 8 June 2011
  

   [ih-fem-er-uhl] Show IPA –adjective

   1. lasting a very short time; short-lived; transitory:
   the ephemeral joys of childhood.
   2. lasting but one day: an ephemeral flower. –noun
   3. anything short-lived, as certain insects.




Wednesday, 8 June 2011
Ephemeral Storage
                                   Seek Performance
           8000


           7000


           6000

                                           7000 IOPs from a disk??
           5000
                                                                        m1.large, ephemeral
Seek / s




           4000                                                         m1.xlarge, ephemeral
                                                                        c1.xlarge, ephemeral

           3000


           2000


           1000


             0
                  1        2                 3                    4

                               # Devices         http://www.slideshare.net/davegardnerisme/
                                                      running-cassandra-on-amazon-ec2
  Wednesday, 8 June 2011
Ephemeral Storage
                                 Seek Performance
            1000

             900

             800

             700

             600
                                                        m1.large, ephemeral
 Seek / s




             500                                        m1.xlarge, ephemeral
                                                        c1.xlarge, ephemeral
             400

             300

             200

             100

              0
                   1     2                 3        4

                             # Devices

Wednesday, 8 June 2011
Ephemeral Throughput
                                            m1.xlarge
                      500

                      450

                      400

                      350

                                                               Write (Raid-0, dd)
  Throughput (MB/s)




                      300
                                                               Read (Raid-0, dd)
                                                               Write (Random 10MB
                      250                                      chunks)
                                                               Read (Random 10MB
                      200                                      chunks)


                      150

                      100

                       50

                       0
                            1   2                3         4

                                    # Devices
Wednesday, 8 June 2011
#
 # dd if=/dev/zero of=/dev/sdd bs=512k count=20000
 ...
 10485760000 bytes (10 GB) copied, 201.995
 seconds, 51.9 MB/s
 #
 # dd if=/dev/zero of=/dev/sdd bs=512k count=20000
 ...
 10485760000 bytes (10 GB) copied, 80.3673
 seconds, 130 MB/s

Wednesday, 8 June 2011
• Max 4 devices per instance
                   • Data goes away when instance is
                         terminated (or crashes!)
                   • Suspect there is some sort indirection layer
                         underneath - thin provisioning / dedupe /
                         CoW or something
                   • Linux software RAID sucks

Wednesday, 8 June 2011
R ES ...
                                            F AI LU
                             E LA T ED
           CO RR
                         What happens if a bug in your software
                         causes all your nodes to crash?
                              ie say a memory leak causes an
                              OOM... on all nodes



Wednesday, 8 June 2011
EBS

Wednesday, 8 June 2011
EBS Seek performance
                 3000




                 2500




                 2000
     Seeks / s




                                                                  m1.large, ebs
                 1500
                                                                  m1.large, ebs
                                                                  c1.xlarge, ebs


                 1000




                  500




                   0
                        0   5   10        15     20     25   30

                                     # Devices


Wednesday, 8 June 2011
EBS Random Reads
                                                    m1.xlarge, raid-0
                             1000


                              900


                              800


                              700                                                        1
                                                                                         2
                              600                                                        3
                                                                                         4
            Total Seek / s




                                                                                         5
                              500                                                        6
                                                                                         7
                              400                                                        8
                                                                                         9
                                                                                         10
                              300


                              200


                              100


                               0
                                    1   2   3   4       5         6     7   8   9   10

                                                      # Threads
Wednesday, 8 June 2011
EBS Random Reads
                                              m1.xlarge, raid-0
                       1000


                        900


                        800


                        700


                        600
        Max seek / s




                        500


                        400


                        300


                        200


                        100


                         0
                              0   1   2   3       4         5     6   7   8   9   10

                                                      # Devices
Wednesday, 8 June 2011
EBS Random Reads
                                                                    m1.xlarge, raid-0
                                              450


                                              400


                                              350
                Seeks per device per second




                                              300


                                              250                                                        max
                                                                                                         min
                                                                                                         avg
                                              200


                                              150


                                              100


                                               50


                                               0
                                                    1   2   3   4      5         6      7   8   9   10

                                                                     # Devices

Wednesday, 8 June 2011
EBS Throughput
10MB chunks)                                        m1.xlarge
                     350


                     300


                     250                                                     Write (Raid-0, dd)
                                                                             Write (Raid-0, dd)
 Throughput (MB/s)




                                                                             Write (Raid-0, dd)
                     200
                                                                             Read (Raid-0, dd)
                                                                             Read (Random 10MB
                     150                                                     chunks)
                                                                             Read (Random 10MB
                                                                             chunks)
                     100


                      50


                      0
                           1   2   3   4     5         6   7    8   9   10

                                           # Devices



 Wednesday, 8 June 2011
• Limited to ~100 IOPS per device?
        • Or just 10ms latency?
       • Seems to scale pretty linearly for random IO
       • Sequential IO limited by network bandwidth,
              independent of # devices
            • shared with other network traffic?
       • Linux software RAID sucks

Wednesday, 8 June 2011
R ES ...
                                             F AI LU
                           E LA T ED
        CO RR
                         What happens when EBS breaks?
                                    http://storagemojo.com/2011/04/29/amazons-ebs-outage/

                                                      http://status.heroku.com/incident/151




Wednesday, 8 June 2011
+

                         II
                         ???
Wednesday, 8 June 2011
“Use Elastic Block Storage”
                                     http://stackoverflow.com/questions/4714879/deploy-cassandra-on-ec2

            “Raid 0 EBS drives are the way to go”
                         http://coreyhulen.org/2010/10/03/%EF%BB%BFcassandra-performance-tests-on-ec2/

            “we recommend using raid0 ephemeral disks”
                             http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cold-boot-
                                                         performance-problems-td5615829.html#a5615889




Wednesday, 8 June 2011
http://coreyhulen.org/2010/10/03/%EF%BB%BFcassandra-performance-tests-on-ec2/
Wednesday, 8 June 2011
http://coreyhulen.org/2010/10/03/%EF%BB%BFcassandra-performance-tests-on-ec2/
Wednesday, 8 June 2011
Insert Rates by Instance Type
                     35000

                     30000

                     25000

                     20000
       Inserts / s




                     15000

                     10000

                      5000

                         0
                                                e ral                e ral                 e ral             ebs                    ebs                   ebs
                                             hem                  hem                   hem           ar ge,                 ar ge,                ar ge,
                                          ep                   ep                   , ep        1.l                  1.x
                                                                                                                         l
                                                                                                                                            1.x
                                                                                                                                               l
                                    ge,                  ge,                  rge             m                    m                      c
                              1 .lar               .x lar               . xl a
                             m                m1                   c1
               100 threads, batch mutate size 100, values length 10, 1 column per row, 300 million values
Wednesday, 8 June 2011
Wednesday, 8 June 2011
Wednesday, 8 June 2011
Get Rates by Instance Type
               1700




               1275
    Gets / s




                850




                425




                  0
                         m1.xlarge, ephemeral                        m1.xlarge, ebs


                                  100 threads, 700 thousand values
Wednesday, 8 June 2011
Wednesday, 8 June 2011
Range Query Rates by Instance Type




                         Too slow. No
                            results


Wednesday, 8 June 2011
Wednesday, 8 June 2011
TODO
              • Repeat experiments
              • # threads vs # devices for ephemeral
              • Repeat experiments
              • Cluster performance - scaling, latency,
                     throughput etc
              • Repeat experiments
              • Strategies for mixed EBS and Ephemeral?
              • Repeat experiments
Wednesday, 8 June 2011
$470
         110 million IOs, 360 GB-months, 560 machine hours



Wednesday, 8 June 2011
Questions?
                  http://github.com/acunu
                 http://bitbucket.org/acunu
          http://www.slideshare.net/acunu

Wednesday, 8 June 2011

Mais conteúdo relacionado

Semelhante a Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords

Cs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computingCs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computingkartiko edhi
 
Devops workshop unit1
Devops workshop unit1Devops workshop unit1
Devops workshop unit1John Willis
 
Hp cloud performance_benchmark
Hp cloud performance_benchmarkHp cloud performance_benchmark
Hp cloud performance_benchmarkOpenCity Community
 
Operational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionOperational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionArne Wiebalck
 
Perf EMC VNX5100 vs IBM DS5300 Eng
Perf EMC VNX5100 vs IBM DS5300 EngPerf EMC VNX5100 vs IBM DS5300 Eng
Perf EMC VNX5100 vs IBM DS5300 EngOleg Korol
 
An Overview of Flash Storage for Databases
An Overview of Flash Storage for DatabasesAn Overview of Flash Storage for Databases
An Overview of Flash Storage for DatabasesConFoo
 
Maximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performanceMaximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performanceAmazon Web Services
 
Galaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWSGalaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWSEnis Afgan
 
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Amazon Web Services
 
Oracle Storage – Innovation and cost cutting bundle
Oracle Storage – Innovation and cost cutting bundleOracle Storage – Innovation and cost cutting bundle
Oracle Storage – Innovation and cost cutting bundleORACLE USER GROUP ESTONIA
 
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014Amazon Web Services
 
(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for PerformanceAmazon Web Services
 
Storage: Alternate Futures
Storage: Alternate FuturesStorage: Alternate Futures
Storage: Alternate Futures小新 制造
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Emulex Corporation
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Emulex Corporation
 
Apache Con NA 2013
Apache Con NA 2013Apache Con NA 2013
Apache Con NA 2013muellerc
 

Semelhante a Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords (20)

Cs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computingCs264 intro-to-cloud-computing
Cs264 intro-to-cloud-computing
 
Devops workshop unit1
Devops workshop unit1Devops workshop unit1
Devops workshop unit1
 
Blue Gene Active Storage
Blue Gene Active StorageBlue Gene Active Storage
Blue Gene Active Storage
 
Hp cloud performance_benchmark
Hp cloud performance_benchmarkHp cloud performance_benchmark
Hp cloud performance_benchmark
 
Operational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in ProductionOperational War Stories from 5 Years of Running OpenStack in Production
Operational War Stories from 5 Years of Running OpenStack in Production
 
Perf EMC VNX5100 vs IBM DS5300 Eng
Perf EMC VNX5100 vs IBM DS5300 EngPerf EMC VNX5100 vs IBM DS5300 Eng
Perf EMC VNX5100 vs IBM DS5300 Eng
 
An Overview of Flash Storage for Databases
An Overview of Flash Storage for DatabasesAn Overview of Flash Storage for Databases
An Overview of Flash Storage for Databases
 
Maximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performanceMaximizing Amazon EC2 and Amazon EBS performance
Maximizing Amazon EC2 and Amazon EBS performance
 
Galaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWSGalaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWS
 
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
Maximizing EC2 and Elastic Block Store Disk Performance (STG302) | AWS re:Inv...
 
Oracle Storage – Innovation and cost cutting bundle
Oracle Storage – Innovation and cost cutting bundleOracle Storage – Innovation and cost cutting bundle
Oracle Storage – Innovation and cost cutting bundle
 
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
(SDD416) Amazon EBS Deep Dive | AWS re:Invent 2014
 
STM
STMSTM
STM
 
The Smug Mug Tale
The Smug Mug TaleThe Smug Mug Tale
The Smug Mug Tale
 
(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance(STG403) Amazon EBS: Designing for Performance
(STG403) Amazon EBS: Designing for Performance
 
Mateo valero p1
Mateo valero p1Mateo valero p1
Mateo valero p1
 
Storage: Alternate Futures
Storage: Alternate FuturesStorage: Alternate Futures
Storage: Alternate Futures
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
 
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
Get Better I/O Performance in VMware vSphere 5.1 Environments with Emulex 16G...
 
Apache Con NA 2013
Apache Con NA 2013Apache Con NA 2013
Apache Con NA 2013
 

Mais de Acunu

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinAcunu
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsAcunu
 
All Your Base
All Your BaseAll Your Base
All Your BaseAcunu
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraAcunu
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonAcunu
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time CassandraAcunu
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Acunu
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with CassandraAcunu
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your businessAcunu
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraAcunu
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: CassandraAcunu
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation CassandraAcunu
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Acunu
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixAcunu
 
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Acunu
 
Cassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowCassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowAcunu
 

Mais de Acunu (20)

Acunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on CassandraAcunu and Hailo: a realtime analytics case study on Cassandra
Acunu and Hailo: a realtime analytics case study on Cassandra
 
Virtual nodes: Operational Aspirin
Virtual nodes: Operational AspirinVirtual nodes: Operational Aspirin
Virtual nodes: Operational Aspirin
 
Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013 Acunu Analytics and Cassandra at Hailo All Your Base 2013
Acunu Analytics and Cassandra at Hailo All Your Base 2013
 
Understanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problemsUnderstanding Cassandra internals to solve real-world problems
Understanding Cassandra internals to solve real-world problems
 
All Your Base
All Your BaseAll Your Base
All Your Base
 
Realtime Analytics with Apache Cassandra
Realtime Analytics with Apache CassandraRealtime Analytics with Apache Cassandra
Realtime Analytics with Apache Cassandra
 
Realtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX LondonRealtime Analytics with Apache Cassandra - JAX London
Realtime Analytics with Apache Cassandra - JAX London
 
Real-time Cassandra
Real-time CassandraReal-time Cassandra
Real-time Cassandra
 
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
Realtime Analytics on the Twitter Firehose with Apache Cassandra - Denormaliz...
 
Realtime Analytics with Cassandra
Realtime Analytics with CassandraRealtime Analytics with Cassandra
Realtime Analytics with Cassandra
 
Acunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra LondonAcunu Analytics @ Cassandra London
Acunu Analytics @ Cassandra London
 
Exploring Big Data value for your business
Exploring Big Data value for your businessExploring Big Data value for your business
Exploring Big Data value for your business
 
Realtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with CassandraRealtime Analytics on the Twitter Firehose with Cassandra
Realtime Analytics on the Twitter Firehose with Cassandra
 
Progressive NOSQL: Cassandra
Progressive NOSQL: CassandraProgressive NOSQL: Cassandra
Progressive NOSQL: Cassandra
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
 
Next Generation Cassandra
Next Generation CassandraNext Generation Cassandra
Next Generation Cassandra
 
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
Cassandra EU 2012 - CQL: Then, Now and When by Eric Evans
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
 
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
Cassandra EU 2012 - Highly Available: The Cassandra Distribution Model by Sam...
 
Cassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard LowCassandra EU 2012 - Data modelling workshop by Richard Low
Cassandra EU 2012 - Data modelling workshop by Richard Low
 

Último

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Storage on EC2 (& Cassandra), Cassandra Workshop, Berlin Buzzwords

  • 1. Storage on EC2 (& Cassandra) Tom Wilkie Cassandra Workshop 8/06/11 Wednesday, 8 June 2011
  • 2. ACHTUNG! Data only collected over past 5 days Didn’t repeat experiments (that much) EC2 is a moving target Wednesday, 8 June 2011
  • 3. Consider: Not considering: • Ephemeral vs EBS • Cluster Performance • ... vs Instance Type • Internode latency, throughput • ... vs RAID level • Tuning... • ... vs # threads ES ... • (...vs storage engine) D F A I L UR A TE EL C OR R Wednesday, 8 June 2011
  • 4. m1.large 7.5 GB RAM, 4 CU, 64-bit, ‘High’ IO m1.xlarge 15 GB RAM, 8 CU, 64-bit, ‘High’ IO c1.xlarge 7GB RAM, 20 CU, 64-bit, ‘High’ IO Cassandra 0.7.6, CentOS 5.5, OpenJDK... Wednesday, 8 June 2011
  • 6.    [ih-fem-er-uhl] Show IPA –adjective 1. lasting a very short time; short-lived; transitory: the ephemeral joys of childhood. 2. lasting but one day: an ephemeral flower. –noun 3. anything short-lived, as certain insects. Wednesday, 8 June 2011
  • 7. Ephemeral Storage Seek Performance 8000 7000 6000 7000 IOPs from a disk?? 5000 m1.large, ephemeral Seek / s 4000 m1.xlarge, ephemeral c1.xlarge, ephemeral 3000 2000 1000 0 1 2 3 4 # Devices http://www.slideshare.net/davegardnerisme/ running-cassandra-on-amazon-ec2 Wednesday, 8 June 2011
  • 8. Ephemeral Storage Seek Performance 1000 900 800 700 600 m1.large, ephemeral Seek / s 500 m1.xlarge, ephemeral c1.xlarge, ephemeral 400 300 200 100 0 1 2 3 4 # Devices Wednesday, 8 June 2011
  • 9. Ephemeral Throughput m1.xlarge 500 450 400 350 Write (Raid-0, dd) Throughput (MB/s) 300 Read (Raid-0, dd) Write (Random 10MB 250 chunks) Read (Random 10MB 200 chunks) 150 100 50 0 1 2 3 4 # Devices Wednesday, 8 June 2011
  • 10. # # dd if=/dev/zero of=/dev/sdd bs=512k count=20000 ... 10485760000 bytes (10 GB) copied, 201.995 seconds, 51.9 MB/s # # dd if=/dev/zero of=/dev/sdd bs=512k count=20000 ... 10485760000 bytes (10 GB) copied, 80.3673 seconds, 130 MB/s Wednesday, 8 June 2011
  • 11. • Max 4 devices per instance • Data goes away when instance is terminated (or crashes!) • Suspect there is some sort indirection layer underneath - thin provisioning / dedupe / CoW or something • Linux software RAID sucks Wednesday, 8 June 2011
  • 12. R ES ... F AI LU E LA T ED CO RR What happens if a bug in your software causes all your nodes to crash? ie say a memory leak causes an OOM... on all nodes Wednesday, 8 June 2011
  • 14. EBS Seek performance 3000 2500 2000 Seeks / s m1.large, ebs 1500 m1.large, ebs c1.xlarge, ebs 1000 500 0 0 5 10 15 20 25 30 # Devices Wednesday, 8 June 2011
  • 15. EBS Random Reads m1.xlarge, raid-0 1000 900 800 700 1 2 600 3 4 Total Seek / s 5 500 6 7 400 8 9 10 300 200 100 0 1 2 3 4 5 6 7 8 9 10 # Threads Wednesday, 8 June 2011
  • 16. EBS Random Reads m1.xlarge, raid-0 1000 900 800 700 600 Max seek / s 500 400 300 200 100 0 0 1 2 3 4 5 6 7 8 9 10 # Devices Wednesday, 8 June 2011
  • 17. EBS Random Reads m1.xlarge, raid-0 450 400 350 Seeks per device per second 300 250 max min avg 200 150 100 50 0 1 2 3 4 5 6 7 8 9 10 # Devices Wednesday, 8 June 2011
  • 18. EBS Throughput 10MB chunks) m1.xlarge 350 300 250 Write (Raid-0, dd) Write (Raid-0, dd) Throughput (MB/s) Write (Raid-0, dd) 200 Read (Raid-0, dd) Read (Random 10MB 150 chunks) Read (Random 10MB chunks) 100 50 0 1 2 3 4 5 6 7 8 9 10 # Devices Wednesday, 8 June 2011
  • 19. • Limited to ~100 IOPS per device? • Or just 10ms latency? • Seems to scale pretty linearly for random IO • Sequential IO limited by network bandwidth, independent of # devices • shared with other network traffic? • Linux software RAID sucks Wednesday, 8 June 2011
  • 20. R ES ... F AI LU E LA T ED CO RR What happens when EBS breaks? http://storagemojo.com/2011/04/29/amazons-ebs-outage/ http://status.heroku.com/incident/151 Wednesday, 8 June 2011
  • 21. + II ??? Wednesday, 8 June 2011
  • 22. “Use Elastic Block Storage” http://stackoverflow.com/questions/4714879/deploy-cassandra-on-ec2 “Raid 0 EBS drives are the way to go” http://coreyhulen.org/2010/10/03/%EF%BB%BFcassandra-performance-tests-on-ec2/ “we recommend using raid0 ephemeral disks” http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cold-boot- performance-problems-td5615829.html#a5615889 Wednesday, 8 June 2011
  • 25. Insert Rates by Instance Type 35000 30000 25000 20000 Inserts / s 15000 10000 5000 0 e ral e ral e ral ebs ebs ebs hem hem hem ar ge, ar ge, ar ge, ep ep , ep 1.l 1.x l 1.x l ge, ge, rge m m c 1 .lar .x lar . xl a m m1 c1 100 threads, batch mutate size 100, values length 10, 1 column per row, 300 million values Wednesday, 8 June 2011
  • 28. Get Rates by Instance Type 1700 1275 Gets / s 850 425 0 m1.xlarge, ephemeral m1.xlarge, ebs 100 threads, 700 thousand values Wednesday, 8 June 2011
  • 30. Range Query Rates by Instance Type Too slow. No results Wednesday, 8 June 2011
  • 32. TODO • Repeat experiments • # threads vs # devices for ephemeral • Repeat experiments • Cluster performance - scaling, latency, throughput etc • Repeat experiments • Strategies for mixed EBS and Ephemeral? • Repeat experiments Wednesday, 8 June 2011
  • 33. $470 110 million IOs, 360 GB-months, 560 machine hours Wednesday, 8 June 2011
  • 34. Questions? http://github.com/acunu http://bitbucket.org/acunu http://www.slideshare.net/acunu Wednesday, 8 June 2011