SlideShare a Scribd company logo
1 of 28
Download to read offline
Performance comparisons and trade-offs for various
               MySQL replication schemes


               Darpan Dinker                          Brian O’Krafka,
               VP Engineering                         Chief Architect
               Schooner Information Technology, Inc.
               http://www.schoonerinfotech.com/




Oracle, MySQL, Java are registered trademarks of Oracle and/or its affiliates.
Other names may be trademarks of their respective owners.
MySQL Replication


     • Replication for databases is critical
          – High Availability: avoid SPOF, fail-over for service continuity, DR
          – Scale-Out: scale reads on replicas
          – Misc. administrative tasks: upgrades, schema changes, backup, PITR ...

     • MySQL/InnoDB with base replication has serious gaps
          – Failover is not straight-forward
          – Possibility of data loss and data inconsistency
          – Possibility of stale data read on Slaves
          – Writes on Master need to be de-rated to Slave’s single-thread applier
            performance (forcing unnecessary sharding)
          – Applications and deployments need to work around these issues or live with the
            consequences

     • Several alternatives exist with different design considerations


© 2011 Schooner Information Technology, p2
Technology Covered: Replication Options for MySQL/InnoDB


                                             Qualitative Comparison

          MySQL-Specific Replication                        External Replication
          1. MySQL async                                    4. GoldenGate
               – Available in MySQL 5.1, 5.5                5. Tungsten
          2. MySQL semi-sync                                6. DRBD
               – Available in MySQL 5.5                      – Available on Linux
          3. Schooner sync
               – Available in Schooner Active
                 Cluster




© 2011 Schooner Information Technology, p3
#1 MySQL Asynchronous Replication
                                                                                           Read version based on tx=50


                                 Master mysqld                                                           Slave mysqld

                                    Tx.Commit(101)                                                       Repl.apply(51)



                                   tx=101                tx=101                                 tx=51             tx=51

                                             InnoDB MySQL         Log events pulled by Slave     Relay     InnoDB
                           DB                                                                                             DB
                                              Tx log Bin log                                      log       Tx log
                                      Last tx=100 Last tx=100                                    Last tx=70 Last tx=50

          • Loosely coupled master/slave                                          • Read on slave can give old data
            relationship                                                          • No checksums in binary or relay log
                     • Master does not wait for Slave                               stored on disk, data corruption possible
                     • Slave determines how much to read                          • Upon a Master’s failure
                       and from which point in the binary log
                                                                                        • Slave may not have latest committed
                     • Slave can be arbitrarily behind master in                          data resulting in data loss
                       reading and applying changes
                                                                                        • Fail-over to a slave is stalled until all
                                                                                          transactions in relay log have been
                                                                                          committed – not instantaneous

© 2011 Schooner Information Technology, p4
#2 MySQL Semi-synchronous Replication

                                                                                             Read version based on tx=50


                                 Master mysqld                                                              Slave mysqld

                                    Tx.Commit(101)                                                          Repl.apply(51)



                                   tx=101                tx=101                                    tx=51            tx=51
                                                                  Log for tx=100 pulled by Slave
                                             InnoDB MySQL                                           Relay     InnoDB
                           DB                                         Slave ACK for tx=100                                   DB
                                              Tx log Bin log                                         log       Tx log
                                      Last tx=100 Last tx=100                                      Last tx=100 Last tx=50

          • Semi-coupled master/slave relationship                                  • Read on slave can give old data
                     • On commit, Master waits for an ACK                           • No checksums in binary or relay log
                       from Slave                                                     stored on disk, data corruption possible
                     • Slave logs the transaction event in relay
                                                                                    • Upon a Master’s failure
                       log and ACKs (may not apply yet)
                                                                                          • Fail-over to a slave is stalled until all
                     • Slave can be arbitrarily behind master in
                                                                                            transactions in relay log have been
                       applying changes
                                                                                            committed – not instantaneous



© 2011 Schooner Information Technology, p5
#3 Schooner MySQL Active Cluster (SAC):
     An integrated HA and replication solution for MySQL/InnoDB
                                                                                           Read version based on tx=100


                                 Master mysqld                 Log for tx=100 pushed to Slave           Slave mysqld
                                                                    Slave ACK for tx=100            Repl.apply(100)
                                    Tx.Commit(101)


                                                                                                               tx=100
                                   tx=101

                                             InnoDB MySQL                                                  InnoDB
                           DB                                                                                            DB
                                              Tx log Bin log                                                Tx log
                                      Last tx=100 Last tx=100                                              Last tx=100

          • Tightly-coupled master/slave                                          • Read on slave gives latest committed
            relationship                                                            data
                     • After commit, all Slaves guaranteed to                     • Checksums in log stored on disk
                       receive and commit the change
                                                                                  • Upon a Master’s failure
                     • Slave in lock-step with Master
                                                                                        • Fail-over to a slave is fully integrated
                                                                                          and automatic
                                                                                        • Application writes continue on new
                                                                                          master instantaneously


© 2011 Schooner Information Technology, p6
#4 Oracle GoldenGate
                                                                                                        Read version based on tx=50


                                 Master mysqld                                                                             Slave mysqld

                                    Tx.Commit(101)                                                                          Repl.apply(51)



                                   tx=101          tx=101                                                                               tx=51
                                                                            Log for tx=71 sent to Slave
                                             InnoDB MySQL Source                                                  Dest.        InnoDB
                           DB                                                                                                                   DB
                                              Tx log Bin log Trail                                                Trail         Tx log
                                      Last tx=100 Last tx=100                                                    Last tx=70 Last tx=50

          • Loosely coupled master/slave                                                   • Read on slave can give old data
            relationship                                                                   • Heterogeneous Database support
                     • Master does not wait for Slave                                          • Oracle, Microsoft SQL Server, IBM
                     • Changes applied on slave similar to                                       DB2, MySQL
                       MySQL
                     • Slave can be arbitrarily behind master in
                       reading and applying changes


Oracle and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

© 2011 Schooner Information Technology, p7
#5 Tungsten Replicator
                                                                                               Read version based on tx=50


                                 Master mysqld                                                                        Slave mysqld

                                    Tx.Commit(101)                                                                     Repl.apply(51)



                                   tx=101          tx=101                                                                       tx=51
                                                                   Log for tx=71 sent to Slave
                                                                Tx                                          Tx
                                             InnoDB MySQL                                                        InnoDB
                           DB                                History                                     History                        DB
                                              Tx log Bin log                                                      Tx log
                                                               Log                                         Log
                                      Last tx=100 Last tx=100                                             Last tx=70 Last tx=50

          • Loosely coupled master/slave                                        • Read on slave can give old data*
            relationship                                                        • Heterogeneous Database support
                     • Master does not wait for Slave                                • MySQL, PostgreSQL
                     • Changes applied on slave similar to
                       MySQL
                                                                                • Global Tx ID, useful to point to new
                                                                                  master upon failure
                     • Slave can be arbitrarily behind master in
                       reading and applying changes                             • SaaS & ISP feature: Parallel replication
                                                                                  for multi-tenant MySQL databases

Tungsten is a registered trademark of Continuent                                http://code.google.com/p/tungsten-replicator/

© 2011 Schooner Information Technology, p8
#6 Linux DRBD



                                 Master mysqld
                                                                         Heartbeat                   STAND-BY
                                    Tx.Commit(101)
                                                                                                (No mysqld running)



                                   tx=101          tx=101
                                                                  Block {dev-a, 20123}
                                             InnoDB MySQL                                               InnoDB MySQL
                           DB                                   ACK for Block {dev-a, 20123}
                                                                                                DB
                                              Tx log Bin log                                             Tx log Bin log
                                      Last tx=100 Last tx=100                                         Last tx=100 Last tx=100

          • Active-Passive mirroring at block device                           • Stand-by server does not service load
                     • After each commit, the Stand-by server                  • No data-loss
                       is guaranteed to have identical blocks
                                                                               • Upon a Master’s failure
                       on device
                                                                                     • MySQL is started on stand-by, database
                     • Stand-by in lock-step with Master
                                                                                       recovery takes ~minutes
                                                                                     • Stand-by is made new Master
                                                                                     • Application writes may use VIPs to write
                                                                                       to new Master when its ready


© 2011 Schooner Information Technology, p9
Replication Options for MySQL/InnoDB


                           Quantitative Comparison: Performance
     • Performance comparison of:
       1. MySQL asynchronous (v5.5.8)
       2. MySQL semi-synchronous (v5.5.8)
       3. Schooner Active Cluster (SAC) synchronous replication (v3.1)

     • Benchmark: DBT2 open-source transaction processing benchmark




© 2011 Schooner Information Technology, p10
DBT2 Benchmark

   • Open source benchmark available at http://osdldbt.sourceforge.net
        • On-line Transaction Processing (OLTP) performance test
        • Fair-usage implementation of the TPC-C benchmark
        • Simulates a wholesale parts supplier with a database containing inventory and
          customer information

   • Benchmark scale determined by number of warehouses
        • Results here based on a scale of 1000

   • Use InnoDB storage engine with 48GB buffer pool and full
     consistency/durability settings

   • Schooner has found that optimizing MySQL/InnoDB for DBT2 yields
     significant benefit for many real customer workloads


© 2011 Schooner Information Technology, p11
Measurement Setup
                                                        SWITCH         •    Arista 10Gb Ethernet
                                                                            switch

                               SERVERS




                                                                           CLIENT
       • IBM 3650 2RU Westmere Server                        •   DBT2 client runs on master server
       • CPU: 2x Intel Xeon X5670 processors, 6 cores/12
         threads per processor, 2.93 GHz
       • DRAM: 72GB
       • HDD: 2x300GB 10k RPM HDD RAID-1 for logging, with
         LSI M5015 controller with NVRAM writeback cache
       • Flash: 8x200GB OCZ MLC SSD’s, with 1 LSI 9211
         controller, md RAID-0
       • Network: 1x Chelseo 10Gb Ethernet NIC




© 2011 Schooner Information Technology, p12
Results: DBT2 Performance
     DBT2 Throughput (kTpm)                            DBT2 Response Time (ms)

        120                                            6
        100                                            5
          80                                           4
          60                                           3
          40                                           2
          20                                           1
             0                                         0
                    2-node 5.5 2-node 5.5     2-node       2-node 5.5 2-node 5.5   2-node
                      async       semi         SAC           async       semi       SAC




© 2011 Schooner Information Technology, p13
Results: DBT2 Performance
     DBT2 Throughput (kTpm)                            DBT2 Response Time (ms)

        120                                            6
        100                                            5
          80                                           4
          60                                           3
          40                                           2
          20                                           1
             0                                         0
                    2-node 5.5 2-node 5.5     2-node        2-node 5.5 2-node 5.5   2-node
                      async       semi         SAC            async       semi       SAC

         5.5 Async and Semi-sync
      limited by serial slave applier                           SAC optimizations
                                                           yields lower response times
              SAC optimizations
       yield 4-5x boost in throughput
© 2011 Schooner Information Technology, p14
Results: Master CPU Utilization
                                              Master CPU Utilization (%)
                                              1600
                                              1400
                                              1200
                                              1000
                                              800
                                              600
                                              400
                                              200
                                                0
                                                      2-node 2-node 2-node
                                                     5.5 async 5.5 semi SAC




© 2011 Schooner Information Technology, p15
Results: Master CPU Utilization
                                              Master CPU Utilization (%)
                                              1600
                                              1400
                                              1200
                                              1000
                                              800
                                              600
                                              400
                                              200
                                                0
                                                      2-node 2-node 2-node
                                                     5.5 async 5.5 semi SAC


                                                 Higher throughput means
                                                  higher CPU utilization,
                                                   better system balance
© 2011 Schooner Information Technology, p16
Results: Transient Behavior on Master
                        5.5 Async                         5.5 Semi-sync                 SAC




                                              Inconsistent performance with 5.5 Async

© 2011 Schooner Information Technology, p17
Results: Slave Utilization
                                              Slave CPU Utilization (%)
                                              600
                                              500
                                              400
                                              300
                                              200
                                              100
                                                0
                                                    2-node 5.5   2-node 5.5   2-node SAC
                                                      async         semi




© 2011 Schooner Information Technology, p18
Results: Slave Utilization
                                              Slave CPU Utilization (%)
                                              600
                                              500
                                              400
                                              300
                                              200
                                              100
                                                0
                                                    2-node 5.5   2-node 5.5   2-node SAC
                                                      async         semi

                                                      Single applier in 5.5
                                                      saturates one core
                                          Even at 5X throughput on SAC Slave,
                                        enough headroom to service Read traffic
© 2011 Schooner Information Technology, p19
Results: Transient Behavior on Slave
               5.5 Async                         5.5 Semi-sync              SAC




                                              Fluctuations with 5.5 Async

© 2011 Schooner Information Technology, p20
Results: Storage IO Utilization
                                                   Master Storage IOPS
                                                     Utilization (%)
                                              16
                                              14
                                              12
                                              10
                                               8
                                               6
                                               4
                                               2
                                               0
                                                    2-node 5.5 2-node 5.5 2-node SAC
                                                      async       semi




© 2011 Schooner Information Technology, p21
Results: Storage IO Utilization
                                                   Master Storage IOPS
                                                     Utilization (%)
                                              16
                                              14
                                              12
                                              10
                                               8
                                               6
                                               4
                                               2
                                               0
                                                    2-node 5.5 2-node 5.5 2-node SAC
                                                      async       semi


                                                   Higher throughput means
                                                   higher storage bandwidth,
                                                     better system balance
© 2011 Schooner Information Technology, p22
Results: Network Utilization
               Replication                                    Replication
          Network Utilization (%)                      Network Bandwidth (MB/s)
         40                                            50
         35                                            45
                                                       40
         30                                            35
         25                                            30
         20                                            25
         15                                            20
                                                       15
         10                                            10
          5                                             5
          0                                             0
                    2-node 5.5 2-node 5.5     2-node        2-node 5.5 2-node 5.5   2-node
                      async       semi         SAC            async       semi       SAC




© 2011 Schooner Information Technology, p23
Results: Network Utilization
               Replication                                           Replication
          Network Utilization (%)                             Network Bandwidth (MB/s)
         40                                                    50
         35                                                    45
                                                               40
         30                                                    35
         25                                                    30
         20                                                    25
         15                                                    20
                                                               15
         10                                                    10
          5                                                     5
          0                                                     0
                    2-node 5.5 2-node 5.5        2-node             2-node 5.5 2-node 5.5   2-node
                      async       semi            SAC                 async       semi       SAC


                                                Higher throughput means
                                              higher replication bandwidth,
                                                  better system balance
© 2011 Schooner Information Technology, p24
Results: Summary

       • Sustainable replication performance of 5.5.8 Async and
         Semi-sync is severely limited

       • Deeply integrated synchronous replication in SAC yield 4-
         5X boost in sustainable replication performance

       • High performance and fully synchronous replication are not
         mutually exclusive




© 2011 Schooner Information Technology, p25
Conclusion
 • Asynchronous replication is de-facto standard for 10+ years, issues with
   consistency, data-loss and service continuity

 • Recently released semi-synchronous replication mitigates the need to DRBD to
   avoid data-loss on failover, however consistency and service continuity issues
   exist

 • Schooner synchronous provides consistent reads on Slaves instantaneous
   failover with zero data-loss

 • For database interoperability, GoldenGate and Tungsten are a good choice

 • Performance comparisons between MySQL 5.5.8 async/semi-sync and
   Schooner Active Cluster using DBT2 show:
       • 5.5.8 async/semi-sync is severely limited by serial slave applier
       • Parallel slave appliers plus Schooner core optimizations in SAC yield 4-5x
         increase in throughput at lower response times, while maintaining read
         consistency across the cluster.


© 2011 Schooner Information Technology, p26
Backup Slides
Reference:
     MySQL Replication Solutions Compared

                                                     Schooner               MySQL 5.5       MySQL 5.1/5.5                     Linux Heartbeat +
                                                                                                                 Tungsten                             GoldenGate
                                                 Active Cluster 3.1         (semi-sync)        (async)                              DRBD
   Eliminates Slave Lag                                   Y                      N                   N              N                Y                    N
   Slave Consistent w/ Master                             Y                      N                   N              N                N                    N
   Write scalability                                    High                 Moderate            Moderate          Low              Low               Moderate
   Read scalability                                     High                   High                High           High              N/A                  High
   Data loss Probability on Failover                    Zero                   Zero             High            High             Zero                  High
                                                                                          Slave-lag-catch- Slave-lag-catch- InnoDB recovery       Slave-lag-catch-
   Planned Failover time (stop mysql)                  <2sec           Slave-lag-catch-up
                                                                                                 up               up             time                    up
                                                                       #                    #                #                #                   #
                                                                       ~8sec + slave-lag-   ~8sec + slave-   ~8sec + slave-    ~8sec + InnoDB      ~8sec + slave-
   Automated Unplanned Failover time                   <8sec
                                                                          catch-up          lag-catch-up     lag-catch-up      recovery time       lag-catch-up
   @
       Manual Unplanned Failover time                   N/A                Minutes-Hours    Minutes-Hours Minutes-Hours        Minutes-Hours      Minutes-Hours

   Checksum in replication logs                           Y                      N                   N              Y                N                    ?
   Replication Solution                                Built-in               Built-in            Built-in       External         External             External
   Replicate to Non-MySQL Databases                      N^                     N^                  N^              Y                N^                   Y
   Point-in Time Recovery (PITR)                          Y                      Y                  Y               Y                Y                    Y
   Automated DB Instantiation                             Y                      N                  N               Y                N                    Y
   Incremental data movement                              Y                      Y                   Y              Y                Y                    Y
   Rolling upgrade                                        Y                     N                   N               Y                N                    Y
   Automated Rolling migration                            Y                     N                   N               ?                N                    ?


     #   Using Multi-master for MySQL (MMM), Flipper, etc. assuming 8sec for heartbeat retries
     @   Manual response to an alert and running trouble-shooting and replication commands/scripts
     ^   Possible to leverage complementing solution like GoldenGate

© 2011 Schooner Information Technology, p28

More Related Content

Viewers also liked

Week3 Lecture Database Design
Week3 Lecture Database DesignWeek3 Lecture Database Design
Week3 Lecture Database Design
Kevin Element
 
Database Design
Database DesignDatabase Design
Database Design
learnt
 
English gcse final tips
English gcse final tipsEnglish gcse final tips
English gcse final tips
mrhoward12
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
Denish Patel
 
How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinar
oysteing
 

Viewers also liked (20)

3 relational model
3 relational model3 relational model
3 relational model
 
MySQL Replication: Pros and Cons
MySQL Replication: Pros and ConsMySQL Replication: Pros and Cons
MySQL Replication: Pros and Cons
 
Distributed Postgres
Distributed PostgresDistributed Postgres
Distributed Postgres
 
Week3 Lecture Database Design
Week3 Lecture Database DesignWeek3 Lecture Database Design
Week3 Lecture Database Design
 
Database Design
Database DesignDatabase Design
Database Design
 
2 entity relationship_model
2 entity relationship_model2 entity relationship_model
2 entity relationship_model
 
English gcse final tips
English gcse final tipsEnglish gcse final tips
English gcse final tips
 
Postgres-XC Write Scalable PostgreSQL Cluster
Postgres-XC Write Scalable PostgreSQL ClusterPostgres-XC Write Scalable PostgreSQL Cluster
Postgres-XC Write Scalable PostgreSQL Cluster
 
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
Escalabilidade, Sharding, Paralelismo e Bigdata com PostgreSQL? Yes, we can!
 
Database design concept
Database design conceptDatabase design concept
Database design concept
 
Best Practices for Database Schema Design
Best Practices for Database Schema DesignBest Practices for Database Schema Design
Best Practices for Database Schema Design
 
Database Schema
Database SchemaDatabase Schema
Database Schema
 
Database design
Database designDatabase design
Database design
 
Database design
Database designDatabase design
Database design
 
Advanced Postgres Monitoring
Advanced Postgres MonitoringAdvanced Postgres Monitoring
Advanced Postgres Monitoring
 
The NoSQL Way in Postgres
The NoSQL Way in PostgresThe NoSQL Way in Postgres
The NoSQL Way in Postgres
 
Postgres in Amazon RDS
Postgres in Amazon RDSPostgres in Amazon RDS
Postgres in Amazon RDS
 
Scaling postgres
Scaling postgresScaling postgres
Scaling postgres
 
How to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinarHow to analyze and tune sql queries for better performance webinar
How to analyze and tune sql queries for better performance webinar
 
MySQL Optimizer Cost Model
MySQL Optimizer Cost ModelMySQL Optimizer Cost Model
MySQL Optimizer Cost Model
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 

Performance Comparisons And Trade Offs For Various My Sql Replication Schemes Presentation

  • 1. Performance comparisons and trade-offs for various MySQL replication schemes Darpan Dinker Brian O’Krafka, VP Engineering Chief Architect Schooner Information Technology, Inc. http://www.schoonerinfotech.com/ Oracle, MySQL, Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.
  • 2. MySQL Replication • Replication for databases is critical – High Availability: avoid SPOF, fail-over for service continuity, DR – Scale-Out: scale reads on replicas – Misc. administrative tasks: upgrades, schema changes, backup, PITR ... • MySQL/InnoDB with base replication has serious gaps – Failover is not straight-forward – Possibility of data loss and data inconsistency – Possibility of stale data read on Slaves – Writes on Master need to be de-rated to Slave’s single-thread applier performance (forcing unnecessary sharding) – Applications and deployments need to work around these issues or live with the consequences • Several alternatives exist with different design considerations © 2011 Schooner Information Technology, p2
  • 3. Technology Covered: Replication Options for MySQL/InnoDB Qualitative Comparison MySQL-Specific Replication External Replication 1. MySQL async 4. GoldenGate – Available in MySQL 5.1, 5.5 5. Tungsten 2. MySQL semi-sync 6. DRBD – Available in MySQL 5.5 – Available on Linux 3. Schooner sync – Available in Schooner Active Cluster © 2011 Schooner Information Technology, p3
  • 4. #1 MySQL Asynchronous Replication Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 tx=51 InnoDB MySQL Log events pulled by Slave Relay InnoDB DB DB Tx log Bin log log Tx log Last tx=100 Last tx=100 Last tx=70 Last tx=50 • Loosely coupled master/slave • Read on slave can give old data relationship • No checksums in binary or relay log • Master does not wait for Slave stored on disk, data corruption possible • Slave determines how much to read • Upon a Master’s failure and from which point in the binary log • Slave may not have latest committed • Slave can be arbitrarily behind master in data resulting in data loss reading and applying changes • Fail-over to a slave is stalled until all transactions in relay log have been committed – not instantaneous © 2011 Schooner Information Technology, p4
  • 5. #2 MySQL Semi-synchronous Replication Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 tx=51 Log for tx=100 pulled by Slave InnoDB MySQL Relay InnoDB DB Slave ACK for tx=100 DB Tx log Bin log log Tx log Last tx=100 Last tx=100 Last tx=100 Last tx=50 • Semi-coupled master/slave relationship • Read on slave can give old data • On commit, Master waits for an ACK • No checksums in binary or relay log from Slave stored on disk, data corruption possible • Slave logs the transaction event in relay • Upon a Master’s failure log and ACKs (may not apply yet) • Fail-over to a slave is stalled until all • Slave can be arbitrarily behind master in transactions in relay log have been applying changes committed – not instantaneous © 2011 Schooner Information Technology, p5
  • 6. #3 Schooner MySQL Active Cluster (SAC): An integrated HA and replication solution for MySQL/InnoDB Read version based on tx=100 Master mysqld Log for tx=100 pushed to Slave Slave mysqld Slave ACK for tx=100 Repl.apply(100) Tx.Commit(101) tx=100 tx=101 InnoDB MySQL InnoDB DB DB Tx log Bin log Tx log Last tx=100 Last tx=100 Last tx=100 • Tightly-coupled master/slave • Read on slave gives latest committed relationship data • After commit, all Slaves guaranteed to • Checksums in log stored on disk receive and commit the change • Upon a Master’s failure • Slave in lock-step with Master • Fail-over to a slave is fully integrated and automatic • Application writes continue on new master instantaneously © 2011 Schooner Information Technology, p6
  • 7. #4 Oracle GoldenGate Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 Log for tx=71 sent to Slave InnoDB MySQL Source Dest. InnoDB DB DB Tx log Bin log Trail Trail Tx log Last tx=100 Last tx=100 Last tx=70 Last tx=50 • Loosely coupled master/slave • Read on slave can give old data relationship • Heterogeneous Database support • Master does not wait for Slave • Oracle, Microsoft SQL Server, IBM • Changes applied on slave similar to DB2, MySQL MySQL • Slave can be arbitrarily behind master in reading and applying changes Oracle and MySQL are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. © 2011 Schooner Information Technology, p7
  • 8. #5 Tungsten Replicator Read version based on tx=50 Master mysqld Slave mysqld Tx.Commit(101) Repl.apply(51) tx=101 tx=101 tx=51 Log for tx=71 sent to Slave Tx Tx InnoDB MySQL InnoDB DB History History DB Tx log Bin log Tx log Log Log Last tx=100 Last tx=100 Last tx=70 Last tx=50 • Loosely coupled master/slave • Read on slave can give old data* relationship • Heterogeneous Database support • Master does not wait for Slave • MySQL, PostgreSQL • Changes applied on slave similar to MySQL • Global Tx ID, useful to point to new master upon failure • Slave can be arbitrarily behind master in reading and applying changes • SaaS & ISP feature: Parallel replication for multi-tenant MySQL databases Tungsten is a registered trademark of Continuent http://code.google.com/p/tungsten-replicator/ © 2011 Schooner Information Technology, p8
  • 9. #6 Linux DRBD Master mysqld Heartbeat STAND-BY Tx.Commit(101) (No mysqld running) tx=101 tx=101 Block {dev-a, 20123} InnoDB MySQL InnoDB MySQL DB ACK for Block {dev-a, 20123} DB Tx log Bin log Tx log Bin log Last tx=100 Last tx=100 Last tx=100 Last tx=100 • Active-Passive mirroring at block device • Stand-by server does not service load • After each commit, the Stand-by server • No data-loss is guaranteed to have identical blocks • Upon a Master’s failure on device • MySQL is started on stand-by, database • Stand-by in lock-step with Master recovery takes ~minutes • Stand-by is made new Master • Application writes may use VIPs to write to new Master when its ready © 2011 Schooner Information Technology, p9
  • 10. Replication Options for MySQL/InnoDB Quantitative Comparison: Performance • Performance comparison of: 1. MySQL asynchronous (v5.5.8) 2. MySQL semi-synchronous (v5.5.8) 3. Schooner Active Cluster (SAC) synchronous replication (v3.1) • Benchmark: DBT2 open-source transaction processing benchmark © 2011 Schooner Information Technology, p10
  • 11. DBT2 Benchmark • Open source benchmark available at http://osdldbt.sourceforge.net • On-line Transaction Processing (OLTP) performance test • Fair-usage implementation of the TPC-C benchmark • Simulates a wholesale parts supplier with a database containing inventory and customer information • Benchmark scale determined by number of warehouses • Results here based on a scale of 1000 • Use InnoDB storage engine with 48GB buffer pool and full consistency/durability settings • Schooner has found that optimizing MySQL/InnoDB for DBT2 yields significant benefit for many real customer workloads © 2011 Schooner Information Technology, p11
  • 12. Measurement Setup SWITCH • Arista 10Gb Ethernet switch SERVERS CLIENT • IBM 3650 2RU Westmere Server • DBT2 client runs on master server • CPU: 2x Intel Xeon X5670 processors, 6 cores/12 threads per processor, 2.93 GHz • DRAM: 72GB • HDD: 2x300GB 10k RPM HDD RAID-1 for logging, with LSI M5015 controller with NVRAM writeback cache • Flash: 8x200GB OCZ MLC SSD’s, with 1 LSI 9211 controller, md RAID-0 • Network: 1x Chelseo 10Gb Ethernet NIC © 2011 Schooner Information Technology, p12
  • 13. Results: DBT2 Performance DBT2 Throughput (kTpm) DBT2 Response Time (ms) 120 6 100 5 80 4 60 3 40 2 20 1 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC © 2011 Schooner Information Technology, p13
  • 14. Results: DBT2 Performance DBT2 Throughput (kTpm) DBT2 Response Time (ms) 120 6 100 5 80 4 60 3 40 2 20 1 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC 5.5 Async and Semi-sync limited by serial slave applier SAC optimizations yields lower response times SAC optimizations yield 4-5x boost in throughput © 2011 Schooner Information Technology, p14
  • 15. Results: Master CPU Utilization Master CPU Utilization (%) 1600 1400 1200 1000 800 600 400 200 0 2-node 2-node 2-node 5.5 async 5.5 semi SAC © 2011 Schooner Information Technology, p15
  • 16. Results: Master CPU Utilization Master CPU Utilization (%) 1600 1400 1200 1000 800 600 400 200 0 2-node 2-node 2-node 5.5 async 5.5 semi SAC Higher throughput means higher CPU utilization, better system balance © 2011 Schooner Information Technology, p16
  • 17. Results: Transient Behavior on Master 5.5 Async 5.5 Semi-sync SAC Inconsistent performance with 5.5 Async © 2011 Schooner Information Technology, p17
  • 18. Results: Slave Utilization Slave CPU Utilization (%) 600 500 400 300 200 100 0 2-node 5.5 2-node 5.5 2-node SAC async semi © 2011 Schooner Information Technology, p18
  • 19. Results: Slave Utilization Slave CPU Utilization (%) 600 500 400 300 200 100 0 2-node 5.5 2-node 5.5 2-node SAC async semi Single applier in 5.5 saturates one core Even at 5X throughput on SAC Slave, enough headroom to service Read traffic © 2011 Schooner Information Technology, p19
  • 20. Results: Transient Behavior on Slave 5.5 Async 5.5 Semi-sync SAC Fluctuations with 5.5 Async © 2011 Schooner Information Technology, p20
  • 21. Results: Storage IO Utilization Master Storage IOPS Utilization (%) 16 14 12 10 8 6 4 2 0 2-node 5.5 2-node 5.5 2-node SAC async semi © 2011 Schooner Information Technology, p21
  • 22. Results: Storage IO Utilization Master Storage IOPS Utilization (%) 16 14 12 10 8 6 4 2 0 2-node 5.5 2-node 5.5 2-node SAC async semi Higher throughput means higher storage bandwidth, better system balance © 2011 Schooner Information Technology, p22
  • 23. Results: Network Utilization Replication Replication Network Utilization (%) Network Bandwidth (MB/s) 40 50 35 45 40 30 35 25 30 20 25 15 20 15 10 10 5 5 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC © 2011 Schooner Information Technology, p23
  • 24. Results: Network Utilization Replication Replication Network Utilization (%) Network Bandwidth (MB/s) 40 50 35 45 40 30 35 25 30 20 25 15 20 15 10 10 5 5 0 0 2-node 5.5 2-node 5.5 2-node 2-node 5.5 2-node 5.5 2-node async semi SAC async semi SAC Higher throughput means higher replication bandwidth, better system balance © 2011 Schooner Information Technology, p24
  • 25. Results: Summary • Sustainable replication performance of 5.5.8 Async and Semi-sync is severely limited • Deeply integrated synchronous replication in SAC yield 4- 5X boost in sustainable replication performance • High performance and fully synchronous replication are not mutually exclusive © 2011 Schooner Information Technology, p25
  • 26. Conclusion • Asynchronous replication is de-facto standard for 10+ years, issues with consistency, data-loss and service continuity • Recently released semi-synchronous replication mitigates the need to DRBD to avoid data-loss on failover, however consistency and service continuity issues exist • Schooner synchronous provides consistent reads on Slaves instantaneous failover with zero data-loss • For database interoperability, GoldenGate and Tungsten are a good choice • Performance comparisons between MySQL 5.5.8 async/semi-sync and Schooner Active Cluster using DBT2 show: • 5.5.8 async/semi-sync is severely limited by serial slave applier • Parallel slave appliers plus Schooner core optimizations in SAC yield 4-5x increase in throughput at lower response times, while maintaining read consistency across the cluster. © 2011 Schooner Information Technology, p26
  • 28. Reference: MySQL Replication Solutions Compared Schooner MySQL 5.5 MySQL 5.1/5.5 Linux Heartbeat + Tungsten GoldenGate Active Cluster 3.1 (semi-sync) (async) DRBD Eliminates Slave Lag Y N N N Y N Slave Consistent w/ Master Y N N N N N Write scalability High Moderate Moderate Low Low Moderate Read scalability High High High High N/A High Data loss Probability on Failover Zero Zero High High Zero High Slave-lag-catch- Slave-lag-catch- InnoDB recovery Slave-lag-catch- Planned Failover time (stop mysql) <2sec Slave-lag-catch-up up up time up # # # # # ~8sec + slave-lag- ~8sec + slave- ~8sec + slave- ~8sec + InnoDB ~8sec + slave- Automated Unplanned Failover time <8sec catch-up lag-catch-up lag-catch-up recovery time lag-catch-up @ Manual Unplanned Failover time N/A Minutes-Hours Minutes-Hours Minutes-Hours Minutes-Hours Minutes-Hours Checksum in replication logs Y N N Y N ? Replication Solution Built-in Built-in Built-in External External External Replicate to Non-MySQL Databases N^ N^ N^ Y N^ Y Point-in Time Recovery (PITR) Y Y Y Y Y Y Automated DB Instantiation Y N N Y N Y Incremental data movement Y Y Y Y Y Y Rolling upgrade Y N N Y N Y Automated Rolling migration Y N N ? N ? # Using Multi-master for MySQL (MMM), Flipper, etc. assuming 8sec for heartbeat retries @ Manual response to an alert and running trouble-shooting and replication commands/scripts ^ Possible to leverage complementing solution like GoldenGate © 2011 Schooner Information Technology, p28