SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
MySQL HA
with PaceMaker
    Kris Buytaert
   #opendbcamp
Kris Buytaert
●   I used to be a Dev, Then Became an Op,
●   Today I feel like a dev again
●   Senior Linux and Open Source Consultant @inuits.be
●   „Infrastructure Architect“
●   Building Clouds since before the Cloud
●   Surviving the 10th floor test
●   Co-Author of some books
●   Guest Editor at some sites
In this presentation
●   High Availability ?
●   MySQL HA Solutions
●   Linux HA / Pacemaker
What is HA Clustering ?

●   One service goes down
     => others take over its work
●   IP address takeover, service takeover,
●   Not designed for high-performance
●   Not designed for high troughput (load
    balancing)
Lies, Damn Lies, and
Statistics
         Counting nines
            (slide by Alan R)




 99.9999%                        30 sec
 99.999%                          5 min
 99.99%                          52 min
 99.9%                            9  hr  
 99%                            3.5 day
The Rules of HA

●   Keep it Simple
●   Keep it Simple
●   Prepare for Failure
●   Complexity is the enemy of reliability
●   Test your HA setup
Eliminating the SPOF
●   Find out what Will Fail
    •   Disks
    •   Fans
    •   Power (Supplies)
●   Find out what Can Fail
    •   Network
    •   Going Out Of Memory
Data vs Connection
●   DATA :
    •   Replication
    •   Shared storage
    •   DRBD
●   Connection
    •   LVS
    •   Proxy
    •   Heartbeat / Pacemaker
Shared Storage
●   1 MySQL instance
●   Monitor MySQL node
●   Stonith
●   $$$              1+1 <> 2
●   Storage = SPOF
●   Split Brain :(
DRBD
●   Distributed Replicated Block Device
●   In the Linux Kernel
●   Usually only 1 mount
    •   Multi mount as of 8.X
        •   Requires GFS / OCFS2
●   Regular FS ext3 ...
●   Only 1 MySQL instance Active accessing data
●   Upon Failover MySQL needs to be started on
    other node
DRBD(2)
●   What happens when you pull the plug of a
    Physical machine ?
    •   Minimal Timeout
    •   Why did the crash happen ?
    •   Is my data still correct ?
    •   Innodb Consistency Checks ?
        •   Lengthy ?
        •   Check your BinLog size
Other Solutions Today

●   MySQL Cluster NDBD
●   Multi Master Replication
●   MySQL Proxy
●   MMM
●   Flipper
●   BYO
●   ....
Pulling Traffic
●   Eg. for Cluster, MultiMaster setups
    •   DNS
    •   Advanced Routing
    •   LVS


    •   Or the upcoming slides
Linux-HA PaceMaker
●   Plays well with others
●   Manages more than MySQL
●

●   ...v3 .. don't even think about the rest anymore
●

●   http://clusterlabs.org/
Heartbeat v1
•   Max 2 nodes
•   No finegrained resources
•   Monitoring using “mon”

/etc/ha.d/ha.cf
/etc/ha.d/haresources
mdb-a.menos.asbucenter.dz ntc-restart-mysql mon IPaddr2::10.8.0.13/16/bond0 
    IPaddr2::10.16.0.13/16/bond0.16 mon



/etc/ha.d/authkeys
Heartbeat v2
 •   Stability issues
 •   Forking ?




“A consulting Opportunity”
                             LMB
Clone Resource
Clones in v2 were buggy
Resources were started on 2 nodes
Stopped again on “1”
Heartbeat v3

•   No more /etc/ha.d/haresources
•   No more xml
•   Better integrated monitoring
•   /etc/ha.d/ha.cf has
•   crm=yes
Pacemaker ?
●   Not a fork
●   Only CRM Code taken out of Heartbeat
●   As of Heartbeat 2.1.3
    •   Support for both OpenAIS / HeartBeat
    •   Different Release Cycles as Heartbeat
Heartbeat, OpenAis,
Corosync ?
●   All Messaging Layers
●   Initially only Heartbeat
●   OpenAIS
●   Heartbeat got unmaintained
●   OpenAIS had heisenbugs :(
●   Corosync
●   Heartbeat maintenance taken over by LinBit
●   CRM Detects which layer
Pacemaker




Heartbeat       or         OpenAIS




            Cluster Glue
●   Stonithd : The Heartbeat fencing subsystem.



Pacemaker Architecture
            ●   Lrmd : Local Resource Management Daemon. Interacts
                directly with resource agents (scripts).
            ●   pengine Policy Engine. Computes the next state of the
                cluster based on the current state and the configuration.
            ●   cib Cluster Information Base. Contains definitions of all
                cluster options, nodes, resources, their relationships to
                one another and current status. Synchronizes updates to
                all cluster nodes.
            ●   crmd Cluster Resource Management Daemon. Largely
                a message broker for the PEngine and LRM, it also elects
                a leader to co-ordinate the activities of the cluster.
            ●   openais messaging and membership layer.
            ●   heartbeat messaging layer, an alternative to OpenAIS.
            ●   ccm Short for Consensus Cluster Membership. The
                Heartbeat membership layer.
Configuring Heartbeat Correctly

heartbeat::hacf {"clustername":

         hosts => ["host-a","host-b"],

         hb_nic => ["bond0"],

         hostip1 => ["10.0.128.11"],

         hostip2 => ["10.0.128.12"],

         ping => ["10.0.128.4"],

    }

heartbeat::authkeys {"ClusterName":

         password => “ClusterName ",

    }

http://github.com/jtimberman/puppet/tree/master/heartbeat/
CRM
                          configure
                          property $id="cib­bootstrap­options" 
●   Cluster Resource              stonith­enabled="FALSE" 
                                  no­quorum­policy=ignore 
    Manager                       start­failure­is­fatal="FALSE" 
                          rsc_defaults $id="rsc_defaults­options" 
                                  migration­threshold="1" 
●   Keeps Nodes in Sync           failure­timeout="1"
                          primitive d_mysql ocf:local:mysql 
                                  op monitor interval="30s" 
                                  params test_user="sure" test_passwd="illtell" 
●   XML Based             test_table="test.table"
                          primitive ip_db ocf:heartbeat:IPaddr2 
                                  params ip="172.17.4.202" nic="bond0" 
●   cibadm                        op monitor interval="10s"
                          group svc_db d_mysql ip_db
                          commit

●   Cli manageable
●   Crm
Heartbeat Resources
●   LSB
●   Heartbeat resource (+status)
●   OCF (Open Cluster FrameWork) (+monitor)
●   Clones (don't use in HAv2)
●   Multi State Resources
LSB Resource Agents
●   LSB == Linux Standards Base
●   LSB resource agents are standard System V-
    style init scripts commonly used on Linux and
    other UNIX-like OSes
●   LSB init scripts are stored under /etc/init.d/
●   This enables Linux-HA to immediately support
    nearly every service that comes with your
    system, and most packages which come with
    their own init script
●   It's straightforward to change an LSB script to
    an OCF script
OCF
●   OCF == Open Cluster Framework
●   OCF Resource agents are the most powerful type of
    resource agent we support
●   OCF RAs are extended init scripts
    • They have additional actions:
      • monitor – for monitoring resource health
      • meta-data – for providing information about the RA

●   OCF RAs are located in
    /usr/lib/ocf/resource.d/provider-name/
Monitoring
●   Defined in the OCF Resource script
●   Configured in the parameters
●   You have to support multiple states
    •   Not running
    •   Running
    •   Failed
Anatomy of a Cluster
config

•   Cluster properties
•   Resource Defaults
•   Primitive Definitions
•   Resource Groups and Constraints
Cluster Properties

property $id="cib-bootstrap-options" 
     stonith-enabled="FALSE" 
     no-quorum-policy="ignore" 
     start-failure-is-fatal="FALSE" 



No-quorum-policy = We'll ignore the loss of quorum on a 2 node cluster

Start-failure : When set to FALSE, the cluster will instead use the resource's failcount and value for resource-failure-
stickiness
Resource Defaults

rsc_defaults $id="rsc_defaults-options" 
     migration-threshold="1" 
     failure-timeout="1" 
     resource-stickiness="INFINITY"


failure-timeout means that after a failure there will be a 60 second timeout before the resource can come back to the
node on which it failed.

Migration-treshold=1 means that after 1 failure the resource will try to start on the other node

Resource-stickiness=INFINITY means that the resource really wants to stay where it is now.
Primitive Definitions

primitive d_mine ocf:custom:tomcat 
     params instance_name="mine" 
     monitor_urls="health.html" 
     monitor_use_ssl="no" 
     op monitor interval="15s" 
     on-fail="restart" 



primitive ip_mine_svc ocf:heartbeat:IPaddr2 
     params ip="10.8.4.131" cidr_netmask="16" nic="bond0" 
     op monitor interval="10s"
Parsing a config
●   Isn't always done correctly
●   Even a verify won't find all issues
●   Unexpected behaviour might occur
Where a resource runs
•   multi state resources
    •  Master – Slave ,
       •   e.g mysql master-slave, drbd
•   Clones
    •  Resources that can run on multiple nodes
           e.g
       •   Multimaster mysql servers
       •   Mysql slaves
       •   Stateless applications
•   location
    •  Preferred location to run resource, eg. Based on hostname
•   colocation
    •  Resources that have to live together
       •   e.g ip address + service
•   order
       Define what resource has to start first, or wait for another resource
•   groups
    •  Colocation + order
eg. A Service on DRBD
●   DRBD can only be active on 1 node
●   The filesystem needs to be mounted on that
    active DRBD node

group svc_mine d_mine ip_mine

ms ms_drbd_storage drbd_storage 

meta master_max="1" master_node_max="1" clone_max="2" clone_node_max="1"
notify="true"

colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Master

order fs_after_drbd inf: ms_drbd_storage:promote svc_mine:start



location cli-prefer-svc_db svc_db 

rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a
A MySQL Resource
●   OCF
    •   Clone
        •   Where do you hook up the IP ?
    •   Multi State
        •   But we have Master Master replication
    •   Meta Resource
        •   Dummy resource that can monitor
            •   Connection
            •   Replication state
Simple 2 node example
primitive d_mysql ocf:ntc:mysql 
     op monitor interval="30s" 
     params test_user="just" test_passwd="kidding" test_table="really"

primitive ip_mysql_svc ocf:heartbeat:IPaddr2 
     params ip="10.8.0.30" cidr_netmask="255.255.255.0"
nic="bond0" 
     op monitor interval="10s"



group svc_mysql d_mysql ip_mysql_svc
Monitor your Setup
●   Not just connectivity
●   Also functional
    •   Query data
    •   Check resultset is correct
●   Check replication
    •   MaatKit
    •   OpenARK
How to deal with replication state ?
●   Multiple slaves

    •   Use Drbd ocf resource
●   2 masters only use own script

        •   Replication is slow on the active node

            •   Shouldn't happen talk to HR / cfgmt people

        •   Replication is slow on the passive node

            •   Weight--

        •   Replication breaks on the active node

                send out warning, don't modify weights and check other node

        •   Replication breaks on the passive node

            •   Fence of the passive node
Adding MySQL to the
stack

                     Replication
  Service IP MySQL

  “MySQLd”                          “MySQLd”   Resource MySQL

                                                Cluster Stack
                      Pacemaker

                      HeartBeat
         Node A                    Node B      Hardware
Pitfalls & Solutions
●   Monitor,
    •   Replication state
    •   Replication Lag


●   MaatKit
●   OpenARK
Conclusion
●   Plenty of Alternatives
●   Think about your Data
●   Think about getting Queries to that Data
●   Complexity is the enemy of reliability
●   Keep it Simple
●   Monitor inside the DB
Contact
Kris Buytaert Kris.Buytaert@inuits.be

Further Reading
@KrisBuytaert
http://www.krisbuytaert.be/blog/
http://www.inuits.be/
http://www.virtualization.com/
http://www.oreillygmt.com/




                              Inuits            Esquimaux
                              't Hemeltje       Kheops Business
                              Gemeentepark 2    Center
                              2930 Brasschaat   Avenque Georges
                              891.514.231       Lemaître 54
                                                6041 Gosselies
                              +32 473 441 636   889.780.406
                                                +32 495 698 668

Mais conteúdo relacionado

Mais procurados

Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State Drives
Rick Branson
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Giuseppe Paterno'
 

Mais procurados (20)

brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2brief introduction of drbd in SLE12SP2
brief introduction of drbd in SLE12SP2
 
HIGH AVAILABLE CLUSTER IN WEB SERVER WITH HEARTBEAT + DRBD + OCFS2
HIGH AVAILABLE CLUSTER IN WEB SERVER WITH  HEARTBEAT + DRBD + OCFS2HIGH AVAILABLE CLUSTER IN WEB SERVER WITH  HEARTBEAT + DRBD + OCFS2
HIGH AVAILABLE CLUSTER IN WEB SERVER WITH HEARTBEAT + DRBD + OCFS2
 
Corosync and Pacemaker
Corosync and PacemakerCorosync and Pacemaker
Corosync and Pacemaker
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analyticsLeveraging Cassandra for real-time multi-datacenter public cloud analytics
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
 
Apache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentialsApache Cassandra multi-datacenter essentials
Apache Cassandra multi-datacenter essentials
 
Cassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentialsCassandra multi-datacenter operations essentials
Cassandra multi-datacenter operations essentials
 
Shootout at the PAAS Corral
Shootout at the PAAS CorralShootout at the PAAS Corral
Shootout at the PAAS Corral
 
Introduction to XtraDB Cluster
Introduction to XtraDB ClusterIntroduction to XtraDB Cluster
Introduction to XtraDB Cluster
 
Cassandra and Solid State Drives
Cassandra and Solid State DrivesCassandra and Solid State Drives
Cassandra and Solid State Drives
 
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-FelixCassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
Cassandra EU 2012 - Storage Internals by Nicolas Favre-Felix
 
Strata - 03/31/2012
Strata - 03/31/2012Strata - 03/31/2012
Strata - 03/31/2012
 
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWSCassandra Summit 2014: Performance Tuning Cassandra in AWS
Cassandra Summit 2014: Performance Tuning Cassandra in AWS
 
The Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systemsThe Google Chubby lock service for loosely-coupled distributed systems
The Google Chubby lock service for loosely-coupled distributed systems
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 
Bulk Loading into Cassandra
Bulk Loading into CassandraBulk Loading into Cassandra
Bulk Loading into Cassandra
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2Filesystem Comparison: NFS vs GFS2 vs OCFS2
Filesystem Comparison: NFS vs GFS2 vs OCFS2
 
Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)Cassandra at Instagram (August 2013)
Cassandra at Instagram (August 2013)
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 

Destaque

Barbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStackBarbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStack
jarito030506
 
Deep dive into highly available open stack architecture openstack summit va...
Deep dive into highly available open stack architecture   openstack summit va...Deep dive into highly available open stack architecture   openstack summit va...
Deep dive into highly available open stack architecture openstack summit va...
Arthur Berezin
 
Pacemaker Overview
Pacemaker OverviewPacemaker Overview
Pacemaker Overview
stooty s
 

Destaque (14)

MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMaker
 
Best practices for MySQL High Availability
Best practices for MySQL High AvailabilityBest practices for MySQL High Availability
Best practices for MySQL High Availability
 
Pacemaker basics
Pacemaker basicsPacemaker basics
Pacemaker basics
 
Highly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlndHighly Available MySQL/PHP Applications with mysqlnd
Highly Available MySQL/PHP Applications with mysqlnd
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Security
 
Barbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStackBarbican 1.0 - Open Source Key Management for OpenStack
Barbican 1.0 - Open Source Key Management for OpenStack
 
Open Source KMIP Implementation
Open Source KMIP ImplementationOpen Source KMIP Implementation
Open Source KMIP Implementation
 
Supriya Shailaja Latest Gallery
 Supriya Shailaja Latest Gallery Supriya Shailaja Latest Gallery
Supriya Shailaja Latest Gallery
 
High availability and fault tolerance of openstack
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstack
 
Open stack HA - Theory to Reality
Open stack HA -  Theory to RealityOpen stack HA -  Theory to Reality
Open stack HA - Theory to Reality
 
Devops is not about Tooling
Devops is not about ToolingDevops is not about Tooling
Devops is not about Tooling
 
Deep dive into highly available open stack architecture openstack summit va...
Deep dive into highly available open stack architecture   openstack summit va...Deep dive into highly available open stack architecture   openstack summit va...
Deep dive into highly available open stack architecture openstack summit va...
 
Chef cookbooks for OpenStack HA
Chef cookbooks for OpenStack HAChef cookbooks for OpenStack HA
Chef cookbooks for OpenStack HA
 
Pacemaker Overview
Pacemaker OverviewPacemaker Overview
Pacemaker Overview
 

Semelhante a MySQL HA with Pacemaker

Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
kuchinskaya
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
FromDual GmbH
 
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL Replicator
Command Prompt., Inc
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Docker, Inc.
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101
ConFoo
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
DataWorks Summit
 

Semelhante a MySQL HA with Pacemaker (20)

Buytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemakerBuytaert kris my_sql-pacemaker
Buytaert kris my_sql-pacemaker
 
Scale 10x 01:22:12
Scale 10x 01:22:12Scale 10x 01:22:12
Scale 10x 01:22:12
 
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
Building the Enterprise infrastructure with PostgreSQL as the basis for stori...
 
Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101Ops Jumpstart: MongoDB Administration 101
Ops Jumpstart: MongoDB Administration 101
 
UKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL TuningUKOUG 2011: Practical MySQL Tuning
UKOUG 2011: Practical MySQL Tuning
 
MySQL HA
MySQL HAMySQL HA
MySQL HA
 
Go replicator
Go replicatorGo replicator
Go replicator
 
Replication using PostgreSQL Replicator
Replication using PostgreSQL ReplicatorReplication using PostgreSQL Replicator
Replication using PostgreSQL Replicator
 
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-DeviceSUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
SUSE Expert Days Paris 2018 - SUSE HA Cluster Multi-Device
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
Introduction to Galera Cluster
Introduction to Galera ClusterIntroduction to Galera Cluster
Introduction to Galera Cluster
 
Scalable Architecture 101
Scalable Architecture 101Scalable Architecture 101
Scalable Architecture 101
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Sdc challenges-2012
Sdc challenges-2012Sdc challenges-2012
Sdc challenges-2012
 
Sdc 2012-challenges
Sdc 2012-challengesSdc 2012-challenges
Sdc 2012-challenges
 
Migrating to XtraDB Cluster
Migrating to XtraDB ClusterMigrating to XtraDB Cluster
Migrating to XtraDB Cluster
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
 
2013 london advanced-replication
2013 london advanced-replication2013 london advanced-replication
2013 london advanced-replication
 
Preventing and Resolving MySQL Downtime
Preventing and Resolving MySQL DowntimePreventing and Resolving MySQL Downtime
Preventing and Resolving MySQL Downtime
 

Mais de Kris Buytaert

Mais de Kris Buytaert (20)

Years of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoopsYears of (not) learning , from devops to devoops
Years of (not) learning , from devops to devoops
 
Observability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,IgniteObservability will not fix your Broken Monitoring ,Ignite
Observability will not fix your Broken Monitoring ,Ignite
 
Infrastructure as Code Patterns
Infrastructure as Code PatternsInfrastructure as Code Patterns
Infrastructure as Code Patterns
 
From devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learningFrom devoops to devops 13 years of (not) learning
From devoops to devops 13 years of (not) learning
 
Pipeline all the Dashboards as Code
Pipeline all the Dashboards as CodePipeline all the Dashboards as Code
Pipeline all the Dashboards as Code
 
Help , My Datacenter is on fire
Help , My Datacenter is on fireHelp , My Datacenter is on fire
Help , My Datacenter is on fire
 
GitOps , done Right
GitOps , done RightGitOps , done Right
GitOps , done Right
 
Devops is Dead, Long live Devops
Devops is Dead, Long live DevopsDevops is Dead, Long live Devops
Devops is Dead, Long live Devops
 
10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ? 10 years of #devopsdays, but what have we really learned ?
10 years of #devopsdays, but what have we really learned ?
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 
Is there a Future for devops ?
Is there a Future for devops   ? Is there a Future for devops   ?
Is there a Future for devops ?
 
10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness10 Years of #devopsdays weirdness
10 Years of #devopsdays weirdness
 
ADDO 2019: Looking back at over 10 years of Devops
ADDO 2019:    Looking back at over 10 years of DevopsADDO 2019:    Looking back at over 10 years of Devops
ADDO 2019: Looking back at over 10 years of Devops
 
Can we fix dev-oops ?
Can we fix dev-oops ?Can we fix dev-oops ?
Can we fix dev-oops ?
 
Continuous Infrastructure First Ignite Edition
Continuous Infrastructure First  Ignite EditionContinuous Infrastructure First  Ignite Edition
Continuous Infrastructure First Ignite Edition
 
Continuous Infrastructure First
Continuous Infrastructure FirstContinuous Infrastructure First
Continuous Infrastructure First
 
Open Source Monitoring in 2019
Open Source Monitoring in 2019 Open Source Monitoring in 2019
Open Source Monitoring in 2019
 
Migrating to Puppet 5
Migrating to Puppet 5Migrating to Puppet 5
Migrating to Puppet 5
 
Repositories as Code
Repositories as CodeRepositories as Code
Repositories as Code
 
Devops is a Security Requirement
Devops is a Security RequirementDevops is a Security Requirement
Devops is a Security Requirement
 

Último

Último (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

MySQL HA with Pacemaker

  • 1. MySQL HA with PaceMaker Kris Buytaert #opendbcamp
  • 2. Kris Buytaert ● I used to be a Dev, Then Became an Op, ● Today I feel like a dev again ● Senior Linux and Open Source Consultant @inuits.be ● „Infrastructure Architect“ ● Building Clouds since before the Cloud ● Surviving the 10th floor test ● Co-Author of some books ● Guest Editor at some sites
  • 3. In this presentation ● High Availability ? ● MySQL HA Solutions ● Linux HA / Pacemaker
  • 4. What is HA Clustering ? ● One service goes down => others take over its work ● IP address takeover, service takeover, ● Not designed for high-performance ● Not designed for high troughput (load balancing)
  • 5. Lies, Damn Lies, and Statistics Counting nines (slide by Alan R) 99.9999% 30 sec 99.999% 5 min 99.99% 52 min 99.9% 9  hr   99% 3.5 day
  • 6. The Rules of HA ● Keep it Simple ● Keep it Simple ● Prepare for Failure ● Complexity is the enemy of reliability ● Test your HA setup
  • 7. Eliminating the SPOF ● Find out what Will Fail • Disks • Fans • Power (Supplies) ● Find out what Can Fail • Network • Going Out Of Memory
  • 8. Data vs Connection ● DATA : • Replication • Shared storage • DRBD ● Connection • LVS • Proxy • Heartbeat / Pacemaker
  • 9. Shared Storage ● 1 MySQL instance ● Monitor MySQL node ● Stonith ● $$$ 1+1 <> 2 ● Storage = SPOF ● Split Brain :(
  • 10. DRBD ● Distributed Replicated Block Device ● In the Linux Kernel ● Usually only 1 mount • Multi mount as of 8.X • Requires GFS / OCFS2 ● Regular FS ext3 ... ● Only 1 MySQL instance Active accessing data ● Upon Failover MySQL needs to be started on other node
  • 11. DRBD(2) ● What happens when you pull the plug of a Physical machine ? • Minimal Timeout • Why did the crash happen ? • Is my data still correct ? • Innodb Consistency Checks ? • Lengthy ? • Check your BinLog size
  • 12. Other Solutions Today ● MySQL Cluster NDBD ● Multi Master Replication ● MySQL Proxy ● MMM ● Flipper ● BYO ● ....
  • 13. Pulling Traffic ● Eg. for Cluster, MultiMaster setups • DNS • Advanced Routing • LVS • Or the upcoming slides
  • 14. Linux-HA PaceMaker ● Plays well with others ● Manages more than MySQL ● ● ...v3 .. don't even think about the rest anymore ● ● http://clusterlabs.org/
  • 15. Heartbeat v1 • Max 2 nodes • No finegrained resources • Monitoring using “mon” /etc/ha.d/ha.cf /etc/ha.d/haresources mdb-a.menos.asbucenter.dz ntc-restart-mysql mon IPaddr2::10.8.0.13/16/bond0 IPaddr2::10.16.0.13/16/bond0.16 mon /etc/ha.d/authkeys
  • 16. Heartbeat v2 • Stability issues • Forking ? “A consulting Opportunity” LMB
  • 17. Clone Resource Clones in v2 were buggy Resources were started on 2 nodes Stopped again on “1”
  • 18. Heartbeat v3 • No more /etc/ha.d/haresources • No more xml • Better integrated monitoring • /etc/ha.d/ha.cf has • crm=yes
  • 19. Pacemaker ? ● Not a fork ● Only CRM Code taken out of Heartbeat ● As of Heartbeat 2.1.3 • Support for both OpenAIS / HeartBeat • Different Release Cycles as Heartbeat
  • 20. Heartbeat, OpenAis, Corosync ? ● All Messaging Layers ● Initially only Heartbeat ● OpenAIS ● Heartbeat got unmaintained ● OpenAIS had heisenbugs :( ● Corosync ● Heartbeat maintenance taken over by LinBit ● CRM Detects which layer
  • 21. Pacemaker Heartbeat or OpenAIS Cluster Glue
  • 22. Stonithd : The Heartbeat fencing subsystem. Pacemaker Architecture ● Lrmd : Local Resource Management Daemon. Interacts directly with resource agents (scripts). ● pengine Policy Engine. Computes the next state of the cluster based on the current state and the configuration. ● cib Cluster Information Base. Contains definitions of all cluster options, nodes, resources, their relationships to one another and current status. Synchronizes updates to all cluster nodes. ● crmd Cluster Resource Management Daemon. Largely a message broker for the PEngine and LRM, it also elects a leader to co-ordinate the activities of the cluster. ● openais messaging and membership layer. ● heartbeat messaging layer, an alternative to OpenAIS. ● ccm Short for Consensus Cluster Membership. The Heartbeat membership layer.
  • 23. Configuring Heartbeat Correctly heartbeat::hacf {"clustername": hosts => ["host-a","host-b"], hb_nic => ["bond0"], hostip1 => ["10.0.128.11"], hostip2 => ["10.0.128.12"], ping => ["10.0.128.4"], } heartbeat::authkeys {"ClusterName": password => “ClusterName ", } http://github.com/jtimberman/puppet/tree/master/heartbeat/
  • 24. CRM configure property $id="cib­bootstrap­options"  ● Cluster Resource         stonith­enabled="FALSE"          no­quorum­policy=ignore  Manager         start­failure­is­fatal="FALSE"  rsc_defaults $id="rsc_defaults­options"          migration­threshold="1"  ● Keeps Nodes in Sync         failure­timeout="1" primitive d_mysql ocf:local:mysql          op monitor interval="30s"          params test_user="sure" test_passwd="illtell"  ● XML Based test_table="test.table" primitive ip_db ocf:heartbeat:IPaddr2          params ip="172.17.4.202" nic="bond0"  ● cibadm         op monitor interval="10s" group svc_db d_mysql ip_db commit ● Cli manageable ● Crm
  • 25. Heartbeat Resources ● LSB ● Heartbeat resource (+status) ● OCF (Open Cluster FrameWork) (+monitor) ● Clones (don't use in HAv2) ● Multi State Resources
  • 26. LSB Resource Agents ● LSB == Linux Standards Base ● LSB resource agents are standard System V- style init scripts commonly used on Linux and other UNIX-like OSes ● LSB init scripts are stored under /etc/init.d/ ● This enables Linux-HA to immediately support nearly every service that comes with your system, and most packages which come with their own init script ● It's straightforward to change an LSB script to an OCF script
  • 27. OCF ● OCF == Open Cluster Framework ● OCF Resource agents are the most powerful type of resource agent we support ● OCF RAs are extended init scripts • They have additional actions: • monitor – for monitoring resource health • meta-data – for providing information about the RA ● OCF RAs are located in /usr/lib/ocf/resource.d/provider-name/
  • 28. Monitoring ● Defined in the OCF Resource script ● Configured in the parameters ● You have to support multiple states • Not running • Running • Failed
  • 29. Anatomy of a Cluster config • Cluster properties • Resource Defaults • Primitive Definitions • Resource Groups and Constraints
  • 30. Cluster Properties property $id="cib-bootstrap-options" stonith-enabled="FALSE" no-quorum-policy="ignore" start-failure-is-fatal="FALSE" No-quorum-policy = We'll ignore the loss of quorum on a 2 node cluster Start-failure : When set to FALSE, the cluster will instead use the resource's failcount and value for resource-failure- stickiness
  • 31. Resource Defaults rsc_defaults $id="rsc_defaults-options" migration-threshold="1" failure-timeout="1" resource-stickiness="INFINITY" failure-timeout means that after a failure there will be a 60 second timeout before the resource can come back to the node on which it failed. Migration-treshold=1 means that after 1 failure the resource will try to start on the other node Resource-stickiness=INFINITY means that the resource really wants to stay where it is now.
  • 32. Primitive Definitions primitive d_mine ocf:custom:tomcat params instance_name="mine" monitor_urls="health.html" monitor_use_ssl="no" op monitor interval="15s" on-fail="restart" primitive ip_mine_svc ocf:heartbeat:IPaddr2 params ip="10.8.4.131" cidr_netmask="16" nic="bond0" op monitor interval="10s"
  • 33. Parsing a config ● Isn't always done correctly ● Even a verify won't find all issues ● Unexpected behaviour might occur
  • 34. Where a resource runs • multi state resources • Master – Slave , • e.g mysql master-slave, drbd • Clones • Resources that can run on multiple nodes e.g • Multimaster mysql servers • Mysql slaves • Stateless applications • location • Preferred location to run resource, eg. Based on hostname • colocation • Resources that have to live together • e.g ip address + service • order Define what resource has to start first, or wait for another resource • groups • Colocation + order
  • 35. eg. A Service on DRBD ● DRBD can only be active on 1 node ● The filesystem needs to be mounted on that active DRBD node group svc_mine d_mine ip_mine ms ms_drbd_storage drbd_storage meta master_max="1" master_node_max="1" clone_max="2" clone_node_max="1" notify="true" colocation fs_on_drbd inf: svc_mine ms_drbd_storage:Master order fs_after_drbd inf: ms_drbd_storage:promote svc_mine:start location cli-prefer-svc_db svc_db rule $id="cli-prefer-rule-svc_db" inf: #uname eq db-a
  • 36. A MySQL Resource ● OCF • Clone • Where do you hook up the IP ? • Multi State • But we have Master Master replication • Meta Resource • Dummy resource that can monitor • Connection • Replication state
  • 37. Simple 2 node example primitive d_mysql ocf:ntc:mysql op monitor interval="30s" params test_user="just" test_passwd="kidding" test_table="really" primitive ip_mysql_svc ocf:heartbeat:IPaddr2 params ip="10.8.0.30" cidr_netmask="255.255.255.0" nic="bond0" op monitor interval="10s" group svc_mysql d_mysql ip_mysql_svc
  • 38. Monitor your Setup ● Not just connectivity ● Also functional • Query data • Check resultset is correct ● Check replication • MaatKit • OpenARK
  • 39. How to deal with replication state ? ● Multiple slaves • Use Drbd ocf resource ● 2 masters only use own script • Replication is slow on the active node • Shouldn't happen talk to HR / cfgmt people • Replication is slow on the passive node • Weight-- • Replication breaks on the active node send out warning, don't modify weights and check other node • Replication breaks on the passive node • Fence of the passive node
  • 40. Adding MySQL to the stack Replication Service IP MySQL “MySQLd” “MySQLd” Resource MySQL Cluster Stack Pacemaker HeartBeat Node A Node B Hardware
  • 41. Pitfalls & Solutions ● Monitor, • Replication state • Replication Lag ● MaatKit ● OpenARK
  • 42. Conclusion ● Plenty of Alternatives ● Think about your Data ● Think about getting Queries to that Data ● Complexity is the enemy of reliability ● Keep it Simple ● Monitor inside the DB
  • 43. Contact Kris Buytaert Kris.Buytaert@inuits.be Further Reading @KrisBuytaert http://www.krisbuytaert.be/blog/ http://www.inuits.be/ http://www.virtualization.com/ http://www.oreillygmt.com/ Inuits Esquimaux 't Hemeltje Kheops Business Gemeentepark 2 Center 2930 Brasschaat Avenque Georges 891.514.231 Lemaître 54 6041 Gosselies +32 473 441 636 889.780.406 +32 495 698 668