SlideShare uma empresa Scribd logo
1 de 259
Baixar para ler offline
#Fosdem 2014 #MySQL & Friends
#devroom

15 Tips to improve your
Galera Cluster Experience
Who am I ?
Frédéric Descamps “lefred”
@lefred
http://about.me/lefred
Percona Consultant since 2011
Managing MySQL since 3.23
devops believer
I installed my first galera cluster in feb 2010
2/5/14
Who am I ?
Frédéric Descamps “lefred”
@lefred
http://about.me/lefred
Percona Consultant since 2011
Managing MySQL since 3.23
devops believer
I installed my first galera cluster in feb 2010
2/5/14
Ready for countdown ?
15
How to perform point in time recovery ?
Binary log must be enabled
log­slave­updates should be enabled

2/5/14
The environment

writes

writes

2/5/14

writes
Suddenly !

2/5/14
Suddenly !
Oups ! Dim0 truncated a production table...
:-S
We can have 2 scenarios :
–
–

2/5/14

The application can keep running even without that table
The application musts be stopped !
Suddenly !
Oups ! Dim0 truncated a production table...
:-S
We can have 2 scenarios :
–
–

2/5/14

The application can keep running even without that table
The application musts be stopped !
Suddenly !
Oups ! Dim0 truncated a production table...
:-S
We can have 2 scenarios :
–
–

2/5/14

The application can keep running even without that table
The application musts be stopped !
Suddenly !
Oups ! Dim0 truncated a production table...
:-S
We can have 2 scenarios :
–
–

2/5/14

The application can keep running even without that table
The application musts be stopped !
Scenario 1: application must be stopped !

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster
/etc/init.d/mysql stop
or
service mysql stop

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster
– Find the binlog file and position before “the
event” happened

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster
– Find the binlog file and position before “the
event” happened
# mysqlbinlog binlog.000001 | grep truncate ­B 2
#140123 23:37:03 server id 1  end_log_pos 1224 
Query thread_id=4 exec_time=0 error_code=0
SET TIMESTAMP=1390516623/*!*/;
truncate table speakers
2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Stop the each node of the cluster
– Find the binlog file and position before “the
event” happened
– Restore the backup on one node

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)

# cp binlog.00001 ~
# innobackupex ­­apply­log .  
We
etc..have binary logs

These are the steps :
– Stop the each node of the cluster
– Find the binlog file and position before “the
event” happened
– Restore the backup on one node

2/5/14
Scenario 1: application must be stopped !
We have Xtrabackup (and it creates daily backups!)
# /etc/init.d/mysql bootstrap­pxc
We have binary logs

These are the steps :
– Stop the each node of the cluster
– Find the binlog file and position before “the
event” happened
– Restore the backup on one node
– Restart that node (being sure the application doesn't
connect to it)

2/5/14
Scenario 1: application must be stopped ! (2)

2/5/14
Scenario 1: application must be stopped ! (2)
–

2/5/14

Replay all the binary logs since the backup BUT
the position of the event
Scenario 1: application must be stopped ! (2)
–

Replay all the binary logs since the backup BUT
the position of the event

# cat xtrabackup_binlog_info
Binlog.000001   565
# mysqlbinlog binlog.000001 | grep end_log_pos | 
grep 1224 ­B 1
#140123 23:36:53 server id 1  end_log_pos 1136 
#140123 23:37:03 server id 1  end_log_pos 1224 
# mysqlbinlog binlog.000001 ­j 565  
­­stop­position 1136 | mysql

2/5/14
Scenario 1: application must be stopped ! (2)

2/5/14
Scenario 1: application must be stopped ! (2)
–
–
–

2/5/14

Replay all the binary logs since the backup BUT
the position of the event
Start other nodes 1 by 1 and let them perform
SST
Enable connections from the application
Scenario 1: application must be stopped ! (2)
–
–
–

2/5/14

Replay all the binary logs since the backup BUT
the position of the event
Start other nodes 1 by 1 and let them perform
SST
Enable connections from the application
Scenario 1: application must be stopped ! (2)
–
–
–

2/5/14

Replay all the binary logs since the backup BUT
the position of the event
Start other nodes 1 by 1 and let them perform
SST
Enable connections from the application
Scenario 2: application can keep running

2/5/14
Scenario 2: application can keep running
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Take care of quorum (add garbd, change pc.weight,
pc.ignore_quorum)

–
–

2/5/14

Find the binlog file and position before “the
event” happened (thank you dim0!)
Remove one node from the cluster (and be sure the
app doesn't connect to it, load-balancer...)
Scenario 2: application can keep running
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Take care of quorum (add garbd, change pc.weight,
pc.ignore_quorum)

–
–

2/5/14

Find the binlog file and position before “the
event” happened (thank you dim0!)
Remove one node from the cluster (and be sure the
app doesn't connect to it, load-balancer...)
Scenario 2: application can keep running
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Take care of quorum (add garbd, change pc.weight,
pc.ignore_quorum)

–
–

2/5/14

Find the binlog file and position before “the
event” happened (thank you dim0!)
Remove one node from the cluster (and be sure the
app doesn't connect to it, load-balancer...)
Scenario 2: application can keep running
We have Xtrabackup (and it creates daily backups!)
We have binary logs
These are the steps :
– Take care of quorum (add garbd, change pc.weight,
pc.ignore_quorum)

–
–

2/5/14

Find the binlog file and position before “the
event” happened (thank you dim0!)
Remove one node from the cluster (and be sure the
app doesn't connect to it, load-balancer...)
Scenario 2: application can keep running (2)

2/5/14
Scenario 2: application can keep running (2)
–
–
–
–
–
–

2/5/14

Restore the backup on the node we stopped
Start mysql without joining the cluster (--wsrepcluster-address=dummy://)

Replay the binary log until the position of “the
event”
Export the table we need (mysqldump)
Import it on the cluster
Restart mysql on the off-line node and let it
perform SST
Scenario 2: application can keep running (2)
–
–
–
–
–
–

2/5/14

Restore the backup on the node we stopped
Start mysql without joining the cluster (--wsrepcluster-address=dummy://)

Replay the binary log until the position of “the
event”
Export the table we need (mysqldump)
Import it on the cluster
Restart mysql on the off-line node and let it
perform SST
Scenario 2: application can keep running (2)
–
–
–
–
–
–

2/5/14

Restore the backup on the node we stopped
Start mysql without joining the cluster (--wsrepcluster-address=dummy://)

Replay the binary log until the position of “the
event”
Export the table we need (mysqldump)
Import it on the cluster
Restart mysql on the off-line node and let it
perform SST
Scenario 2: application can keep running (2)
–
–
–
–
–
–

2/5/14

Restore the backup on the node we stopped
Start mysql without joining the cluster (--wsrepcluster-address=dummy://)

Replay the binary log until the position of “the
event”
Export the table we need (mysqldump)
Import it on the cluster
Restart mysql on the off-line node and let it
perform SST
Scenario 2: application can keep running (2)
–
–
–
–
–
–

2/5/14

Restore the backup on the node we stopped
Start mysql without joining the cluster (--wsrepcluster-address=dummy://)

Replay the binary log until the position of “the
event”
Export the table we need (mysqldump)
Import it on the cluster
Restart mysql on the off-line node and let it
perform SST
Scenario 2: application can keep running (2)
–
–
–
–
–
–

2/5/14

Restore the backup on the node we stopped
Start mysql without joining the cluster (--wsrepcluster-address=dummy://)

Replay the binary log until the position of “the
event”
Export the table we need (mysqldump)
Import it on the cluster
Restart mysql on the off-line node and let it
perform SST
14
Reduce “donation” time during XtraBackup SST
When performing SST with Xtrabackup the
donor can still be active
by default this is disabled in clustercheck
(AVAILABLE_WHEN_DONOR=0)

Running Xtrabackup can increase the load
(CPU / IO) on the server

2/5/14
Reduce “donation” time during XtraBackup SST (2)

Using Xtrabackup 2.1 features helps to reduce
the time of backup on the donor
[mysqld]

wsrep_sst_method=xtrabackup­v2
wsrep_sst_auth=root:dim0DidItAgain
[sst]
streamfmt=xbstream
[xtrabackup]
compress
compact
parallel=8
compress­threads=8
2/5/14

rebuild­threads=8

compress & compact can reduce the size
of payload transferred among nodes
but in general it slows down the process
13
Move asynchronous slave to a new master in 5.5
binlog.000039
102

writes

B

A
binlog.000002
102

as
yn
c

writes

master_host=A
master_log_file=binlog.000002
master_pos=102
2/5/14

C
writes

binlog.000001
402
Move asynchronous slave to a new master in 5.5
(2)
binlog.000039
102

writes

B

writes

as
yn
c

C

master_host=B
master_log_file=binlog.000002
master_pos=102
2/5/14

writes

binlog.000001
402
Move asynchronous slave to a new master in 5.5
(2)
binlog.000039
102

writes

B

writes

as
yn
c

C

master_host=B
master_log_file=binlog.000002
master_pos=102
2/5/14

writes

binlog.000001
402
Move asynchronous slave to a new master in
5.5 (3)

2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position
need to be used by the async slave ?
Find the last received Xid in the relay log on
the async slave (using mysqlbinlog)

2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position
need to be used by the async slave ?
Find the last received Xid in the relay log on
the async slave (using mysqlbinlog)

2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position
need to be used by the async slave ?
Find the last received Xid in the relay log on
the async slave (using mysqlbinlog)
# mysqlbinlog  percona4­relay­bin.000002 | tail
MjM5NDMxMDMxOTEtNTI4NzYxMTUxMDctMTM3NTAyNTI2NjUtNTc1ODY3MTc0MTg=
'/*!*/;
# at 14611057
#140131 12:48:12 server id 1  end_log_pos 29105924  Xid = 30097
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
2/5/14
Move asynchronous slave to a new master in
5.5 (3)

2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position need
to be used by the async slave ?
Find the last received Xid in the relay log on the
async slave (using mysqlbinlog)
Find in the new master which binary position
matches that same Xid
Use the binary log file and the position for your
CHANGE MASTER statement
2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position need
to be used by the async slave ?
Find the last received Xid in the relay log on the
async slave (using mysqlbinlog)
Find in the new master which binary position
matches that same Xid
Use the binary log file and the position for your
CHANGE MASTER statement
2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position need
to be used by the async slave ?
# mysqlbinlog percona3­bin.000004 | grep 'Xid = 30097'
#140131 12:48:12 server id 1  end_log_pos 28911093 

Xid = 30097

Find the last received Xid in the relay log on the
async slave (using mysqlbinlog)
Find in the new master which binary position
matches that same Xid
Use the binary log file and the position for your
CHANGE MASTER statement

2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position need
to be used by the async slave ?
Find the last received Xid in the relay log on the
async slave (using mysqlbinlog)
Find in the new master which binary position
matches that same Xid
Use the binary log file and the position for your
CHANGE MASTER statement
2/5/14
Move asynchronous slave to a new master in
5.5 (3)
How can we know which file and position need
to be used by the async slave ?

Async mysql> slave stop;

Async mysql> change master to master_host='percona3', 
          ­> master_log_file='percona3­bin.000004',
          ­> master_log_pos=28911093;

Find the last received Xid in the relay log on the
async slave (using mysqlbinlog)
Async mysql> start slave;
Find in the new master which binary position
matches that same Xid
Use the binary log file and the position for your
CHANGE MASTER statement
2/5/14
12
Move asynchronous slave to a new master in 5.6

2/5/14
Move asynchronous slave to a new master in 5.6
With 5.6 and GTID it's easier !
... but ...
It requires rsync SST (binlogs are needed)
Or since Jan 30th
wsrep_sst_xtrabackup­v2 supports
Xtrabackup 2.1.7 that makes is possible !!!
Just change master ;-)
2/5/14
Move asynchronous slave to a new master in 5.6
With 5.6 and GTID it's easier !
... but ...
It requires rsync SST (binlogs are needed)
Or since Jan 30th
wsrep_sst_xtrabackup­v2 supports
Xtrabackup 2.1.7 that makes is possible !!!
Just change master ;-)
2/5/14
Move asynchronous slave to a new master in 5.6
With 5.6 and GTID it's easier !
... but ...
It requires rsync SST (binlogs are needed)
Or since Jan 30th
wsrep_sst_xtrabackup­v2 supports
Xtrabackup 2.1.7 that makes is possible !!!
Just change master ;-)
2/5/14
Move asynchronous slave to a new master in 5.6
With 5.6 and GTID it's easier !
... but ...
It requires rsync SST (binlogs are needed)
Or since Jan 30th
wsrep_sst_xtrabackup­v2 supports
Xtrabackup 2.1.7 that makes is possible !!!
Just change master ;-)
2/5/14
Move asynchronous slave to a new master in 5.6
With 5.6 and GTID it's easier !
... but ...
It requires rsync SST (binlogs are needed)
Or since Jan 30th
wsrep_sst_xtrabackup­v2 supports
Xtrabackup 2.1.7 that makes is possible !!!
Just change master ;-)
2/5/14
11
Allow longer downtime for a node

2/5/14
Allow longer downtime for a node
When a node goes off-line, when it joins again
the cluster, it sends its last replicated event to
the donor
If the donor can send all next events, IST will
be performed (very fast)
If not... SST is mandatory

2/5/14
Allow longer downtime for a node
When a node goes off-line, when it joins again
the cluster, it sends its last replicated event to
the donor
If the donor can send all next events, IST will
be performed (very fast)
If not... SST is mandatory

2/5/14
Allow longer downtime for a node
When a node goes off-line, when it joins again
the cluster, it sends its last replicated event to
the donor
If the donor can send all next events, IST will
be performed (very fast)
If not... SST is mandatory

2/5/14
Allow longer downtime for a node (2)

2/5/14
Allow longer downtime for a node (2)
Those events are stored on a cache on disk:
galera.cache
The size of the cache is 128Mb by default
It can be increased using gcache.size provider
option:

2/5/14
Allow longer downtime for a node (2)
Those events are stored on a cache on disk:
galera.cache
The size of the cache is 128Mb by default
It can be increased using gcache.size provider
option:

2/5/14
Allow longer downtime for a node (2)
Those events are stored on a cache on disk:
galera.cache
The size of the cache is 128Mb by default
It can be increased using gcache.size provider
option:

2/5/14
Allow longer downtime for a node (2)
Those events are stored on a cache on disk:
galera.cache
The size of the cache is 128Mb by default
It can be increased using gcache.size provider
option:
In /etc/my.cnf:
wsrep_provider_options = “gcache.size=1G”
2/5/14
10
Then choose the right donor !
Let's imagine this:
Event 1

B

C

A
writes

2/5/14

Event 1

Event 1
Then choose the right donor ! (2)
Let's imagine this:
Event 1
Event 2

B
C
A
writes

2/5/14

Event 1
Event 1
Event 2
Then choose the right donor ! (3)
Let's imagine this:
Event 1
Event 2

Join:
last event = 1

B
C

A
writes

2/5/14

Event 1
Event 2

Event 1
Then choose the right donor ! (4)
Let's imagine this:
Event 1
Event 2

IST

B
C

A
writes

2/5/14

Event 1
Event 2

Event 1
Then choose the right donor ! (5)
Let's imagine this:
Event 1
Event 2

B

C

A
writes

2/5/14

Event 1
Event 2

Event 1
Event 2
Then choose the right donor ! (6)
Let's imagine this:
Event 1
Event 2

Let's format
the disk

B

C
A
writes

2/5/14

Event 1
Event 2
Then choose the right donor ! (7)
Let's imagine this:
Event 1
Event 2

Join:
no cluster info

B

C
A
writes

2/5/14

Event 1
Event 2
Then choose the right donor ! (8)
Full SST needed
Event 1
Event 2

SST

B

C
A
writes

2/5/14

Event 1
Event 2
Then choose the right donor ! (9)
This is what we have now:
Event 1
Event 2
Event 3

B

C

A
writes

2/5/14

Event 1
Event 2
Event 3

Event 3
Then choose the right donor ! (10)
Let's remove node B for maintenance
Event 1
Event 2
Event 3

B

C

A
writes

2/5/14

Event 1
Event 2
Event 3
Event 4

Event 3
Event 4
Then choose the right donor ! (11)
Now let's remove node C to replace a disk :-(
Event 1
Event 2
Event 3

B

C

A
writes

2/5/14

Event 1
Event 2
Event 3
Event 4
Event 5
Then choose the right donor ! (12)
Node C joins again and performs SST
Event 1
Event 2
Event 3

B

C

A
writes

2/5/14

Event 1
Event 2
Event 3
Event 4
Event 5
Then choose the right donor ! (12)
Node C joins again and performs SST
Event 1
Event 2
Event 3

B

C

A
writes

2/5/14

Event 1
Event 2
Event 3
Event 4
Event 5
Event 6

Event 6
Then choose the right donor ! (13)
Node B joins again but donor selection is not
clever yet...
Event 1
Event 2
Event 3

Join:
last event = 3

B

C

A
writes

2/5/14

Event 1
Event 2
Event 3
Event 4
Event 5
Event 6

Event 6
Then choose the right donor ! (13)
Node B joins again but donor selection is not
clever yet...
Event 1
Event 2
Event 3

Join:
last event = 3

B

SST will be needed !
C

A
writes

2/5/14

Event 1
Event 2
Event 3
Event 4
Event 5
Event 6

Event 6
Then choose the right donor ! (14)
So how to tell node B that it needs to use
node A?
Event 1
Event 2
Event 3

Join:
last event = 3

B

# /etc/init.d/mysql start ­­wsrep­sst_donor=nodeA
C

A
writes

2/5/14

Event 1
Event 2
Event 3
Event 4
Event 5
Event 6

Event 6
Then choose the right donor ! (15)

2/5/14
Then choose the right donor ! (15)
With 5.6 you have now the possibility to know the
lowest sequence number in gcache using
wsrep_local_cached_downto
To know the latest event's sequence number on
the node that joins the cluster, you have two
possibilities:

2/5/14
Then choose the right donor ! (15)
With 5.6 you have now the possibility to know the
lowest sequence number in gcache using
wsrep_local_cached_downto
To know the latest event's sequence number on
the node that joins the cluster, you have two
possibilities:

2/5/14
Then choose the right donor ! (15)
With 5.6 you have now the possibility to know the
lowest sequence number in gcache using
wsrep_local_cached_downto
To know the latest event's sequence number on
the node that joins the cluster, you have two
possibilities:
# cat grasdate.dat
# GALERA saved state
version: 2.1
uuid:    41920174­7ec6­11e3­a05a­6a2ab4033f05
seqno:   11
cert_index:

2/5/14
Then choose the right donor ! (15)
With 5.6 you have now the possibility to know the
lowest sequence number in gcache using
wsrep_local_cached_downto
To know the latest event's sequence number on
the node that joins the cluster, you have two
possibilities:
# cat grasdate.dat
# GALERA saved state
version: 2.1
uuid:    41920174­7ec6­11e3­a05a­6a2ab4033f05
seqno:   11
cert_index:

2/5/14
Then choose the right donor ! (15)
With 5.6 you have now the possibility to know the
lowest sequence number in gcache using
wsrep_local_cached_downto
To know the latest event's sequence number on
the node that joins the cluster, you have two
possibilities:
# mysqld_safe ­­wsrep­recover
140124 10:46:32 mysqld_safe Logging to '/var/lib/mysql/percona1_error.log'.
140124 10:46:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140124 10:46:32 mysqld_safe Skipping wsrep­recover 
for 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 pair
140124 10:46:32 mysqld_safe Assigning 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 
to wsrep_start_position
140124 10:46:34 mysqld_safe mysqld from pid file /var/lib/mysql/percona1.pid ended
2/5/14
Then choose the right donor ! (15)
With 5.6 you have now the possibility to know the
lowest sequence number in gcache using
wsrep_local_cached_downto
To know the latest event's sequence number on
the node that joins the cluster, you have two
possibilities:
# mysqld_safe ­­wsrep­recover
140124 10:46:32 mysqld_safe Logging to '/var/lib/mysql/percona1_error.log'.
140124 10:46:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140124 10:46:32 mysqld_safe Skipping wsrep­recover 
for 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 pair
140124 10:46:32 mysqld_safe Assigning 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 
to wsrep_start_position
140124 10:46:34 mysqld_safe mysqld from pid file /var/lib/mysql/percona1.pid ended
2/5/14
9
Measuring Max Replication Throughput

2/5/14
Measuring Max Replication Throughput
Since (5.5.33) wsrep_desync can be used to
find out how fast a node can replicate
The process is to collect the amount of
transactions (events) during peak time for a define
time range (let's take 1 min)

2/5/14
Measuring Max Replication Throughput
Since (5.5.33) wsrep_desync can be used to
find out how fast a node can replicate
The process is to collect the amount of
transactions (events) during peak time for a define
time range (let's take 1 min)

2/5/14
Measuring Max Replication Throughput
Since (5.5.33) wsrep_desync can be used to
find out how fast a node can replicate
The process is to collect the amount of
transactions (events) during peak time for a define
time range (let's take 1 min)
mysql> pager grep wsrep
mysql> show global status like 'wsrep_last_committed'; 
    ­> select sleep(60); 
    ­> show global status like 'wsrep_last_committed';
| wsrep_last_committed | 61472 |
| wsrep_last_committed | 69774 |
2/5/14
Measuring Max Replication Throughput
Since (5.5.33) wsrep_desync can be used to
find out how fast a node can replicate
The process is to collect the amount of
transactions (events) during peak69774 – 61472 = 8302
time for a define
8302 / 60 = 138.36 tps
time range (let's take 1 min)
mysql> pager grep wsrep
mysql> show global status like 'wsrep_last_committed'; 
    ­> select sleep(60); 
    ­> show global status like 'wsrep_last_committed';
| wsrep_last_committed | 61472 |
| wsrep_last_committed | 69774 |
2/5/14
Measuring Max Replication Throughput

2/5/14
Measuring Max Replication Throughput
Since (5.5.33) wsrep_desync can be used to find
out how fast a node can replicate
The process is to collect the amount of transactions
(events) during peak time for a define time range (let's
take 1 min)
Then collect the amount of transactions and the
duration to process them after the node was in
desync mode and not allowing writes
In desync mode, the node doesn't sent flow control
messages to the cluster
2/5/14
Measuring Max Replication Throughput
Since (5.5.33) wsrep_desync can be used to find
out how fast a node can replicate
The process is to collect the amount of transactions
(events) during peak time for a define time range (let's
take 1 min)
Then collect the amount of transactions and the
duration to process them after the node was in
desync mode and not allowing writes
In desync mode, the node doesn't sent flow control
messages to the cluster
2/5/14
Measuring Max Replication Throughput
Since (5.5.33) wsrep_desync can be used to find
out how fast a node can replicate
The process is to collect the amount of transactions
(events) during peak time for a define time range (let's
take 1 min)
Then collect the amount of transactions and the
duration to process them after the node was in
desync mode and not allowing writes
In desync mode, the node doesn't sent flow control
messages to the cluster
2/5/14
Measuring Max Replication Throughput
set global wsrep_desync=ON; flush tables with read lock; 
show global status like 'wsrep_last_committed'; 
select sleep( 60 ); unlock tables;

Since (5.5.33) wsrep_desync can be used to find
out how fast a node can replicate

+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+
| Variable_name        | Value  |
+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+
| wsrep_last_committed | 145987 |
+­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+

The process is to collect the amount of transactions
(events) during peak time for a define time range (let's
take 1 min)
Then collect the amount of transactions and the
duration to process them after the node was in
desync mode and not allowing writes
In desync mode, the node doesn't sent flow control
messages to the cluster

2/5/14
Measuring Max Replication Throughput
In another terminal you run myq_gadget and
when wsrep_local_recv_queue (Queue 
Dn) is back to 0 check again the value of
wsrep_last_committed.

2/5/14
Measuring Max Replication Throughput
In another terminal you run myq_gadget and
when wsrep_local_recv_queue (Queue 
Dn) is back to 0 check again the value of
wsrep_last_committed.
LefredPXC / percona3 / Galera 2.8(r165)
Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit 
    time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind
13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0
13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2
...
13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1
13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

2/5/14
Measuring Max Replication Throughput
In another terminal you run myq_gadget and
when wsrep_local_recv_queue (Queue 
Dn) is back to 0 check again the value of
wsrep_last_committed. This is when FTWRL 
is released
LefredPXC / percona3 / Galera 2.8(r165)
Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit 
    time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind
13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0
13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2
...
13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1
13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

2/5/14
Measuring Max Replication Throughput
In another terminal you run myq_gadget and
when wsrep_local_recv_queue (Queue 
This is when galera
Dn) is back to 0 check again the value of
catch up.
wsrep_last_committed = 165871
wsrep_last_committed. This is when FTWRL 
is released
LefredPXC / percona3 / Galera 2.8(r165)
Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit 
    time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind
13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0
13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2
...
13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1
13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

2/5/14
Measuring Max Replication Throughput
In another terminal you run myq_gadget and
165871 ­ 145987  = 19884
when wsrep_local_recv_queue (Queue 
19884 / 82 = 242.48 tps
Dn) is back toWe're currently at 57% value of
0 check again the
of our capacity
wsrep_last_committed.
LefredPXC / percona3 / Galera 2.8(r165)
Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit 
    time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind
13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0
13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2
...
13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1
13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1

2/5/14
8
Multicast replication
By default, galera uses
unicast TCP
1 copy of the replication
message sent to all other
nodes in the cluster

2/5/14
Multicast replication (2)
By default, galera uses
unicast TCP
1 copy of the replication
message sent to all other
nodes in the cluster
More nodes, more
bandwidth

2/5/14
Multicast replication (3)

2/5/14
Multicast replication (3)
If your network supports it you
can use Multicast UDP for
replication
wsrep_provider_options = 
“gmacast.mcast_addr = 
239.192.0.11”
wsrep_cluster_cluster_addr
ess = gcomm://239.192.0.11

2/5/14
Multicast replication (3)
If your network supports it you
can use Multicast UDP for
replication
wsrep_provider_options = 
“gmacast.mcast_addr = 
239.192.0.11”
wsrep_cluster_cluster_addr
ess = gcomm://239.192.0.11

2/5/14
Multicast replication (3)
If your network supports it you
can use Multicast UDP for
replication
wsrep_provider_options = 
“gmacast.mcast_addr = 
239.192.0.11”
wsrep_cluster_cluster_addr
ess = gcomm://239.192.0.11

2/5/14
7
SSL everywhere !

2/5/14
SSL everywhere !
It's possible to have the Galera replication
encrypted via SSL
But now it's also possible to have SST over
SSL, with xtrabackup_v2 and with rsync
https://github.com/tobz/galera-secure-rsync

2/5/14
SSL everywhere !
It's possible to have the Galera replication
encrypted via SSL
But now it's also possible to have SST over
SSL, with xtrabackup_v2 and with rsync
https://github.com/tobz/galera-secure-rsync

2/5/14
SSL everywhere !
It's possible to have the Galera replication
encrypted via SSL
But now it's also possible to have SST over
SSL, with xtrabackup_v2 and with rsync
https://github.com/tobz/galera-secure-rsync

2/5/14
SSL everywhere : certs creation

2/5/14
SSL everywhere : certs creation
openssl req ­new ­x500 ­days 
365000 ­nodes ­keyout key.pem 
­out cert.pem
Same cert and key must be copied on all
nodes
Copy them in /etc/mysql for example and
let only mysql read them
2/5/14
SSL everywhere : certs creation
openssl req ­new ­x500 ­days 
365000 ­nodes ­keyout key.pem 
­out cert.pem
Same cert and key must be copied on all
nodes
Copy them in /etc/mysql for example and
let only mysql read them
2/5/14
SSL everywhere : certs creation
openssl req ­new ­x500 ­days 
365000 ­nodes ­keyout key.pem 
­out cert.pem
Same cert and key must be copied on all
nodes
Copy them in /etc/mysql for example and
let only mysql read them
2/5/14
SSL everywhere : galera configuration

2/5/14
SSL everywhere : galera configuration
wsrep_provider_options = 
“socket.ssl.cert=/etc/mysql/cert.pem; 
socket.ssl_key=/etc/mysql/key.pem”

It's possible to set a remote Certificate
Authority for validation (use
socket.ssl_ca)
All nodes must have SSL enabled

2/5/14
SSL everywhere : galera configuration
wsrep_provider_options = 
“socket.ssl.cert=/etc/mysql/cert.pem; 
socket.ssl_key=/etc/mysql/key.pem”

It's possible to set a remote Certificate
Authority for validation (use
socket.ssl_ca)
All nodes must have SSL enabled

2/5/14
SSL everywhere : galera configuration
wsrep_provider_options = 
“socket.ssl.cert=/etc/mysql/cert.pem; 
socket.ssl_key=/etc/mysql/key.pem”

It's possible to set a remote Certificate
Authority for validation (use
socket.ssl_ca)
All nodes must have SSL enabled

2/5/14
SSL everywhere : SST configuration

2/5/14
SSL everywhere : SST configuration
As Xtrabackup 2.1 supports encryption, it's now also
possible to use SSL for SST
Use wsrep_sst_method=xtrabackup­v2
[sst]
tkey=/etc/mysql/key.pem
tcert=/etc/mysql/cert.pem
encrypt=3

2/5/14
SSL everywhere : SST configuration
As Xtrabackup 2.1 supports encryption, it's now also
possible to use SSL for SST
Use wsrep_sst_method=xtrabackup­v2
[sst]
tkey=/etc/mysql/key.pem
tcert=/etc/mysql/cert.pem
encrypt=3

2/5/14
SSL everywhere : SST configuration
As Xtrabackup 2.1 supports encryption, it's now also
possible to use SSL for SST
Use wsrep_sst_method=xtrabackup­v2
[sst]
tkey=/etc/mysql/key.pem
tcert=/etc/mysql/cert.pem
encrypt=3

2/5/14
SSL everywhere : SST configuration
As Xtrabackup 2.1 supports encryption, it's now also
possible to use SSL for SST
Use wsrep_sst_method=xtrabackup­v2
[sst]
tkey=/etc/mysql/key.pem
tcert=/etc/mysql/cert.pem
encrypt=3

2/5/14
SSL everywhere : SST configuration
As Xtrabackup 2.1 supports encryption, it's now also
possible to use SSL for SST
Use wsrep_sst_method=xtrabackup­v2
[sst]
tkey=/etc/mysql/key.pem
tcert=/etc/mysql/cert.pem
encrypt=3

2/5/14
SSL everywhere : SST configuration
As Xtrabackup 2.1 supports encryption, it's now also
possible to use SSL for SST
Use wsrep_sst_method=xtrabackup­v2
[sst]
tkey=/etc/mysql/key.pem
tcert=/etc/mysql/cert.pem
encrypt=3

2/5/14
SSL everywhere : SST configuration

2/5/14
SSL everywhere : SST configuration
And for those using rsync ?
galera­secure­rsync acts like
wsrep_sst_rsync but secures the communication
with SSL using socat.
Uses also the same cert and key file
wsrep_sst_method=secure_rsync
https://github.com/tobz/galera-secure-rsync
2/5/14
SSL everywhere : SST configuration
And for those using rsync ?
galera­secure­rsync acts like
wsrep_sst_rsync but secures the communication
with SSL using socat.
Uses also the same cert and key file
wsrep_sst_method=secure_rsync
https://github.com/tobz/galera-secure-rsync
2/5/14
SSL everywhere : SST configuration
And for those using rsync ?
galera­secure­rsync acts like
wsrep_sst_rsync but secures the communication
with SSL using socat.
Uses also the same cert and key file
wsrep_sst_method=secure_rsync
https://github.com/tobz/galera-secure-rsync
2/5/14
SSL everywhere : SST configuration
And for those using rsync ?
galera­secure­rsync acts like
wsrep_sst_rsync but secures the communication
with SSL using socat.
Uses also the same cert and key file
wsrep_sst_method=secure_rsync
https://github.com/tobz/galera-secure-rsync
2/5/14
SSL everywhere : SST configuration
And for those using rsync ?
galera­secure­rsync acts like
wsrep_sst_rsync but secures the communication
with SSL using socat.
Uses also the same cert and key file
wsrep_sst_method=secure_rsync
https://github.com/tobz/galera-secure-rsync
2/5/14
SSL everywhere : SST configuration
And for those using rsync ?
galera­secure­rsync acts like
wsrep_sst_rsync but secures the communication
with SSL using socat.
Uses also the same cert and key file
wsrep_sst_method=secure_rsync
https://github.com/tobz/galera-secure-rsync
2/5/14
6
Decode GRA* files

2/5/14
Decode GRA* files
When a replication failure occurs, a
GRA_*.log file is created into the datadir
For each of those files, a corresponding
message is present in the mysql error log file
Can be a false positive (bad DDL statement)...
or not !
This is how you can decode the content of that
file
2/5/14
Decode GRA* files
When a replication failure occurs, a
GRA_*.log file is created into the datadir
For each of those files, a corresponding
message is present in the mysql error log file
Can be a false positive (bad DDL statement)...
or not !
This is how you can decode the content of that
file
2/5/14
Decode GRA* files
When a replication failure occurs, a
GRA_*.log file is created into the datadir
For each of those files, a corresponding
message is present in the mysql error log file
Can be a false positive (bad DDL statement)...
or not !
This is how you can decode the content of that
file
2/5/14
Decode GRA* files
When a replication failure occurs, a
GRA_*.log file is created into the datadir
For each of those files, a corresponding
message is present in the mysql error log file
Can be a false positive (bad DDL statement)...
or not !
This is how you can decode the content of that
file
2/5/14
Decode GRA* files (2)

2/5/14
Decode GRA* files (2)
Download a binlog header file (http://goo.gl/kYTkY2)
Join the header and one GRA_*.log file:
–

cat GRA­header > GRA_3_3­bin.log

–

cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find
out what the problem was !

2/5/14
Decode GRA* files (2)
Download a binlog header file (http://goo.gl/kYTkY2)
Join the header and one GRA_*.log file:
–

cat GRA­header > GRA_3_3­bin.log

–

cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find
out what the problem was !

2/5/14
Decode GRA* files (2)
Download a binlog header file (http://goo.gl/kYTkY2)
Join the header and one GRA_*.log file:
–

cat GRA­header > GRA_3_3­bin.log

–

cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find
out what the problem was !

2/5/14
Decode GRA* files (2)
Download a binlog header file (http://goo.gl/kYTkY2)
Join the header and one GRA_*.log file:
–

cat GRA­header > GRA_3_3­bin.log

–

cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find
out what the problem was !

2/5/14
Decode GRA* files (2)
Download a binlog header file (http://goo.gl/kYTkY2)
Join the header and one GRA_*.log file:
–

cat GRA­header > GRA_3_3­bin.log

–

cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find
out what the problem was !

2/5/14
Decode GRA* files (2)
Download a binlog header file (http://goo.gl/kYTkY2)
Join the header and one GRA_*.log file:
–

cat GRA­header > GRA_3_3­bin.log

–

cat GRA_3_3.log  >> GRA_3_3­bin.log

Now you can just use mysqlbinlog ­vvv and find
out what the problem was !
wsrep_log_conflicts = 1
wsrep_debug = 1
wsrep_provider_options = “cert.log_conflicts=1”
2/5/14
5
Avoiding SST when adding a new node

2/5/14
Avoiding SST when adding a new node
It's possible to use a backup to prepare a new
node.
Those are the 3 prerequisites:
–
–
–

2/5/14

use XtraBackup >= 2.0.1
the backup needs to be performed with
­­galera­info
the gcache must be large enough
Avoiding SST when adding a new node
It's possible to use a backup to prepare a new
node.
Those are the 3 prerequisites:
–
–
–

2/5/14

use XtraBackup >= 2.0.1
the backup needs to be performed with
­­galera­info
the gcache must be large enough
Avoiding SST when adding a new node
It's possible to use a backup to prepare a new
node.
Those are the 3 prerequisites:
–
–
–

2/5/14

use XtraBackup >= 2.0.1
the backup needs to be performed with
­­galera­info
the gcache must be large enough
Avoiding SST when adding a new node
It's possible to use a backup to prepare a new
node.
Those are the 3 prerequisites:
–
–
–

2/5/14

use XtraBackup >= 2.0.1
the backup needs to be performed with
­­galera­info
the gcache must be large enough
Avoiding SST when adding a new node
It's possible to use a backup to prepare a new
node.
Those are the 3 prerequisites:
–
–
–

2/5/14

use XtraBackup >= 2.0.1
the backup needs to be performed with
­­galera­info
the gcache must be large enough
Avoiding SST when adding a new node (2)

2/5/14
Avoiding SST when adding a new node (2)
Restore the backup on the new node
Display the content of xtrabackup_galera_info:
5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:
#GALERA saved state
version: 2.1
uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66
seqno: 23 
cert_index:
2/5/14
Avoiding SST when adding a new node (2)
Restore the backup on the new node
Display the content of xtrabackup_galera_info:
5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:
#GALERA saved state
version: 2.1
uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66
seqno: 23 
cert_index:
2/5/14
Avoiding SST when adding a new node (2)
Restore the backup on the new node
Display the content of xtrabackup_galera_info:
5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:
#GALERA saved state
version: 2.1
uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66
seqno: 23 
cert_index:
2/5/14
Avoiding SST when adding a new node (2)
Restore the backup on the new node
Display the content of xtrabackup_galera_info:
5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:
#GALERA saved state
version: 2.1
uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66
seqno: 23 
cert_index:
2/5/14
Avoiding SST when adding a new node (2)
Restore the backup on the new node
Display the content of xtrabackup_galera_info:
5f22b204­dc6b­11e1­0800­7a9c9624dd66:23

Create the file called grastate.dat like this:
#GALERA saved state
version: 2.1
uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66
seqno: 23 
cert_index:
2/5/14
Avoiding SST when adding a new node (2)
Restore the backup on the new node
Display the content of xtrabackup_galera_info:
5f22b204­dc6b­11e1­0800­7a9c9624dd66:23
mysql> show global status like 'wsrep_provider_version';
+­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| Variable_name          | Value     |
+­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+
| wsrep_provider_version | 2.1(r113) |
+­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+

Create the file called grastate.dat like this:
#GALERA saved state
version: 2.1
uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66
seqno: 23 
cert_index:
2/5/14
4
Play with quorum and weight

2/5/14
Play with quorum and weight
Galera manages Quorum
If a node does not see more than 50% of the total
amount of nodes, reads/writes are not accepted
Split brain is prevented
This requires at least 3 nodes to work properly
Can be disabled (but be warned!)
You can cheat ;-)

2/5/14
Play with quorum and weight
Galera manages Quorum
If a node does not see more than 50% of the total
amount of nodes, reads/writes are not accepted
Split brain is prevented
This requires at least 3 nodes to work properly
Can be disabled (but be warned!)
You can cheat ;-)

2/5/14
Play with quorum and weight
Galera manages Quorum
If a node does not see more than 50% of the total
amount of nodes, reads/writes are not accepted
Split brain is prevented
This requires at least 3 nodes to work properly
Can be disabled (but be warned!)
You can cheat ;-)

2/5/14
Play with quorum and weight
Galera manages Quorum
If a node does not see more than 50% of the total
amount of nodes, reads/writes are not accepted
Split brain is prevented
This requires at least 3 nodes to work properly
Can be disabled (but be warned!)
You can cheat ;-)

2/5/14
Play with quorum and weight
Galera manages Quorum
If a node does not see more than 50% of the total
amount of nodes, reads/writes are not accepted
Split brain is prevented
This requires at least 3 nodes to work properly
Can be disabled (but be warned!)
You can cheat ;-)

2/5/14
Play with quorum and weight
Galera manages Quorum
If a node does not see more than 50% of the total
amount of nodes, reads/writes are not accepted
Split brain is prevented
This requires at least 3 nodes to work properly
Can be disabled (but be warned!)
You can cheat ;-)

2/5/14
Quorum: lost of connectivity

2/5/14
Quorum: lost of connectivity

Network Problem

2/5/14
Quorum: lost of connectivity

Network Problem

Does not accept Reads & Writes

2/5/14
Quorum: lost of connectivity

2/5/14
Quorum: lost of connectivity
Network Problem

2/5/14
Quorum: lost of connectivity
Network Problem

2/5/14
Quorum: lost of connectivity
Network Problem

2/5/14
Quorum: arbitrator

2/5/14
Quorum: arbitrator
It's possible to use an arbitrator (garbd) to play
an extra node. All traffic will pass through it
but it won't have any MySQL running.
Useful in case of storage available only for 2
nodes or if you have an even amount of
nodes.
Odd number of nodes is always advised
2/5/14
Quorum: arbitrator
It's possible to use an arbitrator (garbd) to play
an extra node. All traffic will pass through it
but it won't have any MySQL running.
Useful in case of storage available only for 2
nodes or if you have an even amount of
nodes.
Odd number of nodes is always advised
2/5/14
Quorum: arbitrator
It's possible to use an arbitrator (garbd) to play
an extra node. All traffic will pass through it
but it won't have any MySQL running.
Useful in case of storage available only for 2
nodes or if you have an even amount of
nodes.
Odd number of nodes is always advised
2/5/14
Quorum: cheat !

2/5/14
Quorum: cheat !
You can disable quorum but watch out ! (you have
been warned):
wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the
quorum calculation (default is 1):
wsrep_provider_options = “pc.weight=1”

2/5/14
Quorum: cheat !
You can disable quorum but watch out ! (you have
been warned):
wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the
quorum calculation (default is 1):
wsrep_provider_options = “pc.weight=1”

2/5/14
Quorum: cheat !
You can disable quorum but watch out ! (you have
been warned):
wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the
quorum calculation (default is 1):
wsrep_provider_options = “pc.weight=1”

2/5/14
Quorum: cheat !
You can disable quorum but watch out ! (you have
been warned):
wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the
quorum calculation (default is 1):
wsrep_provider_options = “pc.weight=1”

2/5/14
Quorum: cheat !
You can disable quorum but watch out ! (you have
been warned):
wsrep_provider_options = “pc.ignore_quorum=true”

You can define the weigth of a node to affect the
quorum calculation (default is 1):
wsrep_provider_options = “pc.weight=1”

2/5/14
3
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication?
Galera 2 requires all point-to-point connections for
replication

datacenter A
2/5/14

datacenter B
How to optimize WAN replication? (2)
Galera 3 brings the notion of “cluster segments”

datacenter A
2/5/14

datacenter B
How to optimize WAN replication? (3)
Segments gateways can change per transaction

datacenter A
2/5/14

datacenter B
How to optimize WAN replication? (3)
Replication traffic between segments is mimized.
Writesets are relayed to the other segment
through one node
commit
WS

datacenter A
2/5/14

datacenter B
How to optimize WAN replication? (3)
Replication traffic between segments is mimized.
Writesets are relayed to the other segment
through one node
commit

WS

datacenter A
2/5/14

WS

WS

datacenter B
How to optimize WAN replication? (4)
From those local relays replication is propagated to
every nodes in the segment
commit

WS

datacenter A
2/5/14

datacenter B

WS
How to optimize WAN replication? (4)
From those local relays replication is propagated to
every nodes in the segment
commit

gmcasts.segment = 1...255
WS

datacenter A
2/5/14

datacenter B

WS
2
Load balancers

2/5/14
Load balancers
Galera is generally used in combination with a
load balancer
The most used is HA Proxy
Codership provides one with Galera: glbd

2/5/14
Load balancers
Galera is generally used in combination with a
load balancer
The most used is HA Proxy
Codership provides one with Galera: glbd

2/5/14
Load balancers
Galera is generally used in combination with a
load balancer
The most used is HA Proxy
Codership provides one with Galera: glbd

2/5/14
Load balancers
Galera is generally used in combination with a
load balancer
The most used is HA Proxy
Codership provides one with Galera: glbd
app 1

app 2

app 3

HA PROXY

2/5/14

node 1

node 2

node 3
Load balancers: myths, legends and reality

2/5/14
Load balancers: myths, legends and reality
TIME_WAIT
–

On heavy load, you may have an issue with a large amount of
TCP connections in TIME_WAIT state

–

This can leas to a TCP port exhaustion !

How to fix ?
–

–
2/5/14

Use nolinger option in HA Proxy (for glbd check
http://www.lefred.be/node/168), but this lead to an increase
of Aborted_clients is the client is connecting and
disconnecting to MySQL too fast
Modify the value of tcp_max_tw_buckets
Load balancers: myths, legends and reality
TIME_WAIT
–

On heavy load, you may have an issue with a large amount of
TCP connections in TIME_WAIT state

–

This can leas to a TCP port exhaustion !

How to fix ?
–

–
2/5/14

Use nolinger option in HA Proxy (for glbd check
http://www.lefred.be/node/168), but this lead to an increase
of Aborted_clients is the client is connecting and
disconnecting to MySQL too fast
Modify the value of tcp_max_tw_buckets
Load balancers: myths, legends and reality
TIME_WAIT
–

On heavy load, you may have an issue with a large amount of
TCP connections in TIME_WAIT state

–

This can leas to a TCP port exhaustion !

How to fix ?
–

–
2/5/14

Use nolinger option in HA Proxy (for glbd check
http://www.lefred.be/node/168), but this lead to an increase
of Aborted_clients is the client is connecting and
disconnecting to MySQL too fast
Modify the value of tcp_max_tw_buckets
Load balancers: myths, legends and reality
TIME_WAIT
–

On heavy load, you may have an issue with a large amount of
TCP connections in TIME_WAIT state

–

This can leas to a TCP port exhaustion !

How to fix ?
–

–
2/5/14

Use nolinger option in HA Proxy (for glbd check
http://www.lefred.be/node/168), but this lead to an increase
of Aborted_clients is the client is connecting and
disconnecting to MySQL too fast
Modify the value of tcp_max_tw_buckets
Load balancers: myths, legends and reality
TIME_WAIT
–

On heavy load, you may have an issue with a large amount of
TCP connections in TIME_WAIT state

–

This can leas to a TCP port exhaustion !

How to fix ?
–

–
2/5/14

Use nolinger option in HA Proxy (for glbd check
http://www.lefred.be/node/168), but this lead to an increase
of Aborted_clients is the client is connecting and
disconnecting to MySQL too fast
Modify the value of tcp_max_tw_buckets
Load balancers: myths, legends and reality
TIME_WAIT
–

On heavy load, you may have an issue with a large amount of
TCP connections in TIME_WAIT state

–

This can leas to a TCP port exhaustion !

How to fix ?
–

–
2/5/14

Use nolinger option in HA Proxy (for glbd check
http://www.lefred.be/node/168), but this lead to an increase
of Aborted_clients is the client is connecting and
disconnecting to MySQL too fast
Modify the value of tcp_max_tw_buckets
Load balancers: common issues
Persitent Connections
–

Many people expects the following scenario:
app 1

app 2

HA PROXY

node 1

2/5/14

node 2

node 3

app 3
Load balancers: common issues
Persitent Connections
–

When the node that was specified to receive the persistent
write fails for example
app 1

app 2

HA PROXY

node 1

2/5/14

node 2

node 3

app 3
Load balancers: common issues
Persitent Connections
–

When the node is back on-line...
app 1

app 2

HA PROXY

node 1

2/5/14

node 2

node 3

app 3
Load balancers: common issues
Persitent Connections
–

Only the new connections will use again the preferred node
app 1

app 2

HA PROXY

node 1

2/5/14

node 2

node 3

app 3
Load balancers: common issues
Persitent Connections
–

Only the new connections will use again the preferred node
app 1

app 2

HA PROXY

node 1

2/5/14

node 2

node 3

app 3
Load balancers: common issues
Persitent Connections
–

Only the new connections will use again the preferred node
app 1

app 2

HA PROXY

node 1

2/5/14

node 2

node 3

app 3
Load balancers: common issues

2/5/14
Load balancers: common issues
Persitent Connections
–
–

HA Proxy decides where the connection will go at TCP
handshake
Once the TCP session is established, the sessions will stay
where they are !

Solution ?
–

With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

2/5/14
Load balancers: common issues
Persitent Connections
–
–

HA Proxy decides where the connection will go at TCP
handshake
Once the TCP session is established, the sessions will stay
where they are !

Solution ?
–

With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

2/5/14
Load balancers: common issues
Persitent Connections
–
–

HA Proxy decides where the connection will go at TCP
handshake
Once the TCP session is established, the sessions will stay
where they are !

Solution ?
–

With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

2/5/14
Load balancers: common issues
Persitent Connections
–
–

HA Proxy decides where the connection will go at TCP
handshake
Once the TCP session is established, the sessions will stay
where they are !

Solution ?
–

With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

2/5/14
Load balancers: common issues
Persitent Connections
–
–

HA Proxy decides where the connection will go at TCP
handshake
Once the TCP session is established, the sessions will stay
where they are !

Solution ?
–

With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

2/5/14
Load balancers: common issues
Persitent Connections
–
–

HA Proxy decides where the connection will go at TCP
handshake
Once the TCP session is established, the sessions will stay
where they are !

Solution ?
–

With HA Proxy 1.5 you can now specify the following option :

on­marked­down shutdown­sessions

2/5/14
1
Taking backups without stalls

2/5/14
Taking backups without stalls
When you want to perform a consistent
backup, you need to take a FLUSH TABLES
WITH READ LOCK (FTWRL)
By default even with Xtrabackup
This causes a Flow Control in galera
So how can we deal with that ?

2/5/14
Taking backups without stalls
When you want to perform a consistent
backup, you need to take a FLUSH TABLES
WITH READ LOCK (FTWRL)
By default even with Xtrabackup
This causes a Flow Control in galera
So how can we deal with that ?

2/5/14
Taking backups without stalls
When you want to perform a consistent
backup, you need to take a FLUSH TABLES
WITH READ LOCK (FTWRL)
By default even with Xtrabackup
This causes a Flow Control in galera
So how can we deal with that ?

2/5/14
Taking backups without stalls
When you want to perform a consistent
backup, you need to take a FLUSH TABLES
WITH READ LOCK (FTWRL)
By default even with Xtrabackup
This causes a Flow Control in galera
So how can we deal with that ?

2/5/14
Taking backups without stalls

2/5/14
Taking backups without stalls
Choose the node from which you want to take the backup
Change the state to 'Donor/Desynced' (see tip 9)
set global wsrep_desync=ON
Take the backup
Wait that wsrep_local_recv_queue is back
down to 0
Change back the state to 'Joined'
set global wsrep_desync=OFF
2/5/14
Taking backups without stalls
Choose the node from which you want to take the backup
Change the state to 'Donor/Desynced' (see tip 9)
set global wsrep_desync=ON
Take the backup
Wait that wsrep_local_recv_queue is back
down to 0
Change back the state to 'Joined'
set global wsrep_desync=OFF
2/5/14
Taking backups without stalls
Choose the node from which you want to take the backup
Change the state to 'Donor/Desynced' (see tip 9)
set global wsrep_desync=ON
Take the backup
Wait that wsrep_local_recv_queue is back
down to 0
Change back the state to 'Joined'
set global wsrep_desync=OFF
2/5/14
Taking backups without stalls
Choose the node from which you want to take the backup
Change the state to 'Donor/Desynced' (see tip 9)
set global wsrep_desync=ON
Take the backup
Wait that wsrep_local_recv_queue is back
down to 0
Change back the state to 'Joined'
set global wsrep_desync=OFF
2/5/14
Taking backups without stalls
Choose the node from which you want to take the backup
Change the state to 'Donor/Desynced' (see tip 9)
set global wsrep_desync=ON
Take the backup
Wait that wsrep_local_recv_queue is back
down to 0
Change back the state to 'Joined'
set global wsrep_desync=OFF
2/5/14
Taking backups without stalls
Choose the node from which you want to take the backup
Change the state to 'Donor/Desynced' (see tip 9)
set global wsrep_desync=ON
Take the backup
Wait that wsrep_local_recv_queue is back
down to 0
Change back the state to 'Joined'
set global wsrep_desync=OFF
2/5/14
Taking backups without stalls
Choose the node from which you want to take the backup
Change the state to 'Donor/Desynced' (see tip 9)
set global wsrep_desync=ON
Take the backup
Wait that wsrep_local_recv_queue is back
down to 0
Change back the state to 'Joined'
set global wsrep_desync=OFF
2/5/14
0
GO !

Mais conteúdo relacionado

Mais procurados

Galera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slidesGalera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slidesSeveralnines
 
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Severalnines
 
Percona XtraDB Cluster
Percona XtraDB ClusterPercona XtraDB Cluster
Percona XtraDB ClusterKenny Gryp
 
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...Severalnines
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMakerKris Buytaert
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesSeveralnines
 
Plny12 galera-cluster-best-practices
Plny12 galera-cluster-best-practicesPlny12 galera-cluster-best-practices
Plny12 galera-cluster-best-practicesDimas Prasetyo
 
合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群YUCHENG HU
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterSeveralnines
 
Introduction to Galera
Introduction to GaleraIntroduction to Galera
Introduction to GaleraHenrik Ingo
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLKenny Gryp
 
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...Severalnines
 
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxyBo-Yi Wu
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationKenny Gryp
 
High Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
High Availability with Galera Cluster - SkySQL Road Show 2013 in BerlinHigh Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
High Availability with Galera Cluster - SkySQL Road Show 2013 in BerlinMariaDB Corporation
 
Galera 3.0 Webinar Slides: Galera Monitoring & Management
Galera 3.0 Webinar Slides: Galera Monitoring & ManagementGalera 3.0 Webinar Slides: Galera Monitoring & Management
Galera 3.0 Webinar Slides: Galera Monitoring & ManagementSeveralnines
 

Mais procurados (20)

Galera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slidesGalera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slides
 
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
Webinar slides: Introducing Galera 3.0 - Now supporting MySQL 5.6
 
Percona XtraDB Cluster
Percona XtraDB ClusterPercona XtraDB Cluster
Percona XtraDB Cluster
 
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...
Repair & Recovery for your MySQL, MariaDB & MongoDB / TokuMX Clusters - Webin...
 
How to understand Galera Cluster - 2013
How to understand Galera Cluster - 2013How to understand Galera Cluster - 2013
How to understand Galera Cluster - 2013
 
MySQL HA with PaceMaker
MySQL HA with  PaceMakerMySQL HA with  PaceMaker
MySQL HA with PaceMaker
 
Galera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction SlidesGalera cluster for MySQL - Introduction Slides
Galera cluster for MySQL - Introduction Slides
 
Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1Galera Cluster Best Practices for DBA's and DevOps Part 1
Galera Cluster Best Practices for DBA's and DevOps Part 1
 
Plny12 galera-cluster-best-practices
Plny12 galera-cluster-best-practicesPlny12 galera-cluster-best-practices
Plny12 galera-cluster-best-practices
 
合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群合并到 XtraDB 存储引擎集群
合并到 XtraDB 存储引擎集群
 
Galera Cluster 3.0 Features
Galera Cluster 3.0 FeaturesGalera Cluster 3.0 Features
Galera Cluster 3.0 Features
 
Webinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera ClusterWebinar Slides: Migrating to Galera Cluster
Webinar Slides: Migrating to Galera Cluster
 
Introduction to Galera
Introduction to GaleraIntroduction to Galera
Introduction to Galera
 
Reducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQLReducing Risk When Upgrading MySQL
Reducing Risk When Upgrading MySQL
 
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
Webinar Slides : Migrating to MySQL, MariaDB Galera and/or Percona XtraDB Clu...
 
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
2014 OSDC Talk: Introduction to Percona XtraDB Cluster and HAProxy
 
Introducing Galera 3.0
Introducing Galera 3.0Introducing Galera 3.0
Introducing Galera 3.0
 
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group ReplicationPercona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
Percona XtraDB Cluster vs Galera Cluster vs MySQL Group Replication
 
High Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
High Availability with Galera Cluster - SkySQL Road Show 2013 in BerlinHigh Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
High Availability with Galera Cluster - SkySQL Road Show 2013 in Berlin
 
Galera 3.0 Webinar Slides: Galera Monitoring & Management
Galera 3.0 Webinar Slides: Galera Monitoring & ManagementGalera 3.0 Webinar Slides: Galera Monitoring & Management
Galera 3.0 Webinar Slides: Galera Monitoring & Management
 

Semelhante a Fosdem 2014 - MySQL & Friends Devroom: 15 tips galera cluster

Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionParis Carbone
 
Have you been stalking your servers?
Have you been stalking your servers?Have you been stalking your servers?
Have you been stalking your servers?morpht
 
Intro to Kernel Debugging - Just make the crashing stop!
Intro to Kernel Debugging - Just make the crashing stop!Intro to Kernel Debugging - Just make the crashing stop!
Intro to Kernel Debugging - Just make the crashing stop!All Things Open
 
Ruby and Rails Packaging to Production
Ruby and Rails Packaging to ProductionRuby and Rails Packaging to Production
Ruby and Rails Packaging to ProductionFabio Kung
 
Game Changing Dependency Management
Game Changing Dependency ManagementGame Changing Dependency Management
Game Changing Dependency ManagementJeremy Kendall
 
Chef Workshop: Setup Environment with Chef,Vagrant, and Berkshelf
Chef Workshop: Setup Environment with Chef,Vagrant, and BerkshelfChef Workshop: Setup Environment with Chef,Vagrant, and Berkshelf
Chef Workshop: Setup Environment with Chef,Vagrant, and BerkshelfJun Sakata
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemFrederic Descamps
 
Hooks and Events in Drupal 8
Hooks and Events in Drupal 8Hooks and Events in Drupal 8
Hooks and Events in Drupal 8Nida Ismail Shah
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot publicSeongJae Park
 
Exadata db node update
Exadata db node updateExadata db node update
Exadata db node updatepat2001
 
PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...
PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...
PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...Puppet
 
MariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docxMariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docxNeoClova
 
Install tomcat 5.5 in debian os and deploy war file
Install tomcat 5.5 in debian os and deploy war fileInstall tomcat 5.5 in debian os and deploy war file
Install tomcat 5.5 in debian os and deploy war fileNguyen Cao Hung
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardeningarchwisp
 
Scalable Deployment Architectures with TYPO3 Surf, Git and Jenkins
Scalable Deployment Architectures with TYPO3 Surf, Git and JenkinsScalable Deployment Architectures with TYPO3 Surf, Git and Jenkins
Scalable Deployment Architectures with TYPO3 Surf, Git and Jenkinsmhelmich
 
MySQL Galera 集群
MySQL Galera 集群MySQL Galera 集群
MySQL Galera 集群YUCHENG HU
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisAnne Nicolas
 

Semelhante a Fosdem 2014 - MySQL & Friends Devroom: 15 tips galera cluster (20)

Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in ActionNot Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
Not Less, Not More: Exactly Once, Large-Scale Stream Processing in Action
 
Have you been stalking your servers?
Have you been stalking your servers?Have you been stalking your servers?
Have you been stalking your servers?
 
Intro to Kernel Debugging - Just make the crashing stop!
Intro to Kernel Debugging - Just make the crashing stop!Intro to Kernel Debugging - Just make the crashing stop!
Intro to Kernel Debugging - Just make the crashing stop!
 
Ruby and Rails Packaging to Production
Ruby and Rails Packaging to ProductionRuby and Rails Packaging to Production
Ruby and Rails Packaging to Production
 
Game Changing Dependency Management
Game Changing Dependency ManagementGame Changing Dependency Management
Game Changing Dependency Management
 
Chef Workshop: Setup Environment with Chef,Vagrant, and Berkshelf
Chef Workshop: Setup Environment with Chef,Vagrant, and BerkshelfChef Workshop: Setup Environment with Chef,Vagrant, and Berkshelf
Chef Workshop: Setup Environment with Chef,Vagrant, and Berkshelf
 
Backups
BackupsBackups
Backups
 
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database SystemPercona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - The Evolution of a MySQL Database System
 
infra-as-code
infra-as-codeinfra-as-code
infra-as-code
 
Hooks and Events in Drupal 8
Hooks and Events in Drupal 8Hooks and Events in Drupal 8
Hooks and Events in Drupal 8
 
Git inter-snapshot public
Git  inter-snapshot publicGit  inter-snapshot public
Git inter-snapshot public
 
Exadata db node update
Exadata db node updateExadata db node update
Exadata db node update
 
PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...
PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...
PuppetConf 2016: Deconfiguration Management: Making Puppet Clean Up Its Own M...
 
MariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docxMariaDB10.7_install_Ubuntu.docx
MariaDB10.7_install_Ubuntu.docx
 
Install tomcat 5.5 in debian os and deploy war file
Install tomcat 5.5 in debian os and deploy war fileInstall tomcat 5.5 in debian os and deploy war file
Install tomcat 5.5 in debian os and deploy war file
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardening
 
Scalable Deployment Architectures with TYPO3 Surf, Git and Jenkins
Scalable Deployment Architectures with TYPO3 Surf, Git and JenkinsScalable Deployment Architectures with TYPO3 Surf, Git and Jenkins
Scalable Deployment Architectures with TYPO3 Surf, Git and Jenkins
 
MySQL Galera 集群
MySQL Galera 集群MySQL Galera 集群
MySQL Galera 集群
 
Fun with Ruby and Cocoa
Fun with Ruby and CocoaFun with Ruby and Cocoa
Fun with Ruby and Cocoa
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
 

Mais de Frederic Descamps

MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...
MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...
MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...Frederic Descamps
 
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code ExtensionMySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code ExtensionFrederic Descamps
 
RivieraJUG - MySQL Indexes and Histograms
RivieraJUG - MySQL Indexes and HistogramsRivieraJUG - MySQL Indexes and Histograms
RivieraJUG - MySQL Indexes and HistogramsFrederic Descamps
 
RivieraJUG - MySQL 8.0 - What's new for developers.pdf
RivieraJUG - MySQL 8.0 - What's new for developers.pdfRivieraJUG - MySQL 8.0 - What's new for developers.pdf
RivieraJUG - MySQL 8.0 - What's new for developers.pdfFrederic Descamps
 
MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8Frederic Descamps
 
State of the Dolphin - May 2022
State of the Dolphin - May 2022State of the Dolphin - May 2022
State of the Dolphin - May 2022Frederic Descamps
 
Percona Live 2022 - MySQL Shell for Visual Studio Code
Percona Live 2022 - MySQL Shell for Visual Studio CodePercona Live 2022 - MySQL Shell for Visual Studio Code
Percona Live 2022 - MySQL Shell for Visual Studio CodeFrederic Descamps
 
Percona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL ArchitecturesPercona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL ArchitecturesFrederic Descamps
 
LinuxFest Northwest 2022 - The Evolution of a MySQL Database System
LinuxFest Northwest 2022 - The Evolution of a MySQL Database SystemLinuxFest Northwest 2022 - The Evolution of a MySQL Database System
LinuxFest Northwest 2022 - The Evolution of a MySQL Database SystemFrederic Descamps
 
Open Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsOpen Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsFrederic Descamps
 
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
Pi Day 2022 -  from IoT to MySQL HeatWave Database ServicePi Day 2022 -  from IoT to MySQL HeatWave Database Service
Pi Day 2022 - from IoT to MySQL HeatWave Database ServiceFrederic Descamps
 
Confoo 2022 - le cycle d'une instance MySQL
Confoo 2022  - le cycle d'une instance MySQLConfoo 2022  - le cycle d'une instance MySQL
Confoo 2022 - le cycle d'une instance MySQLFrederic Descamps
 
FOSDEM 2022 MySQL Devroom: MySQL 8.0 - Logical Backups, Snapshots and Point-...
FOSDEM 2022 MySQL Devroom:  MySQL 8.0 - Logical Backups, Snapshots and Point-...FOSDEM 2022 MySQL Devroom:  MySQL 8.0 - Logical Backups, Snapshots and Point-...
FOSDEM 2022 MySQL Devroom: MySQL 8.0 - Logical Backups, Snapshots and Point-...Frederic Descamps
 
Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0Frederic Descamps
 
Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0Frederic Descamps
 
State of The Dolphin - May 2021
State of The Dolphin - May 2021State of The Dolphin - May 2021
State of The Dolphin - May 2021Frederic Descamps
 
Deploying Magento on OCI with MDS
Deploying Magento on OCI with MDSDeploying Magento on OCI with MDS
Deploying Magento on OCI with MDSFrederic Descamps
 
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...From single MySQL instance to High Availability: the journey to MySQL InnoDB ...
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...Frederic Descamps
 

Mais de Frederic Descamps (20)

MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...
MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...
MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...
 
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code ExtensionMySQL Day Roma - MySQL Shell and Visual Studio Code Extension
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
 
RivieraJUG - MySQL Indexes and Histograms
RivieraJUG - MySQL Indexes and HistogramsRivieraJUG - MySQL Indexes and Histograms
RivieraJUG - MySQL Indexes and Histograms
 
RivieraJUG - MySQL 8.0 - What's new for developers.pdf
RivieraJUG - MySQL 8.0 - What's new for developers.pdfRivieraJUG - MySQL 8.0 - What's new for developers.pdf
RivieraJUG - MySQL 8.0 - What's new for developers.pdf
 
MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8MySQL User Group NL - MySQL 8
MySQL User Group NL - MySQL 8
 
State of the Dolphin - May 2022
State of the Dolphin - May 2022State of the Dolphin - May 2022
State of the Dolphin - May 2022
 
Percona Live 2022 - MySQL Shell for Visual Studio Code
Percona Live 2022 - MySQL Shell for Visual Studio CodePercona Live 2022 - MySQL Shell for Visual Studio Code
Percona Live 2022 - MySQL Shell for Visual Studio Code
 
Percona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL ArchitecturesPercona Live 2022 - MySQL Architectures
Percona Live 2022 - MySQL Architectures
 
LinuxFest Northwest 2022 - The Evolution of a MySQL Database System
LinuxFest Northwest 2022 - The Evolution of a MySQL Database SystemLinuxFest Northwest 2022 - The Evolution of a MySQL Database System
LinuxFest Northwest 2022 - The Evolution of a MySQL Database System
 
Open Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and HistogramsOpen Source 101 2022 - MySQL Indexes and Histograms
Open Source 101 2022 - MySQL Indexes and Histograms
 
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
Pi Day 2022 -  from IoT to MySQL HeatWave Database ServicePi Day 2022 -  from IoT to MySQL HeatWave Database Service
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
 
Confoo 2022 - le cycle d'une instance MySQL
Confoo 2022  - le cycle d'une instance MySQLConfoo 2022  - le cycle d'une instance MySQL
Confoo 2022 - le cycle d'une instance MySQL
 
FOSDEM 2022 MySQL Devroom: MySQL 8.0 - Logical Backups, Snapshots and Point-...
FOSDEM 2022 MySQL Devroom:  MySQL 8.0 - Logical Backups, Snapshots and Point-...FOSDEM 2022 MySQL Devroom:  MySQL 8.0 - Logical Backups, Snapshots and Point-...
FOSDEM 2022 MySQL Devroom: MySQL 8.0 - Logical Backups, Snapshots and Point-...
 
Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0
 
Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0
 
State of The Dolphin - May 2021
State of The Dolphin - May 2021State of The Dolphin - May 2021
State of The Dolphin - May 2021
 
MySQL Shell for DBAs
MySQL Shell for DBAsMySQL Shell for DBAs
MySQL Shell for DBAs
 
Deploying Magento on OCI with MDS
Deploying Magento on OCI with MDSDeploying Magento on OCI with MDS
Deploying Magento on OCI with MDS
 
MySQL Router REST API
MySQL Router REST APIMySQL Router REST API
MySQL Router REST API
 
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...From single MySQL instance to High Availability: the journey to MySQL InnoDB ...
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...
 

Último

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Fosdem 2014 - MySQL & Friends Devroom: 15 tips galera cluster

  • 1. #Fosdem 2014 #MySQL & Friends #devroom 15 Tips to improve your Galera Cluster Experience
  • 2. Who am I ? Frédéric Descamps “lefred” @lefred http://about.me/lefred Percona Consultant since 2011 Managing MySQL since 3.23 devops believer I installed my first galera cluster in feb 2010 2/5/14
  • 3. Who am I ? Frédéric Descamps “lefred” @lefred http://about.me/lefred Percona Consultant since 2011 Managing MySQL since 3.23 devops believer I installed my first galera cluster in feb 2010 2/5/14
  • 5. 15
  • 6. How to perform point in time recovery ? Binary log must be enabled log­slave­updates should be enabled 2/5/14
  • 9. Suddenly ! Oups ! Dim0 truncated a production table... :-S We can have 2 scenarios : – – 2/5/14 The application can keep running even without that table The application musts be stopped !
  • 10. Suddenly ! Oups ! Dim0 truncated a production table... :-S We can have 2 scenarios : – – 2/5/14 The application can keep running even without that table The application musts be stopped !
  • 11. Suddenly ! Oups ! Dim0 truncated a production table... :-S We can have 2 scenarios : – – 2/5/14 The application can keep running even without that table The application musts be stopped !
  • 12. Suddenly ! Oups ! Dim0 truncated a production table... :-S We can have 2 scenarios : – – 2/5/14 The application can keep running even without that table The application musts be stopped !
  • 13. Scenario 1: application must be stopped ! 2/5/14
  • 14. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster 2/5/14
  • 15. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster 2/5/14
  • 16. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster 2/5/14
  • 17. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster 2/5/14
  • 18. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster /etc/init.d/mysql stop or service mysql stop 2/5/14
  • 19. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster – Find the binlog file and position before “the event” happened 2/5/14
  • 20. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster – Find the binlog file and position before “the event” happened # mysqlbinlog binlog.000001 | grep truncate ­B 2 #140123 23:37:03 server id 1  end_log_pos 1224  Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1390516623/*!*/; truncate table speakers 2/5/14
  • 21. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Stop the each node of the cluster – Find the binlog file and position before “the event” happened – Restore the backup on one node 2/5/14
  • 22. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) # cp binlog.00001 ~ # innobackupex ­­apply­log .   We etc..have binary logs These are the steps : – Stop the each node of the cluster – Find the binlog file and position before “the event” happened – Restore the backup on one node 2/5/14
  • 23. Scenario 1: application must be stopped ! We have Xtrabackup (and it creates daily backups!) # /etc/init.d/mysql bootstrap­pxc We have binary logs These are the steps : – Stop the each node of the cluster – Find the binlog file and position before “the event” happened – Restore the backup on one node – Restart that node (being sure the application doesn't connect to it) 2/5/14
  • 24. Scenario 1: application must be stopped ! (2) 2/5/14
  • 25. Scenario 1: application must be stopped ! (2) – 2/5/14 Replay all the binary logs since the backup BUT the position of the event
  • 26. Scenario 1: application must be stopped ! (2) – Replay all the binary logs since the backup BUT the position of the event # cat xtrabackup_binlog_info Binlog.000001   565 # mysqlbinlog binlog.000001 | grep end_log_pos |  grep 1224 ­B 1 #140123 23:36:53 server id 1  end_log_pos 1136  #140123 23:37:03 server id 1  end_log_pos 1224  # mysqlbinlog binlog.000001 ­j 565   ­­stop­position 1136 | mysql 2/5/14
  • 27. Scenario 1: application must be stopped ! (2) 2/5/14
  • 28. Scenario 1: application must be stopped ! (2) – – – 2/5/14 Replay all the binary logs since the backup BUT the position of the event Start other nodes 1 by 1 and let them perform SST Enable connections from the application
  • 29. Scenario 1: application must be stopped ! (2) – – – 2/5/14 Replay all the binary logs since the backup BUT the position of the event Start other nodes 1 by 1 and let them perform SST Enable connections from the application
  • 30. Scenario 1: application must be stopped ! (2) – – – 2/5/14 Replay all the binary logs since the backup BUT the position of the event Start other nodes 1 by 1 and let them perform SST Enable connections from the application
  • 31. Scenario 2: application can keep running 2/5/14
  • 32. Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Take care of quorum (add garbd, change pc.weight, pc.ignore_quorum) – – 2/5/14 Find the binlog file and position before “the event” happened (thank you dim0!) Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)
  • 33. Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Take care of quorum (add garbd, change pc.weight, pc.ignore_quorum) – – 2/5/14 Find the binlog file and position before “the event” happened (thank you dim0!) Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)
  • 34. Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Take care of quorum (add garbd, change pc.weight, pc.ignore_quorum) – – 2/5/14 Find the binlog file and position before “the event” happened (thank you dim0!) Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)
  • 35. Scenario 2: application can keep running We have Xtrabackup (and it creates daily backups!) We have binary logs These are the steps : – Take care of quorum (add garbd, change pc.weight, pc.ignore_quorum) – – 2/5/14 Find the binlog file and position before “the event” happened (thank you dim0!) Remove one node from the cluster (and be sure the app doesn't connect to it, load-balancer...)
  • 36. Scenario 2: application can keep running (2) 2/5/14
  • 37. Scenario 2: application can keep running (2) – – – – – – 2/5/14 Restore the backup on the node we stopped Start mysql without joining the cluster (--wsrepcluster-address=dummy://) Replay the binary log until the position of “the event” Export the table we need (mysqldump) Import it on the cluster Restart mysql on the off-line node and let it perform SST
  • 38. Scenario 2: application can keep running (2) – – – – – – 2/5/14 Restore the backup on the node we stopped Start mysql without joining the cluster (--wsrepcluster-address=dummy://) Replay the binary log until the position of “the event” Export the table we need (mysqldump) Import it on the cluster Restart mysql on the off-line node and let it perform SST
  • 39. Scenario 2: application can keep running (2) – – – – – – 2/5/14 Restore the backup on the node we stopped Start mysql without joining the cluster (--wsrepcluster-address=dummy://) Replay the binary log until the position of “the event” Export the table we need (mysqldump) Import it on the cluster Restart mysql on the off-line node and let it perform SST
  • 40. Scenario 2: application can keep running (2) – – – – – – 2/5/14 Restore the backup on the node we stopped Start mysql without joining the cluster (--wsrepcluster-address=dummy://) Replay the binary log until the position of “the event” Export the table we need (mysqldump) Import it on the cluster Restart mysql on the off-line node and let it perform SST
  • 41. Scenario 2: application can keep running (2) – – – – – – 2/5/14 Restore the backup on the node we stopped Start mysql without joining the cluster (--wsrepcluster-address=dummy://) Replay the binary log until the position of “the event” Export the table we need (mysqldump) Import it on the cluster Restart mysql on the off-line node and let it perform SST
  • 42. Scenario 2: application can keep running (2) – – – – – – 2/5/14 Restore the backup on the node we stopped Start mysql without joining the cluster (--wsrepcluster-address=dummy://) Replay the binary log until the position of “the event” Export the table we need (mysqldump) Import it on the cluster Restart mysql on the off-line node and let it perform SST
  • 43. 14
  • 44. Reduce “donation” time during XtraBackup SST When performing SST with Xtrabackup the donor can still be active by default this is disabled in clustercheck (AVAILABLE_WHEN_DONOR=0) Running Xtrabackup can increase the load (CPU / IO) on the server 2/5/14
  • 45. Reduce “donation” time during XtraBackup SST (2) Using Xtrabackup 2.1 features helps to reduce the time of backup on the donor [mysqld] wsrep_sst_method=xtrabackup­v2 wsrep_sst_auth=root:dim0DidItAgain [sst] streamfmt=xbstream [xtrabackup] compress compact parallel=8 compress­threads=8 2/5/14 rebuild­threads=8 compress & compact can reduce the size of payload transferred among nodes but in general it slows down the process
  • 46. 13
  • 47. Move asynchronous slave to a new master in 5.5 binlog.000039 102 writes B A binlog.000002 102 as yn c writes master_host=A master_log_file=binlog.000002 master_pos=102 2/5/14 C writes binlog.000001 402
  • 48. Move asynchronous slave to a new master in 5.5 (2) binlog.000039 102 writes B writes as yn c C master_host=B master_log_file=binlog.000002 master_pos=102 2/5/14 writes binlog.000001 402
  • 49. Move asynchronous slave to a new master in 5.5 (2) binlog.000039 102 writes B writes as yn c C master_host=B master_log_file=binlog.000002 master_pos=102 2/5/14 writes binlog.000001 402
  • 50. Move asynchronous slave to a new master in 5.5 (3) 2/5/14
  • 51. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? Find the last received Xid in the relay log on the async slave (using mysqlbinlog) 2/5/14
  • 52. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? Find the last received Xid in the relay log on the async slave (using mysqlbinlog) 2/5/14
  • 53. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? Find the last received Xid in the relay log on the async slave (using mysqlbinlog) # mysqlbinlog  percona4­relay­bin.000002 | tail MjM5NDMxMDMxOTEtNTI4NzYxMTUxMDctMTM3NTAyNTI2NjUtNTc1ODY3MTc0MTg= '/*!*/; # at 14611057 #140131 12:48:12 server id 1  end_log_pos 29105924  Xid = 30097 COMMIT/*!*/; DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; 2/5/14
  • 54. Move asynchronous slave to a new master in 5.5 (3) 2/5/14
  • 55. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? Find the last received Xid in the relay log on the async slave (using mysqlbinlog) Find in the new master which binary position matches that same Xid Use the binary log file and the position for your CHANGE MASTER statement 2/5/14
  • 56. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? Find the last received Xid in the relay log on the async slave (using mysqlbinlog) Find in the new master which binary position matches that same Xid Use the binary log file and the position for your CHANGE MASTER statement 2/5/14
  • 57. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? # mysqlbinlog percona3­bin.000004 | grep 'Xid = 30097' #140131 12:48:12 server id 1  end_log_pos 28911093  Xid = 30097 Find the last received Xid in the relay log on the async slave (using mysqlbinlog) Find in the new master which binary position matches that same Xid Use the binary log file and the position for your CHANGE MASTER statement 2/5/14
  • 58. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? Find the last received Xid in the relay log on the async slave (using mysqlbinlog) Find in the new master which binary position matches that same Xid Use the binary log file and the position for your CHANGE MASTER statement 2/5/14
  • 59. Move asynchronous slave to a new master in 5.5 (3) How can we know which file and position need to be used by the async slave ? Async mysql> slave stop; Async mysql> change master to master_host='percona3',            ­> master_log_file='percona3­bin.000004',           ­> master_log_pos=28911093; Find the last received Xid in the relay log on the async slave (using mysqlbinlog) Async mysql> start slave; Find in the new master which binary position matches that same Xid Use the binary log file and the position for your CHANGE MASTER statement 2/5/14
  • 60. 12
  • 61. Move asynchronous slave to a new master in 5.6 2/5/14
  • 62. Move asynchronous slave to a new master in 5.6 With 5.6 and GTID it's easier ! ... but ... It requires rsync SST (binlogs are needed) Or since Jan 30th wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!! Just change master ;-) 2/5/14
  • 63. Move asynchronous slave to a new master in 5.6 With 5.6 and GTID it's easier ! ... but ... It requires rsync SST (binlogs are needed) Or since Jan 30th wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!! Just change master ;-) 2/5/14
  • 64. Move asynchronous slave to a new master in 5.6 With 5.6 and GTID it's easier ! ... but ... It requires rsync SST (binlogs are needed) Or since Jan 30th wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!! Just change master ;-) 2/5/14
  • 65. Move asynchronous slave to a new master in 5.6 With 5.6 and GTID it's easier ! ... but ... It requires rsync SST (binlogs are needed) Or since Jan 30th wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!! Just change master ;-) 2/5/14
  • 66. Move asynchronous slave to a new master in 5.6 With 5.6 and GTID it's easier ! ... but ... It requires rsync SST (binlogs are needed) Or since Jan 30th wsrep_sst_xtrabackup­v2 supports Xtrabackup 2.1.7 that makes is possible !!! Just change master ;-) 2/5/14
  • 67. 11
  • 68. Allow longer downtime for a node 2/5/14
  • 69. Allow longer downtime for a node When a node goes off-line, when it joins again the cluster, it sends its last replicated event to the donor If the donor can send all next events, IST will be performed (very fast) If not... SST is mandatory 2/5/14
  • 70. Allow longer downtime for a node When a node goes off-line, when it joins again the cluster, it sends its last replicated event to the donor If the donor can send all next events, IST will be performed (very fast) If not... SST is mandatory 2/5/14
  • 71. Allow longer downtime for a node When a node goes off-line, when it joins again the cluster, it sends its last replicated event to the donor If the donor can send all next events, IST will be performed (very fast) If not... SST is mandatory 2/5/14
  • 72. Allow longer downtime for a node (2) 2/5/14
  • 73. Allow longer downtime for a node (2) Those events are stored on a cache on disk: galera.cache The size of the cache is 128Mb by default It can be increased using gcache.size provider option: 2/5/14
  • 74. Allow longer downtime for a node (2) Those events are stored on a cache on disk: galera.cache The size of the cache is 128Mb by default It can be increased using gcache.size provider option: 2/5/14
  • 75. Allow longer downtime for a node (2) Those events are stored on a cache on disk: galera.cache The size of the cache is 128Mb by default It can be increased using gcache.size provider option: 2/5/14
  • 76. Allow longer downtime for a node (2) Those events are stored on a cache on disk: galera.cache The size of the cache is 128Mb by default It can be increased using gcache.size provider option: In /etc/my.cnf: wsrep_provider_options = “gcache.size=1G” 2/5/14
  • 77. 10
  • 78. Then choose the right donor ! Let's imagine this: Event 1 B C A writes 2/5/14 Event 1 Event 1
  • 79. Then choose the right donor ! (2) Let's imagine this: Event 1 Event 2 B C A writes 2/5/14 Event 1 Event 1 Event 2
  • 80. Then choose the right donor ! (3) Let's imagine this: Event 1 Event 2 Join: last event = 1 B C A writes 2/5/14 Event 1 Event 2 Event 1
  • 81. Then choose the right donor ! (4) Let's imagine this: Event 1 Event 2 IST B C A writes 2/5/14 Event 1 Event 2 Event 1
  • 82. Then choose the right donor ! (5) Let's imagine this: Event 1 Event 2 B C A writes 2/5/14 Event 1 Event 2 Event 1 Event 2
  • 83. Then choose the right donor ! (6) Let's imagine this: Event 1 Event 2 Let's format the disk B C A writes 2/5/14 Event 1 Event 2
  • 84. Then choose the right donor ! (7) Let's imagine this: Event 1 Event 2 Join: no cluster info B C A writes 2/5/14 Event 1 Event 2
  • 85. Then choose the right donor ! (8) Full SST needed Event 1 Event 2 SST B C A writes 2/5/14 Event 1 Event 2
  • 86. Then choose the right donor ! (9) This is what we have now: Event 1 Event 2 Event 3 B C A writes 2/5/14 Event 1 Event 2 Event 3 Event 3
  • 87. Then choose the right donor ! (10) Let's remove node B for maintenance Event 1 Event 2 Event 3 B C A writes 2/5/14 Event 1 Event 2 Event 3 Event 4 Event 3 Event 4
  • 88. Then choose the right donor ! (11) Now let's remove node C to replace a disk :-( Event 1 Event 2 Event 3 B C A writes 2/5/14 Event 1 Event 2 Event 3 Event 4 Event 5
  • 89. Then choose the right donor ! (12) Node C joins again and performs SST Event 1 Event 2 Event 3 B C A writes 2/5/14 Event 1 Event 2 Event 3 Event 4 Event 5
  • 90. Then choose the right donor ! (12) Node C joins again and performs SST Event 1 Event 2 Event 3 B C A writes 2/5/14 Event 1 Event 2 Event 3 Event 4 Event 5 Event 6 Event 6
  • 91. Then choose the right donor ! (13) Node B joins again but donor selection is not clever yet... Event 1 Event 2 Event 3 Join: last event = 3 B C A writes 2/5/14 Event 1 Event 2 Event 3 Event 4 Event 5 Event 6 Event 6
  • 92. Then choose the right donor ! (13) Node B joins again but donor selection is not clever yet... Event 1 Event 2 Event 3 Join: last event = 3 B SST will be needed ! C A writes 2/5/14 Event 1 Event 2 Event 3 Event 4 Event 5 Event 6 Event 6
  • 93. Then choose the right donor ! (14) So how to tell node B that it needs to use node A? Event 1 Event 2 Event 3 Join: last event = 3 B # /etc/init.d/mysql start ­­wsrep­sst_donor=nodeA C A writes 2/5/14 Event 1 Event 2 Event 3 Event 4 Event 5 Event 6 Event 6
  • 94. Then choose the right donor ! (15) 2/5/14
  • 95. Then choose the right donor ! (15) With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto To know the latest event's sequence number on the node that joins the cluster, you have two possibilities: 2/5/14
  • 96. Then choose the right donor ! (15) With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto To know the latest event's sequence number on the node that joins the cluster, you have two possibilities: 2/5/14
  • 97. Then choose the right donor ! (15) With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto To know the latest event's sequence number on the node that joins the cluster, you have two possibilities: # cat grasdate.dat # GALERA saved state version: 2.1 uuid:    41920174­7ec6­11e3­a05a­6a2ab4033f05 seqno:   11 cert_index: 2/5/14
  • 98. Then choose the right donor ! (15) With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto To know the latest event's sequence number on the node that joins the cluster, you have two possibilities: # cat grasdate.dat # GALERA saved state version: 2.1 uuid:    41920174­7ec6­11e3­a05a­6a2ab4033f05 seqno:   11 cert_index: 2/5/14
  • 99. Then choose the right donor ! (15) With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto To know the latest event's sequence number on the node that joins the cluster, you have two possibilities: # mysqld_safe ­­wsrep­recover 140124 10:46:32 mysqld_safe Logging to '/var/lib/mysql/percona1_error.log'. 140124 10:46:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 140124 10:46:32 mysqld_safe Skipping wsrep­recover  for 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 pair 140124 10:46:32 mysqld_safe Assigning 41920174­7ec6­11e3­a05a­6a2ab4033f05:11  to wsrep_start_position 140124 10:46:34 mysqld_safe mysqld from pid file /var/lib/mysql/percona1.pid ended 2/5/14
  • 100. Then choose the right donor ! (15) With 5.6 you have now the possibility to know the lowest sequence number in gcache using wsrep_local_cached_downto To know the latest event's sequence number on the node that joins the cluster, you have two possibilities: # mysqld_safe ­­wsrep­recover 140124 10:46:32 mysqld_safe Logging to '/var/lib/mysql/percona1_error.log'. 140124 10:46:32 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 140124 10:46:32 mysqld_safe Skipping wsrep­recover  for 41920174­7ec6­11e3­a05a­6a2ab4033f05:11 pair 140124 10:46:32 mysqld_safe Assigning 41920174­7ec6­11e3­a05a­6a2ab4033f05:11  to wsrep_start_position 140124 10:46:34 mysqld_safe mysqld from pid file /var/lib/mysql/percona1.pid ended 2/5/14
  • 101. 9
  • 102. Measuring Max Replication Throughput 2/5/14
  • 103. Measuring Max Replication Throughput Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min) 2/5/14
  • 104. Measuring Max Replication Throughput Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min) 2/5/14
  • 105. Measuring Max Replication Throughput Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min) mysql> pager grep wsrep mysql> show global status like 'wsrep_last_committed';      ­> select sleep(60);      ­> show global status like 'wsrep_last_committed'; | wsrep_last_committed | 61472 | | wsrep_last_committed | 69774 | 2/5/14
  • 106. Measuring Max Replication Throughput Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate The process is to collect the amount of transactions (events) during peak69774 – 61472 = 8302 time for a define 8302 / 60 = 138.36 tps time range (let's take 1 min) mysql> pager grep wsrep mysql> show global status like 'wsrep_last_committed';      ­> select sleep(60);      ­> show global status like 'wsrep_last_committed'; | wsrep_last_committed | 61472 | | wsrep_last_committed | 69774 | 2/5/14
  • 107. Measuring Max Replication Throughput 2/5/14
  • 108. Measuring Max Replication Throughput Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min) Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes In desync mode, the node doesn't sent flow control messages to the cluster 2/5/14
  • 109. Measuring Max Replication Throughput Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min) Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes In desync mode, the node doesn't sent flow control messages to the cluster 2/5/14
  • 110. Measuring Max Replication Throughput Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min) Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes In desync mode, the node doesn't sent flow control messages to the cluster 2/5/14
  • 111. Measuring Max Replication Throughput set global wsrep_desync=ON; flush tables with read lock;  show global status like 'wsrep_last_committed';  select sleep( 60 ); unlock tables; Since (5.5.33) wsrep_desync can be used to find out how fast a node can replicate +­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+ | Variable_name        | Value  | +­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+ | wsrep_last_committed | 145987 | +­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­+ The process is to collect the amount of transactions (events) during peak time for a define time range (let's take 1 min) Then collect the amount of transactions and the duration to process them after the node was in desync mode and not allowing writes In desync mode, the node doesn't sent flow control messages to the cluster 2/5/14
  • 112. Measuring Max Replication Throughput In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue  Dn) is back to 0 check again the value of wsrep_last_committed. 2/5/14
  • 113. Measuring Max Replication Throughput In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue  Dn) is back to 0 check again the value of wsrep_last_committed. LefredPXC / percona3 / Galera 2.8(r165) Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit      time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind 13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0 13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2 ... 13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1 13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1 2/5/14
  • 114. Measuring Max Replication Throughput In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue  Dn) is back to 0 check again the value of wsrep_last_committed. This is when FTWRL  is released LefredPXC / percona3 / Galera 2.8(r165) Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit      time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind 13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0 13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2 ... 13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1 13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1 2/5/14
  • 115. Measuring Max Replication Throughput In another terminal you run myq_gadget and when wsrep_local_recv_queue (Queue  This is when galera Dn) is back to 0 check again the value of catch up. wsrep_last_committed = 165871 wsrep_last_committed. This is when FTWRL  is released LefredPXC / percona3 / Galera 2.8(r165) Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit      time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind 13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0 13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2 ... 13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1 13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1 2/5/14
  • 116. Measuring Max Replication Throughput In another terminal you run myq_gadget and 165871 ­ 145987  = 19884 when wsrep_local_recv_queue (Queue  19884 / 82 = 242.48 tps Dn) is back toWe're currently at 57% value of 0 check again the of our capacity wsrep_last_committed. LefredPXC / percona3 / Galera 2.8(r165) Wsrep    Cluster  Node     Queue   Ops     Bytes     Flow    Conflct PApply        Commit      time P cnf  #  cmt sta  Up  Dn  Up  Dn   Up   Dn pau snt lcf bfa dst oooe oool wind 13:25:24 P   7  3 Dono T/T   0  8k   0   0    0    0 0.0   0   0   0 125    0    0    0 13:25:25 P   7  3 Dono T/T   0  8k   0 197    0 300K 0.0   0   0   0 145   90    0    2 ... 13:26:46 P   7  3 Dono T/T   0   7   0 209    0 318K 0.0   0   0   0 139   62    0    1 13:26:47 P   7  3 Dono T/T   0   0   0 148    0 222K 0.0   0   0   0 140   40    0    1 2/5/14
  • 117. 8
  • 118. Multicast replication By default, galera uses unicast TCP 1 copy of the replication message sent to all other nodes in the cluster 2/5/14
  • 119. Multicast replication (2) By default, galera uses unicast TCP 1 copy of the replication message sent to all other nodes in the cluster More nodes, more bandwidth 2/5/14
  • 121. Multicast replication (3) If your network supports it you can use Multicast UDP for replication wsrep_provider_options =  “gmacast.mcast_addr =  239.192.0.11” wsrep_cluster_cluster_addr ess = gcomm://239.192.0.11 2/5/14
  • 122. Multicast replication (3) If your network supports it you can use Multicast UDP for replication wsrep_provider_options =  “gmacast.mcast_addr =  239.192.0.11” wsrep_cluster_cluster_addr ess = gcomm://239.192.0.11 2/5/14
  • 123. Multicast replication (3) If your network supports it you can use Multicast UDP for replication wsrep_provider_options =  “gmacast.mcast_addr =  239.192.0.11” wsrep_cluster_cluster_addr ess = gcomm://239.192.0.11 2/5/14
  • 124. 7
  • 126. SSL everywhere ! It's possible to have the Galera replication encrypted via SSL But now it's also possible to have SST over SSL, with xtrabackup_v2 and with rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 127. SSL everywhere ! It's possible to have the Galera replication encrypted via SSL But now it's also possible to have SST over SSL, with xtrabackup_v2 and with rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 128. SSL everywhere ! It's possible to have the Galera replication encrypted via SSL But now it's also possible to have SST over SSL, with xtrabackup_v2 and with rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 129. SSL everywhere : certs creation 2/5/14
  • 130. SSL everywhere : certs creation openssl req ­new ­x500 ­days  365000 ­nodes ­keyout key.pem  ­out cert.pem Same cert and key must be copied on all nodes Copy them in /etc/mysql for example and let only mysql read them 2/5/14
  • 131. SSL everywhere : certs creation openssl req ­new ­x500 ­days  365000 ­nodes ­keyout key.pem  ­out cert.pem Same cert and key must be copied on all nodes Copy them in /etc/mysql for example and let only mysql read them 2/5/14
  • 132. SSL everywhere : certs creation openssl req ­new ­x500 ­days  365000 ­nodes ­keyout key.pem  ­out cert.pem Same cert and key must be copied on all nodes Copy them in /etc/mysql for example and let only mysql read them 2/5/14
  • 133. SSL everywhere : galera configuration 2/5/14
  • 134. SSL everywhere : galera configuration wsrep_provider_options =  “socket.ssl.cert=/etc/mysql/cert.pem;  socket.ssl_key=/etc/mysql/key.pem” It's possible to set a remote Certificate Authority for validation (use socket.ssl_ca) All nodes must have SSL enabled 2/5/14
  • 135. SSL everywhere : galera configuration wsrep_provider_options =  “socket.ssl.cert=/etc/mysql/cert.pem;  socket.ssl_key=/etc/mysql/key.pem” It's possible to set a remote Certificate Authority for validation (use socket.ssl_ca) All nodes must have SSL enabled 2/5/14
  • 136. SSL everywhere : galera configuration wsrep_provider_options =  “socket.ssl.cert=/etc/mysql/cert.pem;  socket.ssl_key=/etc/mysql/key.pem” It's possible to set a remote Certificate Authority for validation (use socket.ssl_ca) All nodes must have SSL enabled 2/5/14
  • 137. SSL everywhere : SST configuration 2/5/14
  • 138. SSL everywhere : SST configuration As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST Use wsrep_sst_method=xtrabackup­v2 [sst] tkey=/etc/mysql/key.pem tcert=/etc/mysql/cert.pem encrypt=3 2/5/14
  • 139. SSL everywhere : SST configuration As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST Use wsrep_sst_method=xtrabackup­v2 [sst] tkey=/etc/mysql/key.pem tcert=/etc/mysql/cert.pem encrypt=3 2/5/14
  • 140. SSL everywhere : SST configuration As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST Use wsrep_sst_method=xtrabackup­v2 [sst] tkey=/etc/mysql/key.pem tcert=/etc/mysql/cert.pem encrypt=3 2/5/14
  • 141. SSL everywhere : SST configuration As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST Use wsrep_sst_method=xtrabackup­v2 [sst] tkey=/etc/mysql/key.pem tcert=/etc/mysql/cert.pem encrypt=3 2/5/14
  • 142. SSL everywhere : SST configuration As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST Use wsrep_sst_method=xtrabackup­v2 [sst] tkey=/etc/mysql/key.pem tcert=/etc/mysql/cert.pem encrypt=3 2/5/14
  • 143. SSL everywhere : SST configuration As Xtrabackup 2.1 supports encryption, it's now also possible to use SSL for SST Use wsrep_sst_method=xtrabackup­v2 [sst] tkey=/etc/mysql/key.pem tcert=/etc/mysql/cert.pem encrypt=3 2/5/14
  • 144. SSL everywhere : SST configuration 2/5/14
  • 145. SSL everywhere : SST configuration And for those using rsync ? galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat. Uses also the same cert and key file wsrep_sst_method=secure_rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 146. SSL everywhere : SST configuration And for those using rsync ? galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat. Uses also the same cert and key file wsrep_sst_method=secure_rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 147. SSL everywhere : SST configuration And for those using rsync ? galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat. Uses also the same cert and key file wsrep_sst_method=secure_rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 148. SSL everywhere : SST configuration And for those using rsync ? galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat. Uses also the same cert and key file wsrep_sst_method=secure_rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 149. SSL everywhere : SST configuration And for those using rsync ? galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat. Uses also the same cert and key file wsrep_sst_method=secure_rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 150. SSL everywhere : SST configuration And for those using rsync ? galera­secure­rsync acts like wsrep_sst_rsync but secures the communication with SSL using socat. Uses also the same cert and key file wsrep_sst_method=secure_rsync https://github.com/tobz/galera-secure-rsync 2/5/14
  • 151. 6
  • 153. Decode GRA* files When a replication failure occurs, a GRA_*.log file is created into the datadir For each of those files, a corresponding message is present in the mysql error log file Can be a false positive (bad DDL statement)... or not ! This is how you can decode the content of that file 2/5/14
  • 154. Decode GRA* files When a replication failure occurs, a GRA_*.log file is created into the datadir For each of those files, a corresponding message is present in the mysql error log file Can be a false positive (bad DDL statement)... or not ! This is how you can decode the content of that file 2/5/14
  • 155. Decode GRA* files When a replication failure occurs, a GRA_*.log file is created into the datadir For each of those files, a corresponding message is present in the mysql error log file Can be a false positive (bad DDL statement)... or not ! This is how you can decode the content of that file 2/5/14
  • 156. Decode GRA* files When a replication failure occurs, a GRA_*.log file is created into the datadir For each of those files, a corresponding message is present in the mysql error log file Can be a false positive (bad DDL statement)... or not ! This is how you can decode the content of that file 2/5/14
  • 157. Decode GRA* files (2) 2/5/14
  • 158. Decode GRA* files (2) Download a binlog header file (http://goo.gl/kYTkY2) Join the header and one GRA_*.log file: – cat GRA­header > GRA_3_3­bin.log – cat GRA_3_3.log  >> GRA_3_3­bin.log Now you can just use mysqlbinlog ­vvv and find out what the problem was ! 2/5/14
  • 159. Decode GRA* files (2) Download a binlog header file (http://goo.gl/kYTkY2) Join the header and one GRA_*.log file: – cat GRA­header > GRA_3_3­bin.log – cat GRA_3_3.log  >> GRA_3_3­bin.log Now you can just use mysqlbinlog ­vvv and find out what the problem was ! 2/5/14
  • 160. Decode GRA* files (2) Download a binlog header file (http://goo.gl/kYTkY2) Join the header and one GRA_*.log file: – cat GRA­header > GRA_3_3­bin.log – cat GRA_3_3.log  >> GRA_3_3­bin.log Now you can just use mysqlbinlog ­vvv and find out what the problem was ! 2/5/14
  • 161. Decode GRA* files (2) Download a binlog header file (http://goo.gl/kYTkY2) Join the header and one GRA_*.log file: – cat GRA­header > GRA_3_3­bin.log – cat GRA_3_3.log  >> GRA_3_3­bin.log Now you can just use mysqlbinlog ­vvv and find out what the problem was ! 2/5/14
  • 162. Decode GRA* files (2) Download a binlog header file (http://goo.gl/kYTkY2) Join the header and one GRA_*.log file: – cat GRA­header > GRA_3_3­bin.log – cat GRA_3_3.log  >> GRA_3_3­bin.log Now you can just use mysqlbinlog ­vvv and find out what the problem was ! 2/5/14
  • 163. Decode GRA* files (2) Download a binlog header file (http://goo.gl/kYTkY2) Join the header and one GRA_*.log file: – cat GRA­header > GRA_3_3­bin.log – cat GRA_3_3.log  >> GRA_3_3­bin.log Now you can just use mysqlbinlog ­vvv and find out what the problem was ! wsrep_log_conflicts = 1 wsrep_debug = 1 wsrep_provider_options = “cert.log_conflicts=1” 2/5/14
  • 164. 5
  • 165. Avoiding SST when adding a new node 2/5/14
  • 166. Avoiding SST when adding a new node It's possible to use a backup to prepare a new node. Those are the 3 prerequisites: – – – 2/5/14 use XtraBackup >= 2.0.1 the backup needs to be performed with ­­galera­info the gcache must be large enough
  • 167. Avoiding SST when adding a new node It's possible to use a backup to prepare a new node. Those are the 3 prerequisites: – – – 2/5/14 use XtraBackup >= 2.0.1 the backup needs to be performed with ­­galera­info the gcache must be large enough
  • 168. Avoiding SST when adding a new node It's possible to use a backup to prepare a new node. Those are the 3 prerequisites: – – – 2/5/14 use XtraBackup >= 2.0.1 the backup needs to be performed with ­­galera­info the gcache must be large enough
  • 169. Avoiding SST when adding a new node It's possible to use a backup to prepare a new node. Those are the 3 prerequisites: – – – 2/5/14 use XtraBackup >= 2.0.1 the backup needs to be performed with ­­galera­info the gcache must be large enough
  • 170. Avoiding SST when adding a new node It's possible to use a backup to prepare a new node. Those are the 3 prerequisites: – – – 2/5/14 use XtraBackup >= 2.0.1 the backup needs to be performed with ­­galera­info the gcache must be large enough
  • 171. Avoiding SST when adding a new node (2) 2/5/14
  • 172. Avoiding SST when adding a new node (2) Restore the backup on the new node Display the content of xtrabackup_galera_info: 5f22b204­dc6b­11e1­0800­7a9c9624dd66:23 Create the file called grastate.dat like this: #GALERA saved state version: 2.1 uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66 seqno: 23  cert_index: 2/5/14
  • 173. Avoiding SST when adding a new node (2) Restore the backup on the new node Display the content of xtrabackup_galera_info: 5f22b204­dc6b­11e1­0800­7a9c9624dd66:23 Create the file called grastate.dat like this: #GALERA saved state version: 2.1 uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66 seqno: 23  cert_index: 2/5/14
  • 174. Avoiding SST when adding a new node (2) Restore the backup on the new node Display the content of xtrabackup_galera_info: 5f22b204­dc6b­11e1­0800­7a9c9624dd66:23 Create the file called grastate.dat like this: #GALERA saved state version: 2.1 uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66 seqno: 23  cert_index: 2/5/14
  • 175. Avoiding SST when adding a new node (2) Restore the backup on the new node Display the content of xtrabackup_galera_info: 5f22b204­dc6b­11e1­0800­7a9c9624dd66:23 Create the file called grastate.dat like this: #GALERA saved state version: 2.1 uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66 seqno: 23  cert_index: 2/5/14
  • 176. Avoiding SST when adding a new node (2) Restore the backup on the new node Display the content of xtrabackup_galera_info: 5f22b204­dc6b­11e1­0800­7a9c9624dd66:23 Create the file called grastate.dat like this: #GALERA saved state version: 2.1 uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66 seqno: 23  cert_index: 2/5/14
  • 177. Avoiding SST when adding a new node (2) Restore the backup on the new node Display the content of xtrabackup_galera_info: 5f22b204­dc6b­11e1­0800­7a9c9624dd66:23 mysql> show global status like 'wsrep_provider_version'; +­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | Variable_name          | Value     | +­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ | wsrep_provider_version | 2.1(r113) | +­­­­­­­­­­­­­­­­­­­­­­­­+­­­­­­­­­­­+ Create the file called grastate.dat like this: #GALERA saved state version: 2.1 uuid:5f22b204­dc6b­11e1­0800­7a9c9624dd66 seqno: 23  cert_index: 2/5/14
  • 178. 4
  • 179. Play with quorum and weight 2/5/14
  • 180. Play with quorum and weight Galera manages Quorum If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted Split brain is prevented This requires at least 3 nodes to work properly Can be disabled (but be warned!) You can cheat ;-) 2/5/14
  • 181. Play with quorum and weight Galera manages Quorum If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted Split brain is prevented This requires at least 3 nodes to work properly Can be disabled (but be warned!) You can cheat ;-) 2/5/14
  • 182. Play with quorum and weight Galera manages Quorum If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted Split brain is prevented This requires at least 3 nodes to work properly Can be disabled (but be warned!) You can cheat ;-) 2/5/14
  • 183. Play with quorum and weight Galera manages Quorum If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted Split brain is prevented This requires at least 3 nodes to work properly Can be disabled (but be warned!) You can cheat ;-) 2/5/14
  • 184. Play with quorum and weight Galera manages Quorum If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted Split brain is prevented This requires at least 3 nodes to work properly Can be disabled (but be warned!) You can cheat ;-) 2/5/14
  • 185. Play with quorum and weight Galera manages Quorum If a node does not see more than 50% of the total amount of nodes, reads/writes are not accepted Split brain is prevented This requires at least 3 nodes to work properly Can be disabled (but be warned!) You can cheat ;-) 2/5/14
  • 186. Quorum: lost of connectivity 2/5/14
  • 187. Quorum: lost of connectivity Network Problem 2/5/14
  • 188. Quorum: lost of connectivity Network Problem Does not accept Reads & Writes 2/5/14
  • 189. Quorum: lost of connectivity 2/5/14
  • 190. Quorum: lost of connectivity Network Problem 2/5/14
  • 191. Quorum: lost of connectivity Network Problem 2/5/14
  • 192. Quorum: lost of connectivity Network Problem 2/5/14
  • 194. Quorum: arbitrator It's possible to use an arbitrator (garbd) to play an extra node. All traffic will pass through it but it won't have any MySQL running. Useful in case of storage available only for 2 nodes or if you have an even amount of nodes. Odd number of nodes is always advised 2/5/14
  • 195. Quorum: arbitrator It's possible to use an arbitrator (garbd) to play an extra node. All traffic will pass through it but it won't have any MySQL running. Useful in case of storage available only for 2 nodes or if you have an even amount of nodes. Odd number of nodes is always advised 2/5/14
  • 196. Quorum: arbitrator It's possible to use an arbitrator (garbd) to play an extra node. All traffic will pass through it but it won't have any MySQL running. Useful in case of storage available only for 2 nodes or if you have an even amount of nodes. Odd number of nodes is always advised 2/5/14
  • 198. Quorum: cheat ! You can disable quorum but watch out ! (you have been warned): wsrep_provider_options = “pc.ignore_quorum=true” You can define the weigth of a node to affect the quorum calculation (default is 1): wsrep_provider_options = “pc.weight=1” 2/5/14
  • 199. Quorum: cheat ! You can disable quorum but watch out ! (you have been warned): wsrep_provider_options = “pc.ignore_quorum=true” You can define the weigth of a node to affect the quorum calculation (default is 1): wsrep_provider_options = “pc.weight=1” 2/5/14
  • 200. Quorum: cheat ! You can disable quorum but watch out ! (you have been warned): wsrep_provider_options = “pc.ignore_quorum=true” You can define the weigth of a node to affect the quorum calculation (default is 1): wsrep_provider_options = “pc.weight=1” 2/5/14
  • 201. Quorum: cheat ! You can disable quorum but watch out ! (you have been warned): wsrep_provider_options = “pc.ignore_quorum=true” You can define the weigth of a node to affect the quorum calculation (default is 1): wsrep_provider_options = “pc.weight=1” 2/5/14
  • 202. Quorum: cheat ! You can disable quorum but watch out ! (you have been warned): wsrep_provider_options = “pc.ignore_quorum=true” You can define the weigth of a node to affect the quorum calculation (default is 1): wsrep_provider_options = “pc.weight=1” 2/5/14
  • 203. 3
  • 204. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 205. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 206. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 207. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 208. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 209. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 210. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 211. How to optimize WAN replication? Galera 2 requires all point-to-point connections for replication datacenter A 2/5/14 datacenter B
  • 212. How to optimize WAN replication? (2) Galera 3 brings the notion of “cluster segments” datacenter A 2/5/14 datacenter B
  • 213. How to optimize WAN replication? (3) Segments gateways can change per transaction datacenter A 2/5/14 datacenter B
  • 214. How to optimize WAN replication? (3) Replication traffic between segments is mimized. Writesets are relayed to the other segment through one node commit WS datacenter A 2/5/14 datacenter B
  • 215. How to optimize WAN replication? (3) Replication traffic between segments is mimized. Writesets are relayed to the other segment through one node commit WS datacenter A 2/5/14 WS WS datacenter B
  • 216. How to optimize WAN replication? (4) From those local relays replication is propagated to every nodes in the segment commit WS datacenter A 2/5/14 datacenter B WS
  • 217. How to optimize WAN replication? (4) From those local relays replication is propagated to every nodes in the segment commit gmcasts.segment = 1...255 WS datacenter A 2/5/14 datacenter B WS
  • 218. 2
  • 220. Load balancers Galera is generally used in combination with a load balancer The most used is HA Proxy Codership provides one with Galera: glbd 2/5/14
  • 221. Load balancers Galera is generally used in combination with a load balancer The most used is HA Proxy Codership provides one with Galera: glbd 2/5/14
  • 222. Load balancers Galera is generally used in combination with a load balancer The most used is HA Proxy Codership provides one with Galera: glbd 2/5/14
  • 223. Load balancers Galera is generally used in combination with a load balancer The most used is HA Proxy Codership provides one with Galera: glbd app 1 app 2 app 3 HA PROXY 2/5/14 node 1 node 2 node 3
  • 224. Load balancers: myths, legends and reality 2/5/14
  • 225. Load balancers: myths, legends and reality TIME_WAIT – On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state – This can leas to a TCP port exhaustion ! How to fix ? – – 2/5/14 Use nolinger option in HA Proxy (for glbd check http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast Modify the value of tcp_max_tw_buckets
  • 226. Load balancers: myths, legends and reality TIME_WAIT – On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state – This can leas to a TCP port exhaustion ! How to fix ? – – 2/5/14 Use nolinger option in HA Proxy (for glbd check http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast Modify the value of tcp_max_tw_buckets
  • 227. Load balancers: myths, legends and reality TIME_WAIT – On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state – This can leas to a TCP port exhaustion ! How to fix ? – – 2/5/14 Use nolinger option in HA Proxy (for glbd check http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast Modify the value of tcp_max_tw_buckets
  • 228. Load balancers: myths, legends and reality TIME_WAIT – On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state – This can leas to a TCP port exhaustion ! How to fix ? – – 2/5/14 Use nolinger option in HA Proxy (for glbd check http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast Modify the value of tcp_max_tw_buckets
  • 229. Load balancers: myths, legends and reality TIME_WAIT – On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state – This can leas to a TCP port exhaustion ! How to fix ? – – 2/5/14 Use nolinger option in HA Proxy (for glbd check http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast Modify the value of tcp_max_tw_buckets
  • 230. Load balancers: myths, legends and reality TIME_WAIT – On heavy load, you may have an issue with a large amount of TCP connections in TIME_WAIT state – This can leas to a TCP port exhaustion ! How to fix ? – – 2/5/14 Use nolinger option in HA Proxy (for glbd check http://www.lefred.be/node/168), but this lead to an increase of Aborted_clients is the client is connecting and disconnecting to MySQL too fast Modify the value of tcp_max_tw_buckets
  • 231. Load balancers: common issues Persitent Connections – Many people expects the following scenario: app 1 app 2 HA PROXY node 1 2/5/14 node 2 node 3 app 3
  • 232. Load balancers: common issues Persitent Connections – When the node that was specified to receive the persistent write fails for example app 1 app 2 HA PROXY node 1 2/5/14 node 2 node 3 app 3
  • 233. Load balancers: common issues Persitent Connections – When the node is back on-line... app 1 app 2 HA PROXY node 1 2/5/14 node 2 node 3 app 3
  • 234. Load balancers: common issues Persitent Connections – Only the new connections will use again the preferred node app 1 app 2 HA PROXY node 1 2/5/14 node 2 node 3 app 3
  • 235. Load balancers: common issues Persitent Connections – Only the new connections will use again the preferred node app 1 app 2 HA PROXY node 1 2/5/14 node 2 node 3 app 3
  • 236. Load balancers: common issues Persitent Connections – Only the new connections will use again the preferred node app 1 app 2 HA PROXY node 1 2/5/14 node 2 node 3 app 3
  • 237. Load balancers: common issues 2/5/14
  • 238. Load balancers: common issues Persitent Connections – – HA Proxy decides where the connection will go at TCP handshake Once the TCP session is established, the sessions will stay where they are ! Solution ? – With HA Proxy 1.5 you can now specify the following option : on­marked­down shutdown­sessions 2/5/14
  • 239. Load balancers: common issues Persitent Connections – – HA Proxy decides where the connection will go at TCP handshake Once the TCP session is established, the sessions will stay where they are ! Solution ? – With HA Proxy 1.5 you can now specify the following option : on­marked­down shutdown­sessions 2/5/14
  • 240. Load balancers: common issues Persitent Connections – – HA Proxy decides where the connection will go at TCP handshake Once the TCP session is established, the sessions will stay where they are ! Solution ? – With HA Proxy 1.5 you can now specify the following option : on­marked­down shutdown­sessions 2/5/14
  • 241. Load balancers: common issues Persitent Connections – – HA Proxy decides where the connection will go at TCP handshake Once the TCP session is established, the sessions will stay where they are ! Solution ? – With HA Proxy 1.5 you can now specify the following option : on­marked­down shutdown­sessions 2/5/14
  • 242. Load balancers: common issues Persitent Connections – – HA Proxy decides where the connection will go at TCP handshake Once the TCP session is established, the sessions will stay where they are ! Solution ? – With HA Proxy 1.5 you can now specify the following option : on­marked­down shutdown­sessions 2/5/14
  • 243. Load balancers: common issues Persitent Connections – – HA Proxy decides where the connection will go at TCP handshake Once the TCP session is established, the sessions will stay where they are ! Solution ? – With HA Proxy 1.5 you can now specify the following option : on­marked­down shutdown­sessions 2/5/14
  • 244. 1
  • 245. Taking backups without stalls 2/5/14
  • 246. Taking backups without stalls When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL) By default even with Xtrabackup This causes a Flow Control in galera So how can we deal with that ? 2/5/14
  • 247. Taking backups without stalls When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL) By default even with Xtrabackup This causes a Flow Control in galera So how can we deal with that ? 2/5/14
  • 248. Taking backups without stalls When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL) By default even with Xtrabackup This causes a Flow Control in galera So how can we deal with that ? 2/5/14
  • 249. Taking backups without stalls When you want to perform a consistent backup, you need to take a FLUSH TABLES WITH READ LOCK (FTWRL) By default even with Xtrabackup This causes a Flow Control in galera So how can we deal with that ? 2/5/14
  • 250. Taking backups without stalls 2/5/14
  • 251. Taking backups without stalls Choose the node from which you want to take the backup Change the state to 'Donor/Desynced' (see tip 9) set global wsrep_desync=ON Take the backup Wait that wsrep_local_recv_queue is back down to 0 Change back the state to 'Joined' set global wsrep_desync=OFF 2/5/14
  • 252. Taking backups without stalls Choose the node from which you want to take the backup Change the state to 'Donor/Desynced' (see tip 9) set global wsrep_desync=ON Take the backup Wait that wsrep_local_recv_queue is back down to 0 Change back the state to 'Joined' set global wsrep_desync=OFF 2/5/14
  • 253. Taking backups without stalls Choose the node from which you want to take the backup Change the state to 'Donor/Desynced' (see tip 9) set global wsrep_desync=ON Take the backup Wait that wsrep_local_recv_queue is back down to 0 Change back the state to 'Joined' set global wsrep_desync=OFF 2/5/14
  • 254. Taking backups without stalls Choose the node from which you want to take the backup Change the state to 'Donor/Desynced' (see tip 9) set global wsrep_desync=ON Take the backup Wait that wsrep_local_recv_queue is back down to 0 Change back the state to 'Joined' set global wsrep_desync=OFF 2/5/14
  • 255. Taking backups without stalls Choose the node from which you want to take the backup Change the state to 'Donor/Desynced' (see tip 9) set global wsrep_desync=ON Take the backup Wait that wsrep_local_recv_queue is back down to 0 Change back the state to 'Joined' set global wsrep_desync=OFF 2/5/14
  • 256. Taking backups without stalls Choose the node from which you want to take the backup Change the state to 'Donor/Desynced' (see tip 9) set global wsrep_desync=ON Take the backup Wait that wsrep_local_recv_queue is back down to 0 Change back the state to 'Joined' set global wsrep_desync=OFF 2/5/14
  • 257. Taking backups without stalls Choose the node from which you want to take the backup Change the state to 'Donor/Desynced' (see tip 9) set global wsrep_desync=ON Take the backup Wait that wsrep_local_recv_queue is back down to 0 Change back the state to 'Joined' set global wsrep_desync=OFF 2/5/14
  • 258. 0
  • 259. GO !