MariaDB provides the solution to ease Multi-Source Replication aimed to show up the main characteristics of the
feature that was lunched together with MariaDB 10.0.1.
3. Europe Roadshow
Locations in May & June
Berlin
May 18
Munich
May 11
Bonn
June 30
Amsterdam
May 23
Paris
June 30
Madrid
May 18
Milan
May 11
Details and registration:
http://go.mariadb.com/Europe-Roadshow-LP.html
More dates and locations will be announced soon!
4. Presenter
Wagner Bianchi or just Bianchi, as he likes to be called, is a MySQL DBA for
more than 10 years, versed in Database Operations and Services, what he called
to be the Database War or just DBOps, he has been working with some of the
biggest companies in the world with focus centered in Data-Infrastructure, Performance,
Scale-Out and HA for MySQL and its ecosystem. Before working at MariaDB as a
Principal Remote DBA, Bianchi worked at Percona, Pythian, Splunk, Oracle and IBM.
Additionally, Bianchi is an Oracle ACE Director since 2014.
Email: wagner.bianchi@mariadb.com
LinkedIn: https://www.linkedin.com/in/wagnerbianchi/
Twitter: @wagnerbianchijr
4
5. Webinar’s Agenda
• The raise of Multi-Source Replication on MariaDB 10.0.1;
• How it works and what is the main features added to MariaDB;
• What are the differentials on MariaDB Multi-Source Replication;
• Benefitting from Multi-Source Replication and Parallel Threads;
• Talking about cases and scenarios of Multi-Source Replication;
• Features evolving with MariaDB releases and versions;
• Questions and Answers.
5
7. Multi-Source Replication
• Typical use cases described by the manual:
• Centralize shard’s data
• Get all data altogether to a backup online
• To read/write on multi-located database instances
7
8. Multi-Source Replication
• With this new feature, MariaDB giver support to:
• A replica can replicate from many masters;
• Each replication is named chosen by the administrator;
• Each named connection has its own structure to handle replication;
• Each one will create those the two regular threads IO and SQL_THREAD;
• The administrator can chose which named connections to work with;
• set default_master_connection="myNamedConnection101";
• Support of variables being set per named connection;
• sql_slave_skip_counter supported when running single_threaded replication;
• Many new additions to existing commands to better manage connection names;
• One can create up to 64 named connections;
8
9. Multi-Source Replication
• Below files will be created after adding sources to the multi-source slave:
• An entry on multi-master-info-file is created with the chosen names;
• A master-into-file-connection_name.info is created as the regular master.info;
• A set of relay logs files following the pattern relay-log-connection_name.xxxxxx;
• A relay-log-index-connection_name.info with the names of the active relay logs;
• A relay-log-info-file-connection_name.info containing current master position;
9
[root@box03 mysql]# ls -lh | egrep "us_east|us_west"
-rw-rw---- 1 mysql mysql 306 Apr 3 16:07 box03-relay-bin-us_east.000001
-rw-rw---- 1 mysql mysql 619 Apr 3 16:07 box03-relay-bin-us_east.000002
-rw-rw---- 1 mysql mysql 66 Apr 3 16:07 box03-relay-bin-us_east.index
-rw-rw---- 1 mysql mysql 306 Apr 3 16:07 box03-relay-bin-us_west.000001
-rw-rw---- 1 mysql mysql 619 Apr 3 16:07 box03-relay-bin-us_west.000002
-rw-rw---- 1 mysql mysql 66 Apr 3 16:07 box03-relay-bin-us_west.index
-rw-rw---- 1 mysql mysql 155 Apr 3 16:07 master-us_east.info
-rw-rw---- 1 mysql mysql 155 Apr 3 16:07 master-us_west.info
-rw-rw---- 1 mysql mysql 61 Apr 3 16:07 relay-log-us_east.info
-rw-rw---- 1 mysql mysql 60 Apr 3 16:07 relay-log-us_west.info
10. Implementing Multi-Source Replication
• The known commands are now support the connection’s names as below:
• CHANGE MASTER 'connection_name' TO…
• FLUSH RELAY LOGS ['connection_name']
• MASTER_POS_WAIT(....,['connection_name'])
• RESET SLAVE ['connection_name'] [ALL]
• SHOW RELAYLOG ['connection_name'] EVENTS
• SHOW SLAVE ['connection_name'] STATUS
• SHOW ALL SLAVES STATUS
• START SLAVE ['connection_name'...]]
• START ALL SLAVES ...
• STOP SLAVE ['connection_name'] ...
• STOP ALL SLAVES …
• Commands that omit the connection_name part deals with the default slave which is ''
10
11. Implementing Multi-Source Replication
• Considering a scenario where data should be centralized:
CHANGE MASTER 'US_EAST' TO MASTER_HOST…
CHANGE MASTER 'US_WEST' TO MASTER_HOST…
• After configuring multi-source replication, you can check status like:
SHOW SLAVE 'US_EAST' STATUSG
SHOW SLAVE 'US_WEST' STATUSG
SHOW ALL SLAVES STATUSG
11
12. Implementing Multi-Source Replication
• Let’s streamline it to work with an individual connection name:
SET default_master_connection=US_EAST;
SHOW SLAVE STATUSG
SHOW STATUS LIKE 'Slave_running'G
• This is going to permit you to set variables per connection (all below need to be enclosed by stop slave/start slave):
SET default_master_connection=US_WEST;
STOP SLAVE;
SET GLOBAL sql_slave_skip_counter=1; #: if single_threaded replication
SET GLOBAL replicate_ignore_table=mydb.table01; #: Replicate_Ignore_Table: mydb.table01
SET GLOBAL replicate_ignore_db=mydb; #: Replicate_Ignore_DB: mydb
START SLAVE;
12
14. Multi-Source Replication and GTIDs
• MariaDB implements GTID differently than the upstream version;
• gtid_domain_id, it’s a 32-bit integer that will separate transactions as groups;
• server_id, it’s a 32-bit integer as well that will unique identify a database;
• trx_id, it’s 64-bit integer that will increment on every transaction executed by:
• the same server_id;
• within the same gtid_domain_id;
• Domain ID is very important mainly for Multi-Source because:
• binary log can have multiple streams, each stream identified by a Domain ID;
• if a application is updating multiple servers, there is no order issue;
• organize better the streams within the replication topology;
14
15. Multi-Source Replication and GTIDs
15
CENTRAL DATA REPO
US-EASTUS-WEST
DBEAST01
DBEAST02DBEAST01
DBWEST01
DBWEST01 DBWEST02
MaxScale can help replacing intermediate masters
https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-as-a-binlog-server/
16. Multi-Source Replication and GTIDs
• Basic configuration file for Multi-Source Replica instance (/etc/my.cnf.d/server.cnf):
[mariadb]
server_id=300
report_host=multisource_slave
log_bin=mariadb-bin
log_bin_index=mariadb-bin.index
log_slave_updates=true
gtid_domain_id=300
slave_parallel_threads=4
slave_domain_parallel_threads=2
[mariadb 10.1]
US_WEST.slave_parallel_mode=optimistic
US_EAST.slave_parallel_mode=optimistic
The implication of using multi-threaded is
that sql_slave_skip_counter cannot be used
to fix replication errors in case it’s OK to use
16
17. Multi-Source Replication and GTIDs
• Things to take care on a multi-master environment:
• Auto_increment conflict is not handled by default, you need to set it up
• auto_increment_increment=<total_#_of_servers>
• auto_increment_offset=<offset#_to_start/increment>
• It’s expected that all servers runs with log_slave_updates;
• gtid_domain can be the same as server_id;
• Parallel Mode in MariaDB 10.1 makes it better (when slave_parallel_threads > 0);
• Possible to configure at X threads per domain (slave_domain_parallel_threads);
• Possible to increase parallelism when applying relay log (slave_parallel_mode);
17
18. Multi-Source Replication
Multi-Source Replica
US_WEST US_EAST
18
box03 [(none)]> start all slaves;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
box03 [(none)]> show warnings;
+-------+------+-------------------------+
| Level | Code | Message |
+-------+------+-------------------------+
| Note | 1937 | SLAVE 'US_WEST' started |
| Note | 1937 | SLAVE 'US_EAST' started |
+-------+------+-------------------------+
2 rows in set (0.00 sec)
box03 [(none)]> pager grep "Connection_name:"
PAGER set to 'grep "Connection_name:"'
box03 [(none)]> show all slaves statusG
Connection_name: US_EAST
Connection_name: US_WEST
2 rows in set (0.00 sec)
19. Multi-Source Replication Implementation
• Let’s setup the source or connection name for US_WEST:
#: CREATING THE REPLICATION USER
box01 [(none)]> CREATE USER rpl@'192.168.0.13' IDENTIFIED BY 'xyz007';
Query OK, 0 rows affected (0.00 sec)
box01 [(none)]> GRANT REPLICATION SLAVE ON *.* TO rpl@'192.168.0.13';
Query OK, 0 rows affected (0.00 sec)
#: SETTING UP THE CONNECTION NAME
box03 [(none)]> CHANGE MASTER 'US_WEST' TO
-> MASTER_HOST=‘192.168.0.11’,
-> MASTER_USER='rpl',
-> MASTER_PASSWORD='xyz007',
-> MASTER_USE_GTID=SLAVE_POS;
Query OK, 0 rows affected (0.02 sec)
19
20. Multi-Source Replication Implementation
• Let’s setup the source or connection name for US_EAST:
#: CREATING THE REPLICATION USER
box02 [(none)]> SET SQL_LOG_BIN=0; CREATE USER rpl@'192.168.0.13' IDENTIFIED BY 'xyz007';
Query OK, 0 rows affected (0.00 sec)
box02 [(none)]> GRANT REPLICATION SLAVE ON *.* TO rpl@'192.168.0.13'; SET SQL_LOG_BIN=1;
Query OK, 0 rows affected (0.00 sec)
#: SETTING UP THE CONNECTION NAME
box03 [(none)]> CHANGE MASTER 'US_EAST' TO
-> MASTER_HOST='192.168.0.12',
-> MASTER_USER='rpl',
-> MASTER_PASSWORD='xyz007',
-> MASTER_USE_GTID=SLAVE_POS;
Query OK, 0 rows affected (0.02 sec)
20
Here, I used the switch SQL_LOG_BIN as 0
to avoid breaking replication when starting up
the threads for the connection name US_EAST.
One can reset master if possible on both sides in
order to clean up binary logs before start slave all
21. Multi-Source Replication and GTIDs
• MASTER_USE_GTID supports the following values:
• SLAVE_POS:
• Use this option when you don’t pretend to write to te replica
• You don’t want the trx_id part of the GTID to increment on the slave;
• You’re adding a new slave to the rotation and would like to start replicating form the
GLOBAL.gtid_slave_pos;
• CURRENT_POS:
• Use this when you write to the slave and would like to increment trx_id’s GTID;
• When a master need to become now a new slave (GLOBAL.gtid_current_pos)
• NO:
• Use this option when you want to start a positional classic replication.
21
22. Multi-Source Replication Common Break/Fix
• When you have a bunch of connection names, you have the following to fix errors:
• single_threaded, use per connection name sql_slave_skip_counter!
box03 [(none)]> show all slaves statusG
*************************** 1. row ***************************
Connection_name: US_EAST
Slave_SQL_State:
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.12
Master_User: rpl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mariadb-bin.000003
Read_Master_Log_Pos: 480
Relay_Log_File: box03-relay-bin-us_east.000002
Relay_Log_Pos: 619
Relay_Master_Log_File: mariadb-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: ''. Query:
'insert into test.t1 set i=1''insert into test.t1 set i=1'
22
23. Multi-Source Replication Common Break/Fix
• Using sql_slave_skip_counter for single_threaded and getting the replication resumed:
box03 [(none)]> set default_master_connection=US_EAST;
Query OK, 0 rows affected (0.00 sec)
box03 [(none)]> stop slave; set global sql_slave_skip_counter=1; start slave;
Query OK, 0 rows affected (0.01 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
23
24. Multi-Source Replication Common Break/Fix
• When you have a bunch of connection names, you have the following to fix errors:
• sql_slave_skip_counter: when using parallel replication and GTID with
multiple replication domains, @@sql_slave_skip_counter can not be used. Instead,
setting @@gtid_slave_pos explicitly can be used to skip to after a given GTID
position.
box03 [test]> show all slaves statusG
*************************** 1. row ***************************
Connection_name: US_EAST
Slave_SQL_State:
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.0.12
Master_User: rpl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mariadb-bin.000001
Read_Master_Log_Pos: 899
Relay_Log_File: box03-relay-bin-us_east.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mariadb-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '1' for key 'PRIMARY'' on query. Default database: ''. Query:
'insert into test.t1 set i=1'
24
25. Multi-Source Replication Common Break/Fix
• Using the right variable gtid_slave_pos, we need to do the below:
#: current gtid_slave_pos
box03 [test]> select @@global.gtid_slave_posG
*************************** 1. row ***************************
@@global.gtid_slave_pos: 100-100-2,200-200-3
1 row in set (0.00 sec)
#: we need to increment one transaction for US_EAST which is gtid_domain_id 200
box03 [test]> stop all slaves;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
box03 [test]> stop slave; set global gtid_slave_pos='100-100-2,200-200-4'; start slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 0 rows affected (0.02 sec)
Query OK, 0 rows affected (0.00 sec)
box03 [test]> start all slaves;
Query OK, 0 rows affected, 1 warning (0.01 sec)
25
26. Removing Multi-Source Connection Names
• To remove the connection names created:
#: deletes the master.info and relay-log.info files, all the relay log files, and starts a new relay log file
#: stop all slaves before, connection names will continue appearing out of show all slaves status
box03 [(none)]> reset slave 'US_WEST';
Query OK, 0 rows affected (0.00 sec)
box03 [(none)]> reset slave 'US_EAST';
Query OK, 0 rows affected (0.00 sec)
#: checking how it is right now
box03 [(none)]> show all slaves statusG
Connection_name: US_EAST
Slave_IO_Running: No
Slave_SQL_Running: No
Connection_name: US_WEST
Slave_IO_Running: No
Slave_SQL_Running: No
2 rows in set (0.00 sec)
26
27. Removing Multi-Source Connection Names
• To remove the connection names created:
#: permanent remove a connection names
#: ALL also resets the PORT, HOST, USER and PASSWORD parameters for the slave
box03 [(none)]> reset slave 'US_EAST' all;
Query OK, 0 rows affected (0.01 sec)
box03 [(none)]> reset slave 'US_WEST' all;
Query OK, 0 rows affected (0.00 sec)
#: checking how it is right now
box03 [(none)]> show all slaves statusG
Empty set (0.00 sec)
27