We will show how Galera Cluster executes DDLs in a safe, consistent manner across all the nodes in the cluster, and the differences with stand-alone MySQL. We will discuss how to prepare for and successfully carry out a schema upgrade and the considerations that need to be taken into account during the process.
Large-scale Logging Made Easy: Meetup at Deutsche Bank 2024
Galera Cluster DDL and Schema Upgrades 220217
1. Galera Cluster Best Practices Part 3:
Schema Changes and DDL
Philip Stoev
Codership Oy
2. Agenda
• A very quick overview of Galera Cluster
• DDL handling in Galera Cluster
• Preparing for a schema upgrade
• Execution strategies for DDL
• Recent Developments and Future Improvements
• Q/A
3. Galera Cluster Overview
Synchronous
– each transaction is immediately replicated on all nodes at commit
– no stale slaves
Multi-Master
– read from and write to any node
– automatic transaction conflict detection
Replication
– a copy of the entire dataset is available on all nodes
– new nodes can join automatically
For MySQL
– based on a modified version of MySQL (5.5, 5.6, 5.7)
– InnoDB storage engine
4. And more …
• Recovers from node failures within seconds
• Data consistency protections
– avoids reading stale data
– prevents unsafe data modifications
• Cloud and WAN support
5. DDL in Galera
• DDL statements are handled differently in Galera
– this is to ensure maximum data consistency in a distributed
environment
– The “online”, “in-place” and “non-blocking” terms from the
MySQL documentation do not apply directly
• DDL execution must be thought out in advance
6. DDL Execution Methods
• Total Order Isolation (TOI) - the default
– the DDL is run on all nodes at the same time
– the cluster can not commit other transactions while the DDL is
running
• RSU – Rolling Schema Upgrade
– the DDL is run on one node at a time
7. The Application and DDLs
• Check for DDLs executed by the application/framework:
– some applications run a lot of
CREATE TABLE [IF NOT EXISTS] at connection time
– some run ALTER TABLE when started,
if they feel they need to upgrade the schema
– TEMPORARY tables are OK
• Take control over DDLs:
– Revoke ALTER, INDEX privileges
– A SQL-aware proxy / load balancer can also intercept such
queries
8. The DDL Statement
• CREATE, DROP [PARTITION]
– usually fast enough, no need for special planning
– unless executed repeatedly by multiple connections
• ALTER TABLE or CREATE INDEX
– some operations have different execution speed depending on
MySQL version
– some statements operate on metadata only, so are fast
– some DDL support ALGORITHM=INPLACE for faster execution
• Will still cause locking in Galera under TOI
– some DDLs require creating a complete copy of the entire table
9. OPTIMIZE TABLE, etc.
• If a statement can be given the
NO_WRITE_TO_BINLOG modifier, it will not be
replicated by Galera
• Such a statement may fail if concurrent updates against
the same table are going on elsewhere in the cluster.
• If you experience deadlock errors:
– do not perform concurrent updates against the table, or
– make the updates only on the node running the statement
11. General Principles for TOI
• No other write transactions can commit anywhere on the
cluster while a TOI DDL is in progress
• Even if the DDL is “online”, ”inplace” or allows
concurrent table access in stand-alone MySQL, it is still
fully blocking for writes
• DML transactions operating on same table may get
deadlock error
• wsrep_sync_wait queries may time out
12. General Principles for TOI (#2)
• DDL statements can not be killed once started
• If a node dies during DDL, it may need to rejoin via SST
• In Galera 3.x, DDL execution errors are ignored
– so check server error log
– a GRA*.log file will also be created for each failure
13. How Galera runs TOI DDL
1. The DDL statement is sent to all nodes
2. All transactions in the cluster that committed prior to the
DDL are replicated and applied first, new commits are
blocked
3. The DDL is run on all nodes at exactly the same place
in the logical sequence of events
14. Procedure for TOI
1. Practice the DDL on a test cluster, if possible
2. Ensure enough free disk space is available on all nodes
3. Schedule a maintenance window / put application in read-
only mode
– hangs or deadlock errors will occur for all DML transactions
4. Run the DDL on one node only, it will be replicated to the
rest
5. Examine SHOW PROCESSLIST, SHOW CREATE TABLE
on all nodes to confirm successful execution
6. Check error logs on all nodes for errors
15. Potential Failure Scenarios
• MySQL returns a SQL error locally
– statement still ran on all nodes even if it failed locally;
– it may have succeeded elsewhere
• Statement fails to complete successfully on other nodes
– disk space issues
– constraint violation due to data inconsistency
ALTER TABLE ADD UNIQUE KEY may expose inconsistencies that
have not been noticed previously
• Statement takes longer than expected
17. Basic Principles for RSU
• Statement is manually run on one node at a time
• Node will temporarily fall behind the cluster for the
duration of the DDL
• Standard MySQL locking rules apply on local node
• Nothing is locked on remote nodes
• Other transactions can continue unaffected
• The binary log on each node will contain events in
different order – important when using async replication
18. Your Application and RSU
• During a Rolling Schema Upgrade:
– a cluster contains some nodes with old schema
and some nodes with the new one
– the node that is currently running the DDL may temporarily
fall behind
• Remove it from load balancer if data freshness is important
– RSU is a global setting, so application should not attempt to
run other DDLs while you execute the RSU procedure
19. Coexistence of Two Schemas
• INSERT queries should not attempt to insert into a
column that does not yet exist everywhere.
– INSERT INTO table (old_col1, new_col2) VALUES (123,
‘abc’);
• Column count and position may be different in old and
new schema:
– SELECT * may return differently-shaped result sets
• SELECT old_col1, old_col2 is better
– INSERT INTO table VALUES (123, ‘abc’) may fail or put
data in the wrong column
20. Preparation
1. Practice the DDL on a test machine, if possible
2. Practice taking nodes out of the load balancer, as this
needs to be done repeatedly
3. Do one DDL at a time, to avoid confusion
– multiple operations can be combined in a single DDL
statement
21. Step-By-Step Procedure
1. On every node, one at a time:
2. Remove node from load balancer if data freshness is important
3. Run DDL:
SET GLOBAL wsrep_osu_method=RSU;
ALTER TABLE …
SET GLOBAL wsrep_osu_method=TOI;
4. Wait for node to catch up - wsrep_local_recv_queue variable
5. Restore node to load balancer
6. Check for application errors
7. Repeat procedure on the other nodes
22. Current and Future Improvements
• In recently-released Galera Cluster 5.7:
certain DDL statements are now much faster or instantaneous:
– ALTER TABLE ADD KEY
– ALTER TABLE CHANGE COLUMN for some VARCHAR types
– OPTIMIZE TABLE
• In upcoming Galera Replication Library 4.x:
– a new schema upgrade method, NBO, will allow ALTER
statements to run without blocking the entire cluster
– a new consistency mechanism will check if a DDL succeeded
or failed equally on all nodes
23. Questions
• Please use the Question/Chat box in the GoToWebinar
panel
• Ideas welcome for future webinars