SlideShare a Scribd company logo
1 of 60
Download to read offline
MySQL X Protocol
Talking to MySQL Directly over the Wire
Simon J Mudd <simon.mudd@booking.com>
Oracle Open World − 22nd September 2016
Content
● What is MySQL X protocol
● How does it work
● Building Drivers
● Pipelining
● Why we need a proper protocol specification
● X thoughts – things I noticed
● Conclusion
Disclaimer
● Not involved in the design
● I have not looked at how the old protocol works
● Information obtained from docs, code and observation
● Incorrect descriptions of behaviour are my own
Focus
● A developer should not have to care about this as he or
she will be using a driver and will therefore not see the
details
● The focus of this presentation is for a driver writer or
someone interested in knowing how the communication
between client and server works
Focus
● Booking.com uses 2 languages which do not currently
have X protocol support: perl and Go
● We already do special things with MySQL
● Process binlogs with the binlog router and for sending data to
Hadoop
● We wanted to see if the new protocol would be beneficial to use
in our current use cases
What is the MySQL X Protocol
6
What is the MySQL X Protocol?
In April MySQL 5.7.12 introduces MySQL DocumentStore
● noSQL API to access JSON data in MySQL
● MySQL x plugin in the server
● MySQL shell to provide command line access
● X DevAPI client libraries for: Java, C, dot Net, node.js
and Python
What is the MySQL X Protocol?
X protocol: more flexible connectivity between client and
server
● asynchronous API, command pipelining
● uses tcp port 33060 rather than 3306
● transport uses wrapped Google protobuf messages
● Supports both SQL and new noSQL API
● meant as a foundation for future features
What is the MySQL X Protocol?
Are there any ”buts”?
● The name: wikipedia says the X protocol was created in
1984… I tend to use MySQL X protocol
● Support missing in any other “MySQL-like” products
● Drivers missing for other languages
● network-based protocol specification: users can use to write their
own drivers
● However, this is still very new…
How does it work?
10
How does the X protocol work?
Messages exchanged between client and server are
wrapped Google protobuf messages
● Wrapped means prefixing each message with a 4-byte
length and a 1-byte message type indicator
● Protobuf descriptions are buried in the server code!
● Mysqlx_max_allowed_packet: default 1MB
● Limits the size of a query or single row returned to client
● In practice this setting may need to be increased
11
How does the X protocol work?
Message flow consists of the following phases
● Connect to server
● Capabilities exchange (optional)
● Authentication
● Querying server (optional)
● Disconnect from server
12
Capabilities Exchange
13
Name/value based configuration exchange
● Request/Set some server settings prior to authentication
● Used to initiate TLS
● Used to determine which authentication mechanisms are
available to the client
● “value” can in theory be any arbitrary type though
currently single scalar values or a list of scalars
● this should be formally restrained to keep things simple
Capabilities Exchange
14
client server
CapabilitiesGet
Capabilities
Current Capabilities:
• tls (if TLS is configured)
• authentication.mechanisms
• doc.formats
• node_type
• plugin.version
• client.pwd_expire_ok
Can be used before
authenticating client
CapabilitiesSet
Ok
Authentication
15
● MYSQL41 by default
● If using TLS other options are available:
● PLAIN (safe as transport is encrypted)
● EXTERNAL
● It would be good to define which authentication options
are available when and why
Authentication
16
client server
AuthenticateStart(mech=“MYSQL41”)
AuthenticateContinue
AuthenticateStart in this case just
provides the mech name
Second AuthenticateContinue
provides username plus
scrambled password but also
database to connect to
Notice provides a CLIENT_ID
AuthenticateContinue
Notice
AuthenticateOk
Query Server (noSQL)
17
● DocumentStore stuff
● JSON stored in tables and use of CRUD type messages
● Find, Insert, Update, Delete messages
● Not covered in this presentation
Query Server (SQL)
18
Client requests data from the server.
● Prepared statements are not available (5.7.15)
● Documentation indicates they are available in sample
message flows (see Figure 15.11 Messages for SQL)
● The messages sql::StmtPrepare, and
PreparedStmt::ExecuteIntoCursorIt do not appear to
exist, but there is a StmtExecute
● Future functionality? Should be indicated more clearly
Query Server (SQL)
19
client server
StmtExecute
ColumnMetaData*
Query:
Contains query and optionally
parameters to be used with
placeholders
Results:
One ColumnMetaData message
per column in result set
One Row message per row in
result set
Notice returns rows affected
Row*
Notice
StmtExecuteOk
Disconnect
20
● Tell MySQL we have finished and then disconnect
Disconnect
21
client server
Session::Close
Ok
Not much to say.
Client free to disconnect from
server after receiving Ok
Building Drivers
22
Building Drivers
Usually drivers are built below a standard high-level
interface for the language concerned
● e.g. Go: database/sql, Perl: DBI
● Client can only use API provided by high-level driver
● X protocol wants to use pipelining: may not be available
● To get “all features”: need full custom driver
Building Drivers
● We had a look at Go and Perl
● Harder than expected
● Documentation was not as complete as desired
● Protobuf files are not enough
● No explanation of expected behaviour under error conditions
● Few examples of complete message exchanges
● Incorrect or misleading documentation
● Resorted to reading source code or source code tests
Building Drivers
Results of our proof of concept:
● Learnt about message flows
● Achieved authentication
● Able to send queries to the server and get back results
● Look at edge cases
● Work in progress
Building Drivers
Results of what we did can be seen here:
● Go driver: https://github.com/sjmudd/go-mysqlx-driver
● Perl: https://github.com/slanning/perl-mysql-xprotocol
● But more work to do
Pipelining
27
Pipelining
28
client server
Request 1
Request 2
Response 3
Response 1
Response 2
Request 3
client server
Request 1
Request 2
Response 3
Response 1
Response 2
Request 3
pipelinedsynchronous
X protocol message responses are one or more messages
time
Pipelining
● Most MySQL X messages are quite small
● Network layer can piggy back more than one message
into a single packet when sending
● Useful for session startup as several messages
exchanged
● Helpful if you have several independent queries to send
● Avoids the synchronous round trip time wait
● But pipelined messages are not queued on the server
Pipelining
Servers in more than one data centre:
● cross-dc latency is higher (e.g. ~15 ms vs < 1ms)
● Applications which serialise access to the db may have
problems if accessing a remote db when talking locally
runs fine
● MySQL X protocol here looks interesting
Pipelining
Results of some SQL benchmarking in perl1
• 100 primary key SELECTs
Benchmark Same DC Cross DC Latency Affect
Perl DBI: 34ms 1248ms 36x
MySQL X pipelined: 44ms 59ms 1.34x
MySQL X non-pipelined: 89ms 982ms 11x
Conclusion
● Same DC: DBI still faster
● Cross DC: pipelining much faster
● Change application logic to remove serialisation
[1] Scott Lanning: https://github.com/slanning/perl-mysql-xprotocol
Pipelining
Example: Orchestrator
● Currently uses “legacy” driver: go-sql-drivers/mysql
● Driver by default sends prepared statements (2x slower)
● We have had to disable prepared statements for
performance reasons.
● With MySQL X protocol the pipelining would allow the
client to send the prepared statement and execute it
together by default – so simpler
Pipelining
● Pipelining will work quite well on higher latency links
● Depends on query execution time vs network latency
time
● X protocol is quite noisy (many messages): could be
optimised further
● No current support (yet?) for asynchronous queries
Why we need a protocol specification
34
Why we need a protocol specification
First: Oracle have made a very solid first implementation
● Server side X plugin
● Client libraries
● New shell
● Documentation
● Supports both SQL and noSQL access
● Intended to be production quality on release
Why we need a protocol specification
The MySQL ecosystem is very large
● Everyone using the classic or legacy protocol
● Moving to a new protocol will only work if it is worthwhile
and if players see the benefit
● The benefit can only be gained if everyone jumps on
board
Why we need a protocol specification
● Today we have complex use cases:
● Sharding
● “external” connectivity (Hadoop, Vitess, Kafka, …)
● “proxy” connectivity (MySQL router, MaxScale, ProxySQL)
● may not be in languages supported by Oracle
● Current documentation while improving is still incomplete
● Migration to the X protocol needs to be easy
● Other MySQL-like vendors must come on board
37
Why we need a protocol specification
● Reading the source code of the current X plugin or client
libraries does not count as documentation as this is a
moving target
● The docs are only available online or in EPUB as my
request for a pdf failed. bug#81128
38
Why we need a protocol specification
● Easy to download document showing full specification
● Driver writers only have one place to look
● Examples and test cases included
● Should avoid the need to look at source code
● Ensures that enhancements will not break backwards
compatibility
● More likely to get buy-in from the community
● Helps avoid fragmentation
39
Why we need a protocol specification
I have tried to start writing one myself
● Very much work in progress
● RFC style
● I would appreciate support from others who might be
interested in helping
● See: https://github.com/sjmudd/xprotocol-notes
40
X thoughts - things I noticed
41
X plugin variable names
mysqlx_max_connections vs max_connections
● I might prefer to limit all connections globally
● mysqlx_max_connections = 0 disable MySQL X
connections ?
● mysqlx_max_connections = -1 use max_connections to
limit connections ?
X plugin variable names
mysqlx_min_worker_threads vs thread_cache_size
● Inconsistent naming: maybe better
mysqlx_thread_cache_size ?
X plugin variable names
mysqlx_ssl_* vs ssl_*
● mysqlx_ssl_* settings need to go away.
● See bug#81528
X plugin
Need more information for monitoring
● Counters for normal and error conditions
● Session and global metrics
● Need timing metrics
● Probably work in progress
X protocol
Initial State
● Character set being used: client and server
● Minimum/default mysqlx_max_allowed_packet
Error Handling
● definition of behaviour under different error conditions?
Missing
● Optional idle heartbeat (in both directions), timers
● Checksums: needed? binlog events?
X protocol
Character sets
● Expectation of client/server character sets is unclear
● Expectation of data transferred over the wire? Utf8?
Which one?
● What about storage and retrieval with character set
based columns?
● Where is conversion done when there is a difference?
X protocol
Character sets
● MySQL 5.7 by default uses utf8 (3-bytes)
● MySQL 8.0 to use utf8mb4 (4-bytes) by default?
● Many people configure things other than the default
● Column data can use different character sets
● Needs to fit together in an unambiguous way
X protocol
Initial session setup lengthy
● If you want to check and set things mentioned before
and go to TLS then the session setup is lengthy
● Not good for “fast” single query connection types
● Pipelining can help but does not completely solve the
problem
● Capabilities exchange “during” authentication?
X protocol
Capabilities
● Values can be any type, (avoid nested types)
● No way to see which capabilities are readable or
writeable
● No specification of the values that can be applied
● Limit the capabilities which are exposed prior to
authentication
● Remove version specific values (e.g. plugin.version)
X protocol
ColumnMetaData received in each query response
● the X protocol sends one message per column rather
than a singe message including all column meta data
● This generates a network overhead of 5 bytes per
column (length plus message type) and adds to code
complexity as each message processed separately
● Inefficient for single row responses
X protocol
Notice messages too overloaded
● Unsolicited messages from server to client
● Responses of change
● Warnings response to queries
● Session variable changes
● Binlog events …
● In theory can be ignored (according to documentation)
● Sometimes unneeded (so overhead)
NOTICE
X protocol
Idle behaviour issues:
● “Server gone away” errors
● “Server still executing query” of a disconnected client
Solution
● Optional heartbeat server to client and/or client to server
● Used only if no activity in the direction concerned
● Tcp keepalive might not see a stuck mysqld
X protocol
No unique message ids
● Client and server message ids overlap
● “Sniffers” need to know the direction of the message to
be able to decode.
X protocol
Performance given as focus: see WL#8639
● Would be good to see comparisons from the MySQL
team.
● Under some use cases the X protocol can be faster.
(high latency between client and server with high
message rate)
X protocol
Extensibility
● Likely to undergo rapid change
● Binlog routing? (docs imply this)
● Sharding (comments from this week’s keynote)
● Do not forget backward compatibility
● Will help early adopters, driver writers etc
● Proper specifications help
Conclusion
57
Conclusion
● MySQL X protocol looks good and stable
● If you use the supported languages you will be fine
● A formal specification will ease adoption by third parties,
clarify current behaviour and ensure compatibility as the
protocol evolves
● Simple drivers to hook into existing SQL infrastructure
should be easier to write, but if you want to use the new
features such as pipelining more specialised drivers will
be needed
References
Oracle
• http://dev.mysql.com/doc/internals/en/x-protocol.html
My work:
● https://github.com/sjmudd/go-mysqlx-driver
● https://github.com/sjmudd/mysql-x-protocol-specification
Thank you

More Related Content

What's hot

What's hot (20)

Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replication
 
Percona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL AdministrationPercona Toolkit for Effective MySQL Administration
Percona Toolkit for Effective MySQL Administration
 
Backup para MySQL
Backup para MySQLBackup para MySQL
Backup para MySQL
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and BeyondScylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
Scylla Summit 2022: The Future of Consensus in ScyllaDB 5.0 and Beyond
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Elasticsearch in Netflix
Elasticsearch in NetflixElasticsearch in Netflix
Elasticsearch in Netflix
 
Redis cluster
Redis clusterRedis cluster
Redis cluster
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-Webinar
 
Building Your Data Streams for all the IoT
Building Your Data Streams for all the IoTBuilding Your Data Streams for all the IoT
Building Your Data Streams for all the IoT
 
Что нужно знать о трёх топовых фичах MySQL
Что нужно знать  о трёх топовых фичах  MySQLЧто нужно знать  о трёх топовых фичах  MySQL
Что нужно знать о трёх топовых фичах MySQL
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Demystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash SafetyDemystifying MySQL Replication Crash Safety
Demystifying MySQL Replication Crash Safety
 
ONNX and MLflow
ONNX and MLflowONNX and MLflow
ONNX and MLflow
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
MySQL Parallel Replication: All the 5.7 and 8.0 Details (LOGICAL_CLOCK)
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache FlinkTzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
Tzu-Li (Gordon) Tai - Stateful Stream Processing with Apache Flink
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Tuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBaseTuning Apache Phoenix/HBase
Tuning Apache Phoenix/HBase
 

Similar to MySQL X protocol - Talking to MySQL Directly over the Wire

kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
Krivoy Rog IT Community
 

Similar to MySQL X protocol - Talking to MySQL Directly over the Wire (20)

USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a MonthUSENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
 
Automating using Ansible
Automating using AnsibleAutomating using Ansible
Automating using Ansible
 
Netty training
Netty trainingNetty training
Netty training
 
Netty training
Netty trainingNetty training
Netty training
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
Level 101 for Presto: What is PrestoDB?
Level 101 for Presto: What is PrestoDB?Level 101 for Presto: What is PrestoDB?
Level 101 for Presto: What is PrestoDB?
 
Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
 
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end developmentWebCamp Ukraine 2016: Instant messenger with Python. Back-end development
WebCamp Ukraine 2016: Instant messenger with Python. Back-end development
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
 
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
WebCamp 2016: Python. Вячеслав Каковский: Real-time мессенджер на Python. Осо...
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Engine
 
Gluster dev session #6 understanding gluster's network communication layer
Gluster dev session #6  understanding gluster's network   communication layerGluster dev session #6  understanding gluster's network   communication layer
Gluster dev session #6 understanding gluster's network communication layer
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stack
 
Python And The MySQL X DevAPI - PyCaribbean 2019
Python And The MySQL X DevAPI - PyCaribbean 2019Python And The MySQL X DevAPI - PyCaribbean 2019
Python And The MySQL X DevAPI - PyCaribbean 2019
 
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
"Clouds on the Horizon Get Ready for Drizzle" by David Axmark @ eLiberatica 2009
 
SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0SPDY and What to Consider for HTTP/2.0
SPDY and What to Consider for HTTP/2.0
 
Truemotion Adventures in Containerization
Truemotion Adventures in ContainerizationTruemotion Adventures in Containerization
Truemotion Adventures in Containerization
 
Massively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHPMassively Scaled High Performance Web Services with PHP
Massively Scaled High Performance Web Services with PHP
 
Java one2013
Java one2013Java one2013
Java one2013
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

MySQL X protocol - Talking to MySQL Directly over the Wire

  • 1. MySQL X Protocol Talking to MySQL Directly over the Wire Simon J Mudd <simon.mudd@booking.com> Oracle Open World − 22nd September 2016
  • 2. Content ● What is MySQL X protocol ● How does it work ● Building Drivers ● Pipelining ● Why we need a proper protocol specification ● X thoughts – things I noticed ● Conclusion
  • 3. Disclaimer ● Not involved in the design ● I have not looked at how the old protocol works ● Information obtained from docs, code and observation ● Incorrect descriptions of behaviour are my own
  • 4. Focus ● A developer should not have to care about this as he or she will be using a driver and will therefore not see the details ● The focus of this presentation is for a driver writer or someone interested in knowing how the communication between client and server works
  • 5. Focus ● Booking.com uses 2 languages which do not currently have X protocol support: perl and Go ● We already do special things with MySQL ● Process binlogs with the binlog router and for sending data to Hadoop ● We wanted to see if the new protocol would be beneficial to use in our current use cases
  • 6. What is the MySQL X Protocol 6
  • 7. What is the MySQL X Protocol? In April MySQL 5.7.12 introduces MySQL DocumentStore ● noSQL API to access JSON data in MySQL ● MySQL x plugin in the server ● MySQL shell to provide command line access ● X DevAPI client libraries for: Java, C, dot Net, node.js and Python
  • 8. What is the MySQL X Protocol? X protocol: more flexible connectivity between client and server ● asynchronous API, command pipelining ● uses tcp port 33060 rather than 3306 ● transport uses wrapped Google protobuf messages ● Supports both SQL and new noSQL API ● meant as a foundation for future features
  • 9. What is the MySQL X Protocol? Are there any ”buts”? ● The name: wikipedia says the X protocol was created in 1984… I tend to use MySQL X protocol ● Support missing in any other “MySQL-like” products ● Drivers missing for other languages ● network-based protocol specification: users can use to write their own drivers ● However, this is still very new…
  • 10. How does it work? 10
  • 11. How does the X protocol work? Messages exchanged between client and server are wrapped Google protobuf messages ● Wrapped means prefixing each message with a 4-byte length and a 1-byte message type indicator ● Protobuf descriptions are buried in the server code! ● Mysqlx_max_allowed_packet: default 1MB ● Limits the size of a query or single row returned to client ● In practice this setting may need to be increased 11
  • 12. How does the X protocol work? Message flow consists of the following phases ● Connect to server ● Capabilities exchange (optional) ● Authentication ● Querying server (optional) ● Disconnect from server 12
  • 13. Capabilities Exchange 13 Name/value based configuration exchange ● Request/Set some server settings prior to authentication ● Used to initiate TLS ● Used to determine which authentication mechanisms are available to the client ● “value” can in theory be any arbitrary type though currently single scalar values or a list of scalars ● this should be formally restrained to keep things simple
  • 14. Capabilities Exchange 14 client server CapabilitiesGet Capabilities Current Capabilities: • tls (if TLS is configured) • authentication.mechanisms • doc.formats • node_type • plugin.version • client.pwd_expire_ok Can be used before authenticating client CapabilitiesSet Ok
  • 15. Authentication 15 ● MYSQL41 by default ● If using TLS other options are available: ● PLAIN (safe as transport is encrypted) ● EXTERNAL ● It would be good to define which authentication options are available when and why
  • 16. Authentication 16 client server AuthenticateStart(mech=“MYSQL41”) AuthenticateContinue AuthenticateStart in this case just provides the mech name Second AuthenticateContinue provides username plus scrambled password but also database to connect to Notice provides a CLIENT_ID AuthenticateContinue Notice AuthenticateOk
  • 17. Query Server (noSQL) 17 ● DocumentStore stuff ● JSON stored in tables and use of CRUD type messages ● Find, Insert, Update, Delete messages ● Not covered in this presentation
  • 18. Query Server (SQL) 18 Client requests data from the server. ● Prepared statements are not available (5.7.15) ● Documentation indicates they are available in sample message flows (see Figure 15.11 Messages for SQL) ● The messages sql::StmtPrepare, and PreparedStmt::ExecuteIntoCursorIt do not appear to exist, but there is a StmtExecute ● Future functionality? Should be indicated more clearly
  • 19. Query Server (SQL) 19 client server StmtExecute ColumnMetaData* Query: Contains query and optionally parameters to be used with placeholders Results: One ColumnMetaData message per column in result set One Row message per row in result set Notice returns rows affected Row* Notice StmtExecuteOk
  • 20. Disconnect 20 ● Tell MySQL we have finished and then disconnect
  • 21. Disconnect 21 client server Session::Close Ok Not much to say. Client free to disconnect from server after receiving Ok
  • 23. Building Drivers Usually drivers are built below a standard high-level interface for the language concerned ● e.g. Go: database/sql, Perl: DBI ● Client can only use API provided by high-level driver ● X protocol wants to use pipelining: may not be available ● To get “all features”: need full custom driver
  • 24. Building Drivers ● We had a look at Go and Perl ● Harder than expected ● Documentation was not as complete as desired ● Protobuf files are not enough ● No explanation of expected behaviour under error conditions ● Few examples of complete message exchanges ● Incorrect or misleading documentation ● Resorted to reading source code or source code tests
  • 25. Building Drivers Results of our proof of concept: ● Learnt about message flows ● Achieved authentication ● Able to send queries to the server and get back results ● Look at edge cases ● Work in progress
  • 26. Building Drivers Results of what we did can be seen here: ● Go driver: https://github.com/sjmudd/go-mysqlx-driver ● Perl: https://github.com/slanning/perl-mysql-xprotocol ● But more work to do
  • 28. Pipelining 28 client server Request 1 Request 2 Response 3 Response 1 Response 2 Request 3 client server Request 1 Request 2 Response 3 Response 1 Response 2 Request 3 pipelinedsynchronous X protocol message responses are one or more messages time
  • 29. Pipelining ● Most MySQL X messages are quite small ● Network layer can piggy back more than one message into a single packet when sending ● Useful for session startup as several messages exchanged ● Helpful if you have several independent queries to send ● Avoids the synchronous round trip time wait ● But pipelined messages are not queued on the server
  • 30. Pipelining Servers in more than one data centre: ● cross-dc latency is higher (e.g. ~15 ms vs < 1ms) ● Applications which serialise access to the db may have problems if accessing a remote db when talking locally runs fine ● MySQL X protocol here looks interesting
  • 31. Pipelining Results of some SQL benchmarking in perl1 • 100 primary key SELECTs Benchmark Same DC Cross DC Latency Affect Perl DBI: 34ms 1248ms 36x MySQL X pipelined: 44ms 59ms 1.34x MySQL X non-pipelined: 89ms 982ms 11x Conclusion ● Same DC: DBI still faster ● Cross DC: pipelining much faster ● Change application logic to remove serialisation [1] Scott Lanning: https://github.com/slanning/perl-mysql-xprotocol
  • 32. Pipelining Example: Orchestrator ● Currently uses “legacy” driver: go-sql-drivers/mysql ● Driver by default sends prepared statements (2x slower) ● We have had to disable prepared statements for performance reasons. ● With MySQL X protocol the pipelining would allow the client to send the prepared statement and execute it together by default – so simpler
  • 33. Pipelining ● Pipelining will work quite well on higher latency links ● Depends on query execution time vs network latency time ● X protocol is quite noisy (many messages): could be optimised further ● No current support (yet?) for asynchronous queries
  • 34. Why we need a protocol specification 34
  • 35. Why we need a protocol specification First: Oracle have made a very solid first implementation ● Server side X plugin ● Client libraries ● New shell ● Documentation ● Supports both SQL and noSQL access ● Intended to be production quality on release
  • 36. Why we need a protocol specification The MySQL ecosystem is very large ● Everyone using the classic or legacy protocol ● Moving to a new protocol will only work if it is worthwhile and if players see the benefit ● The benefit can only be gained if everyone jumps on board
  • 37. Why we need a protocol specification ● Today we have complex use cases: ● Sharding ● “external” connectivity (Hadoop, Vitess, Kafka, …) ● “proxy” connectivity (MySQL router, MaxScale, ProxySQL) ● may not be in languages supported by Oracle ● Current documentation while improving is still incomplete ● Migration to the X protocol needs to be easy ● Other MySQL-like vendors must come on board 37
  • 38. Why we need a protocol specification ● Reading the source code of the current X plugin or client libraries does not count as documentation as this is a moving target ● The docs are only available online or in EPUB as my request for a pdf failed. bug#81128 38
  • 39. Why we need a protocol specification ● Easy to download document showing full specification ● Driver writers only have one place to look ● Examples and test cases included ● Should avoid the need to look at source code ● Ensures that enhancements will not break backwards compatibility ● More likely to get buy-in from the community ● Helps avoid fragmentation 39
  • 40. Why we need a protocol specification I have tried to start writing one myself ● Very much work in progress ● RFC style ● I would appreciate support from others who might be interested in helping ● See: https://github.com/sjmudd/xprotocol-notes 40
  • 41. X thoughts - things I noticed 41
  • 42. X plugin variable names mysqlx_max_connections vs max_connections ● I might prefer to limit all connections globally ● mysqlx_max_connections = 0 disable MySQL X connections ? ● mysqlx_max_connections = -1 use max_connections to limit connections ?
  • 43. X plugin variable names mysqlx_min_worker_threads vs thread_cache_size ● Inconsistent naming: maybe better mysqlx_thread_cache_size ?
  • 44. X plugin variable names mysqlx_ssl_* vs ssl_* ● mysqlx_ssl_* settings need to go away. ● See bug#81528
  • 45. X plugin Need more information for monitoring ● Counters for normal and error conditions ● Session and global metrics ● Need timing metrics ● Probably work in progress
  • 46. X protocol Initial State ● Character set being used: client and server ● Minimum/default mysqlx_max_allowed_packet Error Handling ● definition of behaviour under different error conditions? Missing ● Optional idle heartbeat (in both directions), timers ● Checksums: needed? binlog events?
  • 47. X protocol Character sets ● Expectation of client/server character sets is unclear ● Expectation of data transferred over the wire? Utf8? Which one? ● What about storage and retrieval with character set based columns? ● Where is conversion done when there is a difference?
  • 48. X protocol Character sets ● MySQL 5.7 by default uses utf8 (3-bytes) ● MySQL 8.0 to use utf8mb4 (4-bytes) by default? ● Many people configure things other than the default ● Column data can use different character sets ● Needs to fit together in an unambiguous way
  • 49. X protocol Initial session setup lengthy ● If you want to check and set things mentioned before and go to TLS then the session setup is lengthy ● Not good for “fast” single query connection types ● Pipelining can help but does not completely solve the problem ● Capabilities exchange “during” authentication?
  • 50. X protocol Capabilities ● Values can be any type, (avoid nested types) ● No way to see which capabilities are readable or writeable ● No specification of the values that can be applied ● Limit the capabilities which are exposed prior to authentication ● Remove version specific values (e.g. plugin.version)
  • 51. X protocol ColumnMetaData received in each query response ● the X protocol sends one message per column rather than a singe message including all column meta data ● This generates a network overhead of 5 bytes per column (length plus message type) and adds to code complexity as each message processed separately ● Inefficient for single row responses
  • 52. X protocol Notice messages too overloaded ● Unsolicited messages from server to client ● Responses of change ● Warnings response to queries ● Session variable changes ● Binlog events … ● In theory can be ignored (according to documentation) ● Sometimes unneeded (so overhead) NOTICE
  • 53. X protocol Idle behaviour issues: ● “Server gone away” errors ● “Server still executing query” of a disconnected client Solution ● Optional heartbeat server to client and/or client to server ● Used only if no activity in the direction concerned ● Tcp keepalive might not see a stuck mysqld
  • 54. X protocol No unique message ids ● Client and server message ids overlap ● “Sniffers” need to know the direction of the message to be able to decode.
  • 55. X protocol Performance given as focus: see WL#8639 ● Would be good to see comparisons from the MySQL team. ● Under some use cases the X protocol can be faster. (high latency between client and server with high message rate)
  • 56. X protocol Extensibility ● Likely to undergo rapid change ● Binlog routing? (docs imply this) ● Sharding (comments from this week’s keynote) ● Do not forget backward compatibility ● Will help early adopters, driver writers etc ● Proper specifications help
  • 58. Conclusion ● MySQL X protocol looks good and stable ● If you use the supported languages you will be fine ● A formal specification will ease adoption by third parties, clarify current behaviour and ensure compatibility as the protocol evolves ● Simple drivers to hook into existing SQL infrastructure should be easier to write, but if you want to use the new features such as pipelining more specialised drivers will be needed
  • 59. References Oracle • http://dev.mysql.com/doc/internals/en/x-protocol.html My work: ● https://github.com/sjmudd/go-mysqlx-driver ● https://github.com/sjmudd/mysql-x-protocol-specification