How THINQ runs both transactions and analytics at scale

1MySQL Triangle Meetup| 2019-01-24
How runs both transactions and analytics at scale

2MySQL Triangle Meetup| 2019-01-24 2MariaDB OpenWorks Conference | 2019-02-26
Open Source Enthusiast
• I am the most senior DBA at who is fortunate to be among the early
MySQL users
• One of my use cases was listed in An Open Letter to the Community from
MySQL Founders David Axmark & Michael "Monty" Widenius on 2 August
2005
• This year, we are celebrating ten years of MySQL…
• Major free software projects and hugely-popular Web sites such as the Sahana
(disaster recovery system for the tsunami), Ensembl.org and Human Genome Project
(used for cancer research), Wikipedia, Bugzilla, Craigslist, Feedster, Flickr, Freshmeat,
LiveJournal, Neopets, Slashdot, SugarCRM, Technorati, Wordpress, CERNs
ATLAS Experiment -- all taking advantage of MySQL's speed, ease of use, flexibility,
scalability and ecosystem

Outline
• provides a cloud-based Communications-Platform-as-a-Service
(CPaaS) that routes tens of millions of phone calls per day for
customers in enterprise and telecommunications
• I will present how combined MariaDB ColumnStore/InfiniDB
and MariaDB Galera Cluster to support both high-performance
transaction processing and scalable analytics
• In addition, I will share some of our best practices and lessons learned
from supporting an ever-increasing database workload
• a consequence of continued growth

What is VoIP and LCR?
• Since 2010 provides Least Cost Routing for VoIP phone calls
• VoIP stands for Voice over Internet Protocol
• An example: Skype
• Uses proprietary protocol to setup calls
• and most phone carriers use a standard Session Initiation Protocol (SIP)
• and many others use an open-source application OpenSIPS to setup calls
• LCR stands for Least Cost Routing
• While setting up VoIP phone call, choose carrier providing least expensive rate
• Providing Least Cost Routing is not trivial – see next slide

Actively maintained
*stats provided by openhub.net
About CGRateS (2)Least Cost Routing: Solutions by Others
Least Cost Routing: the science of finding the most
cost-effective path to connect your customers’ calls

Least Cost Routing: Solutions by
• Industry's only toll-
free LCR engine
170+
Countries
60+
Carriers

data types, retention periods and volumes
SIP Messages
CDRs
Aggregates
days
weeks or
months
permanent
Terabytes
Terabytes
Gigabytes

Setting up Phone Calls
• processes dozens of SIP messages
(text format, like HTTP/1.1) to setup one call
• For debugging (SIP tracing) all SIP messages are
captured in the MariaDB Server database
• As a consequence of our continued growth, this
results in an ever-increasing database workload to
capture more than 75,000 SIP messages per second
SIP Messages

Is there a way to scale up SIP messages inserts in RDBMS rather than inserting rows one by one?
SIP Messages

Scaling up Insert Rate: Multi-Row Inserts
• Instead of flushing inserted rows one by one into the database, OpenSIPS will now store
rows in memory, and only flush them to DB when a certain number of rows have piled up
in memory
• The flushing of the rows will be done within a single SQL command
1. On a constant rate of 1500 CPS pushed through the proxy, the load on the mysql daemon
was :
• OpenSIPS 1.6.4 : 71.5 %
• OpenSIPS 1.7.0 : 35.1 %
-> BOOST = ~50 % lower load on mysql
http://www.opensips.org/About/PerformanceTests-InsertBuffering
SIP Messages

The transactions per second rate has been lowered by inserting multiple rows at once
Does the data rate (Mbps) remain the same?
How to scale up the I/O rate further?
SIP Messages

Scaling up I/O Rate
• As a consequence of our growth we now
• optimized my.cnf configuration for InnoDB disk I/O
• use faster disks
• split (“shard”) the SIP message capture across several MariaDB Servers
SIP Messages

Scaling up Insert Rate: Multi-Row Inserts
• Instead of flushing inserted rows one by one into the database, OpenSIPS will now store
rows in memory, and only flush them to DB when a certain number of rows have piled up
in memory
• The flushing of the rows will be done within a single SQL command
1. On a constant rate of 1500 CPS pushed through the proxy, the load on the mysql daemon
was :
• OpenSIPS 1.6.4 : 71.5 %
• OpenSIPS 1.7.0 : 35.1 %
-> BOOST = ~50 % lower load on mysql
2. In terms of raw CPS, the maximum number of calls per second that the proxy could handle was :
• OpenSIPS 1.6.4 : 1700 CPS
• OpenSIPS 1.7.0 : 3000 CPS
-> BOOST = ~75 % higher CPS
http://www.opensips.org/About/PerformanceTests-InsertBuffering
SIP Messages

Decoupling Database and Application
• The increase in OpenSIPS throughput from using multi-row inserts
indicates a tight coupling between the database and the application
• Decoupling the application from the database requires
• a non-blocking MariaDB client library
• application re-architecture
• The latest version of OpenSIPS has been re-architected to decouple
application from database
• This presentation shares our experience in case of tight coupling
between the database and the application
SIP Messages
CDRs

High Availability
• A critical application tightly
coupled to a database requires
High Availability of the database
server
• For High Availability we use
MariaDB Galera Cluster
Galera Cluster
➢ Synchronous multi-master cluster
➢ no data loss
➢ no slave lag
➢ no slave failover
➢ For MySQL/InnoDB
➢ 3 or more nodes needed for HA
➢ No single point of failure
CDRs

Call Detail Records
• After the phone call is complete, telecommunications providers
must store a Call Detail Record (CDR)
• In our case, upon the phone call completion the OpenSIPS application
generates a corresponding record for CDR accounting
• A database query for the predefined accounting table:
• INSERT INTO acc …
CDRs

CDR Processing
• Records from acc table are
processed to generate CDRs
and aggregate data further
• Traditional solution:
• Queue records for
processing
• Works well on a small scale
Flavio E. Goncalves: Building Telephony Systems with OpenSIPS 1.6

Do you expect a problem with this approach when the CDR processing rate increases
as a consequence of our growth?
CDRs

Scaling Up RDBMS Queues
• https://www.engineyard.com/blog/5-subtle-ways-youre-using-mysql-as-a-queue-and-why-itll-bite-you
• https://www.eschrade.com/page/why-mysql-is-not-a-queue

Is there a better way to process records that scales better than a queue implemented in RDBMS?
CDRs

Triggers!
• Scales much better
• For processing and
aggregation uses a
trigger on the acc table
• Upon each acc row INSERT,
the trigger increments
corresponding rows in the
billing table using ON
DUPLICATE KEY UPDATE
for a unique index on columns
used for aggregation such as
customer_id, carrier_id
https://www.xaprb.com/blog/2007/01/11/how-to-implement-a-queue-in-sql

Next Scalability Limit
We use:
• OpenSIPS multi-row inserts to scale up inserts from numerous calls
• Triggers to scale up CDR processing
• MariaDB Galera cluster for High Availability (HA)
• Trigger support is straightforward
• Not so trivial to support EVENTs
• As a consequence of growth, these solutions hit another scalability limit:
• We encountered an increasing rate of deadlocks
CDRs
Aggregates

Increasing Rate of Deadlocks Like:
• Retry logic is of little or no help here as the deadlock may happen again
• Consolidating writes to a single Galera Cluster node is of no help either
CDRs
Aggregates

Let us review our aggregation approach
CREATE TRIGGER aggregate BEFORE INSERT ON acc FOR EACH ROW BEGIN
INSERT INTO billing.aggregates (date, customer_id, carrier_id, calls, ...) VALUES
(NEW.date, NEW.customer_id, NEW.carrier_id,
IF(NEW.sip_code=200,1,0), # count completed calls
...)
ON DUPLICATE KEY UPDATE
calls = calls + VALUES(calls),
...;
END
CDRs
AggregatesCREATE TABLE àcc` (
ìd` INT NOT NULL AUTO_INCREMENT,
`date` DATE,
`customer_id` INT,
`carrier_id` INT,
`sip_code` INT,
...
PRIMARY KEY (ìd`)
);
CREATE TABLE `billing`.àggregates` (
ìd` INT NOT NULL AUTO_INCREMENT,
`date` DATE,
`customer_id` INT,
`carrier_id` INT,
`calls` INT,
PRIMARY KEY (ìd`),
UNIQUE KEY ùk_customer_carrier` (`date`,`customer_id`,`carrier_id`)
);

Is it clear why these deadlocks started to happen as a consequence of our growth?
CDRs
Aggregates

customer-1, carrier-1
UNIQUE KEY INDEX ROWS
(1) TRANSACTION (2) TRANSACTION
Call 1: customer-1, carrier-1
…
…
What is causing deadlocks?
• Concurrent multi-row INSERT queries may cause a deadlock when
two transactions touch the same sets of data in a different order
• Such as locking index rows in an opposite order
CDRs
Aggregates

Why deadlocks presented a problem?
• A deadlocked query on Galera
Cluster is like a huge transaction
• All later INSERT queries are
waiting for it to complete
• or to be rolled back
• Increasing rate of deadlocks
creates a backpressure for our
OpenSIPS application
• Due to the tight coupling
between the application and the
database server
Impact of Huge Transaction
0
500
1000
1500
2000
2500
3000
3500
4000
4500
Huge Transaction Slave Lag
Trx in master
24 secs
Trx in slave
9 secs
CDRs
Aggregates

Is there a way to avoid such deadlocks?
CDRs
Aggregates

Locking in the same order
• When modifying different sets of rows in the same
table, do those operations in a consistent order each
time
[MySQL Reference Manual]
• Application developers can eliminate all risk of enqueue
deadlocks by ensuring that transactions requiring
multiple resources always lock them in the same order
[Steve Adams]
CDRs
Aggregates

UNIQUE KEY INDEX ROWS
(1) TRANSACTION (2) TRANSACTION
…
…
We need ordered sets of rows
• Since OpenSIPS application is open source, we can modify the INSERT
statement generated by the OpenSIPS application
CDRs
Aggregates

We would like to delegate row ordering to MariaDB, for example
1. Insert unordered data into a temporary table tmp
2. Insert into acc
select from tmp
order by customer_id, carrier_id
Is there a better way?
CDRs
Aggregates

Avoiding Deadlocks
• We order INSERT rows according to
the index columns on-the-fly
• by modifying the prepared statement
generated by the OpenSIPS application
for the acc table INSERT query
• With UNION ALL, the modified
prepared statement creates a set,
which is then ordered by the unique
index columns
• A use case for the MariaDB UNION ALL
optimization
row OpenSIPS prepared statement modified prepared statement
derived table: select without from
insert into acc insert into acc
( (
method, method,
customer_id, customer_id,
carrier_id, carrier_id,
callid, callid,
sip_code, sip_code,
sip_reason, sip_reason,
time, time,
duration, duration,
setuptime, setuptime,
created created
) )
values (
1 ( select
?, ? AS method,
?, ? AS customer_id,
?, ? AS carrier_id,
?, ? AS callid,
?, ? AS sip_code,
?, ? AS sip_reason,
?, ? AS time,
?, ? AS duration,
?, ? AS setuptime,
? ? AS created
) )
, union all
2 (?,?,?,?,?,?,?,?,?,?) (select ?,?,?,?,?,?,?,?,?,?)
, union all
… … …
100 (?,?,?,?,?,?,?,?,?,?) (select ?,?,?,?,?,?,?,?,?,?)
order by customer_id, carrier_id

Three use cases for ColumnStore/InfiniDB
1. Conventional use case: analytics on the Call Detail Records
The CDRs are wide rows with 50-100 columns, while most analytics
queries need just few of those columns
Works best for approximately incremental data, such as time series
CDRs

2. Data Aggregation with ColumnStore
How many calls happen during the last hour?
• A considerable problem here is that this number may change, as Call
Detail Records for the past hour may arrive with delay
• Under those conditions with InfiniDB/ColumnStore we are able to repeat
simple SQL queries for data aggregation
• These SQL queries work well, since latest data (for the few select columns) fit
in memory of ColumnStore/InfiniDB cluster nodes
• In a traditional row-based MariaDB Server this simple approach would require
prohibitively more memory to fit whole InnoDB rows
CDRs
Aggregates

3. Redundancy for Critical Data
• As statistical findings tolerate limited data losses, ACID compliance
of the MariaDB ColumnStore (the InfiniDB legacy) is often overlooked
• In contrast, the third use case requires both products
Telecommunications (IP telephony – SaaS)
Database
(Hybrid)
Transactions Analytics
Transactional
Capture call detail records
Charge by call/message
Generate bills
Analytical
Monitor usage
Identify peak periods
Estimate costs
Self-service analytics
Slide by Shane K Johnson, MariaDB
CDRs

Double-Entry Accounting
• Since middle ages, accounting uses
double-entry system, were at least
two accounting entries are required
to record each financial transaction
• In telecommunications these
financial transactions are a part of
the Call Detail Record
• In 1494 Fra Luca Bartolomeo de
Pacioli published a book on the
double-entry system of accounting

Double-Entry Accounting for CDRs
• Thanks to the ACID compliance of MariaDB ColumnStore/InfiniDB
transactions, thinQ was able to implement in practice double-entry
accounting for Call Detail Records
• The ColumnStore/InfiniDB transactions enable us to verify/audit customers’
billing done through Call Detail Records stored in MariaDB Server
CDRs

Traditional InfiniDB/CS Data Processing Pipeline
OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
High-speedLoadUtility
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
Data Warehouse and Metadata Management
#6 Load New Data with Minimal Impact
OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
InfiniDB
HA Staging
Area for
processed data
local to
InfiniDB
(e.g. costly HA
storage)
HA Staging Area
for raw data
(e.g. data streams)

Heterogeneous Redundancy Assures Scaling
• Our experience with using both MariaDB Server and MariaDB
ColumnStore/InfiniDB proved crucial in solving business challenges
• as two distinct data processing pipelines built with two different technologies
encountered scalability limits at different loads
• Similar to both mechanical and hydraulic braking in a car
• Combining mechanical drum brakes with hydraulic brakes to offer backup
braking support in case the car’s hydraulic system fails
CDRs

• To assure resilience, infrastructure is geo-redundant
• Our call data is collected in geo-distributed data centers, which complicates analytics
• Note the extra cost of large data transfers between data centers
• According to the InfiniDB Concepts Guide, User and Performance Modules can be
separated out in different data centers and geographic locations
• We are pleased that MariaDB ColumnStore retained such feature
• The beauty of this is that the analytical data aggregation queries are executed locally, with
only small aggregated data are transferred between data centers
CDRs

Tips for Geo-Distributed MariaDB ColumnStore
• Three combined UM/PM nodes: PM3 in a different geo-location than PM1 & PM2
• Watch for idle TCP/IP connections dropped by data centers firewalls
• Implement a keep-alive ping such as periodic execution of a test data aggregation query
• Watch for automatic round-robin distribution of queries:
• First query execution is fast (0.3s) – logged on PM1 node
• Second execution is fast (0.3s) – logged on PM2 node
• Third execution is slow (18s) – logged on PM3 node in a data center separated by 25 ms RTT from PM1 & PM2
• David Thompson (MariaDB VP) kindly provided a workaround for round-robin query distribution:
• Change the Columnstore.xml ExeMgr IP addresses to 127.0.0.1 on all three nodes
• With these, the geo-distributed MariaDB ColumnStore system operates stably

We look forward to the new remote mcsimport capabilities of the
ColumnStore Bulk Write SDK
CDRs

OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
6 Load New Data with Minimal Impact
OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
RemoteHABulkDataLoaders
MariaDB
ColumnStore
HA Staging
Area
for raw data
(e.g. data streams)
New ColumnStore Data Processing Pipeline
• Improvements
in data
processing
pipeline
provided by
the remote
mcsimport

vs. Traditional InfiniDB/CS Data Processing Pipeline
OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
OLTP
Files/XML
Log Files
Operational
Source Data
StagingorODSETL
Ad-Hoc
Dashboards
Reports
Notifications
Users
Staging
Area
Data
Warehouse
InfiniDB
HA Staging
Area for
processed data
local to
InfiniDB
(e.g. costly HA
storage)
HA Staging Area
for raw data
(e.g. data streams)

Online Schema Change while Streaming
• Tight coupling of streaming applications to CS schema complicates schema changes
• Redundant applications for data streaming enable schema changes without data loss
• Prepare new application (e.g. mcsimport job.xml) for the new schema
• e.g. describe new columns as <DefaultColumn colName="col7"/>
• Stop the old application (mcsimport cron job)
• Provide enough buffer (staging area) for the data
• Use ColumnStore function (it takes time)
select calonlinealter('alter table foo add column col7 int;');
alter table foo add column col7 int comment 'schema sync only';
• Start the new application (mcsimport cron job)

High Availability with ColumnStore Bulk Write SDK
• By their nature, data streaming applications run continuously
• Redundant applications could increase data streaming uptime, since if one
application fails, a second application would still be running
• How do you implement HA/failover between data streaming applications
using bulk write SDK remotely?
• MariaDB developers provide functions to view and clear table locks remotely
• In case of MariaDB Server, transaction is rolled back upon client failure
• Perhaps the MariaDB Platform X3 may implement a similar behavior for ColumnStore

More Open Source Benefits
• Some ColumnStore features are documented as open source code
• A failed cpimport may result in locks that can not be cleared with
cleartablelock. Andrew Hutchings (MariaDB CS Lead) pointed to one
useful option documented in such a way:
• If your ColumnStore installation is running fine now these locks can be
removed using a hidden cleartablelock option, '-l'. For example:
/usr/local/mariadb/columnstore/bin/cleartablelock -l 1
• The downside is if the table really does exist and there was data to rollback, it cannot
be rolled back any more. This is why we don't really publish this option.
• I used this option for the table with reference data like customers
https://groups.google.com/d/msg/mariadb-columnstore/B0fDukIgUzM/FUGBiZR7AgAJ

MariaDB ColumnStore Summary
• Most analytics queries read only few of table columns
• Works best for approximately incremental data, such as time series
• Repeated aggregation works well since data could fit in memory
• Features often overlooked:
• ACID transactions enables heterogeneous redundancy for critical data
• Geo-distributed cluster works stably
• While streaming, you may change the schema without data loss
• With caution, you may use hidden cleartablelock option
• Latest bulk write SDK enables remote data upload

MariaDB TX 3.0
MariaDB Server 10.3
MariaDB MaxScale 2.2
InnoDB/MyRocks
MariaDB AX 2.0
MariaDB Server 10.2
ColumnStore 1.2
MariaDB Platform X3
MariaDB Server 10.3
InnoDB/MyRocks
MariaDB Server 10.3
ColumnStore 1.3
Conclusion
• Wise past technology choices (MariaDB/Galera and InifiniDB) provided
with a consolidated roadmap for future upgrades

How THINQ runs both transactions and analytics at scale

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a How THINQ runs both transactions and analytics at scale

Semelhante a How THINQ runs both transactions and analytics at scale (20)

Mais de MariaDB plc

Mais de MariaDB plc (20)

Último

Último (20)

How THINQ runs both transactions and analytics at scale