4. Today’s
agenda
BlaBlaCar - Facts & Figures
Infrastructure Ecosystem - 100% containers powered carpooling
Backend High Availability Pillars - MariaDB as an example
Database as a Service - Building a frictionless infrastructure
What’s next?
6. 60 million
members
Founded
in 2006
1 million tonnes
less CO2
In the past year
30 million mobile
app downloads
iPhone and Android
5 million
monthly travellers
Currently in
22 countriesFrance, Spain, UK, Italy, Poland, Hungary, Croatia, Serbia, Romania,
Germany, Belgium, India, Mexico, The Netherlands, Luxembourg,
Portugal, Ukraine, Czech Republic, Slovakia, Russia, Brazil and Turkey.
Facts and Figures
9. Infrastructure Ecosystem
bare-metal servers
1 type of
hardware
3 disk profiles
fleet cluster
CoreOS
fleet etcd“Distributed init system”
Hardware
Container Registry
ggn
dgr
Service Codebase
rkt PODs
build
run
store
host
create mysqld
monitoring
nerve
mysql-main1
php
nginx
nerve
monitoring
synapse
front1
synapse
nerve
zookeeper Service Discovery
10. backend pod
client pod
Service Discovery
/database/node1
go-nerve does health checks
and reports to zookeeper in
service keys
node1
/database
Applications hit their local
haproxy to access backends
go-synapse watches
zookeeper service keys and
reloads haproxy if changes are
detected
HAProxy
go-nerve
Zookeeper
go-synapse
13. Asynchronous vs. Synchronous
Master
Slave Slave Slave
wsrep wsrep wsrep wsrep
MariaDB Cluster
wsrep
MariaDB Cluster means
No Single Point of
Failure
No Replication Lag
Auto States Transfers
As fast as the slowest
14. MySQL at BlaBlaCar?
wsrep wsrep wsrep wsrep
MariaDB Cluster
wsrep
MariaDB Cluster
Our prerequisites are
Containers
Writes go on one
node
Writes
Reads are balanced
on the others
Reads
19. If enableCheckStableCommand is set
The command is run at each
increase and if returning != 0,
current weight restart from 1.
Weight value is reached
The service is fully in
production.
go-nerve Zookeeper go-synapse HAProxy
call API on
/enable or
/weight/:weight
store current
weight
update weight on
HaProxy via
socket
set weight
<backend>/<server>
<weight>
20. # cat /report_slow_queries.sh
#!/dgr/bin/busybox sh
. /dgr/bin/functions.sh
isLevelEnabled "debug" && set -x
slwq=$(/usr/bin/timeout 1 /usr/bin/mysql -h127.0.0.1 -ulocal_mon -plocal_mon information_schema -e "SELECT COUNT(1) FROM processlist WHERE user
LIKE '%rd' AND LOWER(command) <> 'sleep' AND time > 1" -BN)
if [ $? -eq 0 ] && [ $slwq -eq 0 ]; then
return 0
else
return 1
fi
MySQL’s warm up in nerve
# cat env/prod-dc1/services/mysql-main/attributes/nerve.yml
---
override:
nerve:
services:
- name: "mysql-main"
port: 3306
reporters:
- {type: zookeeper, path: /services/mysql/main}
checks:
- type: sql
driver: mysql
datasource: "local_mon:local_mon@tcp(127.0.0.1:3306)/"
enableCheckStableCommand: ["/report_slow_queries.sh"]
23. API call /disable return
The service can be shutdown
without risk.
Call /disable on Nerve’s API
Set weight to 0 = no more new
sessions will go into the services.
if disableGracefullyDoneCommand is set
This command is run in loop until
return 0.
Gracefully
Disabling
Pipeline
24. # cat /report_remaining_processes.sh
#!/dgr/bin/busybox sh
. /dgr/bin/functions.sh
isLevelEnabled "debug" && set -x
procs=$(/usr/bin/timeout 1 /usr/bin/mysql -h127.0.0.1 -ulocal_mon -plocal_mon information_schema -e "SELECT COUNT(1) FROM processlist WHERE user
LIKE '%rd' OR user LIKE '%wr'" -BN)
if [ $? -eq 0 ] && [ $procs -eq 0 ]; then
return 0
else
return 1
fi
MySQL’s graceful shutdown in nerve
# cat env/prod-dc1/services/mysql-main/attributes/nerve.yml
---
override:
nerve:
services:
- name: "mysql-main"
port: 3306
reporters:
- {type: zookeeper, path: /services/mysql/main}
checks:
- type: sql
driver: mysql
datasource: "local_mon:local_mon@tcp(127.0.0.1:3306)/"
enableCheckStableCommand: ["/report_slow_queries.sh"]
disableGracefullyDoneCommand: ["/root/report_remaining_processes.sh"]
25. Be Quiet!
Come gently into prod
Abolish Slavery
Every node is the same
Die in Peace...
Get out when you are ready
Graceful restart
Service Discovery (nerve/synapse)
Weight system
Slow query tracking
Graceful restart
Service Discovery (nerve/synapse)
Weight system
No master/slave
Auto States Transferts
Service Discovery (nerve/synapse)
Backend High Availability Pillars
26. Database as a Service
Building a frictionless infrastructure
27. Easy deployment
Pull Request on a services
repository
No technical parameters to
override
The services are auto initialized
32. Prometheus with Nerve integration
$ cat pod-mysql/pod-manifest.yml
name: aci.blbl.cr/pod-mysql:10.1-33
pod:
apps:
- dependencies:
- aci.blbl.cr/aci-mariadb:10.1-29
app:
mountPoints:
- {name: mysql-data, path: /var/lib/mysql}
- {name: mysql-log, path: /var/log/mysql}
- name: aci-nerve
dependencies:
- aci.blbl.cr/aci-go-nerve:21-23
- aci.blbl.cr/aci-mariadb:10.1-29
- dependencies:
- aci.blbl.cr/aci-prometheus-mysql-exporter:0.10.0-1
# cat env/prod-dc1/services/mysql-main/attributes/nerve.yml
---
override:
nerve:
services:
- name: "{{.hostname}}"
port: 3306
reporters:
- {type: zookeeper, path: /services/mysql/main}
checks:
- type: sql
driver: mysql
datasource: "local_mon:local_mon@tcp(127.0.0.1:3306)/"
- name: "{{.hostname}}_prometheus"
port: 9104
reporters:
- {type: zookeeper, path: /monitoring/mysql/main}
# curl mysql-main1.prod.dc1.com:9104/metrics | head
# HELP mysql_exporter_last_scrape_duration_seconds Duration of the
last scrape of metrics from MySQL.
# TYPE mysql_exporter_last_scrape_duration_seconds gauge
mysql_exporter_last_scrape_duration_seconds 0.056807316
# HELP mysql_exporter_last_scrape_error Whether the last scrape of
metrics from MySQL resulted in an error (1 for error, 0 for success).
# TYPE mysql_exporter_last_scrape_error gauge
mysql_exporter_last_scrape_error 0
[...]
# cat env/prod-dc1/services/prometheus/attributes/prometheus.yml
[...]
ranged_targets:
- type: zk
job_name: discovery_prod-dc1
scrape_interval: 20s
metrics_path: /metrics
zk:
hosts: '{{ toJson .zk.hosts }}'
zkpaths:
- /monitoring
[...]
33. Prometheus relabeling
# [zk: localhost:2181(CONNECTED) 1] get /monitoring/mysql/main/mysql-main1_prometheus_192.168.1.2_ba0f1f8b
{"available":true,"host":"192.168.1.2","port":9104,"name":"mysql-main1","weight":255,"labels":{"host":"r11-srv1"}}
We push services info with Nerve into Zookeeper
And Prometheus does the magic
34.
35. $ cat prometheus-rules/alert.mysql.rules
# Alert: Galera node state is not synced.
ALERT MySQLGaleraStateIsNotSynced
IF (mysql_global_status_wsrep_local_state != 4 AND mysql_global_variables_wsrep_desync == 0)
FOR 2m
LABELS {
severity = "warning", team="data_infrastructure"
}
ANNOTATIONS {
summary = "Galera node {{ $labels.name }} state is not in “Synced” (state={{$value}}).",
dashboard = "https://promgrafana.blabla.com/dashboard/db/mysql-cluster-view?var-
cluster={{$labels.service}}&var-ds=prom-dc1&from=now-1h&to=now",
runbook="https://ops-run-book.blabla.com/mysql/operational-tasks#MySQLGaleraOutOfSync",
}
Alerting
PromQL to find out
unhealthy services
Labeling for routing to
Slack & Pager Duty
Annotations with
templating to have clear
descriptions, URL to
dashboards and ops
runbooks
37. A set of bash scripts Do the basic health
checks quickly
Easy troubleshooting with “bbc” command
Manage all backends
the same way
Can be used by non-
specialists
Plugged into the
service discovery
Designed for our
needs
38. # bbc mysql list
pp-dc2 mysql-main
pp-dc2 mysql-user
pp-dc2 mysql-trip
pp-dc2 mysql-payment
prod-dc1 mysql-main
prod-dc1 mysql-user
prod-dc1 mysql-trip
prod-dc1 mysql-payment
[...]
bbc command examples
# bbc mysql overview prod-dc1 mysql-main
=== Service Overview 'prod-dc1 mysql-main' ===
mysql-main1 (192.168.1.1) PING, PORT, Synced
---
mysql-main1 (3306) - enabled - weight = 255/255
mysql-main1_prometheus (9104) - enabled - weight = 255/255
mysql-main2 (192.168.1.2) PING, PORT, Synced
---
mysql-main2 (3306) - enabled - weight = 255/255
mysql-main2_prometheus (9104) - enabled - weight = 255/255
mysql-main3 (192.168.1.3) PING, PORT, Synced
---
mysql-main3 (3306) - enabled - weight = 255/255
mysql-main3_prometheus (9104) - enabled - weight = 255/255 # bbc mysql connect prod-dc1 mysql-main
env: prod-dc1
service: mysql-main
host: mysql-main1
ip: 192.168.1.1
Enter the username [ENTER]: team_data
Enter password:
Welcome to the MariaDB monitor. Commands end with ; or g.
Your MariaDB connection id is 2887129
Server version: 10.1.28-MariaDB-1~jessie mariadb.org binary distribution
Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.
MariaDB [(none)]>
# bbc mysql monitor prod-dc1 mysql-main mysql-main1
Weight: 255/255 Processes: 88 Slow: 0
Weight: 255/255 Processes: 75 Slow: 0
Weight: 255/255 Processes: 89 Slow: 0
Weight: 255/255 Processes: 99 Slow: 0
Weight: 255/255 Processes: 79 Slow: 0
Weight: 255/255 Processes: 65 Slow: 0
Weight: 255/255 Processes: 86 Slow: 0
Weight: 255/255 Processes: 93 Slow: 0
Weight: 255/255 Processes: 88 Slow: 0
Weight: 255/255 Processes: 96 Slow: 0
Weight: 255/255 Processes: 77 Slow: 0
Weight: 255/255 Processes: 73 Slow: 0
39. # bbc postgresql overview prod-dc1 postgresql-corridoring
Service Overview 'prod-dc1 postgresql-corridoring'
-- USING BDR --
postgresql-corridoring1 (192.168.1.10) PING, PORT
postgresql-corridoring2 (192.168.1.11) PING, PORT
postgresql-corridoring3 (192.168.1.12) PING, PORT
postgresql-corridoring4 (192.168.1.13) PING, PORT
postgresql-corridoring5 (192.168.1.14) PING, PORT
# bbc postgresql list
pp-dc2 postgresql-airflow
pp-dc2 postgresql-corridoring
pp-dc2 postgresql-redash
pp-dc2 postgresql-trip-pricing
prod-dc1 postgresql-corridoring
prod-dc1 postgresql-redash
bbc command examples
# bbc postgresql connect prod-dc1 postgresql-corridoring
env: prod-dc1
service: postgresql-corridoring - database : corridoring
host: postgresql-corridoring1
ip: 192.168.1.10
Enter the username [ENTER]: team_data
Password for user team_arch:
psql (9.6.6, server 9.4.12)
Type "help" for help.
corridoring=#
# bbc redis overview prod-dc1 redis-main
=== Service 'prod-dc1' 'redis-main' ===
Redis elector master: redis-main1.prod.dc-1.blabla.com
redis-main1 (192.168.1.20): PING, PORT, role:master, clients:255
redis-main2 (192.168.1.21): PING, PORT, role:slave, clients:2, slaveof:192.168.1.20
redis-main3 (192.168.1.22): PING, PORT, role:slave, clients:2, slaveof:192.168.1.20
# bbc redis list
pp-dc2 redis-main
pp-dc2 redis-quota
pp-dc2 redis-translation
pp-dc2 redis-user
prod-dc1 redis-main
prod-dc1 redis-quota
# bbc redis connect prod-dc1 redis-main
env: prod-dc1
service: redis-main
host: redis-main1
ip: 192.168.1.20
role: slave
192.168.1.20:6379>
40. # bbc cassandra ping prod-dc1 cassandra-user
cassandra-user1 (192.168.1.30) PING, CQL, JMX
---
cassandra-user2 (192.168.1.31) PING, CQL, JMX
---
cassandra-user3 (192.168.1.32) PING, CQL, JMX
---
bbc command examples
# bbc cassandra overview prod-dc1 cassandra-user
=== Service 'prod-dc1 cassandra-user' ===
Datacenter: prod-dc1
====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.1.30 6.01 GB 256 33.3% bef39dd5-d4e5-4733-93e5-75904b6d556a r10
UN 192.168.1.31 5.89 GB 256 33.3% 23b77937-2177-4638-b860-e73e4bb913d2 r10
UN 192.168.1.32 5.12 GB 256 33.3% de0f4ed1-1241-499d-9485-e73e4bb913d2 r10
Datacenter: prod-dc2
====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 192.168.2.10 15.69 GB 256 100.0% 3ca1e862-f3e2-4fbf-a6c1-4d7d5a3e70ec r14
UN 192.168.2.11 14.99 GB 256 100.0% de0f4ed1-1241-499d-9485-2f8196aa7425 r13
UN 192.168.2.12 16.1 GB 256 100.0% 7e5fee00-052f-4546-973d-befaebbe604b r15
Today, 32 subcommands are available on bbc...
42. Moving to Kubernetes
From a simple
“Distributed init
system” to the
standard for container
orchestration.
Fleet is deprecated
Fleet is no longer
developed and
maintained by
CoreOS.
What does
the future
look like?
43. Ownership
Move backends
ownership to the
developers teams.
Moving to the cloud?
Extend this idea of
“expendable” services to
hardware resources.
Docker?
Kubernetes + RKT
(rktnetes, rktlet) has a
poor adoption.