This is my presentation document at AnsibleFest 2018 in Austin, Texas.
This topic is ‘Database Automation with thousands of database, monitoring and backup’.
In this document I want to tell you database automation using Ansible.
So I expect to give more confidence to infra engineer like me.
DATABASE AUTOMATION with Thousands of database, monitoring and backup
1. , ,
October 2, 2018
SaeWoong LEE, LINE
# ANSIBEFEST 2018 AUSTIN
2. ABOUT ME
SaeWoong LEE (REX)
Database Administrator,
Database team,
LINE corporation.
saewoong.lee@linecorp.com
3. Agenda • )( &
• ( (
• ( )& & &
• E EE
•
•
• A EAC
• EH
• B
• & &
• .
4. Based in Japan, LINE Corporation (NYSE:LN/TSE:3938) is dedicated to the mission
of “Closing the Distance,” bringing together information, services and people. The
LINE messaging app launched in June 2011 and since then has grown into a
diverse, global ecosystem that includes AI technology, FinTech and more.
MISSION: CLOSING THE DISTANCE
Bringing people, information and services closer together
6. Arabic, Brazilian Portuguese, English, French, German, Indonesian, Italian, Japanese, Korean,
Malay, Portuguese, Russian, Spanish for Spain, Spanish for Latin America, Simplified Chinese,
Thai, Traditional Chinese, Turkish, Vietnamese
Supports 19 languages and dialects
As of Q2 2018
Highly Engaged Users
Top 4 MAU: 164 million Top 4 DAU/MAU: 77%
Japan MAU: 76 million Japan DAU/MAU: 85%
Japan ・ Taiwan ・ Thailand ・Indonesia
MAU : Monthly Activity User, DAU : Daily Activity User
8. Taking LINE into the post-smartphone era, in partnership with NAVER Corporation
2017.3
Clova
announced
2017.10
Clova-
connected
concept car
announced
2017.10
Radiko
added
2017.10
Clova WAVE
released
2017.12
Clova Friends
released
2018.3
IFTTT
support
began
2018.4
Clova Friends
Hoodies
launched
Clova continues
growing as an open
platform
Strategic Business: AI
LINE Clova: Cloud AI Platform
Taking LINE into the post-smartphone era, in partnership with NAVER Corporation
2017.3
Clova
announced
2017.10
Clova-
connected
concept car
announced
2017.10
Radiko
added
2017.10
Clova WAVE
released
2017.12
Clova Friends
released
2018.3
IFTTT
support
began
2018.4
Clova Friends
Hoodies
launched
Clova continues
growing as an open
platform
Strategic Business: AI
LINE Clova: Cloud AI Platform
2018.6
Clova Friends mini
9.
10. • We have many kinds of database.
• MySQL, Oracle, MS-SQL, Redis, Hbase, Mongodb, Cubrid and Elasticsearch
• We are operating all databases of LINE corp. except special case.
• We do not use AWS.
• There is no cost advantage for LINE.
• Compared to same spec machines, performance is not good and management is
inconvenient.
• We have a private cloud system IaaS developed with openstack. Like AWS.
• We call it Verda.
• Verda is developing by another dev team.
• But we are ready to be serviced at any time when we need to open service in a no
internet pod area.
• We have own Internet Data Center(IDC).
• There is no developer on the database team, but we make and use the necessary tools
ourselves.
• MonDB+ : database admin web tool for DBA
• DBONE : Multiple database monitoring tool combining several open sources
• MySS : MySQL slowquery analyze system
• And more…
(LINE IaaS service)
18. - / . : /
- : /
- - # : -
- - -
ü The job is very simple.
1. Make a folder
2. Move on a folder
3. Download upgrade script
4. Execute an upgrade script on local machine
0 . 1
27. ü Simple, Agentless
ü Written in Python
ü Easy to develop and manage
ü Idempotent (using ansible module)
ü Good Configuration management
ü Use SSH
ü Easy to learn.
Ansible in a Nutshell
28. ü Simple, Agentless
ü Written in Python
ü Easy to develop and manage
ü Idempotent (using ansible module)
ü Good Configuration management
ü Use SSH
ü Easy to learn.
Ansible in a Nutshell
31. Install
&
Setup
DDL &
DML
DBACL
Monito
-ring
Backup HA
MySQL
Mongodb
Auto DBACL system
Add user by all server
MEB upgrade
MEB fadeout
Database 4096 cnt
8set database server
: 1set server 256dbs
Manage DBONE
MEM fadeout
MMM re-configuration
Create slave server
) ) ) ) (
MMM : Multi-Master Replication Manager
33. q MySQL install takes a long time.
q We’d like to install in parallel, not sequential.
q It is difficult to set dynamic variables depending on the server.
q We'd like to make it easy to work with new features or modify sources.
q If we change the dbacl or change the default schema, we have to do whole
test. We'd like to reduce test time.
q We'd like to handle exception when failed MySQL install.
Install &
Setup
34. Install &
Setup
q MySQL install takes a long time.
q We’d like to install in parallel, not sequential.
q It is difficult to set dynamic variables depending on the server.
q We'd like to make it easy to work with new features or modify sources.
q If we change the dbacl or change the default schema, we have to do whole
test. We'd like to reduce test time.
q We'd like to handle exception when failed MySQL install.
o . .
38. Install &
Setup
Type1 Type2 AS-IS TO-BE Description
common install time 30 mins 3 mins 10 times faster than before (1master-3slaves)
parallel support X O
scalability
Python full
source
Ansible
module
simple source code, easy try catch error, function module
MySQL port fixed dynamic
server-id fixed automatic
Set the unique server-id using the bit left shift operator by
ip address
innodb_buffer_pool_size fixed dynamic
innodb_log_file_size fixed 16G
dynamic
2G/16G
VM : 1G, PM: 8G - calculate server memory size
into tar file
no logfile in
tar file
dbacl set in advance ansible files If you have dbacl changes, you have to make re-package.
replication master log pos fixed (154) Automatic
replication master log file fixed (3) Automatic
common change source code difficult easy Ansible is very easy to add new MySQL version.
error handing SKIP STOP
When some problem happens in MySQL installing, it is go
on. But ansible install is STOP !!!
need deployment N person 1 person
security RSH SSH SSH is more safe than RSH because it can be managed.
39. Install &
Setup
Check preparation
MySQL install
set replication
set DBACL
set mmm
set MEM
set MEB
set MySS
remove install file
set path
execute backup
set Root pw
set mmm monitor
set Yum
templates
#
. -5 5 - 7 .
. :.7 5 = #
.
. :.7 5 = #
ü Ansible is easy to manage new features and maintenance.
• 5.7 install was taking for 5 months.
• It has been in use for 2 years.
• Python API for ansible
• Changed ‘return value’ of MySQL start
• SSH by all servers
• 5.6 install was taken 3 days to complete.
• 5.5 install was taken 2 days to complete.
ü MySQL 5.7, MySQL 5.6, MySQL 5.5
41. DDL &
DML
Q. Can I make it easy to apply DDL or DML job to 4096 databases?
Q. Can I make it easy to check after jobs?
ü Issue: The query is the same, but the database name is different.
[Query] :
use db0001;
alter table `TENNIS’ add column
`js` int after saein;
use db0002;
alter table `TENNIS’ add column
`js` int after saein;
……..
……..
use db4096;
alter table `TENNIS’ add column
`js` int after saein;
db0001, db0002, db0003
db0004, …………………………
…….………………………………..
…….………………………………..
…….………………………………..
db4094, db4095, db4096
shard1 ~ shard16 server
0 4 4 4 6 6
I can’t check
Job log
44. DBACL
ü 5 F5A5 4 C 5 1CC5A 54 : 2 1 D1
Work Flow systemdeveloper
DBA
Database
5 45 5 5A 1 , 2 B BC5
5 B BC5 45 5AB C
1 5B C C C 5 41C121B5 1 C5A A5 5F
3 3 B C 5 B 2DCC B BC5
. C 5 B BC5 B5 4 1 3 5C 5BB1 5 C C 5
45 5 5A
C DB B 2 5 F A F
45. DBACL
ü :
.
Check preparation
MySQL install
set replication
set DBACL
set mmm
set MEM
set MEB
set MySS
remove install file
set path
execute backup
set Root pw
set mmm monitor
set Yum
set 2ndBackup
46. 3000 +
DBACL
I , : P U:8 Q N @QN PU N.' ' * '
IU M Q N,
D Q N-N P
D L S NA- N PLS
D Q T @ P- X IU M @
I - @QN PU N
L S NA- N TL S NA
E P- ' ' * '
LN R- P I
P P -LN P
LL A LN R -U
S PE P I ,
,:2 20
U:8 ,2=20 2
IU M R P NU,:2 20 " 6:29 " 1/ 2"12 2 2
L NC NI @ @E I PEN A , 1/ 2
MySQL 5.7
MySQL 5.6
MySQL 5.5
MySQL 5.1
CentOS 5.X
CentOS 6.X
CentOS 7.X
48. Monitor
ing
ü Multiple database integrated
monitoring solution
ü Make it easy to change features and
source.
ü Cheap license cost
• Need Conditions for monitoring• Database managed by database team
MySQL Enterprise Monitor
Enterprise Monitor
Remin/Relumin
MMS, Cloudera
Kibana,
nPod
• Database monitoring tools
• Is there?
49. Monitor
ing
B CA
E
A
DE
C B
AB A
AB A
AC
B C
B
B B
A
CA A
ü So we developed using many open source technologies to satisfy
our goal and flexibility.
ü The name is DBONE.
50. Monitor
ing
ü It is not easy to manage many open sources.
ü Workset
- Install
- Configure
- agent, prometheus
- Remove agent
- Manage alert values
- Upgrade opensource
- Prometheus
- Grafana
- Fluentd
- Elasticsearch
- HAproxy
52. HA
Active
Master
Slave1
(Standby Master)
Slave2
Slave3
Normal MMM
VIP
AP
RW
Health Check
• MySQL have very strong function as replication.
• MMM means Multi-Master Replication Manager
• MMM is online Failover solution for MySQL
• AP servers access database via the MMM IP is VIP
• MMM monitor checks databases status
• OS Ping
• MySQL status
ü Access, Update query, MySQL replication
MMM Monitor
- -
53. - -
SPOF MMM
(single point of failure)
Active
Master
Active
Master
Slave2
Slave3
VIP
AP
RW
Health Check
MMM Monitor
HA
• MySQL have very strong function as replication.
• MMM means Multi-Master Replication Manager
• MMM is online Failover solution for MySQL
• AP servers access database via the MMM IP is VIP
• MMM monitor checks databases status
• OS Ping
• MySQL status
ü Access, Update query, MySQL replication
• Failover
• Change master
• Set ‘read_only off’ on new active master
• -- Session kill old active master
• Add VIP on new active master
54. Active
Master
Active
Master
Slave2
(Standby Master)
Slave3
NO SPOF MMM
VIP
AP
RW
Health Check
MMM Monitor
ü RE-configure step
1. Stop mmm agent server and
monitor mmm server
2. Modify mmm configuration file
db01 active master --> slave1
db02 slave1 (ST master) --> slave2
db03 slave2 --> slave3
db04 slave3
3. Start mmm agent server and
monitor mmm server
4. Check
- -
Check preparation
MySQL install
set replication
set DBACL
set mmm
set MEM
set MEB
set MySS
remove install file
set path
execute backup
set Root pw
set mmm monitor
set Yum
set 2ndBackup
HA
55. Active Master
Slave1
(Standby Master)
Normal MMM
VIP
AP
RW
Health Check
MMM Monitor
New Slave2
Backup file
Check preparation
MySQL install
set replication
set DBACL
set mmm
set MEM
set MEB
set MySS
remove install file
set path
execute backup
set Root pw
set mmm monitor
set Yum
set 2ndBackup
MySQL install
using Ansible
HA
Ansible
57. Ansible MySQL modules
• mysql_db - Add or remove MySQL databases from a remote host.
• mysql_replication - Manage MySQL replication
• mysql_user - Adds or removes a user from a MySQL database.
• mysql_variables - Manage MySQL global variables
• mysql_install – Set MySQL community version
• mysql_backup – Set MySQL backup with percona xtrabackup
• Mysql_ddl – Set MySQL ddl with schema_online_change
• mysql_status – Report MySQL status like OS, QPS, Lock …..
58. Backup
+ HA
1. Install MySQL using Verda (LINE IaaS)
2. Recovery MySQL by remote backup file
3. Replication setting with Active Master
server until no delay
NO SPOF MMM
Active
Master
Slave1
(Standby Master)
AP
RW
Health Check
MMM Monitor
4. Replication setting with Standby Master
on Active Master
5. Setting reconfiguration MMM
6. MMM status check
DNS
SPOF MMM
(single point of failure)
Active
Master
Active
Master
AP
RW
Health Check
MMM Monitor
DNS
New Server
Remote backup
1
2
3
4
5
6