1. Questions & Answers
1) Q: Describe how Replicon utilized MySQL and what your role entailed in supporting
it? I.e. What kind of environment was it supported in (testing, production)?
A: In Replicon MySQL was used as part of company web gateway solutions. The front-off
web site used customized open sourced DRUPAL CMS. The MySQL was oracle distribution
version 5.6. The web site was distributed for scalability and availability reasons on 3 servers.
2 servers were identical, on the same subnet, emulated local mysql cluster) and third was DR
server located on cloud (amazon). All server contained the same copy of software stack
(DRUPAL, PHP, apache, MySQL). MySQL was set to circular replication.
The data connection from data centrum to cloud was sufficient, the connection from cloud to
data centrum was prone to hickups and lower performance due to speed/performance
asymmetry.
My responsibility was to setup (mysql install, initial setting, replication setting), monitor/identify
and set thresholds alerts, fix replication, troubleshoot mysql/website issues, backup/restore
dbs/schemata and do MySQL tuning. I utilized standard Mysql command, unix/ubuntu
command line via ssh. I developed custom monitoring agents similar to percona's pt-hertbeat
tool (https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html) to check on
replication delays. To resolve the slower and less reliable connection from DR to replicon's
data centrum I used the replication data compression from MySQL 3 to MySQL 1. To
simplified maintenance, backup/restore, I created simple bash scripts available for all DBA
and couched them in learning sessions.
The environment consisted of staging/testing and production. They were exact copy, except
the staging/testing servers were micro versions of productions. The staging environment was
used to evaluate different architectural and DRUPAL settings. As there were not exact
hardware available to test modification of MySQL configuration parameters, I have to reliant
on the best practices, logic and common sense
2) Q: What kind of replication topology was used?
A: It was used circular replication model.
[(1, slave, master) → (2, slave, master) data centrum] → [(3, slave, master) cloud] → (1)
data centrum cloud
3) Q: How often were you called upon to do query optimization?
A: On MySQL were minimal performance issues. I did modification to use innodb storage on
all tables, added extra indexes and other modification to MySQL setting (caches, file system
synch) to improve reliability and performance. As there were mostly standard DRUPAL
queries there was not too much room in rewriting them. I was involved in query optimization 1
or 2 times in month when new release was imminent. I was responsible for others RDBMS
MySQL 1 MySQL 2 MySQL 3 - DR
2. systems too (PostgreSQL, MSSQL) and that involved almost daily query tweaking and
optimization suggestion (rewriting queries/views, indexes redesign, reindexing, maintenance
settings, etc)
4) Q: What did they use to instrument / collect metrics?
A: Instrument to collect metrics were on multiple levels. On MySQL was used long query log
to identify long queries and error log to find other issues. Local unix tools (mysqladmin,.mysql
commands/system views, unix commands, custom shell scripts) was used to troubleshoot
problems. To collect longer term metrics Nimsoft was used and then later replicon switched to
Zabbix. On other production systems (PostgreSQL, MSSQL) the RDBS system views were
heavily used to collect metrics and I created postgresql and mssql query sampler to log
queries during high cpu usage. Also I used applications log analyzing tool Splunk and SUMO
logic with combination to database logs. During the last 6 months I was involved in project to
use Graphite as central location to collect all metrics.
P.S. Currently I am looking to utilize pt-stalk, mysqlsampler, mysql_logger to log data about
mysql. Amazon provide its own set of monitoring/collection graphs.