3. RDBMS – START FROM THE END
• MySQL
• RDBMS
• Relational Database Management System
• How it scales?
• Read Replica
• Pros (In terms of scalability)
• Simple to do
• Simple management
• Cons
• You can scale only read operations
• The master instance has to handle all write operations
(bottleneck on writes)
CORLEY SRL – WWW.CORLEY.IT
4. READ REPLICA ON AWS
• From RDS service tab on the AWS console right
click on a running instance and create a Read
Replica DB Instance
• Configure the read-replica and create it through
the graphical console.
CORLEY SRL – WWW.CORLEY.IT
5. IN ORDER TO PROMOTE A SLAVE TO MASTER?
Similar to master creation
• Select a read-replica
• Right-click and promote Read Replica
Discover more on RDS:
• http://aws.typepad.com/aws/amazon-rds/
CORLEY SRL – WWW.CORLEY.IT
6. NOW HAVE A LOOK ON WEB INSTANCES
• All web instances scales out
instead scales up
• Scale out? What it means?
• Instead increase VM performances
(more RAM, more CPU, more IO etc.
etc.) open new VM and serve requests
from these instances
• Load balancer route incoming
connections to VMs using common
algorithms
• Round robin techniques
• Based on VMs average load
CORLEY SRL – WWW.CORLEY.IT
7. PROBLEMS… WE NEVER TALK ABOUT…
• Session management
• If we open and close servers runtime we have to maintain
PHP sessions in order to handle user logins and other
features related to sessions
• Database connections
• All MySQL connectors handle just one connection… No
“x” RDB connections a the same time…
• Software and Plugins maintenance
• How can we have the same version of WordPress and WP
Plugins if VMs starts and stops continuously? How can we
handle software updates?
• What about logs? How can we centralize the log
management?
CORLEY SRL – WWW.CORLEY.IT
8. DELEGATE SESSION MANAGEMENT TO MEMCACHE
• Memcache(d) servers are not only useful
distributed in RAM caching servers but also
they can manage PHP session for us.
• Memcache infrastructure is simple to create and
maintain
• Elasticache Service of AWS
• No software modification
• We have just to configure the PHP interpreter
(compile with memcache/memcached support)
session.save_handler = memcache
session.save_path = "tcp://1.cache.group.domain.tld:11211"
CORLEY SRL – WWW.CORLEY.IT
9. DELEGATE CONNECTIONS TO MYSQL NATIVE DRIVER
• MySQL native driver?
• Available from PHP >=5.3
• Compile PHP with mysqlnd support
• --with-mysqli=mysqlnd --with-pdo=mysqlnd --with-
mysql=mysqlnd
• WARN mysql extension is deprecated as of PHP
5.5.0
• Delegate to “mysqlnd_ms” the master/slave
management
• http://www.php.net/manual/en/book.mysqlnd-
ms.php
CORLEY SRL – WWW.CORLEY.IT
10. DELEGATE CONNECTIONS TO MYSQL NATIVE DRIVER
The simple JSON configuration is divided
in two main section
{
"myapp": {
"master": { • Master
"master_0": { • Slaves
"host": "localhost",
"port": "3306"
} “myapp” is the hostname that we use
},
"slave": {
instead the real mysql host address.
"slave_0": {
"host": "192.168.2.27", Eg.
"port": "3306"
} • mysql_connect(“myapp”, “user”,
} “passwd”);
} • new
} Mysqli(“myapp”, “user”, “passwd”
);
• new
PDO(“mysql:dbname=testdb;host=my
app”);
CORLEY SRL – WWW.CORLEY.IT
11. START TALKING ABOUT ELASTIC COMPUTE CLOUD
• ELB – Elastic Load Balancer
• Distributed load balancer on AWS regions (eu-west-1, 2, 3 you
have to select in how many region you are available)
• Watch EC2 status thanks to a ping strategy
• Page check every x minutes/seconds
• Turn on/off EC2 instances automatically thanks to alarms
(CloudWatch raise alarms)
• Receive Alarms from CloudWatch and engage scale operations
• You can raise CPU alarms, Network Alarms, VM status alarms and many
others in order to increase or decrease the actual number of EC2
• Scale strategy is not simple and you have to understand how your
application works
• CPU is the simplest way but remember that the bandwidth is limited by
network interfaces and bottlenecks can obfuscate the CPU alarm and your
application stucks in weird and strange situations.
CORLEY SRL – WWW.CORLEY.IT
12. AUTOSCALING WITH ELB + EC2 + CLOUDWATCH
• If servers start and stops continuously, we have to
find solutions to stay fresh and updated also on
software
• When a server starts, it has to create a valid
environment in order to provides web pages.
Strategies?
• Compile and bundle all softwares in one instance image
• It is very simple but all software becomes old very quickly and
when you have to release an update you have to compile a new
image and update all load balancers configurations. It is a long and
complex operation
• Use EC2_USER_DATA feature provided by AWS
• You can run a shell script when your instances bootstraps. It is more
flexible because you can create a skeleton (PHP + libraries) and
download all software runtime during the boot operation
CORLEY SRL – WWW.CORLEY.IT
13. THE PROBLEM WITH SOFTWARE MANAGEMENT
Use SVN (Subversion) to download the latest version of
WordPress
Probably is not a good idea use the “trunk” but you can use tags in order to stay
aligned in all VMs
svn checkout http://core.svn.wordpress.org/tags/3.5.1/ mywebsite
http://codex.wordpress.org/Installing/Updating_WordPress_with_Su
bversion
Use SVN externals to download your plugins
cd mywebsite/wp-content/plugins/
svn propset svn:externals akismet
http://plugins.svn.wordpress.org/akismet/tags/2.5.7/
svn up
Create/Download your WordPress configuration file
during VM bootstrap
CORLEY SRL – WWW.CORLEY.IT
14. HOW WE CAN DOWNLOAD WP AND PLUGINS?
• If you ran 10 servers execute
commands could be hard. You can use
tools to run command on a server list
• Capistrano (Ruby)
• https://github.com/capistrano/capistrano
• Fabric (Python)
• https://github.com/fabric/fabric
• Use CLOTH for AWS EC2 instances
• https://github.com/garethr/cloth
CORLEY SRL – WWW.CORLEY.IT
15. HOW TO UPDATE CONFIGURATIONS RUNTIME?
#! /usr/bin/env python
from __future__ import with_statement
from fabric.api import *
from fabric.contrib.console import confirm EC2 instances are dynamic with don’t know
address, for that reason we can use tagging
from cloth.tasks import * system to execute commands on a group of
instances
env.user = "root"
env.directory = '/mnt/wordpress' fab nodes:"^production.*" tail
env.key_filename = ['/home/walter/Amazon/wp-
cms.pem'] Execute the “tail” command on all instances
with a name that starts with “production.”
@task
def reload(): Eg.
"Reload Apache configuration" • production.web-1
run('/etc/init.d/apache2 reload') • production.log
• production.mongodb
@task
def tail():
"Tail Apache logs"
run('tail /var/log/syslog')
CORLEY SRL – WWW.CORLEY.IT
16. EXAMPLE OF FABRIC – USAGE WITH CLOTH
• We create and destroy instances thanks to alarms but
when we close an instance we lose immediately all
apache logs (or equivalent)
• How we can manage logs?
• The simplest way is to use Rsyslog clusters
• Rsyslog is an opensource software that forwarding log
messages in an IP network
• Rsyslog implement the basic syslog protol
• That means that we can configure apache logs to “syslog”
instead using normal text files.
• In this way we can collect all logs in one group of VM and
work on these files later thanks to other technologies.
CORLEY SRL – WWW.CORLEY.IT
17. ALSO LOG MANAGEMENT IS NOT SIMPLE…
• Collecting logs is not the latest operation
because you have to analyse and reduce
information
• Move logs to S3 bucket – Time based
• Analyze logs with Hadoop
• Map Reduce on the cloud with Elastic Map Reduce service
(EMR)
• Use script languages on top of Hadoop in order to
simply the log analysis
• HIVE – Data Warehouse infrastructure (data
summarization)
• Pig – High level platform for creating MapReduce program
CORLEY SRL – WWW.CORLEY.IT