3. Think about your app as a worker
not single instance
OS
Load balancer
App
Server #1
App #1
App #2
Server #2
App #3
App #4
Server #3
App #5
4. Think about your app as a worker
not single instance
Load balancer
Server #1
App #1
Server #3
Load balancer
App #2
Server #2
App #3
App #4
App #5
Server #n
7. Sessions - Redis
•
•
•
•
•
Key-value in memory database (hash-tabled)
Scalable up to 1k nodes
Partitioning with Query routing
Non blocking M-S replication on nodes
Clustered (currently not production ready)
http://athlan.pl/symfony2-redis-session-handler/
8. Redis - Partitioning with Query routing
Query
random
node
Miss
Node #1
Hit, abort
Node #2
Node #3
Also supported:
• Client-side partitioning (app calls appropriate
node)
• Proxy assisted partitioning (proxy selects
appropriate node)
9. Centralized Logging
• Logs should be centrailzed to avoid taking
notice to each node separately
• Approaches:
– File replication (rsync + cron)
– syslog (easy to integrate with log4j)
• syslogd over UDP p:514
• rsyslog over TCP, stores data in db
10. Common storage, no local changes!
• Keep storage avaliable to all nodes
– Symfony2 Gaufrette Bundle
•
•
•
•
•
FTP
Amazon S3
OpenCloud
AzureBlobStorage
Rackspace
12. Continuous Integration
• To keep all nodes up-to-date, you need CI
• Automatize disabling nodes, building,
deploying
– Jenkins CI
13. Contineous Integration
1. Disable service on node
2. Deploy/build app
1. Copy files
2. Update db schema (liquibase, ORM schema
update)
3. Execute scripts
3. Re-run service
14. Balance the payload - HAProxy
Yeah guys, this is logo :)
But no schema is needed
just imagine how it works.
• Very, very fast proxy!
• Software TCP/HTTP load balancer
• Different node selecting algorithms:
– roudrobin (limit 4128)
– static-rr
– leastconn (lowest number of connections)
15. Balance the payload - HAProxy
• You can check node’s status by pinging
• Dead node is excluded from balancing strategy
vi /etc/haproxy/haproxy.cfg
option httpchk HEAD /check.txt HTTP/1.0
server webA 192.168.0.102:80 check
server webB 192.168.0.103:80 check
16. Balance the payload - HAProxy
• Monitor node’s status by read stats from
socket via socat.
echo "show stat" | socat
/tmp/haproxy.sock stdio
17. Balance the payload - HAProxy
• Monitor node’s status by native stats webapp
console
28. Varnish and ESI
<!DOCTYPE html>
<html>
<body>
<!-- ... some content -->
<!-- Embed the content of another page here -->
<esi:include src="http://..." />
<!-- ... more content -->
</body>
</html>
29. Scaling databases - Master slave
Write
Master
Slave
Read
• All data redundancy
Slave
Slave
30. MongoDB scaling
• Common models to spread data over nodes:
– range keys
– hash keys
• Many nodes on cheap machines
• No all data redundancy in each node
31. MongoDB – range-based keys
http://docs.mongodb.org
• Awesome for range queries (grab data from min nodes –
Query isolation)
• Not good enough to distribute data over nodes in case of
monotinic incemental
32. MongoDB – hash-based keys
http://docs.mongodb.org
• Take notice: not good for range queries while
merge-sorting, no Query isolation in this case
• Write scaling – Write to many nodes simultaneously (take
notice to readers-writer lock, where write is exclusive)
34. CQRS
• Command Query Responsibility Segregation
– separate application service layers for writing and
readng from DB (possibility to use different data
sources like RAM or DB)
35. CQRS
• Examples
– post-insert population cache
• all SELECTs are from cache (even invalid)
• consider LFU instead of LRU to invaidate cache
– pre-insert into memory
• dump results periodicaly
In both approaches there is convenient to use
Queues or data bus !
36. Queues, RabbitMQ
• RabbitMQ is based on AMQP (Advanced
Message Queuing Protocol)
– point-to-point
– publish-and-subscribe
– queueing, routing
• AMQP is not JMS (Java Message Service is an
API, not protocol)
• Happy Rabit is empty Rabbit
– do not try to store any data (messages) in queue
system in persistent mode to keep HA
38. Box vs spread architecture.
• Box architecture
– no scaling
– easy to maintenance
Server
Webapp
Redis
RabbitMQ
Varnish
DB
39. Box vs spread architecture.
• Spread architecture
– High availability
– more integrations, more administrative
Server #1
RabbitMQ
Redis
HAProxy
Server #2
Server #3
Webapp
Webapp
DB shard
Varnish
DB shard
Varnish