This document summarizes the evolution of Docker infrastructure at Pipedrive to meet performance requirements. Key optimizations included reducing Docker build/deploy times to under 5 minutes by optimizing Dockerfiles and switching from Devicemapper to AUFS storage. Health checks and unique check paths were implemented to ensure only healthy services were accessed. The infrastructure was tuned through Linux kernel upgrades, memory limits, load spreading, and security measures to support 10,000 connections and constant high load.
7. Deployment process optimizations
NB! https://docs.docker.com/engine/userguide/storagedriver/selectadriver/
Replacement of Devicemapper to AUFS reduced deployment process time 10x.
There are still improvements possible:
● Handle Linux signals
● Parallel rolling updates
9. Beware the service discovery corruption
● Always enable health checks
● Use unique health checks
SERVICE_CHECK_HTTP=/health
vs
SERVICE_CHECK_HTTP=/v1/companyStatistics/health
13. Issues
● Linux kernel 3.13
● Fluentd logging agent
● Graylog logging driver
● Kernel sysctl parameters
● Swap usage
● PEBKAC
○ "net.ipv4.ip_forward" => 0
● WARNING: No memory limit support
● WARNING: No swap limit support
● WARNING: No kernel memory limit support
● WARNING: No oom kill disable support
● WARNING: No cpu cfs quota support
● WARNING: No cpu cfs period support
15. Service risk mitigation
● Number of nodes in cluster
○ If in doubt increase the number
● Spreading policies
● Multiple instances
● Memory limitations
● Healing policies
○ Autorestart
○ Reschedule
17. Recommendations for going
Live with Docker
● You still need to take care of OS
● Read Github issues
● Read from the source
● Keep it up to date
● (Performance) Test it