SlideShare a Scribd company logo
1 of 21
Download to read offline
Backing up thousands of
containers
OR
How to fail miserably at
copying data
OpenFest 2015
Talk about backup systems...Why?
➢First backup system built in 1999
➢Since then, 10 different systems
➢But why built your own?
➢ simple: SCALE
➢I'm very proud of the design of the last two
systems my team and I build
Backup considerations
➢Storage capacity
➢Amount of backup copies
➢HDD and RAID speeds
➢Almost never the network
Networking....
➢typical transfer speed over 1Gbit/s ~ 24MB/s
➢typical transfer speed over 10Gbit/s ~ 110MB/s
➢Restoring a 80% full 2TB drive
➢ ~21h over 1Gbit/s with 24MB/s
➢ ~4h and a half over 10Gbit/s with 110MB/s
➢Overlapping backups on the same network
equipment
➢Overlapping backups and restores
➢Switch uplinks
Architecture of container backups
➢Designed for 100,000 containers
➢backup each container at least once a day
➢30 incremental copies
➢Now I'll explain HOW :)
Host machine architecture
➢We use LVM
➢RAID array which exposes a single drive
➢setup a single Physical Volume on that drive
➢setup a single Volume Group using the above
PV
➢Thin provisioned VG
➢Each container with its own Logical Volume
Backup node architecture
➢Again we use LVM
➢RAID array which exposes a single drive
➢5 equally big Physical Volumes
➢on each PV we create a VG with thin pool
➢each container has a single LV
➢each incremental backup is a new snapshot
from the LV
➢when the max number of incremental backups
is reached, we remove the first LV
For now, there is nothing reallyFor now, there is nothing really
new or very interesting here.new or very interesting here.
So let me start with the funSo let me start with the fun
part.part.
➢We use rsync (nothing revolutionary here)
➢We need the size of the deleted files
➢ https://github.com/kyupltd/rsync/tree/deleted-stats
➢Restore files directly in client's containers, no
SSH into them
➢ https://github.com/kyupltd/rsync/tree/mount-ns
Backup system architecture
➢ One central database
➢ Public/Private IP addresses
➢ Maximum slots per machine
➢ Gearman for messaging layer
➢ Scheduler for backups
➢ Backup worker
The Scheduler
➢ Check if we have to backup the container
➢ Get the last backup timestamp
➢ Check if the host node has available backup
slots
➢ Schedule a 'start-backup' job at the gearman
on the backup node
start-backup worker
➢ Works on each backup node
➢ Started as many times as the Backup server
can handle
➢ handles the actual backup
➢ creates snapshots
➢ monitors rsync
➢ remove snapshots
➢ update database
No problems... they say :)
➢ We lost ALL of our backups from TWO node
➢ corrupted VG metadata
➢ VG metadata is not enough (more then 2000)
LVs
➢ create the VGs a little bit smaller then the total size
of the PV
➢ separate the VGs to loose less
No problems... they say :)
➢ LV creation becomes sluggish because LVM tries to
scan for devices in /dev
➢ obtain_device_list_from_udev = 1
➢ write_cache_state = 0
➢ specify the devices in scan = [ “/dev” ]
➢lvmetad and dmetad break...
➢ when they breack, they corrupt the metadata of all currently
opened containers
➢lvcreate leaks file descriptors
➢ once lvmetad or dmeventd are out of FDs everything breaks
Then the Avatar came
➢ We wanted to reduce the restore time from 4h to
under 1h, even under 30min
➢ So instead of backing up whole containers...
➢ We now backup accounts
➢ Soon we will be able to do distributed restore
➢ single host node backup
➢ from multiple backup nodes
➢ to multiple host nodes
Layerd backupsSparse File
Physical Volume
Volume Group
ThinPool
Logical Volume
Snapshot6
Snapshot5
Snapshot4
Snapshot3
Snapshot2
Snapshot1
Snapshot0
Loop mount
Issues here
➢ We can't keep a machine UP for more then 19
hours, LVM kernel BUG
➢ 2.6 till 4.3 - when discarding data it crashes
➢ Removing old snapshots does not discard the
data
➢ LVM umounts a volume when dmeventd
reaches the limit of Fds
➢ It does umount -l, the bastard
Issues here
➢ LVM dmeventd try's to extend the volume, but
if you don't have free extents it will silently
umount -l your LV
➢ Monitor your thinpool metadata
➢ Make your thinpool smaller then the VG and
always plan to have a few spare PE for
extending the pool
➢ kabbi__ irc.freenode.net #lvm
Any Questions?
Backing up thousands of containers

More Related Content

What's hot

Varnish: Making eZ Publish sites fly
Varnish: Making eZ Publish sites flyVarnish: Making eZ Publish sites fly
Varnish: Making eZ Publish sites flyPeter Keung
 
Adrian Mouat - Docker Tips and Tricks
 Adrian Mouat - Docker Tips and Tricks Adrian Mouat - Docker Tips and Tricks
Adrian Mouat - Docker Tips and TricksKevin Cross
 
[Js hcm] Deploying node.js with Forever.js and nginx
[Js hcm] Deploying node.js with Forever.js and nginx[Js hcm] Deploying node.js with Forever.js and nginx
[Js hcm] Deploying node.js with Forever.js and nginxNicolas Embleton
 
A little systemtap
A little systemtapA little systemtap
A little systemtapyang bingwu
 
Network Automation: Ansible 102
Network Automation: Ansible 102Network Automation: Ansible 102
Network Automation: Ansible 102APNIC
 
Securing Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultSecuring Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultBram Vogelaar
 
WP-CLI Workshop at WordPress Meetup Cluj-Napoca
WP-CLI Workshop at WordPress Meetup Cluj-NapocaWP-CLI Workshop at WordPress Meetup Cluj-Napoca
WP-CLI Workshop at WordPress Meetup Cluj-Napoca4nd4p0p
 
Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...
Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...
Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...Develcz
 
Linuxday.at - Lightning Talk
Linuxday.at - Lightning TalkLinuxday.at - Lightning Talk
Linuxday.at - Lightning TalkJan Gehring
 
Techniques to Improve Cache Speed
Techniques to Improve Cache SpeedTechniques to Improve Cache Speed
Techniques to Improve Cache SpeedZohaib Hassan
 
Docker remote-api
Docker remote-apiDocker remote-api
Docker remote-apiEric Ahn
 
Nginx وب سروری برای تمام فصول
Nginx وب سروری برای تمام فصولNginx وب سروری برای تمام فصول
Nginx وب سروری برای تمام فصولefazati
 

What's hot (20)

Varnish: Making eZ Publish sites fly
Varnish: Making eZ Publish sites flyVarnish: Making eZ Publish sites fly
Varnish: Making eZ Publish sites fly
 
Adrian Mouat - Docker Tips and Tricks
 Adrian Mouat - Docker Tips and Tricks Adrian Mouat - Docker Tips and Tricks
Adrian Mouat - Docker Tips and Tricks
 
Puppet
PuppetPuppet
Puppet
 
WebSockets with PHP: Mission impossible
WebSockets with PHP: Mission impossibleWebSockets with PHP: Mission impossible
WebSockets with PHP: Mission impossible
 
Curl Tutorial
Curl Tutorial Curl Tutorial
Curl Tutorial
 
Scaling WordPress
Scaling WordPressScaling WordPress
Scaling WordPress
 
Vagrant
VagrantVagrant
Vagrant
 
Puppet
PuppetPuppet
Puppet
 
[Js hcm] Deploying node.js with Forever.js and nginx
[Js hcm] Deploying node.js with Forever.js and nginx[Js hcm] Deploying node.js with Forever.js and nginx
[Js hcm] Deploying node.js with Forever.js and nginx
 
Nginx + PHP
Nginx + PHPNginx + PHP
Nginx + PHP
 
A little systemtap
A little systemtapA little systemtap
A little systemtap
 
Docker
DockerDocker
Docker
 
Network Automation: Ansible 102
Network Automation: Ansible 102Network Automation: Ansible 102
Network Automation: Ansible 102
 
Securing Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp VaultSecuring Prometheus exporters using HashiCorp Vault
Securing Prometheus exporters using HashiCorp Vault
 
WP-CLI Workshop at WordPress Meetup Cluj-Napoca
WP-CLI Workshop at WordPress Meetup Cluj-NapocaWP-CLI Workshop at WordPress Meetup Cluj-Napoca
WP-CLI Workshop at WordPress Meetup Cluj-Napoca
 
Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...
Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...
Ondřej Šika: Docker, Traefik a CI - Mějte nasazené všeny větve na kterých pra...
 
Linuxday.at - Lightning Talk
Linuxday.at - Lightning TalkLinuxday.at - Lightning Talk
Linuxday.at - Lightning Talk
 
Techniques to Improve Cache Speed
Techniques to Improve Cache SpeedTechniques to Improve Cache Speed
Techniques to Improve Cache Speed
 
Docker remote-api
Docker remote-apiDocker remote-api
Docker remote-api
 
Nginx وب سروری برای تمام فصول
Nginx وب سروری برای تمام فصولNginx وب سروری برای تمام فصول
Nginx وب سروری برای تمام فصول
 

Viewers also liked

Choose your dev platform
Choose your dev platformChoose your dev platform
Choose your dev platformMarian Marinov
 
Internet de les coses low cost
Internet de les coses low costInternet de les coses low cost
Internet de les coses low costOriol Rius
 
Changing Companies Minds About Women
Changing Companies Minds About WomenChanging Companies Minds About Women
Changing Companies Minds About WomenSridutt YS
 
Grafico diario del eurostoxx 50 para el 23 02 2012
Grafico diario del eurostoxx 50 para el 23 02 2012Grafico diario del eurostoxx 50 para el 23 02 2012
Grafico diario del eurostoxx 50 para el 23 02 2012Experiencia Trading
 
Apple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | Poteri
Apple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | PoteriApple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | Poteri
Apple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | PoteriIICT Chromepet
 
What to decide before going in for an intranet
What to decide before going in for an intranetWhat to decide before going in for an intranet
What to decide before going in for an intranetSridutt YS
 
Chrome Communications - Case Study - The Silent Auction Pop-Up
Chrome Communications - Case Study - The Silent Auction Pop-UpChrome Communications - Case Study - The Silent Auction Pop-Up
Chrome Communications - Case Study - The Silent Auction Pop-UpPawan Shahri
 
5 trendów, które zmienią oblicze polskiego przemysłu.
5 trendów, które zmienią oblicze polskiego przemysłu.5 trendów, które zmienią oblicze polskiego przemysłu.
5 trendów, które zmienią oblicze polskiego przemysłu.Edgecam Polska
 
Me - My life, My job, My..
Me - My life, My job, My..Me - My life, My job, My..
Me - My life, My job, My..Sridutt YS
 
Les Grandes conférences, Saint-Lô
Les Grandes conférences, Saint-LôLes Grandes conférences, Saint-Lô
Les Grandes conférences, Saint-LôAlice Labrousse
 
Memory management in iOS.
Memory management in iOS.Memory management in iOS.
Memory management in iOS.HSIEH CHING-FAN
 

Viewers also liked (18)

Io t
Io tIo t
Io t
 
Choose your dev platform
Choose your dev platformChoose your dev platform
Choose your dev platform
 
Internet de les coses low cost
Internet de les coses low costInternet de les coses low cost
Internet de les coses low cost
 
Kei2
Kei2Kei2
Kei2
 
Changing Companies Minds About Women
Changing Companies Minds About WomenChanging Companies Minds About Women
Changing Companies Minds About Women
 
El i pod yiseth 10.c
El i pod yiseth 10.cEl i pod yiseth 10.c
El i pod yiseth 10.c
 
Grafico diario del eurostoxx 50 para el 23 02 2012
Grafico diario del eurostoxx 50 para el 23 02 2012Grafico diario del eurostoxx 50 para el 23 02 2012
Grafico diario del eurostoxx 50 para el 23 02 2012
 
Parousiasi elvatzoglou
Parousiasi elvatzoglouParousiasi elvatzoglou
Parousiasi elvatzoglou
 
Apple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | Poteri
Apple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | PoteriApple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | Poteri
Apple iOS training at IICT Chrompet | Tambaram | Pallavaram | Guindy | Poteri
 
Prueba 2
Prueba 2Prueba 2
Prueba 2
 
What to decide before going in for an intranet
What to decide before going in for an intranetWhat to decide before going in for an intranet
What to decide before going in for an intranet
 
Chrome Communications - Case Study - The Silent Auction Pop-Up
Chrome Communications - Case Study - The Silent Auction Pop-UpChrome Communications - Case Study - The Silent Auction Pop-Up
Chrome Communications - Case Study - The Silent Auction Pop-Up
 
5 trendów, które zmienią oblicze polskiego przemysłu.
5 trendów, które zmienią oblicze polskiego przemysłu.5 trendów, które zmienią oblicze polskiego przemysłu.
5 trendów, które zmienią oblicze polskiego przemysłu.
 
Me - My life, My job, My..
Me - My life, My job, My..Me - My life, My job, My..
Me - My life, My job, My..
 
Les Grandes conférences, Saint-Lô
Les Grandes conférences, Saint-LôLes Grandes conférences, Saint-Lô
Les Grandes conférences, Saint-Lô
 
Construction & Materials (2)
Construction & Materials (2)Construction & Materials (2)
Construction & Materials (2)
 
Memory management in iOS.
Memory management in iOS.Memory management in iOS.
Memory management in iOS.
 
Network namespaces
Network namespacesNetwork namespaces
Network namespaces
 

Similar to Backing up thousands of containers

Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia DatabasesJaime Crespo
 
Marian Marinov, 1H Ltd.
Marian Marinov, 1H Ltd.Marian Marinov, 1H Ltd.
Marian Marinov, 1H Ltd.Ontico
 
Performance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networksPerformance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networksMarian Marinov
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningLenz Grimmer
 
Immutable infrastructure with Docker and containers (GlueCon 2015)
Immutable infrastructure with Docker and containers (GlueCon 2015)Immutable infrastructure with Docker and containers (GlueCon 2015)
Immutable infrastructure with Docker and containers (GlueCon 2015)Jérôme Petazzoni
 
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQDocker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQJérôme Petazzoni
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Jérôme Petazzoni
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
 
Docker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xDocker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xrkr10
 
Containerization Is More than the New Virtualization
Containerization Is More than the New VirtualizationContainerization Is More than the New Virtualization
Containerization Is More than the New VirtualizationC4Media
 
Highly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackupHighly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackupNilnandan Joshi
 
Backup & Restore (BR) in Solaris OS
Backup & Restore (BR) in Solaris OSBackup & Restore (BR) in Solaris OS
Backup & Restore (BR) in Solaris OSAchmad Mardiansyah
 
Warden @ Meet magento Romania 2021
Warden @ Meet magento Romania 2021Warden @ Meet magento Romania 2021
Warden @ Meet magento Romania 2021alinalexandru
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Jérôme Petazzoni
 
Overview of sheepdog
Overview of sheepdogOverview of sheepdog
Overview of sheepdogLiu Yuan
 
Docker and Containers for Development and Deployment — SCALE12X
Docker and Containers for Development and Deployment — SCALE12XDocker and Containers for Development and Deployment — SCALE12X
Docker and Containers for Development and Deployment — SCALE12XJérôme Petazzoni
 

Similar to Backing up thousands of containers (20)

Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Marian Marinov, 1H Ltd.
Marian Marinov, 1H Ltd.Marian Marinov, 1H Ltd.
Marian Marinov, 1H Ltd.
 
Performance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networksPerformance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networks
 
MySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery PlanningMySQL Server Backup, Restoration, and Disaster Recovery Planning
MySQL Server Backup, Restoration, and Disaster Recovery Planning
 
Immutable infrastructure with Docker and containers (GlueCon 2015)
Immutable infrastructure with Docker and containers (GlueCon 2015)Immutable infrastructure with Docker and containers (GlueCon 2015)
Immutable infrastructure with Docker and containers (GlueCon 2015)
 
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQDocker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
Docker Introduction, and what's new in 0.9 — Docker Palo Alto at RelateIQ
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9
 
Introduction to docker
Introduction to dockerIntroduction to docker
Introduction to docker
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
Docker Insight
Docker InsightDocker Insight
Docker Insight
 
Ha opensuse
Ha opensuseHa opensuse
Ha opensuse
 
Docker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12xDocker and-containers-for-development-and-deployment-scale12x
Docker and-containers-for-development-and-deployment-scale12x
 
Containerization Is More than the New Virtualization
Containerization Is More than the New VirtualizationContainerization Is More than the New Virtualization
Containerization Is More than the New Virtualization
 
Highly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackupHighly efficient backups with percona xtrabackup
Highly efficient backups with percona xtrabackup
 
Backup & Restore (BR) in Solaris OS
Backup & Restore (BR) in Solaris OSBackup & Restore (BR) in Solaris OS
Backup & Restore (BR) in Solaris OS
 
Warden @ Meet magento Romania 2021
Warden @ Meet magento Romania 2021Warden @ Meet magento Romania 2021
Warden @ Meet magento Romania 2021
 
Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...Containerization is more than the new Virtualization: enabling separation of ...
Containerization is more than the new Virtualization: enabling separation of ...
 
Overview of sheepdog
Overview of sheepdogOverview of sheepdog
Overview of sheepdog
 
Docker and Containers for Development and Deployment — SCALE12X
Docker and Containers for Development and Deployment — SCALE12XDocker and Containers for Development and Deployment — SCALE12X
Docker and Containers for Development and Deployment — SCALE12X
 

More from Marian Marinov

How to implement PassKeys in your application
How to implement PassKeys in your applicationHow to implement PassKeys in your application
How to implement PassKeys in your applicationMarian Marinov
 
Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingMarian Marinov
 
Basic presentation of cryptography mechanisms
Basic presentation of cryptography mechanismsBasic presentation of cryptography mechanisms
Basic presentation of cryptography mechanismsMarian Marinov
 
Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Marian Marinov
 
Introduction and replication to DragonflyDB
Introduction and replication to DragonflyDBIntroduction and replication to DragonflyDB
Introduction and replication to DragonflyDBMarian Marinov
 
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQMessage Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQMarian Marinov
 
How to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdfHow to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdfMarian Marinov
 
How to survive in the work from home era
How to survive in the work from home eraHow to survive in the work from home era
How to survive in the work from home eraMarian Marinov
 
Improve your storage with bcachefs
Improve your storage with bcachefsImprove your storage with bcachefs
Improve your storage with bcachefsMarian Marinov
 
Control your service resources with systemd
 Control your service resources with systemd  Control your service resources with systemd
Control your service resources with systemd Marian Marinov
 
Comparison of-foss-distributed-storage
Comparison of-foss-distributed-storageComparison of-foss-distributed-storage
Comparison of-foss-distributed-storageMarian Marinov
 
Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?Marian Marinov
 
Securing your MySQL server
Securing your MySQL serverSecuring your MySQL server
Securing your MySQL serverMarian Marinov
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKMarian Marinov
 
Challenges with high density networks
Challenges with high density networksChallenges with high density networks
Challenges with high density networksMarian Marinov
 
SiteGround building automation
SiteGround building automationSiteGround building automation
SiteGround building automationMarian Marinov
 
Preventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingPreventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingMarian Marinov
 
Managing a lot of servers
Managing a lot of serversManaging a lot of servers
Managing a lot of serversMarian Marinov
 

More from Marian Marinov (20)

How to implement PassKeys in your application
How to implement PassKeys in your applicationHow to implement PassKeys in your application
How to implement PassKeys in your application
 
Dev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & LoggingDev.bg DevOps March 2024 Monitoring & Logging
Dev.bg DevOps March 2024 Monitoring & Logging
 
Basic presentation of cryptography mechanisms
Basic presentation of cryptography mechanismsBasic presentation of cryptography mechanisms
Basic presentation of cryptography mechanisms
 
Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?Microservices: Benefits, drawbacks and are they for me?
Microservices: Benefits, drawbacks and are they for me?
 
Introduction and replication to DragonflyDB
Introduction and replication to DragonflyDBIntroduction and replication to DragonflyDB
Introduction and replication to DragonflyDB
 
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQMessage Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
Message Queuing - Gearman, Mosquitto, Kafka and RabbitMQ
 
How to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdfHow to successfully migrate to DevOps .pdf
How to successfully migrate to DevOps .pdf
 
How to survive in the work from home era
How to survive in the work from home eraHow to survive in the work from home era
How to survive in the work from home era
 
Managing sysadmins
Managing sysadminsManaging sysadmins
Managing sysadmins
 
Improve your storage with bcachefs
Improve your storage with bcachefsImprove your storage with bcachefs
Improve your storage with bcachefs
 
Control your service resources with systemd
 Control your service resources with systemd  Control your service resources with systemd
Control your service resources with systemd
 
Comparison of-foss-distributed-storage
Comparison of-foss-distributed-storageComparison of-foss-distributed-storage
Comparison of-foss-distributed-storage
 
Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?Защо и как да обогатяваме знанията си?
Защо и как да обогатяваме знанията си?
 
Securing your MySQL server
Securing your MySQL serverSecuring your MySQL server
Securing your MySQL server
 
Sysadmin vs. dev ops
Sysadmin vs. dev opsSysadmin vs. dev ops
Sysadmin vs. dev ops
 
DoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDKDoS and DDoS mitigations with eBPF, XDP and DPDK
DoS and DDoS mitigations with eBPF, XDP and DPDK
 
Challenges with high density networks
Challenges with high density networksChallenges with high density networks
Challenges with high density networks
 
SiteGround building automation
SiteGround building automationSiteGround building automation
SiteGround building automation
 
Preventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel trackingPreventing cpu side channel attacks with kernel tracking
Preventing cpu side channel attacks with kernel tracking
 
Managing a lot of servers
Managing a lot of serversManaging a lot of servers
Managing a lot of servers
 

Recently uploaded

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXssuser89054b
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 

Recently uploaded (20)

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 

Backing up thousands of containers

  • 1. Backing up thousands of containers OR How to fail miserably at copying data OpenFest 2015
  • 2.
  • 3. Talk about backup systems...Why? ➢First backup system built in 1999 ➢Since then, 10 different systems ➢But why built your own? ➢ simple: SCALE ➢I'm very proud of the design of the last two systems my team and I build
  • 4. Backup considerations ➢Storage capacity ➢Amount of backup copies ➢HDD and RAID speeds ➢Almost never the network
  • 5. Networking.... ➢typical transfer speed over 1Gbit/s ~ 24MB/s ➢typical transfer speed over 10Gbit/s ~ 110MB/s ➢Restoring a 80% full 2TB drive ➢ ~21h over 1Gbit/s with 24MB/s ➢ ~4h and a half over 10Gbit/s with 110MB/s ➢Overlapping backups on the same network equipment ➢Overlapping backups and restores ➢Switch uplinks
  • 6. Architecture of container backups ➢Designed for 100,000 containers ➢backup each container at least once a day ➢30 incremental copies ➢Now I'll explain HOW :)
  • 7. Host machine architecture ➢We use LVM ➢RAID array which exposes a single drive ➢setup a single Physical Volume on that drive ➢setup a single Volume Group using the above PV ➢Thin provisioned VG ➢Each container with its own Logical Volume
  • 8. Backup node architecture ➢Again we use LVM ➢RAID array which exposes a single drive ➢5 equally big Physical Volumes ➢on each PV we create a VG with thin pool ➢each container has a single LV ➢each incremental backup is a new snapshot from the LV ➢when the max number of incremental backups is reached, we remove the first LV
  • 9. For now, there is nothing reallyFor now, there is nothing really new or very interesting here.new or very interesting here. So let me start with the funSo let me start with the fun part.part.
  • 10. ➢We use rsync (nothing revolutionary here) ➢We need the size of the deleted files ➢ https://github.com/kyupltd/rsync/tree/deleted-stats ➢Restore files directly in client's containers, no SSH into them ➢ https://github.com/kyupltd/rsync/tree/mount-ns
  • 11. Backup system architecture ➢ One central database ➢ Public/Private IP addresses ➢ Maximum slots per machine ➢ Gearman for messaging layer ➢ Scheduler for backups ➢ Backup worker
  • 12. The Scheduler ➢ Check if we have to backup the container ➢ Get the last backup timestamp ➢ Check if the host node has available backup slots ➢ Schedule a 'start-backup' job at the gearman on the backup node
  • 13. start-backup worker ➢ Works on each backup node ➢ Started as many times as the Backup server can handle ➢ handles the actual backup ➢ creates snapshots ➢ monitors rsync ➢ remove snapshots ➢ update database
  • 14. No problems... they say :) ➢ We lost ALL of our backups from TWO node ➢ corrupted VG metadata ➢ VG metadata is not enough (more then 2000) LVs ➢ create the VGs a little bit smaller then the total size of the PV ➢ separate the VGs to loose less
  • 15. No problems... they say :) ➢ LV creation becomes sluggish because LVM tries to scan for devices in /dev ➢ obtain_device_list_from_udev = 1 ➢ write_cache_state = 0 ➢ specify the devices in scan = [ “/dev” ] ➢lvmetad and dmetad break... ➢ when they breack, they corrupt the metadata of all currently opened containers ➢lvcreate leaks file descriptors ➢ once lvmetad or dmeventd are out of FDs everything breaks
  • 16. Then the Avatar came ➢ We wanted to reduce the restore time from 4h to under 1h, even under 30min ➢ So instead of backing up whole containers... ➢ We now backup accounts ➢ Soon we will be able to do distributed restore ➢ single host node backup ➢ from multiple backup nodes ➢ to multiple host nodes
  • 17. Layerd backupsSparse File Physical Volume Volume Group ThinPool Logical Volume Snapshot6 Snapshot5 Snapshot4 Snapshot3 Snapshot2 Snapshot1 Snapshot0 Loop mount
  • 18. Issues here ➢ We can't keep a machine UP for more then 19 hours, LVM kernel BUG ➢ 2.6 till 4.3 - when discarding data it crashes ➢ Removing old snapshots does not discard the data ➢ LVM umounts a volume when dmeventd reaches the limit of Fds ➢ It does umount -l, the bastard
  • 19. Issues here ➢ LVM dmeventd try's to extend the volume, but if you don't have free extents it will silently umount -l your LV ➢ Monitor your thinpool metadata ➢ Make your thinpool smaller then the VG and always plan to have a few spare PE for extending the pool ➢ kabbi__ irc.freenode.net #lvm