SlideShare a Scribd company logo
1 of 25
Download to read offline
Automation at Brainly
… or how to enter the world of automation in a “different way”.
OPS stack:
● ~80 servers, heavy usage of LXC containers
(~1000)
● 99.9% Debian, 1 Ubuntu host :)
● Nginx / Apache2, 2k reqs per sec
● 200 million page views monthly
● 700Mbps peak traffic
● Python is dominant
About Brainly
World’s largest homework help social network, connecting over 40 million users monthly
DEV stack:
● PHP
- Symfony 2
- SOA projects
- 200 reqs per sec on russian version
● Erlang
- 55k concurrent users
- 22k events per sec
● Native Apps
- iOS
- Android
● Puppet was not feasible for us
- *lots* of dependencies which make containers bigger/heavier
- problems with Puppet's declarative language
- seemed incoherent, lacking integration of orchestration
- steep learning curve
- YMMV
● "packaging as automation" as an intermediate solution
- dependency hell, installing one package could result in uninstalling others
- inflexible, lots of code duplication in debian/rules file
- LOTS of custom bash and PHP scripts, usually very hard to reuse
and not standardized
- this was a dead end :(
● Ansible
- initially used only for orchestration
- maintaining it required keeping up2date inventory, which later
simplified and helped with lots of things
Starting point
● we decided to move forward with Ansible and use it for setting up machines as
well
● first project was nagios monitoring plugins setup
● turned out to be ideal for containers and our needs in general
- very little dependencies to begin with (python2, python-apt),
and small footprint - "configured" Python modules are transferred
directly to machine, no need for local repositories
- very light, no compilation on the destination host is needed
- easy to understand. Tasks/playbooks map directly to actions
an ops/devops would have done if he was doing it by hand
- compatible with "automation by packages". We were able to
migrate from the old system in small steps.
First steps with Ansible
● all policies, rules, and good practices written down in automation's repo main
directory
● helps with introducing new people into the team or with devops approach
- newbies are able to start committing to repo quickly
- what's in GUIDELINES.md, that's law and changing it requires wider
consensus
- gives examples on how to deal with certain problems in standardized way
● few examples:
- limit the number of tags, each of them should be self-contained
with no cross-dependencies.
- do not include roles/tasks inside other roles,
this creates hard to follow dependencies
- NEVER subset the list of hosts inside the role, do it in site.yml.
Otherwise debugging roles/hosts will become difficult
- think twice before adding new role and esp. groups. As infrastructure
grows, it becomes hard to manage and/or creates "dead” code/roles
Avoiding regressions
● one of the policies introduced was storing one-off scripts in a
separate directory in our automation repo.
● most of them are Ansible playbooks used just for one particular
task (i.e. Squeeze->Wheezy migration)
● version-control everything!
● turned out to be very useful, some of them turned out to be useful
enough to be rewritten to proper role or a tool
Ugly-hacks reusability
● available on GitHub and Ansible Galaxy:
https://galaxy.ansible.com/list#/roles/940
https://galaxy.ansible.com/list#/roles/941
● “base” role:
- is reused across 8 different production roles we have ATM
- contains basic monitoring, log rotation, packages installation, etc…
- includes PHP setup in modphp/prefork configuration
- PHP disabled functions control
- basic security setup
- does not include any site-specific stuff
● "site” role:
- contains all site specific stuff and dependencies
(vhosts, additional packages, etc...)
- usually very simple
- more than one site role possible, only one base role though
● It is an example of how we make our roles reusable
Apache2 automation
● automatically setups monitoring basing on inventory and host groups
● implements devops approach - if dev has root on machine, he also has
access to all monitoring stuff related to this system
● automatic host dependencies basing on host groups
● provisioning new hosts is no longer so painful ("auto-discovery")
● all services configuration is stored as YAML files, and used in templates
● role uses DNS data directly from inventory in order to make monitoring
independent of DNS failures
Icinga
DNS migration
● at the beginning:
- dozens of authoritative name servers, each of them having
customized configuration, running ~100 zones, all created by hand
- the main reason for that was using DNS for switching between
primary/secondary servers/services
● three phases:
- slurping configuration into Ansible
- normalizing the configuration
- improving the setup
● Python script which uses Ansible API to fetch normalized zone configuration from
each server
- results available in a neat hash, with per-host, per-zone keys!
- normalization using named-checkconf tool
● use slurped configuration to re-generate all configs, this time using only the data
available to Ansible's
● "push-button" migration, after all recipes were ready :)
● secure: all zone transfers are signed with individual keys, ACLs are tight
● playbooks use dns data directly from inventory
● changing/migrating slaves/masters is easy, NS records are auto-generated
● updates to zones automatically bump serial, while still preserving the
YYYYMMDDxx format
● CRM records are auto-generated as well
* see next slide about CRM automation
● dns entries are always up2date thanks to some custom action modules
- ansible_ssh_host variables are harvested and processed into zones
- only custom entries and zone primary/secondary server names are
now stored in YAML
- new hosts are automatically added to zones, decommissioned
ones - removed
- auto-generation of reverse zones
DNS automation
● we have ~130 CRM clusters
● setting them up by hand would be "difficult" at best, impossible at worst
● available on Ansible Galaxy:
- https://galaxy.ansible.com/list#/roles/956
- https://galaxy.ansible.com/list#/roles/979
● follows pattern from apache2_base
- “base” role suitable for manually set up clusters
- "cluster” role provides service upon base, with few reusable snippets
and a possibility for more complex configurations
● automatic membership based on ansible inventory (no multicasts!)
● the most difficult part was providing synchronous handlers
● few simple configurations are provided, like single service-single vip
Corosync & Pacemaker
● initially we did not have time nor resources to set up full fledged LDAP
● we needed:
- user should be able to log in even during a network outage
- removal/adding users, ssh-keys, custom settings, etc..
all had to be supported
- it had to be reusable/accessible in other roles
(i.e. Icinga/monitoring)
- different privileges for dev,production and other environments
- UID/GID unification
● turned out to be simpler than we thought - users are managed using few
simple tasks and group_vars data. Rest is handled via variables precedence.
● migration/standardization required some effort though
User management automation
● standard ansible inventory management becomes a bit cumbersome with 100’s of
hosts:
- each host has to have ansible_ssh_host defined
- adding/removing large number of hosts/groups required editing lots of files
and/or one-off scripts
- ip address management using google docs does not scale ;)
● Ansible has well defined dynamic inventory API, with scripts available for AWS,
Cobbler, Rackspace, Docker, and many others.
● we wrote our own, which is based on YAML file, version controlled by git:
- python API allowing to manipulate the inventory easily
- logic and syntax checking of the inventory
● available as opensource: https://github.com/brainly/inventory_tool
Inventory management
● we are leasing our servers from Hetzner, no direct Layer 2 connectivity
● all tunnel setups are done using Ansible, new server
is automatically added to our network
● firewalls are set up by Ansible as well:
- OPS contribute the base firewall, DEVs can open
the ports of interest for their application
- ferm at it's base, for easy rule making and keeping in-kernel firewall in sync
with on-disk rules
- rules are auto-generated basing on inventory, adding/removing hosts is
automatically reconfigures FW
Networking
● based on Bareos, opensource Bacula fork
● new hosts are automatically set up for backup,
extending storage space is no longer a problem
● authentication using certificates, PITA without ansible
Backups
● deployment done by Python script calling Ansible API
● simple tasks implemented using ansible playbooks
● complex logic implemented in Python
Deployments
● Jinja2 template error messages are "difficult" to interpret
● templates sometimes grow to huge complexity
● Jinja2 is designed for speed, but with tradeoffs - some Python operators are
missing and creating custom plugins/filters poses some problems
● multi-inheritance, problems with 2-headed trees
● speed, improved with "pipelining=True", containerization on the long run
● some useful functionality requires paid subscription (Ansible Tower)
- RESTfull API, useful if you want to push new application version
to productions via i.e. Jenkins
- schedules - currently we need to push the changes ourselves
Not everything is perfect
● developers by default have RO access to repo, RW on case-by-case basis
● changes to systems owned by developers are done by developers,
OPS only provide the platform and tools
● all non-trivial changes require a Pull Request and a review from Ops
● encrypt mission critical data with Ansible Vault and push it directly to the repo
- *strong* encryption
- available to Ansible without the need for decryption
(password still required though)
- all security sensitive stuff can be skipped by developers with
"--skip-tags" option to ansible-playbooks
Dev,DevOps,Ops
● some of the things we mentioned can be find on our Github account
● we are working on opensourcing more stuff
https://github.com/brainly
Opensource! Opensource! Opensource!
● time needed to deploy new markets dropped considerably
● increased productivity
● better cooperation with developers
● more workpower, Devs are no longer blocked so much, we can push
tasks to them
● infrastructure as a code
● versioning
● code-reuse, less copy-pasting
Conclusions
We are hiring!
http://brainly.co/jobs/
Questions?
Thank you!

More Related Content

What's hot

PuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster EnvironmentPuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster EnvironmentGreg Cockburn
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to AnsibleKnoldus Inc.
 
Salt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsSalt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsBenjamin Cane
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftYaniv cohen
 
Puppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 EditionPuppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 EditionJoshua Thijssen
 
Herd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration managementHerd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration managementFrederik Engelen
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAlberto Molina Coballes
 
Puppet and Telefonica R&D
Puppet and Telefonica R&DPuppet and Telefonica R&D
Puppet and Telefonica R&DPuppet
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementJames Turnbull
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopLorin Hochstein
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetWalter Heck
 
MySQL DevOps at Outbrain
MySQL DevOps at OutbrainMySQL DevOps at Outbrain
MySQL DevOps at OutbrainShlomi Noach
 
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowTXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowMatt Ray
 
Ansible: What, Why & How
Ansible: What, Why & HowAnsible: What, Why & How
Ansible: What, Why & HowAlfonso Cabrera
 
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltStack
 
Understanding salt modular sub-systems and customization
Understanding salt   modular sub-systems and customizationUnderstanding salt   modular sub-systems and customization
Understanding salt modular sub-systems and customizationjasondenning
 
Puppet for SysAdmins
Puppet for SysAdminsPuppet for SysAdmins
Puppet for SysAdminsPuppet
 
Zabbix Performance Tuning
Zabbix Performance TuningZabbix Performance Tuning
Zabbix Performance TuningRicardo Santos
 

What's hot (20)

PuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster EnvironmentPuppetCamp Sydney 2012 - Building a Multimaster Environment
PuppetCamp Sydney 2012 - Building a Multimaster Environment
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
Ansible MySQL MHA
Ansible MySQL MHAAnsible MySQL MHA
Ansible MySQL MHA
 
Beyond Puppet
Beyond PuppetBeyond Puppet
Beyond Puppet
 
Salt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environmentsSalt conf 2014 - Using SaltStack in high availability environments
Salt conf 2014 - Using SaltStack in high availability environments
 
Devops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShiftDevops with Python by Yaniv Cohen DevopShift
Devops with Python by Yaniv Cohen DevopShift
 
Puppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 EditionPuppet for dummies - ZendCon 2011 Edition
Puppet for dummies - ZendCon 2011 Edition
 
Herd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration managementHerd your chickens: Ansible for DB2 configuration management
Herd your chickens: Ansible for DB2 configuration management
 
Automated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. AnsibleAutomated Deployment and Configuration Engines. Ansible
Automated Deployment and Configuration Engines. Ansible
 
Puppet and Telefonica R&D
Puppet and Telefonica R&DPuppet and Telefonica R&D
Puppet and Telefonica R&D
 
Using Puppet - Real World Configuration Management
Using Puppet - Real World Configuration ManagementUsing Puppet - Real World Configuration Management
Using Puppet - Real World Configuration Management
 
Vagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptopVagrant, Ansible, and OpenStack on your laptop
Vagrant, Ansible, and OpenStack on your laptop
 
PuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of PuppetPuppetCamp SEA 1 - The State of Puppet
PuppetCamp SEA 1 - The State of Puppet
 
MySQL DevOps at Outbrain
MySQL DevOps at OutbrainMySQL DevOps at Outbrain
MySQL DevOps at Outbrain
 
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & TomorrowTXLF: Chef- Software Defined Infrastructure Today & Tomorrow
TXLF: Chef- Software Defined Infrastructure Today & Tomorrow
 
Ansible: What, Why & How
Ansible: What, Why & HowAnsible: What, Why & How
Ansible: What, Why & How
 
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStackSaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
 
Understanding salt modular sub-systems and customization
Understanding salt   modular sub-systems and customizationUnderstanding salt   modular sub-systems and customization
Understanding salt modular sub-systems and customization
 
Puppet for SysAdmins
Puppet for SysAdminsPuppet for SysAdmins
Puppet for SysAdmins
 
Zabbix Performance Tuning
Zabbix Performance TuningZabbix Performance Tuning
Zabbix Performance Tuning
 

Viewers also liked

Viewers also liked (15)

Short slidepresentation
Short slidepresentationShort slidepresentation
Short slidepresentation
 
Cửa cuốn công nghệ úc
Cửa cuốn công nghệ úcCửa cuốn công nghệ úc
Cửa cuốn công nghệ úc
 
Cửa cuốn chống cháy VFS 120
Cửa cuốn chống cháy VFS 120Cửa cuốn chống cháy VFS 120
Cửa cuốn chống cháy VFS 120
 
One Minute Power Point
One Minute Power PointOne Minute Power Point
One Minute Power Point
 
Cửa cuốn công nghệ đức
Cửa cuốn công nghệ đứcCửa cuốn công nghệ đức
Cửa cuốn công nghệ đức
 
Cửa cuốn công nghệ úc
Cửa cuốn công nghệ úcCửa cuốn công nghệ úc
Cửa cuốn công nghệ úc
 
cửa cuốn chống cháy
cửa cuốn chống cháycửa cuốn chống cháy
cửa cuốn chống cháy
 
Cửa cuốn trong suốt
Cửa cuốn trong suốtCửa cuốn trong suốt
Cửa cuốn trong suốt
 
Short slidepresentation
Short slidepresentationShort slidepresentation
Short slidepresentation
 
UNIVERSO
UNIVERSOUNIVERSO
UNIVERSO
 
Rome and Paul in British Imperial Ideology
Rome and Paul in British Imperial IdeologyRome and Paul in British Imperial Ideology
Rome and Paul in British Imperial Ideology
 
New Apostles: The Lasting Effects of Paul’s Reception Among British Missionaries
New Apostles: The Lasting Effects of Paul’s Reception Among British MissionariesNew Apostles: The Lasting Effects of Paul’s Reception Among British Missionaries
New Apostles: The Lasting Effects of Paul’s Reception Among British Missionaries
 
The blues
The bluesThe blues
The blues
 
Màn ngăn khói, ngăn cháy
Màn ngăn khói, ngăn cháyMàn ngăn khói, ngăn cháy
Màn ngăn khói, ngăn cháy
 
Short slidepresentation
Short slidepresentationShort slidepresentation
Short slidepresentation
 

Similar to PLNOG Automation@Brainly

#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to AnsibleCédric Delgehier
 
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...Wong Hoi Sing Edison
 
Enabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via KubernetesEnabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via Kubernetesmountpoint.io
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to AnsibleCoreStack
 
Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators Giacomo Tirabassi
 
Kubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsKubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsSIGHUP
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingStanislav Osipov
 
Ansible & Salt - Vincent Boon
Ansible & Salt - Vincent BoonAnsible & Salt - Vincent Boon
Ansible & Salt - Vincent BoonMyNOG
 
Ansible Automation to Rule Them All
Ansible Automation to Rule Them AllAnsible Automation to Rule Them All
Ansible Automation to Rule Them AllTim Fairweather
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios
 
Introduction to ansible
Introduction to ansibleIntroduction to ansible
Introduction to ansibleOmid Vahdaty
 
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes][BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]Wong Hoi Sing Edison
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Divante
 
Infrastructure as Data with Ansible
Infrastructure as Data with AnsibleInfrastructure as Data with Ansible
Infrastructure as Data with AnsibleCarlo Bonamico
 
Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Codemotion
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsGR8Conf
 
Do more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloudDo more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloudphilip_stoev
 
Deploying Perl apps on dotCloud
Deploying Perl apps on dotCloudDeploying Perl apps on dotCloud
Deploying Perl apps on dotClouddaoswald
 

Similar to PLNOG Automation@Brainly (20)

Ansible intro
Ansible introAnsible intro
Ansible intro
 
#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible#OktoCampus - Workshop : An introduction to Ansible
#OktoCampus - Workshop : An introduction to Ansible
 
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
[HKOSCON][20180616][Containerized High Availability Virtual Hosting Deploymen...
 
Enabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via KubernetesEnabling ceph-mgr to control Ceph services via Kubernetes
Enabling ceph-mgr to control Ceph services via Kubernetes
 
Introduction to Ansible
Introduction to AnsibleIntroduction to Ansible
Introduction to Ansible
 
Ansible.pdf
Ansible.pdfAnsible.pdf
Ansible.pdf
 
Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators Kubernetes extensibility: crd & operators
Kubernetes extensibility: crd & operators
 
Kubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & OperatorsKubernetes extensibility: CRDs & Operators
Kubernetes extensibility: CRDs & Operators
 
SCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scalingSCM Puppet: from an intro to the scaling
SCM Puppet: from an intro to the scaling
 
Ansible & Salt - Vincent Boon
Ansible & Salt - Vincent BoonAnsible & Salt - Vincent Boon
Ansible & Salt - Vincent Boon
 
Ansible Automation to Rule Them All
Ansible Automation to Rule Them AllAnsible Automation to Rule Them All
Ansible Automation to Rule Them All
 
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA SolutionsNagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
Nagios Conference 2014 - Andy Brist - Nagios XI Failover and HA Solutions
 
Introduction to ansible
Introduction to ansibleIntroduction to ansible
Introduction to ansible
 
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes][BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
[BarCamp2018][20180915][Tips for Virtual Hosting on Kubernetes]
 
Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)Magento scalability from the trenches (Meet Magento Sweden 2016)
Magento scalability from the trenches (Meet Magento Sweden 2016)
 
Infrastructure as Data with Ansible
Infrastructure as Data with AnsibleInfrastructure as Data with Ansible
Infrastructure as Data with Ansible
 
Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...Infrastructure as data with Ansible: systems and cloud deployment and managem...
Infrastructure as data with Ansible: systems and cloud deployment and managem...
 
My "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails ProjectsMy "Perfect" Toolchain Setup for Grails Projects
My "Perfect" Toolchain Setup for Grails Projects
 
Do more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloudDo more with Galera Cluster in your OpenStack cloud
Do more with Galera Cluster in your OpenStack cloud
 
Deploying Perl apps on dotCloud
Deploying Perl apps on dotCloudDeploying Perl apps on dotCloud
Deploying Perl apps on dotCloud
 

Recently uploaded

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Delhi Call girls
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsstephieert
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$kojalkojal131
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607dollysharma2066
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...SofiyaSharma5
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)Damian Radcliffe
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxellan12
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Servicesexy call girls service in goa
 

Recently uploaded (20)

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
Best VIP Call Girls Noida Sector 75 Call Me: 8448380779
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Radiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girlsRadiant Call girls in Dubai O56338O268 Dubai Call girls
Radiant Call girls in Dubai O56338O268 Dubai Call girls
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
Call Girls Dubai Prolapsed O525547819 Call Girls In Dubai Princes$
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
FULL ENJOY Call Girls In Mayur Vihar Delhi Contact Us 8377087607
 
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
 
How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)How is AI changing journalism? (v. April 2024)
How is AI changing journalism? (v. April 2024)
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 

PLNOG Automation@Brainly

  • 1. Automation at Brainly … or how to enter the world of automation in a “different way”.
  • 2. OPS stack: ● ~80 servers, heavy usage of LXC containers (~1000) ● 99.9% Debian, 1 Ubuntu host :) ● Nginx / Apache2, 2k reqs per sec ● 200 million page views monthly ● 700Mbps peak traffic ● Python is dominant About Brainly World’s largest homework help social network, connecting over 40 million users monthly DEV stack: ● PHP - Symfony 2 - SOA projects - 200 reqs per sec on russian version ● Erlang - 55k concurrent users - 22k events per sec ● Native Apps - iOS - Android
  • 3. ● Puppet was not feasible for us - *lots* of dependencies which make containers bigger/heavier - problems with Puppet's declarative language - seemed incoherent, lacking integration of orchestration - steep learning curve - YMMV ● "packaging as automation" as an intermediate solution - dependency hell, installing one package could result in uninstalling others - inflexible, lots of code duplication in debian/rules file - LOTS of custom bash and PHP scripts, usually very hard to reuse and not standardized - this was a dead end :( ● Ansible - initially used only for orchestration - maintaining it required keeping up2date inventory, which later simplified and helped with lots of things Starting point
  • 4. ● we decided to move forward with Ansible and use it for setting up machines as well ● first project was nagios monitoring plugins setup ● turned out to be ideal for containers and our needs in general - very little dependencies to begin with (python2, python-apt), and small footprint - "configured" Python modules are transferred directly to machine, no need for local repositories - very light, no compilation on the destination host is needed - easy to understand. Tasks/playbooks map directly to actions an ops/devops would have done if he was doing it by hand - compatible with "automation by packages". We were able to migrate from the old system in small steps. First steps with Ansible
  • 5. ● all policies, rules, and good practices written down in automation's repo main directory ● helps with introducing new people into the team or with devops approach - newbies are able to start committing to repo quickly - what's in GUIDELINES.md, that's law and changing it requires wider consensus - gives examples on how to deal with certain problems in standardized way ● few examples: - limit the number of tags, each of them should be self-contained with no cross-dependencies. - do not include roles/tasks inside other roles, this creates hard to follow dependencies - NEVER subset the list of hosts inside the role, do it in site.yml. Otherwise debugging roles/hosts will become difficult - think twice before adding new role and esp. groups. As infrastructure grows, it becomes hard to manage and/or creates "dead” code/roles Avoiding regressions
  • 6. ● one of the policies introduced was storing one-off scripts in a separate directory in our automation repo. ● most of them are Ansible playbooks used just for one particular task (i.e. Squeeze->Wheezy migration) ● version-control everything! ● turned out to be very useful, some of them turned out to be useful enough to be rewritten to proper role or a tool Ugly-hacks reusability
  • 7.
  • 8. ● available on GitHub and Ansible Galaxy: https://galaxy.ansible.com/list#/roles/940 https://galaxy.ansible.com/list#/roles/941 ● “base” role: - is reused across 8 different production roles we have ATM - contains basic monitoring, log rotation, packages installation, etc… - includes PHP setup in modphp/prefork configuration - PHP disabled functions control - basic security setup - does not include any site-specific stuff ● "site” role: - contains all site specific stuff and dependencies (vhosts, additional packages, etc...) - usually very simple - more than one site role possible, only one base role though ● It is an example of how we make our roles reusable Apache2 automation
  • 9. ● automatically setups monitoring basing on inventory and host groups ● implements devops approach - if dev has root on machine, he also has access to all monitoring stuff related to this system ● automatic host dependencies basing on host groups ● provisioning new hosts is no longer so painful ("auto-discovery") ● all services configuration is stored as YAML files, and used in templates ● role uses DNS data directly from inventory in order to make monitoring independent of DNS failures Icinga
  • 10. DNS migration ● at the beginning: - dozens of authoritative name servers, each of them having customized configuration, running ~100 zones, all created by hand - the main reason for that was using DNS for switching between primary/secondary servers/services ● three phases: - slurping configuration into Ansible - normalizing the configuration - improving the setup ● Python script which uses Ansible API to fetch normalized zone configuration from each server - results available in a neat hash, with per-host, per-zone keys! - normalization using named-checkconf tool ● use slurped configuration to re-generate all configs, this time using only the data available to Ansible's ● "push-button" migration, after all recipes were ready :)
  • 11. ● secure: all zone transfers are signed with individual keys, ACLs are tight ● playbooks use dns data directly from inventory ● changing/migrating slaves/masters is easy, NS records are auto-generated ● updates to zones automatically bump serial, while still preserving the YYYYMMDDxx format ● CRM records are auto-generated as well * see next slide about CRM automation ● dns entries are always up2date thanks to some custom action modules - ansible_ssh_host variables are harvested and processed into zones - only custom entries and zone primary/secondary server names are now stored in YAML - new hosts are automatically added to zones, decommissioned ones - removed - auto-generation of reverse zones DNS automation
  • 12. ● we have ~130 CRM clusters ● setting them up by hand would be "difficult" at best, impossible at worst ● available on Ansible Galaxy: - https://galaxy.ansible.com/list#/roles/956 - https://galaxy.ansible.com/list#/roles/979 ● follows pattern from apache2_base - “base” role suitable for manually set up clusters - "cluster” role provides service upon base, with few reusable snippets and a possibility for more complex configurations ● automatic membership based on ansible inventory (no multicasts!) ● the most difficult part was providing synchronous handlers ● few simple configurations are provided, like single service-single vip Corosync & Pacemaker
  • 13. ● initially we did not have time nor resources to set up full fledged LDAP ● we needed: - user should be able to log in even during a network outage - removal/adding users, ssh-keys, custom settings, etc.. all had to be supported - it had to be reusable/accessible in other roles (i.e. Icinga/monitoring) - different privileges for dev,production and other environments - UID/GID unification ● turned out to be simpler than we thought - users are managed using few simple tasks and group_vars data. Rest is handled via variables precedence. ● migration/standardization required some effort though User management automation
  • 14. ● standard ansible inventory management becomes a bit cumbersome with 100’s of hosts: - each host has to have ansible_ssh_host defined - adding/removing large number of hosts/groups required editing lots of files and/or one-off scripts - ip address management using google docs does not scale ;) ● Ansible has well defined dynamic inventory API, with scripts available for AWS, Cobbler, Rackspace, Docker, and many others. ● we wrote our own, which is based on YAML file, version controlled by git: - python API allowing to manipulate the inventory easily - logic and syntax checking of the inventory ● available as opensource: https://github.com/brainly/inventory_tool Inventory management
  • 15. ● we are leasing our servers from Hetzner, no direct Layer 2 connectivity ● all tunnel setups are done using Ansible, new server is automatically added to our network ● firewalls are set up by Ansible as well: - OPS contribute the base firewall, DEVs can open the ports of interest for their application - ferm at it's base, for easy rule making and keeping in-kernel firewall in sync with on-disk rules - rules are auto-generated basing on inventory, adding/removing hosts is automatically reconfigures FW Networking
  • 16. ● based on Bareos, opensource Bacula fork ● new hosts are automatically set up for backup, extending storage space is no longer a problem ● authentication using certificates, PITA without ansible Backups
  • 17. ● deployment done by Python script calling Ansible API ● simple tasks implemented using ansible playbooks ● complex logic implemented in Python Deployments
  • 18. ● Jinja2 template error messages are "difficult" to interpret ● templates sometimes grow to huge complexity ● Jinja2 is designed for speed, but with tradeoffs - some Python operators are missing and creating custom plugins/filters poses some problems ● multi-inheritance, problems with 2-headed trees ● speed, improved with "pipelining=True", containerization on the long run ● some useful functionality requires paid subscription (Ansible Tower) - RESTfull API, useful if you want to push new application version to productions via i.e. Jenkins - schedules - currently we need to push the changes ourselves Not everything is perfect
  • 19. ● developers by default have RO access to repo, RW on case-by-case basis ● changes to systems owned by developers are done by developers, OPS only provide the platform and tools ● all non-trivial changes require a Pull Request and a review from Ops ● encrypt mission critical data with Ansible Vault and push it directly to the repo - *strong* encryption - available to Ansible without the need for decryption (password still required though) - all security sensitive stuff can be skipped by developers with "--skip-tags" option to ansible-playbooks Dev,DevOps,Ops
  • 20.
  • 21. ● some of the things we mentioned can be find on our Github account ● we are working on opensourcing more stuff https://github.com/brainly Opensource! Opensource! Opensource!
  • 22. ● time needed to deploy new markets dropped considerably ● increased productivity ● better cooperation with developers ● more workpower, Devs are no longer blocked so much, we can push tasks to them ● infrastructure as a code ● versioning ● code-reuse, less copy-pasting Conclusions