O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly


Confira estes a seguir

1 de 52 Anúncio

DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly

Baixar para ler offline

DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly gives an overview of the strengths and weaknesses of DevOps automation, tips on developing your automation strategy, and a high level overview of automation options across the DevOps toolchain.

DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly gives an overview of the strengths and weaknesses of DevOps automation, tips on developing your automation strategy, and a high level overview of automation options across the DevOps toolchain.


Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly (20)


Mais de DevOpsGroup (20)

Mais recentes (20)


DevOpsGuys - DevOps Automation - The Good, The Bad and The Ugly

  1. 1. www.devopsguys.com | Phone: 0800 368 7378 | e-mail: team@devopsguys.com | 2017 Automation: The Good, the Bad and the Ugly Getting your Automation strategy right
  2. 2. rm -Rvf
  3. 3. 3@DevOpsGuys
  4. 4. 4@DevOpsGuys
  5. 5. 5@DevOpsGuys
  6. 6. 6@DevOpsGuys The End Result “Database data such as projects, issues, snippets, etc. created between January 31st 17:20 UTC and 23:30 UTC has been lost.” “It's hard to estimate how much data has been lost exactly, but we estimate we have lost at least 5000 projects, 5000 comments, and roughly 700 users.” https://about.gitlab.com/2017/02/10/postmortem-of-database-outage-of- january-31/
  7. 7. 7@DevOpsGuys The Good, The Bad and The Ugly • An automated spam attack plus a • An automated user deletion process manually triggered by an employee incorrectly approving the abuse report against a gitlab employee account, • Created a replication delay issue exacerbated because automated write- ahead log archiving wasn’t enabled • That led to the accidental manual deletion of data • Compounded by automated backups failing • That no-one noticed because the notification email was automatically blocked by DMARC • They plan to fix some of this by automating the backup / restore validation cycle
  8. 8. Automation Strategy Where to start?
  9. 9. 9@DevOpsGuys 1970 2017 19701970 IBM JCLIBM JCL 19771977 SHSH 19781978 REXXREXX 19811981 DOS BATCHDOS BATCH 19891989 BASHBASH 19951995 GHOSTGHOST 19971997 JUNITJUNIT 20042004 SELENIUMSELENIUM 20052005 HUDSONHUDSON PUPPETPUPPET 20062006 POWERSHELLPOWERSHELL 2006 2017 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 20062006 POWERSHELLPOWERSHELL 20082008 GITHUBGITHUB 20082008 GITHUBGITHUB 20092009 CHEFCHEF 20092009 CHEFCHEF 20062006 AWS EC2AWS EC2 20062006 AWS EC2AWS EC2 20102010 AZUREAZURE 20102010 AZUREAZURE 20112011 JENKINSJENKINS 20112011 JENKINSJENKINS 20122012 ANSIBLEANSIBLE 20122012 ANSIBLEANSIBLE 20132013 DOCKERDOCKER 20132013 DOCKERDOCKER 20142014 KUBERNETESKUBERNETES 20142014 KUBERNETESKUBERNETES
  10. 10. https://xebialabs.com/periodic-table-of-devops-tools/
  11. 11. 11@DevOpsGuys
  12. 12. 12@DevOpsGuys Start from the Constraint https://www.amazon.co.uk/gp/product/0566086654/ref=as_li_tl?ie=UTF8&c amp=1634&creative=6738&creativeASIN=0566086654&linkCode=as2&tag =dev09bb-21
  13. 13. 14@DevOpsGuys Value Stream Mapping https://github.com/MSDevOps/VSM
  14. 14. 15@DevOpsGuys http://www.tocinstitute.org/five-focusing-steps.html
  15. 15. DevOps Automation 101 A whistle-stop opinionated overview of some DevOps tools
  16. 16. 17@DevOpsGuys The DevOps Toolchain Design & Plan Code Integrat e Test Release Deploy Operate
  17. 17. 18@DevOpsGuys Plan - Requirements • Atlassian or VSTS Or GitHub Enterprise • Issue/Work Item Tracking • Sprint/Kanban Boards • Wiki for Requirements & other docs* • Source Code & CI Integrations for Feedback loops * Confluence has a edge here!
  18. 18. 19@DevOpsGuys Plan - Communicate • Informal communication is very important! • The key is to be able to communicate and share information with the minimum of “friction” • Act as a point of integration for “ChatOps” • We currently use Slack (www.slack.com) • Microsoft Teams is getting better (rapidly)
  19. 19. 20@DevOpsGuys Code – Source Code • The de facto DVCS system is Git • Excellent for distributed teams, remote working etc. • Github – Cloud and on-premise Enterprise version • VSTS – Git online or TFS on-premise 20
  20. 20. 21@DevOpsGuys Code – Automated Developer Environments • We (strongly) recommend Vagrant for virtualised local development environments • Faster provisioning of local environments • Push out new environment updates and tools • Keep teams in Sync • Combine with Packer (https://www.packer.io/) and your preferred CM tools (e.g. Ansible) for complete environment control • Check everything into source control for version mgmt. • Use Vagrant + Vmware workstation for better performance & compatibility • https://www.vagrantup.com/vmware
  21. 21. 22@DevOpsGuys
  22. 22. 23@DevOpsGuys Code – Database & SQL • Your database schemas and static data are also part of your CI process (but often overlooked) • The should be treated like code and checking into source control! • Our tool(set) of choice is Redgate SQL Source Control • SQL Server • http://www.red-gate.com/products/sql-development/sql-source-control/ • Oracle • http://www.red-gate.com/products/oracle-development/source-control-for-oracle/ 23
  23. 23. 24@DevOpsGuys https://www.slideshare.net/RedgateSoftware/redgate-dlm-demo-webinar-using-migration-scripts-in-sql-source-control-5-19th-july-2016
  24. 24. 25@DevOpsGuys Build - Continuous Integration • Our CI tool of choice is TeamCity from Jetbrains • https://www.jetbrains.com/teamcity/ • Easy to configure / extend • Very cost-effective (free for small teams) • Support when you need it • VSTS has its own Build server OR you can use TC, Jenkins etc • Lots of open-source & Cloud alternatives • Jenkins • Travis-CI • Wercker etc
  25. 25. 27@DevOpsGuys Build – Unit Testing • There are many, many unit testing frameworks so it’s hard to say “what’s best”… • Some are language-specific, some are ported to multiple languages • In a Java world… JUnit is probably the most well known, along with TestNG. • In a .Net world … NUnit, which is a port of JUnit
  26. 26. 28@DevOpsGuys Build – Static Code Analysis • There are (again) a lot of different for different types of static code analysis • Some are integrated with Build servers e.g. Sonar (http://www.sonarqube.org/features/) • Some are integrated with the IDE (e.g. ReSharper for C#.Net which is almost a “must have” product https://www.jetbrains.com/resharper/) TOP TIP InfoSec love this stuff…
  27. 27. 29@DevOpsGuys Build – Package Automation • Part of the build process is creating a releasable package. • In a Linux world the de-facto standard is an RPM or DEB • In a Windows world the de-facto standard is an NuGet package • http://nuget.codeplex.com/ • You can even use Chocolatey on Windows like a Linux package manager to install nugget packages! https://chocolatey.org/ • Should be created automatically as part of your Build process.
  28. 28. 30@DevOpsGuys Test – Test Management • Some Don’ts do start with… • Don’t use Excel spreadsheets to track test execution & status – it rapidly becomes a time-wasting exercise in futility • Don’t use HP Quality Centre, it’s just woeful. Full stop. • VSTS has test case management (but I haven’t used it personally) • Zephyr for Jira is a good option if you’ve gone the Jira route.
  29. 29. 31@DevOpsGuys Test – Acceptance Testing • Many, many Acceptance testing frameworks out there… • Fitnesse is very popular (and cross-platform) • We are also (huge) fans of Gherkin (GWT) syntax and Cucumber-based BDD Acceptance Testing frameworks • Cumber for Java http://cukes.info/install-cucumber-jvm.html • SpecFlow for .Net http://www.specflow.org/ 31
  30. 30. 32@DevOpsGuys Testing – Browser UI Testing • For web-driven UI’s the widely adopted industry standard is Selenium. http://www.seleniumhq.org/ • It’s common • It’s easy to find people with Selenium skills • It’s easy to get Selenium training • It works • It’s free • SauceLabs and VSTS will Cloud Host your selenium testing (as will many others)
  31. 31. 33@DevOpsGuys Release – Artifacts & Release Mgmt • Store your Artifacts (packages, binaries, jars/wars etc) in: • Nexus – http://www.sonatype.org/nexus/ • Artifactory - http://www.jfrog.com/article/devops/ • ProGet (.Net specific) - http://inedo.com/proget/overview • Use Jira or VSTS to Manage your release processes • #KillTheCAB 33 Given Release Package is Ready for Deployment When deployed via an Automated Release Pipeline Then “ITIL Standard Change” is True And No CAB is Required
  32. 32. 34@DevOpsGuys Here is where it gets blurry… Env. Provisioning Configuratio n Managemen t Application Release Automation
  33. 33. 35@DevOpsGuys Deploy – Environment Provisioning • In this context “Environment Provisioning” means “the ability to instantiate (create) compute resources (IaaS or PaaS), normally in a Cloud environment, and then trigger further configuration management and provisioning activities” • HashiCorp Terraform is our weapon of choice Lately we’ve been using the Hashicorp products • Vagrant – e.g. with customer Rackspace provider - https://github.com/mitchellh/vagrant-rackspace • Terraform* - https://www.terraform.io/ • *Note - currently doesn’t have a VMware provider 35
  34. 34. 36@DevOpsGuys Deploy – Server Configuration Mgmt • Pick one: • Chef • Puppet • Ansible • Powershell DSC (cross-platform!) • You can find about 100 DevOps people who know one, or more, of these 4 tools for every 1 person that knows anything about any of the “Enterprise DevOps Tools” 36
  35. 35. 37@DevOpsGuys Deploy – Application Release Automation (ARA) • Lot of choices in this area but our preferred patterns are: • Linux - Ansible triggers YUM / Apt-Get package managers to deploy the RPM / Deb packages • Windows – Octopus Deploy or VSTS Release Manager to deploy NuGet packages 37
  36. 36. 38@DevOpsGuys Deploy – Docker & Containerisation “Docker is an open platform for developers and sysadmins to build, ship, and run distributed applications… Docker enables apps to be quickly assembled from components and eliminates the friction between development, QA, and production environments. As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud” – Docker.com 38
  37. 37. 39@DevOpsGuys The massively over-simplified versions of Ops… Operations Orchestratio n Monitoring Alerting
  38. 38. 40@DevOpsGuys Orchestration • Traditional Apps • RunDeck • Ansible Tower • Azure Automation • Jenkins (  ) • Containers and Beyond • Kubernetes • DCOS (Mesos-based) • Docker Swarm • Cloudify
  39. 39. 41@DevOpsGuys Monitoring = AppDynamics
  40. 40. 42@DevOpsGuys Alerting • Who gets woken up by what notification method • PagerDuty • OpsGenie • VictorOps • Notification Channels • Slack • Mobile App • SMS • Email
  41. 41. 43@DevOpsGuys Summary •Automation is Good, Bad and Ugly •Automation is inevitable •Start at your Constraint •There are lots of choices •YMMV  Don’t spend months on evaluations. Pick one, trial it, start learning
  42. 42. Thank You Questions?
  43. 43. 45@DevOpsGuys About DevOpsGuys • Founded 2013 • 70 Staff • 30+ Clients • Headquartered in Cardiff, Wales • AppDynamics Partner • Team@DevOpsGuys.com • Established as thought leaders in DevOps • Quoted by Gartner and Forrester in research • Founded winops.org • Top ranked DevOps blog “DevOpsGuys are luminaries in the UK DevOps space.” Gene Kim, Author – “The Phoenix Project”
  44. 44. DevOps in 7 slides in 7 minutes Just so we’re all on the same page about this DevOps thing… Start the clock!
  45. 45. 47@DevOpsGuys People, process and the right tools working together to make your product delivery lifecycle faster and more predictable. DevOps - Defined
  46. 46. 48@DevOpsGuys The DevOps “CALMS” model • Culture • Automation • Lean • Measurement • Sharing
  47. 47. 49@DevOpsGuys
  48. 48. 50@DevOpsGuys Multi-Disciplinary Delivery Teams
  49. 49. 51@DevOpsGuys This isn’t an easy Transformation… From… Key Success Factor To… Command & Control Management Style Autonomous Conservative Attitude to Change Experimental Silo Organisation Structure Collaborative Project-focussed Delivery Focus Product-centric Waterfall Delivery Model Iterative (Agile) Large (Huge) Batch size Smallest possible Monolithic Systems Architecture Loosely coupled Proprietary Technology Open (Source) Manual Processes Automated
  50. 50. 52@DevOpsGuys

Notas do Editor

  • At about 1130pm a tired and somewhat frustrated engineer is dealing with some replication issues on a Postgres database – partly caused by a external spam attack and partly caused by an automated process deleting an account flagged for abuse… that turn out to be a Gitlab engineer account with a lot of associated projects etc

    In order to try and resolve the problem in getting the replication to initialise properly on the slave he clears out the data directory on the slave… only to release his SSH session is currently on the PRIMARY (master) not the SECONDARY (Slave). Despite quickly cancelling the command only 4.5Gb of about 310 Gb of data is left.

    2017/01/31 23:00-ish

    YP thinks that perhaps pg_basebackup is being super pedantic about there being an empty data directory, decides to remove the directory. After a second or two he notices he ran it on db1.cluster.gitlab.com, instead of db2.cluster.gitlab.com

    2017/01/31 23:27 YP - terminates the removal, but it’s too late. Of around 310 GB only about 4.5 GB is left - Slack
  • And because we live in a modern age… we live tweet and live stream everything…

  • But it’s OK they had automated backups!

    Every 24 hours a backup is generated using pg_dump, this backup is uploaded to Amazon S3. Old backups are automatically removed after some time.

    Every 24 hours we generate an LVM snapshot of the disk storing the production database data. This snapshot is then loaded into the staging environment, allowing us to more safely test changes without impacting our production environment. Direct access to the staging database is restricted, similar to our production database.

    For various servers (e.g. the NFS servers storing Git data) we use Azure disk snapshots. These snapshots are taken once per 24 hours.

    Replication between PostgreSQL hosts, primarily used for failover purposes and not for disaster recovery.

  • PG-Dump backups were borked due a version mismatch between Postgres versions…

    Azure snapshots not enabled for DB Servers

    LVM snapshot was 6hrs (the one taken before the maintenance) or 24hrs old (the one from last night)
  • The Julie Andrew approved method.

  • This statement from DevOps report is pretty relevant here.
    IT shops who utilize best practices around continuous delivery, deploy code more frequently and with more confidence. And that enables them to be more agile in their software delivery process and makes the company twice as likely to exceed their profitability
  • We were co-founded by 2 experienced technologists, with a track record of delivering results at enterprise scale.