2012 a deployment pipeline for infrastructure a dev ops case study at nbn _ puppet labs

7/1/2015 A Deployment Pipeline for Infrastructure: A DevOps Case Study at NBN | Puppet Labs
https://puppetlabs.com/blog/adeploymentpipelineforinfrastructure 1/7
A Deployment Pipeline for
Infrastructure: A DevOps Case Study
at NBN
 (http://feeds.feedburner.com/PuppetLabs)    
December 21, 2012 by Jez Humble , DevOps (/blog-tags/devops)
Written by Andrew Cunningham (acunning@thoughtworks.com
(mailto:acunning@thoughtworks.com) ), ThoughtWorks; Andrew Myers
(AndrewMyers@nbnco.com.au (mailto:AndrewMyers@nbnco.com.au) ), NBN Co; and edited by
Jez Humble (https://twitter.com/jezhumble) .
The Problem
The National Broadband Network (NBN Co) is an Australian government-owned
enterprise formed in 2009 to build a broadband network which will provide a fiber-
optic connection to 93 percent of Australian homes and businesses, with fixed
wireless and satellite services to the remainder. After its creation, the NBN quickly
needed to establish a public website in order to disseminate information, and to
create a set of business services to allow partners to begin the process of
requesting access to the new network. The internal development teams used a
combination of bespoke application development, service development, and
configuration of commercial off-the-shelf software to provide these services.
Because the NBN was in such a rapid startup mode, the infrastructure necessary to
support the development of these websites and services was being created as the
development proceeded. In development and test environments, projects could
request new nodes and then manage them as they saw fit. In production-like
environments, there were a limited number of available nodes that needed to be
shared between multiple projects. However, each project focused only on their
own infrastructure requirements and issues started to appear. For example:
Two teams sharing infrastructure would make incompatible changes and only
one team’s application would work.
Nodes were being set up by hand, and changes that were made in
development environments to resolve defects were not being applied to later
environments, leading to regressions.
No one had a good idea of what was actually on each machine or who owned
it, leading to anxiety about deployments, configuration changes and the
ability to audit IT systems effectively.
Because each node was being created by hand by different individuals, the
loss of a node required a considerable amount of work to recover.
A group of developers got together to work on resolving these problems. They set
out to accomplish a few goals:
1. Stand up production and production-like test systems quickly from scratch.
2. Capture all the configuration information in version control so that machines
could be provisioned quickly without manual intervention.
3. Have a single source of truth to determine what a node should look like so
changes and their impacts could be analyzed prior to making them.
4. Integrate infrastructure as early as possible. If four teams were going to share
a node in production, make them share nodes in test environments as well.
5. Make infrastructure changes using the same process as application changes
—in particular, applying them in test environments so that their impact can
be understood before they are applied to production.

(http://feeds.feedburner.com/PuppetLabs)
Get RSS feed
(http://feeds.feedburner.com/PuppetLabs)
Categories (/blog-
categories/)
Share
Search the blog 
Community (/blog-
categories/community) (219)
Company News (/blog-
categories/company-news) (169)
Culture (/blog-categories/culture)
(20)
Customer Stories (/blog-
categories/customer-stories) (42)
Education & Training (/blog-
categories/education-training) (88)
Events (/blog-categories/events) (127)
Partnerships (/blog-
categories/partnerships) (24)
Product (/blog-categories/product)
(166)
Tips & How To (/blog-
categories/tips-how) (96)
Tools & Integrations (/blog-
categories/tools-integrations) (106)
home (/) forge (http://forge.puppetlabs.com)
docs (http://docs.puppetlabs.com)
learn (http://learn.puppetlabs.com)
community (/community/overview)
tickets (http://tickets.puppetlabs.com)
PuppetConf (http://2015.puppetconf.com) help (/help)
(/)
Get Started Menu
search

Initial Implementation
The first implementation that was built started to address some of these goals, but
when we came to use it we discovered several problems that needed to be fixed.
Some of these issues are discussed in Figure 1, which presents this initial
implementation, below.
(/wp-content/uploads/2012/12/Infrastucture-Setup-before-pipeline-v3.jpg)
Figure 1: Initial implementation of infrastructure configuration management system
The initial setup was as follows:
Puppet was installed on all new servers being provisioned and setup to run in
the out of the box configuration—it would poll for configuration changes at
30 minute intervals.
Puppet usage across all development teams was inconsistent. Some teams
had their server setup fully automated, some were partially automated, and
some were hand-created masterpieces.
The Puppetmaster daemons were set up manually.
Each environment had its own repository, and migration of infrastructure
code between environments was done manually.
This implementation lead to the following issues:
The process to make infrastructure changes ended up being “Check in
changes, log on to target server, manually trigger Puppet”
Since changes were migrated manually between environments, not all
changes would get migrated. This led to inconsistencies between
environments and configuration drift.
Environment-specific code bases lead to massive duplication in manifests and
modules.
Lack of automated processes for migrating code and inconsistent
implementation of automation across all servers lead to frequent breakages
and reduced trust in the infrastructure deployment.
Fixing the issues: Implementing a Deployment
Pipeline for Infrastructure
As a result of these problems, a dedicated DevOps team was formed. The purpose
of this team was to help development teams automate their server setup and
configuration, and to create a better and more reliable process for migrating
infrastructure code up to production environments.
By this stage, most teams had implemented a deployment pipeline to move their
code from development to production, using a number of different technologies

code from development to production, using a number of different technologies
(Capistrano, Maven etc). The DevOps team decided to implement the same pattern
for the infrastructure code, and chose to standardise on MCollective to roll out
changes to nodes, Puppet for performing the migrations, and Go to manage the
deployment pipeline. The aim was to be able to check a change into version control
once, and then be able to promote it through test environments and finally into
production at the click of a button.
The implementation of this pipeline consisted of the following steps as outlined in
diagram 2 below:
(/wp-content/uploads/2012/12/Infrastructure-pipeline-diagram.jpg)
Figure 2: Infrastructure deployment pipeline setup
1. “Unit testing” of infrastructure code
RSpec tests against custom Puppet/MCollective code.
Local compilation of all node manifests using the Puppet API.
Execution of a system wide Puppet ‘dry-run’ to catch errors and create a
report detailing expected changes when applying that revision of the
infrastructure code.
2. Promoting successful changes up to the next environments
Instead of packaging up and promoting the entire infrastructure
codebase, we simply promoted version control revision numbers
through the deployment pipeline.
When a new revision was migrated to the next environment, we used
MCollective to tell the Puppetmaster to update its repository to the
specified revision, which was taken from the previous successful run of
the upstream environment.
3. Update environment using the specified configuration
Where before we had Puppet running in daemon mode on a 30 minute
loop, we now wanted to have Puppet trigger only when the upstream
pipelines had succeeded, or when manually requested by a user.
We implemented an MCollective agent to allow a command to be sent
to all the machines in an environment to cause them to execute a
Puppet update.
4. Post-update smoke tests
To ensure that the update of the infrastructure code was working as
expected, we wrote a limited number of smoke tests to quickly ascertain
whether all the expected applications on the nodes were running.
5. Publish packaged configuration information and update CMDB
Provided that the update and smoke tests succeeded on all nodes in the
environment, the revision in version control would be made available to
update the next environment.
In addition, each node would also write out a report to a CMDB detailing
its current Puppet manifest.

This implementation has several important benefits:
RSpec tests and local compilation could both be run on a developer’s machine
prior to committing code. This enabled much faster feedback for developers
so they could quickly catch issues such as syntax errors, cyclical dependencies
and missing files.
Code only had to be checked in once, and then could be migrated all the way
to production with a click of a button. If we required a major production fix,
we could push it through all stages of the pipeline from development to
production in about 20 minutes.
The output from the dry-run in the first phase of the pipeline provided a
detailed report as to what would be changed on each node. This report was
used by both developers and the Software Change Control Board to
understand the impact of the changes that were going to go into production.
The use of MCollective to trigger Puppet updates meant that we had
complete control over when a particular revision was pushed into a new
environment.
Repeatability and consistency of infrastructure code deployments were much
improved. Developers had confidence that what was getting pushed to
production was the same baseline that had been tested extensively in pre-
production environments. If necessary, we could re-deploy the same revision
to an environment and have confidence that the results would be same:
infrastructure deployment was idempotent.
There were also some limitations to this model:
Because Puppet is additive only, there was no ability to revert changes
as you would in a typical application deployment. If a change needs to
be backed out, you must explicitly add configuration to reverse it, check
this configuration in, and promote it to production using the pipeline.
This meant that if a breaking change did get deployed into production,
typically a manual fix was applied, with the proper fix checked into
version control subsequently.
Certain types of configuration are node-specific and could only be really
tested when applied to the environment in which the node resided. We
mitigated this risk as much as possible by using Vagrant to stand up
virtual clusters on development machines, but these were never exact
simulacra of the production environment, so bugs would very
occasionally get through.
Because the dry run report was the first step in promoting a change, it
could be hard to dig out this report to determine the impact of a given
revision to an environment when the promotion happened potentially
days later. One possible solution would have been to implement the dry
run report as the last stage in the previous environment’s build steps.
Conclusion
Storing configuration information in version control and using a tool like
Puppet to apply infrastructure changes to your environments is a good first
step to getting your infrastructure under control. Unfortunately, it is only the
first step. In the same way that a business would never check in a change to a
codebase and push it directly to production, operation teams need to be
aware that simply adding Puppet as an entry point for changes is not enough
to ensure success. A poorly-designed infrastructure change applied directly to
production via Puppet is just as likely to cause an outage as a manual change
made directly on the node.
The deployment pipeline model for software provides a path to production
for every code change checked in by a developer. It implements a series of
tests and verifications which the change must pass through to ensure that it

tests and verifications which the change must pass through to ensure that it
is ready to go live. The benefit of this setup is that, provided the code change
passes all the verifications, the team and the business both have confidence
that changes can safely be made to production at any time.
Applying this same pattern to infrastructure changes allowed us to realize the
same benefits. Since we had a single path to production, the only way a
change could be applied to production was to have it applied and tested in
each and every prior environment. In order to verify changes, we ensured
that the infrastructure required for a project to run in production was also
available in testing environments. Project-specific infrastructure changes
could therefore be tested and promoted up to production alongside the
application changes that required them.
The pipeline management tool we used — Go — provided both automated
and manual configuration for promoting changes between environments. For
testing environments, changes were automatically applied after check-in once
the unit tests had passed. For controlled environments, changes that had
made it through test environments could be promoted on demand. Finally,
because our pipeline also produced a report of what it had done, we were
able to detect uncontrolled changes and find out which well-meaning
developer had been monkeying with an environment manually. This helped
us to catch most uncontrolled changes and get them into version control to
ensure they would be applied and promoted using our standard, controlled
process.
Ultimately our infrastructure deployment pipeline gave us the ability to push
changes to production on demand with a very high confidence that the
change would work as expected. Reports were produced detailing what
changes would take place on which nodes, and then the changes were
verified using smoke tests. The teams became so confident that it was not
unexpected to see changes being applied to production in the middle of the
day. The combination of MCollective, Puppet, Go and the deployment pipeline
pattern has enabled NBN Co to treat their infrastructure in the same way they
treat any project. A project that can deliver specific requirements on time in a
reliable, automated, and repeatable fashion.
Learn More:
Take the 2012 DevOps survey (http://info.puppetlabs.com/devops-survey-
2012.html)
Read other posts in the DevOps December series (/category/blog/devops-
december/)
(/category/blog/devops-december/)
(/category/blog/devops-december/) Go from ThoughtWorks Studios
(http://www.thoughtworks-studios.com/go-agile-release-management)
Deployment pipelines (http://www.informit.com/articles/article.aspx?p=1621865)
Continuous Delivery (http://continuousdelivery.com/)
NBN Co (http://www.nbnco.com.au/)
About the authors:
Andrew Cunningham has just joined Telstra as a Senior Middleware Engineer
after spending 7 years with ThoughtWorks in Canada, USA and Australia. He has
worked in a mixture of QA, development and operation roles for a wide variety of
industries and projects. He has been focusing on DevOps work for the last three
years since discovering that he could play as a systems administrator while still
getting to write code.
Twitter: @oldNoakes (https://twitter.com/oldNoakes)
Personal blog: http://workblog.intothenevernever.com/
(http://workblog.intothenevernever.com/)
Github: http://github.com/oldnoakes (https://github.com/oldnoakes)

RESOURCES
Free Download (/download-puppet-enterprise)
Blog (/blog)
Puppet Store (http://shop.puppetlabs.com)
Security (/security)
Site Map (/sitemap)
RSS (http://feeds.feedburner.com/PuppetLabs)
Andrew Myers joined NBN Co in 2011 to play a part in building a high speed
broadband network for all Australians. There he helped grow what was initially a
small DevOps experiment, into it's current state where the automation capabilities
he helped design and implement are depended on by multiple teams every day.
Previously he's worked as a software developer, experiencing the "old ways" first
hand, as well as working on the other side of the fence providing technical support
for development and collaboration tools. Andrew's passionate about DevOps
because it allows him to combine software development and system
administration skills and be involved across the whole software development
lifecycle.
Comments
Leave a comment
Name (required)
Email (will not be published) (required)
Website
Comment
Submit
Choose the High Quality Pipeline Services • 11 months ago 
reply
This deployment pipeline model for software is really faster
completion software. Every Pipeline company can take great
benefit by using this software. <a
href="http://onetcodeconnection.net/working-in-the-
telecommunications-industry... (http://onetcodeconnection.net/working-
in-the-telecommunications-industry/">http://onetcodeconnection.net/working-
in-the-telecommunications-industry/</a>)
Complete Pipeline Construction Solution In QLD • 11 months ago 
reply
This deployment pipeline model for software is really a faster
completion software. Every Pipeline can take greatly benefit by
using this software. <a
href="http://www.reaygroup.com.au/'>http://www.reaygroup.co
m.au/</a>
(http://www.reaygroup.com.au/'>http://www.reaygroup.com.au/</a>)

RSS (http://feeds.feedburner.com/PuppetLabs)
SUPPORT
Customer Support (/services/customer-support)
Bug Tracker (http://tickets.puppetlabs.com)
Puppet Ask (http://ask.puppetlabs.com)
Puppet Users (http://groups.google.com/group/puppet-users?pli=1)
GitHub (https://github.com/puppetlabs)
COMPANY
About Us (/about)
Careers (/about/careers)
Partners (/services/partners)
Licensing (/licensing)
Privacy Policy (/privacy)
Terms of Use (/terms)
CONNECT
Contact Us (/contact)
Contact Sales (/contact-sales)
Twitter (https://twitter.com/puppetlabs)
Facebook (https://www.facebook.com/puppetlabs)
LinkedIn (http://www.linkedin.com/company/621389)
Google+ (https://plus.google.com/112682055028218091774)
IRC (http://webchat.freenode.net/?channels=puppet)
© 2015 Puppet Labs (/about)
877-575-9775 (tel:877-575-9775)
GET PUPPET LABS NEWS
enter your email address 

2012 a deployment pipeline for infrastructure a dev ops case study at nbn _ puppet labs

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Viewers also liked

Viewers also liked (16)

Similar to 2012 a deployment pipeline for infrastructure a dev ops case study at nbn _ puppet labs

Similar to 2012 a deployment pipeline for infrastructure a dev ops case study at nbn _ puppet labs (20)

Recently uploaded

Recently uploaded (20)

2012 a deployment pipeline for infrastructure a dev ops case study at nbn _ puppet labs