2. Who am I?
• Puppet user since 0.22.x
• Architect of MCollective
• Author of Extlookup and Hiera
• Developer at Puppet Labs London
• Blog at http://devco.net
• Tweets at @ripienaar
• Volcane on IRC
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
3. The Problem?
• Puppet needs management just like other
software
• Enabling, disabling, ad-hoc runs, custom
environments etc
• The Puppet Master is a finite resource that
needs protection
• Orchestrated deploys
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
4. MCollective Puppet Agent
package{[“mcollective-puppet-agent”,
“mcollective-puppet-client”]:
ensure => present
}
Available on yum.puppetlabs.com and apt.puppetlabs.com
http://srt.ly/mcpuppet
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
5. Obtaining The Agent
Status
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
6. Obtaining Statuses
$ mco puppet status
* [ ============================================================> ] 11 / 11
node8.example.net: Currently stopped; last completed run 14 minutes 16 seconds ago
....
Summary of Applying:
false = 11
Summary of Daemon Running:
unix text here
Per node status
stopped = 11
Summary of Enabled:
Estate wide summary
enabled = 10
disabled = 1
Summary of Idling:
false = 11
Finished processing 11 / 11 hosts in 72.05 ms
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
7. Obtaining Statuses
$ mco puppet count
Total Puppet nodes: 11
Nodes currently enabled: 10
Nodes currently disabled: 1
Nodes currently doing puppet runs: 5
Nodes currently stopped: 6
Nodes with daemons started: 10
Nodes without daemons started: 1
Daemons started but idling: 6
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
8. Obtaining Statuses
$ mco rpc puppet last_run_summary
* [ ============================================================> ] 28 / 28
.
.
.
Summary of Config Retrieval Time:
Average: 20.13
Summary of Total Resources:
Average: 435
Summary of Total Time:
Average: 39.33
Finished processing 28 / 28 hosts in 311.23 ms
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
10. Doing Basic Runs
$ mco puppet runonce
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2593.85 ms
$ mco puppet count
Total Puppet nodes: 11
Puppet 3 disable message
Nodes currently enabled: 10
Nodes currently disabled: 1
Nodes currently doing puppet runs: 2
Nodes currently stopped: 9
Nodes with daemons started: 10
Nodes without daemons started: 1
Daemons started but idling: 8
Run with default configured splay and splaylimit
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
11. Doing Basic Runs
$ mco puppet runonce -f
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Run with no splay, still subject to enable/disable
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
12. Doing Basic Runs
$ mco puppet runonce --splay --splaylimit 120
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Force splay and set a custom splay limit
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
13. Tags and Environment
$ mco puppet runonce --tag webserver --tag syslog --environment development
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Selects 2 tags in a specific Puppet Environment
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
14. Doing noop Runs
$ mco puppet runonce --noop
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Do a noop run, gathers reports and
audit information
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
15. Doing no-noop Runs
$ mco puppet runonce --tag webserver --no-noop
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
When puppet.conf has noop=true,
do an actual run on demand
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
16. Choosing a Master
$ mco puppet runonce --server secops.example.net:8134 --tag compliance
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Puppet is disabled: 'machine under maintenance'
Finished processing 11 / 11 hosts in 2661.99 ms
Does a single run against a different
Puppet Master
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
18. The Big Red Button
$ mco puppet disable “we f’d up, stop the train!”
* [ ============================================================> ] 11 / 11
node9.example.net Request Aborted
Could not disable Puppet: Already disabled
Summary of Enabled:
disabled = 11
Finished processing 11 / 11 hosts in 90.06 ms
Disables Puppet, does not change currently
disabled nodes reasons
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
19. The Big Green Button
$ mco puppet enable -S ‘puppet().disable_message=/stop the train/’
* [ ============================================================> ] 10 / 10
Summary of Enabled:
enabled = 10
Finished processing 10 / 10 hosts in 90.06 ms
Enables all disabled Puppet nodes
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
20. Operating On Groups
Of Hosts
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
21. Selective Runs
Facter fact Puppet Class
$ mco puppet runonce -W “cluster=a roles::webserver”
* [ ============================================================> ] 5 / 5
Finished processing 5 / 5 hosts in 90.06 ms
Run using a filter:
all web servers with fact cluster=a
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
22. Selective Runs
Any Puppet resource
$ mco puppet runonce -S “resource(‘File[/srv/www]’).managed=true”
* [ ============================================================> ] 5 / 5
Finished processing 5 / 5 hosts in 90.06 ms
Run using a filter:
nodes where we manage /srv/www
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
23. Selective Runs
$ mco puppet runonce -S “resource().failed_resources>5 and resource().config_version=xyz”
* [ ============================================================> ] 5 / 5
Finished processing 5 / 5 hosts in 90.06 ms
Run using a filter:
Most recent run config_version was xyz
that had > 5 resource failures
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
24. Roll Out A Change Quickly
$ mco puppet runall 7
2013-01-19 20:58:59: Running all nodes with a concurrency of 7
2013-01-19 20:58:59: Discovering enabled Puppet nodes to manage
2013-01-19 20:59:02: Found 11 enabled nodes
2013-01-19 20:59:06: node3.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:07: node1.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:09: node4.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:10: node6.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:12: node0.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:13: node5.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:21: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:25: node9.example.net schedule status: Puppet is currently applying a catalog,
cannot run now
2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:33: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:38: node2.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:41: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:46: middleware.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:50: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:55: node7.example.net schedule status: Started a background Puppet run
Runs all nodes with a maximum concurrency
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
25. Roll Out A Change Quickly
2013-01-19 20:58:59: Running all nodes with a concurrency of 7
2013-01-19 20:58:59: Discovering enabled Puppet nodes to manage
2013-01-19 20:59:02: Found 11 enabled nodes
Does not attempt to manage disabled nodes
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
26. Roll Out A Change Quickly
2013-01-19 20:59:02: Found 11 enabled nodes
2013-01-19 20:59:06: node3.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:07: node1.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:09: node4.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:10: node6.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:12: node0.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:13: node5.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 7
Starts the first 6 quickly but considers
administrators doing 1other run at the same time
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
27. Roll Out A Change Quickly
2013-01-19 20:59:17: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:21: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:25: node9.example.net schedule status: Puppet is currently applying a catalog,
cannot run now
2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run
node9 was being run by an administrator or normal
schedule already, skipped to next node
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
28. Roll Out A Change Quickly
2013-01-19 20:59:29: node8.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:33: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:38: node2.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:41: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:46: middleware.example.net schedule status: Started a background Puppet run
2013-01-19 20:59:50: Currently 7 nodes applying the catalog; waiting for less than 7
2013-01-19 20:59:55: node7.example.net schedule status: Started a background Puppet run
Regularly checks the concurrency and starts
more nodes soon as possible.
Average node run time 34.39s, total
time 55 seconds
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
29. Roll Out A Change Slowly
Wait 5 minutes
$ mco puppet runonce --batch 5 --batch-sleep 300
* [ ============================================================> ] 11 / 11
Finished processing 11 / 11 hosts in 903686.29 ms
Does runonce in batches of 5, 5 minute sleep
per batch. ^c after any batch to stop.
15 minute total run time.
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
34. Performance Analysis
$ mco find -S "resource().config_retrieval_time > 30"
dev3.example.net
dev4.example.net
dev7.example.net
dev6.example.net
dev8.example.net
dev9.example.net
dev10.example.net
Find machines with config_retrieval_time over
30 seconds - all the dev servers.
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
35. Maintenance Windows
and Access Control
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
36. Puppet State As ACL
policy default deny
allow cert=manager enable disable * *
allow cert=sysadmin runonce status * *
allow cert=developer * environment=development *
Only cert=manager can enable and disable
the Puppet Agent indicating maintenance
periods
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
37. Puppet State As ACL
policy default deny
allow cert=manager stop start * *
allow cert=noc stop start puppet().enabled=false
allow cert=developer * environment=development *
NOC can start and stop services
only during a maintenance window.
Manager user can always override
maintenance windows.
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar
38. What is MCollective?
• Ruby framework for writing Orchestration
systems
• Provides Authentication, Authorization and
Auditing
• No direct communication between client
and nodes
R.I.Pienaar | rip@devco.net | http://devco.net | @ripienaar