Presentation by Damon Edwards, co-founder of Rundeck, at All Day DevOps on October 24, 2017.
See a Demo of Rundeck Enterprise :
https://www.rundeck.com/see-demo
--or--
Download Rundeck Open Source here:
https://rundeck.com/open-source
Connect:
Stack Overflow community: https://stackoverflow.com/questions/tagged/rundeck
Github: https://github.com/rundeck/rundeck/issues
Twitter: https://twitter.com/Rundeck
Facebook: https://www.facebook.com/RundeckInc/
LinkedIn: www.linkedin.com › company › rundeck-inc
4. Operations is getting squeezed
The Operations Squeeze
“The Operations Squeeze”
Go faster! Be flexible! Lock it down!
Improved Quality
Shorter Time-to-Market
Fast Feedback
From Users
Availability Auditing
Security Compliance
Dev Ops
Ops
5. Operations is getting squeezed
The Operations Squeeze
“The Operations Squeeze”
Go faster! Be flexible! Lock it down!
Improved Quality
Shorter Time-to-Market
Fast Feedback
From Users
Availability Auditing
Security Compliance
Dev Ops
Ops
More errors
More delays
Less capacity
Less flexibility
18. Let’s look at the principles behind the improvement …
19. Two prevailing models of operations support
Running
Service
“You build it. They run it.” “You build it. You run it.”
Development
Team
Operations
Team
Dev Ops
Integrated Delivery Team
Running
Service
20. Two prevailing models of operations support
Running
Service
“You build it. They run it.” “You build it. You run it.”
Development
Team
Operations
Team
Dev Ops
Integrated Delivery Team
Running
Service
21. Two prevailing models of operations support
Running
Service
“You build it. They run it.” “You build it. You run it.”
Development
Team
Operations
Team
Dev Ops
Integrated Delivery Team
Running
Service
“two-pizza team”
22. “You build it. They run it.” (aka… the way it always was)
It’s 2am ….
It’s 2pm ….
It’s the NOC…
Talk them through: health checks,
reviewing log files, and process of
diagnosing and recovering the system.
Same as you did for dev teams 2
months ago, QA teams last month,
Ops during deploy last week, etc.
23. “You build it. They run it.” (aka… the way it always was)
It’s 2am ….
It’s 2pm ….
24. “You build it. They run it.” (aka… the way it always was)
It’s 2am ….
It’s 2pm ….
It’s Ops…
“Will your applications be affected if
we take down EU-West?”
“Is it ok if we change these firewall
rules?”
“We are getting customer complaints
about performance. Are you sure you
didn’t change something?”.
25. “You build it. They run it.” (aka… the way it always was)
Running
Service
Development
Team
Operations
Team
26. “You build it. They run it.” (aka… the way it always was)
Running
Service
Development
Team
Operations
Team
27. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
28. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
?
Incident!!
Incident!!
What would happen if…
New feature!!
New feature!!
New API!!
29. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
?
Incident!!
Incident!!
What would happen if…
New feature!!
New feature!!
New API!!
Running
Service
Add this to your
responsibilities!
30. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
?
Incident!!
Incident!!
What would happen if…
New feature!!
New feature!!
New API!!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
31. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
?
Incident!!
Incident!!
What would happen if…
New feature!!
New feature!!
New API!!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
32. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
?
Incident!!
Incident!!
What would happen if…
New feature!!
New feature!!
New API!!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
33. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
?
Incident!!
Incident!!
What would happen if…
New feature!!
New feature!!
New API!!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
34. “You build it. You run it.”
Dev Ops
Integrated Delivery Team
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
Running
Service
?
Incident!!
Incident!!
What would happen if…
New feature!!
New feature!!
New API!!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
Running
Service
Add this to your
responsibilities!
“two-pizza teams”?
Just change how
business is structured,
funded, and operated.
36. Have the labor scaling benefits of “you build it, they run it”
without
the frequent escalations
the bad handoffs
Ideally we can find a way to…
37. Have the labor scaling benefits of “you build it, they run it”
without
the frequent escalations
the bad handoffs
Ideally we can find a way to…
Have the responsiveness/control of “you build it, you run it”
without
the scaling limitations
42. Ticket-Driven Request Queues Are Often a Sign of Silos
Team A
(Dev)
Team B
(Ops)
Ticket
System
??
Silo Builder Snowflake Maker
43. Silos + Rapid Tool Evolution = Islands of Automation
Puppet Chef
Shell Scripts
Data ETL
PowershellScripts
Network
Management
Monitoring
Ansible
Legacy
Datacenter
Automation
ContainerManagement
SQL
Tools
NewTools
New
Tools
44. Complex
System
Working in a complex system2
Service A
Service B
Service B v2
Service C
Service D
Service
E
Network
Network
Firewall
API
API
APIData
Data
ESB
API
Firewall
Firewall
49. Empower those closest to the issue
or escalate escalate
1° 2° 3°
escalate
4°
Push the ability to take action this direction
50. Improve flow by implementing Operations as a Service
Team A
(Dev)
Team B
(Ops)Ticket
System
Operations
as a
Service
Execute
On Demand
Define
Procedures
Vet
Procedures
Define
Policies
Actual Exceptions
Execute
On Demand
52. Automated procedures are comprised of three parts
Definition of the automated procedure
Execution of the automated procedure
Governance of the automated procedure
Define
Execute
Govern
53. Automated procedures are comprised of three parts
Definition of the automated procedure
Execution of the automated procedure
Governance of the automated procedure
Define
Execute
Govern
(security, oversight, compliance, etc.)
60. fdfd
Operations as a Service
Operations
as a
Service
ED G
Team B
(Ops)
Vet
Procedures
Define
Policies
Execute
On Demand
Team A
(Dev)
Define
Procedures
Execute
On Demand
61. fdfd
Operations as a Service
Split definition, execution, and governance and
move to where most effective use of labor
Operations
as a
Service
ED G
Team B
(Ops)
Vet
Procedures
Define
Policies
Execute
On Demand
Team A
(Dev)
Define
Procedures
Execute
On Demand
62. Again: How do we respond quicker, yet stay under control?
Empower those closest to the issue
Improve flow by implementing
Operations as a Service
63. Rundeck: Open Source Platform For Operations as a Service
#! ! "# $
Scripts APIs Tools Cloud VMs Containers
Orchestration &
Scheduling of Workflows
Collect and
Process Output
Infrastructure
details and state
from multiple
sources
Config.
Man.
CMDB
Monitor.
Metrics
Cloud
Corp
Directory
Authentication
and roles
ITSM Tickets, work
status, approvals
>_
Create workflows ● Define ACL policies ● Execute workflows
Web GUI API CLI
65. Step 1: Establish a Secure Ops Hub
Operations as a Service
Engineers get visibility
and controlled self-service
Secrets
Ops Procedures
“Status”
“Firewall Change”
"Restart"
deny
allow
Identity Audit Logs
Infrastructure view
Service health
System metrics
Ops Support use for
remediation procedures
Inventory and Health
Execute
+ Monitoring Tools
Security and Ops manages
access, configuration, and compliance
66. Step 2: Establish a SDLC for Ops Procedures
Operations as a Service
Engineers get visibility
and controlled self-service
Secrets
Ops Procedures
“Status”
“Firewall Change”
"Restart"
deny
allow
Identity Audit Logs
Infrastructure view
Service health
System metrics
Ops Support use for
remediation procedures
Inventory and Health
Execute
Source Code
Repo
if (($state==wait))
then
kill -9 $PID
fi
Change
Product Engineers
produce automated
procedures and health
checks.
RISKY
Automated Procedures
and Health Checks
FIX
Code review
+ Monitoring Tools
Security and Ops manages
access, configuration, and compliance
67. Step 3: Connect with Enterprise Management Systems
Service Desk
CustomersOps Support get
visibility and audit trail
updated by support tools
Service Ticket
Execute
Software
Supply Chain
Ops integrate
with artifact
flow
Operations as a Service
Engineers get visibility
and controlled self-service
Secrets
Ops Procedures
“Status”
“Firewall Change”
"Restart"
deny
allow
Identity Audit Logs
Infrastructure view
Service health
System metrics
Ops Support use for
remediation procedures
Inventory and Health
Source Code
Repo
if (($state==wait))
then
kill -9 $PID
fi
Change
Product Engineers
produce automated
procedures and health
checks.
RISKY
Automated Procedures
and Health Checks
FIX
Code review
+ Monitoring Tools
Security and Ops manages
access, configuration, and compliance
68. Step 4: Make Compliance Really Happy
Service Desk
CustomersOps Support get
visibility and audit trail
updated by support tools
Service Ticket
Execute
Software
Supply Chain
Ops integrate
with artifact
flow
Who reviewed it? Who ran it? When? Where? Approval trail?
Who created the procedure?
Who created the policy?
Operations as a Service
Engineers get visibility
and controlled self-service
Secrets
Ops Procedures
“Status”
“Firewall Change”
"Restart"
deny
allow
Identity Audit Logs
Infrastructure view
Service health
System metrics
Ops Support use for
remediation procedures
Inventory and Health
Source Code
Repo
if (($state==wait))
then
kill -9 $PID
fi
Change
Product Engineers
produce automated
procedures and health
checks.
RISKY
Automated Procedures
and Health Checks
FIX
Code review
+ Monitoring Tools
Security and Ops manages
access, configuration, and compliance
70. Improve incident response time and reduce escalations
Finish
Deliverables
Interrupt
Interrupt
? ?
?
?
Interrupt
X
"Too busy"
"We're late!"
Start
Deliverables
Fromcurrentproduction
Finish
Deliverables
Interrupt
? ?
?
?
Start
Deliverables
Fromcurrentproduction
"This looks
important"Interrupt
✔
Delivery Team (L2, L3) Delivery Team (L2, L3)
NOC
NOC
NOC
NOC
NOC
NOC
NOC
NOC
Previously delivered
Rundeck Jobs
Old Model New Model
71. Improve incident response time and reduce escalations
Finish
Deliverables
Interrupt
Interrupt
? ?
?
?
Interrupt
X
"Too busy"
"We're late!"
Start
Deliverables
Fromcurrentproduction
Finish
Deliverables
Interrupt
? ?
?
?
Start
Deliverables
Fromcurrentproduction
"This looks
important"Interrupt
✔
Delivery Team (L2, L3) Delivery Team (L2, L3)
NOC
NOC
NOC
NOC
NOC
NOC
NOC
NOC
Previously delivered
Rundeck Jobs
Old Model New Model
72. Team A
(Dev)
Team B
(Ops)
Operations
as a
Service
Execute
On Demand
Define
Procedures
Vet
Procedures
Define
Policies
Execute
On Demand
Tightens feedback loops
73. Reduce delays that otherwise hurt the business
RevenueperWeek
Time
COST OF DELAY Actual Revenue
Opportunity Ready
74. Enables Ops managers to focus on creating value
Old mindset:
Protect capacity
Say “no”
Manager
75. Enables Ops managers to focus on creating value
Old mindset:
Protect capacity
Say “no”
Manager
New mindset:
Scaling OaaS
Get more users
Team A
(Dev)
Team B
(Ops)
Operations
as a
Service
Execute
On Demand
Define
Procedures
Vet
Procedures
Define
Policies
Execute
On Demand
76. Calculating the ROI for Operations as a Service
Team A
(Dev)
Team B
(Ops)
Operations
as a
Service
Execute
On Demand
Define
Procedures
Vet
Procedures
Define
Policies
Execute
On Demand
77. Calculating the ROI for Operations as a Service
ROI inside Ops
Decrease in time to respond to incidents
Decrease in errors and rework
Increase in operational support tasks delegated
Increase in team capacity
Team A
(Dev)
Team B
(Ops)
Operations
as a
Service
Execute
On Demand
Define
Procedures
Vet
Procedures
Define
Policies
Execute
On Demand
78. Calculating the ROI for Operations as a Service
ROI inside Ops
Decrease in time to respond to incidents
Decrease in errors and rework
Increase in operational support tasks delegated
Increase in team capacity
ROI outside Ops
Decrease in number of escalations
Decrease in time spent waiting and rework loops
Decrease in issues due to problematic handoffs
Team A
(Dev)
Team B
(Ops)
Operations
as a
Service
Execute
On Demand
Define
Procedures
Vet
Procedures
Define
Policies
Execute
On Demand
79. Calculating the ROI for Operations as a Service
ROI inside Ops
Decrease in time to respond to incidents
Decrease in errors and rework
Increase in operational support tasks delegated
Increase in team capacity
ROI outside Ops
Decrease in number of escalations
Decrease in time spent waiting and rework loops
Decrease in issues due to problematic handoffs
ROI to Business
Decrease in total cost of operations and support
Decrease in time-to-market, cycle-time, and schedule slippage
Team A
(Dev)
Team B
(Ops)
Operations
as a
Service
Execute
On Demand
Define
Procedures
Vet
Procedures
Define
Policies
Execute
On Demand
80. Back to our story…
Mark
Maun
Jody
Mulkey
Justin
Dean
Sources: https://www.youtube.com/watch?v=_hr4KiB19bQ
http://rundeck.org/stories/mark_maun.html
Ticketmaster’s “Support at the Edge” model
81. Back to our story…
Mark
Maun
Jody
Mulkey
Justin
Dean
Sources: https://www.youtube.com/watch?v=_hr4KiB19bQ
http://rundeck.org/stories/mark_maun.html
Ticketmaster’s “Support at the Edge” model
• Automated Ops procedures written/vetted by the delivery teams
82. Back to our story…
Mark
Maun
Jody
Mulkey
Justin
Dean
Sources: https://www.youtube.com/watch?v=_hr4KiB19bQ
http://rundeck.org/stories/mark_maun.html
Ticketmaster’s “Support at the Edge” model
• Automated Ops procedures written/vetted by the delivery teams
• Ops remained in full control of what can run and security policy
83. Back to our story…
Mark
Maun
Jody
Mulkey
Justin
Dean
Sources: https://www.youtube.com/watch?v=_hr4KiB19bQ
http://rundeck.org/stories/mark_maun.html
Ticketmaster’s “Support at the Edge” model
• Automated Ops procedures written/vetted by the delivery teams
• Ops remained in full control of what can run and security policy
• Empowered support teams with self-service ops tasks
84. Back to our story…
Mark
Maun
Jody
Mulkey
Justin
Dean
Sources: https://www.youtube.com/watch?v=_hr4KiB19bQ
http://rundeck.org/stories/mark_maun.html
Ticketmaster’s “Support at the Edge” model
• Automated Ops procedures written/vetted by the delivery teams
• Ops remained in full control of what can run and security policy
• Empowered support teams with self-service ops tasks
• Empowered the NOC team to be “operators” again
85. Back to our story…
Mark
Maun
Jody
Mulkey
Justin
Dean
Sources: https://www.youtube.com/watch?v=_hr4KiB19bQ
http://rundeck.org/stories/mark_maun.html
Ticketmaster’s “Support at the Edge” model
• Automated Ops procedures written/vetted by the delivery teams
• Ops remained in full control of what can run and security policy
• Empowered support teams with self-service ops tasks
• Empowered the NOC team to be “operators” again
• Empowered developers with limited self-service operations
86. Back to our story…
Mark
Maun
Jody
Mulkey
Justin
Dean
Sources: https://www.youtube.com/watch?v=_hr4KiB19bQ
http://rundeck.org/stories/mark_maun.html
Ticketmaster’s “Support at the Edge” model
• Automated Ops procedures written/vetted by the delivery teams
• Ops remained in full control of what can run and security policy
• Empowered support teams with self-service ops tasks
• Empowered the NOC team to be “operators” again
• Empowered developers with limited self-service operations
87. Better for the business and a better way to work
90% Reduction in MTTR
50% Reduction in escalations
55% Reduction of overall support costs
88. Recap
Move definition, execution,
and governance to where
best use of labor
Understand the
pressures on Ops
Make explicit investment in
process and tooling
Operations as a Service: Reshaping IT Operations to Solve Today’s Challenges 4
D
evOps and Digital Transformations are
driving an unprecedented increase in
the pace and volume of daily change.
Who generally finds this to be welcome news?
Development and Product teams. Who has reasons
to be alarmed at the problems and challenges this
might bring? Operations.
Operations organizations in today’s enterprises
are finding themselves squeezed between two
unrelenting forces. On one side there are the
business-driven demands of DevOps and Digital
Transformation (“Go faster! Open things up!). On
the other side there are the demands to maximize
security and stability (“Don’t be the next hack! Don’t
be the next outage! Lock things down!”). And there, in
the middle, is an already over-burdened Operations
organization doing their best to avoid being squeezed
beyond the breaking point.
Operations has reached an inflection point. To deliver
what the business demands, Operations must find
a way to provide increasing levels of organizational
responsiveness and throughput — all while “locking
things down” to sufficiently meet today’s risk profiles.
A lot is riding on how Operations responds to this
challenge. A failure here is not just a localized IT
failure. A failure will undermine a business’s ability
to operate. Failing to solve this will turn into a
competitive disadvantage for the business.
On the flip side, this challenge also presents a great
opportunity. Operations can take this business
mandate and use it to reimagine how both planned
and unplanned work is handled. This is a chance to
improve how Operations both serves the broader
business and improves the day-to-day lives of
Operations professionals.
The Operations Squeeze
Introduction
“The Operations Squeeze”
Go faster! Be flexible! Lock it down!
Improved Quality
Shorter Time-to-Market
Fast Feedback
From Users
Availability Auditing
Security Compliance
Dev Ops
Ops
Operations is a lot more
than deployment
Team A
(Dev)
Team B
(Ops)
Ticket
System
??
Beware of silos
Use the Operations as a
Service design pattern
Service Desk
CustomersOps Support get
visibility and audit trail
updated by support tools
Service Ticket
Execute
Software
Supply Chain
Ops integrate
with artifact
flow
Operations as a Service
Engineers get visibility
and controlled self-service
Secrets
Ops Procedures
“Status”
“Firewall Change”
"Restart"
deny
allow
Identity Audit Logs
Infrastructure view
Service health
System metrics
Ops Support use for
remediation procedures
Inventory and Health
Source Code
Repo
if (($state==wait))
then
kill -9 $PID
fi
Change
Product Engineers
produce automated
procedures and health
checks.
RISKY
Automated Procedures
and Health Checks
FIX
Code review
+ Monitoring Tools
Security and Ops manages
access, configuration, and compliance