The document discusses 10 lessons learned from the Apollo Moon landings that are still relevant today:
1) Have explicit goals with clear metrics and service level agreements.
2) Ensure all goals are unified rather than conflicting.
3) Break up monolithic systems into smaller, more manageable parts.
4) Take small, agile steps to develop minimum viable products.
5) Perform simulations and post-mortems to prepare for challenges.
6) Promote effective communication and collaboration.
7) Understand what errors could occur beforehand.
8) Monitor key metrics as "golden signals" to determine system health.
9) Maximize reliability in all aspects of the project.
10) Focus
50 Years After: Resiliency Lessons from the Apollo Missions to the Moon - Robert Barron
1. @flyingbarron
Back to the Moon :
Lessons from the Apollo Moon Landings
~ 4,000 out of ~ 400,000 people
IBM Garage for Cloud.
Cloud Service Management & Operations.
Serious about SRE, Chatty about ChatOps.
2. @flyingbarron
Lessons from the Moon Landing
I believe that this nation should commit
itself to achieving the goal, before this
decade is out, of landing a man on the
moon and returning him safely to the Earth
— John F. Kennedy, May 25th, 1961.
3. @flyingbarron
• Man.
• Moon.
• Decade.
• Safe.
Simple, well defined KPIs and an agreed upon SLA.
Lesson #1 – Explicit goals (KPIs and SLA)
4. @flyingbarron
When do we get to the moon?
When we land on the moon?
When we step on the moon?
Pilots take no special joy in walking: pilots like flying.
Pilots generally take pride in a good landing, not in getting out of the vehicle.
— Neil Armstrong
We succeed when we get back to Earth safely!
Lesson #2 – Unified goals
9. @flyingbarron
• 1958 – 1963: Mercury project
• First Man in space
• First American in space
• First American to orbit Earth
Lesson #4 – Small steps, agile development of MVPs
First Man in space
10. @flyingbarron
• 1961 – 1966: Gemini project
Lesson #4 – Small steps, agile development of MVPs
12. @flyingbarron
“That’s one small step for [a]
man, one giant leap for
mankind”
was nearly
“We didn’t land on the Moon
because the computer rebooted
in the middle”
17. @flyingbarron
Monitor as much as possible
but only abort on your Golden Signals!
Lesson #8 – Observability and Golden Signals
For Apollo, the golden signals were speed, direction, angle and stability of flight.
Modern microservices usually use latency, traffic, errors, and saturation.
23. @flyingbarron
Lesson #9 – Maximize reliability
At 110-meter-tall and weighing 2,970,000 kg the Saturn V
rocket had one and only one job:
Launch an 8-meter, 15,000 kg Apollo spacecraft to the
Moon.
Only 13 were ever launched.
24. @flyingbarron
• Guidance & Control
• Telemetry
An independent IBM brain
for the Saturn V rocket
Lesson #9 – Maximize reliability.
28. @flyingbarron
Why did the Americans beat the Soviets when the Soviets had an early lead in the
space race?
Lesson #10 – PoC vs MVP
USA
Mercury
Soviet
Vostok
29. @flyingbarron
Why did the Americans beat the Soviets when the Soviets had an early lead in the
space race?
Lesson #10 – PoC vs MVP
USA
Mercury
USA
Gemini
Soviet
Vostok
30. @flyingbarron
Lesson #10 – PoC vs MVP
USA
Mercury
USA
Gemini
Soviet
Vostok
Soviet
Voskhod
Why did the Americans beat the Soviets when the Soviets had an early lead in the
space race?
31. @flyingbarron
• Lesson #1 – Explicit goals (KPIs and SLA)
• Lesson #2 – Unified goals
• Lesson #3 – Split the monolith
• Lesson #4 – Agile development of MVPs
• Lesson #5 – Simulations & Pre-Mortems
(and documentation)
• Lesson #6– Communication &
Collaboration
• Lesson #7 – Know your errors
• Lesson #8 – Golden Signals
• Lesson #9 – Maximize reliability
• Lesson #10 – PoC vs MVP
Lessons learned
32. @flyingbarron
• The more things change, the more they stay the same
• Resilience is not just tools & technology, it’s also humans, processes,
practice/practise and training
• A good plan is mandatory, even if you don’t stick to it exactly
Lessons of Lessons
33. @flyingbarron
“If I had to single out the piece of equipment that, more than any other, has allowed us
to go from earth-orbit Mercury flights to Apollo lunar trips in just over seven years, it
would be the high-speed computer”
— Christopher C. Kraft,
Director of Flight Operations
NASA, Houston.
“The thing that makes a good [SRE, DevOps Engineer, Ops] is a natural curiosity about
how things work, even if you’re not responsible for them.”
— John Aaron
EECOM, Steely Eyed Missile Man & SRE role model.
NASA, Houston.
34. @flyingbarron
• My blog - https://ibm.biz/apollo-lessons
• Go, Flight!: The Unsung Heroes of Mission Control. Rick Houston & Milt Heflin, 2015
• Apollo. Charles Murray & Catherine Bly Cox, 2004
• https://apolloinrealtime.org/11/
• Apollo Study Report, Volume 2. IBM, 1963
• Saturn V Launch Vehicle Digital Computer, Volume One: General Description and Theory. IBM,
1964
• Computers in Spaceflight. NASA, 1988
• https://www.nasa.gov/specials/apollo50th/index.html
• https://newsroom.ibm.com/apollo
Further reading
35. @flyingbarron
Back to the Moon :
Questions and Happy Hour
IBM Garage for Cloud.
Cloud Service Management & Operations.
Serious about SRE, Chatty about ChatOps.