Principles Of Chaos Engineering - Chaos Engineering Hamburg

•

1 gostou•970 visualizações

Nils Meder

This is Marvin' Talk in the first Meetup of "Chaos Engineering Hamburg".

Software

Chaos Engineering
Hamburg
Marvin Hoffmann | Nils Meder
15.12.2015

1. AWS Basics and Intro
2. Evolution of Chaos Testing
3. Tooling
4. Chaos Engineering
Agenda

Europe West (Ireland)US East (N. Virginia)
Regions
AZs Instances
AWS Basics

“A way to improve availability is
to install proven hardware and
software, and then leave it alone”
Jim Gray
Why Do Computers Stop and What Can Be Done About It?

• Systems need to be reliable
• Nuklear weapon arsenal, heart rate monitoring,
World of Warcraft servers, Streaming business
• Third party dependencies (software and
hardware)
Be reliable!

DynamoDB Outage US-East
• “… there was a brief network disruption that impacted a
portion of DynamoDB’s storage servers.”
• 2:19am until 7:10am PDT
• “There are several other AWS services that use
DynamoDB that experienced problems during the event.”
• SQS, EC2 auto scaling, CloudWatch

• Deployments themselves may cause issues
• Unpredicted behaviour after a change has been
rolled out
• Issues during rollback
• Change in client / user behaviour
It’s not always the infrastructure

Do the simplest thing ﬁrst
• Prepare for your machines to die
• “Cattle, not pets” (Adrian Cockcroft)
• Resilience through redundancy
• Stateless machines

Deal with infrastructure issues
• Latency between instances
• Package loss
• Ports blocked
• or even outages of an entire AZ

Think big!
• Remember that DynamoDB failure?
• Outage of an entire AWS region!
• You’ll need more than one region in the ﬁrst place
• Re-routing of entire trafﬁc from one region to another
• Any region needs to be able to scale to take the load of
two regions

Chaos Monkey
Kills random instances in your account

Chaos Gorilla
Kills a random AZ in your account

Chaos Kong
Kills an entire AWS region in your account

What’s in it?
• A compilation of scripts
• Scripts mess with your AWS account
• Thus, they are very AWS speciﬁc
• If not on AWS, get inspired and build your toolset around
these ideas
• Not a comprehensive toolset

• Latency Monkey
• Conformity Monkey
• Security Monkey
• Doctor Monkey
• 10-18 Monkey
Simian Army

• Systematic approach to Chaos Testing
• Started by Netﬂix
• Talk about it a lot to attract talent
• Many other companies doing similar things in that ﬁeld
• Want to grow a community around it
Chaos Engineering

“Experiment on a distributed system
in order to build conﬁdence in the
system’s capability to withstand
turbulent conditions in production.”
Netﬂix

Know your system
• Operational insight
• What is “normal”? What does a failure look like?

Four Principles of
Chaos Engineering
1.Build a hypothesis around steady-state behaviour

The “Happy Path”
• Trace through code
where nothing bad
happens
• usually testing happens
ﬁrst on the happy path
• Bad things usually
happen off the happy
path

Four Principles of
Chaos Engineering
1.Build a hypothesis around steady-state behaviour
2.Vary real-world events

Laboratory
• “Works on my machine” (or “works in stage env.”)

Four Principles of
Chaos Engineering
1.Build a hypothesis around steady-state behaviour
2.Vary real-world events
3.Run experiments in production

Chaos Engineering Culture
• http://principlesofchaos.com
• More resources:
• https://github.com/Netﬂix/SimianArmy
• https://github.com/Netﬂix/atlas
• https://www.youtube.com/watch?v=vq4QZ4_YDok

Mais conteúdo relacionado

Mais procurados

Chaos Engineering: Why the World Needs More Resilient SystemsC4Media

An Introduction to Chaos EngineeringGremlin

Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...Amazon Web Services

Chaos Engineering with KubernetesArun Gupta

Chaos Engineering - The Art of Breaking Things in ProductionKeet Sugathadasa

[Konveyor] introduction to cloud native chaos engineering with litmus chaos (1)Konveyor Community

chaos-engineering-KnolxKnoldus Inc.

Introduction to Chaos Engineering with Microsoft AzureAna Medina

Chaos engineering and chaos testingjeetendra mandal

Fully automated kubernetes deployment and managementLinuxCon ContainerCon CloudOpen China

DevOps on AWSAmazon Web Services

Choose your own adventure Chaos Engineering - QCon NYC 2017 Nora Jones

CI-CD with AWS Developer Tools and Fargate_AWSPSSummit_SingaporeAmazon Web Services

CI/CD on AWSAmazon Web Services

Automated Deployments with AnsibleMartin Etmajer

CI/CD best practices for building modern applications - MAD304 - Chicago AWS ...Amazon Web Services

The Paved Road at NetflixDianne Marsh

AWS Monitoring & LoggingJason Poley

Chaos Engineering 101: A Field Guidematthewbrahms

CI/CD for Containers: A Way Forward for Your DevOps PipelineAmazon Web Services

Mais procurados (20)

Chaos Engineering: Why the World Needs More Resilient Systems

An Introduction to Chaos Engineering

Chaos Engineering: Why Breaking Things Should Be Practiced - AWS Developer Wo...

Chaos Engineering with Kubernetes

Chaos Engineering - The Art of Breaking Things in Production

[Konveyor] introduction to cloud native chaos engineering with litmus chaos (1)

chaos-engineering-Knolx

Introduction to Chaos Engineering with Microsoft Azure

Chaos engineering and chaos testing

Fully automated kubernetes deployment and management

DevOps on AWS

Choose your own adventure Chaos Engineering - QCon NYC 2017

CI-CD with AWS Developer Tools and Fargate_AWSPSSummit_Singapore

CI/CD on AWS

Automated Deployments with Ansible

CI/CD best practices for building modern applications - MAD304 - Chicago AWS ...

The Paved Road at Netflix

AWS Monitoring & Logging

Chaos Engineering 101: A Field Guide

CI/CD for Containers: A Way Forward for Your DevOps Pipeline

Destaque

Chaos Engineering - Limiting Damage During Chaos ExperimentsNils Meder

Chaos Patterns Twilio SIGNALCONF 2016Bruce Wong

The Journey of Chaos Engineering Begins with a Single StepBruce Wong

The Case for ChaosBruce Wong

Intro to Netflix's Chaos MonkeyMichael Whitehead

Mini-Training: Netflix Simian ArmyBetclic Everest Group Tech Team

From resilient to antifragile - Chaos Engineering Primer DevSecConSergiu Bodiu

ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012Amazon Web Services

Like us on Facebook! Does your FB content build your brand?Michael Paredrakos

Destaque (9)

Chaos Engineering - Limiting Damage During Chaos Experiments

Chaos Patterns Twilio SIGNALCONF 2016

The Journey of Chaos Engineering Begins with a Single Step

The Case for Chaos

Intro to Netflix's Chaos Monkey

Mini-Training: Netflix Simian Army

From resilient to antifragile - Chaos Engineering Primer DevSecCon

ARC301 Intro to Chaos Monkey & the Simian Army - AWS re: Invent 2012

Like us on Facebook! Does your FB content build your brand?

Semelhante a Principles Of Chaos Engineering - Chaos Engineering Hamburg

Principles of Chaos Engineeringh_marvin

Chaos Engineering when you're not NetflixMartez Reed

20140708 - Jeremy Edberg: How Netflix Delivers SoftwareDevOps Chicago

Elatt Presentationstudent-elatt

Inrastructure as CodeCharles Anderson

Hacklu2011 tricaudstricaud

The economies of scaling software - Abdel Remanijaxconf

Chirp 2010: Scaling TwitterJohn Adams

CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National PoliceBert Jan Schrijver

Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...Codemotion

Dev Ops without the OpsKonstantin Gredeskoul

AWS Meetup - Nordstrom Data Lab and the AWS CloudNordstromDataLab

Site reliability in the Serverless age - Serverless Boston 2019Erik Peterson

The Economies of Scaling SoftwareAbdelmonaim Remani

RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does ItRightScale

CloudStack SecuredJohn Kinsella

Greenfields tech decisionsTrent Hornibrook

Devoxx PL 2018 - Microservices in action at the Dutch National PoliceBert Jan Schrijver

Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...Amazon Web Services

ChaosEngineeringITEA.pptxJenniferBergstrom10

Semelhante a Principles Of Chaos Engineering - Chaos Engineering Hamburg (20)

Principles of Chaos Engineering

Chaos Engineering when you're not Netflix

20140708 - Jeremy Edberg: How Netflix Delivers Software

Elatt Presentation

Inrastructure as Code

Hacklu2011 tricaud

The economies of scaling software - Abdel Remani

Chirp 2010: Scaling Twitter

CodeMotion Amsterdam 2018 - Microservices in action at the Dutch National Police

Microservices in action at the Dutch National Police - Bert Jan Schrijver - C...

Dev Ops without the Ops

AWS Meetup - Nordstrom Data Lab and the AWS Cloud

Site reliability in the Serverless age - Serverless Boston 2019

The Economies of Scaling Software

RightScale Webinar: Security Monitoring in the Cloud: How RightScale Does It

CloudStack Secured

Greenfields tech decisions

Devoxx PL 2018 - Microservices in action at the Dutch National Police

Hack-Proof Your Cloud: Responding to 2016 Threats | AWS Public Sector Summit ...

ChaosEngineeringITEA.pptx

Último

%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba

VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale

WSO2CON 2024 - Does Open Source Still Matter?WSO2

%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba

%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba

W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda

%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba

OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan

Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1

Microsoft AI Transformation Partner Playbook.pdfWilly Marroquin (WillyDevNET)

Architecture decision records - How not to get lost in the pastPapp Krisztián

%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...masabamasaba

Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...SelfMade bd

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171

%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...masabamasaba

WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2

The title is not connected to what is insideshinachiaurasa2

Announcing Codolex 2.0 from GDK SoftwareJim McKeeth

8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82

MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...Jittipong Loespradit

Principles Of Chaos Engineering - Chaos Engineering Hamburg

1. Chaos Engineering Hamburg Marvin Hoffmann | Nils Meder 15.12.2015

2. 1. AWS Basics and Intro 2. Evolution of Chaos Testing 3. Tooling 4. Chaos Engineering Agenda

3. Europe West (Ireland)US East (N. Virginia) Regions AZs Instances AWS Basics

4. Chaos? - What do we mean?

5. “A way to improve availability is to install proven hardware and software, and then leave it alone” Jim Gray Why Do Computers Stop and What Can Be Done About It?

6. • Systems need to be reliable • Nuklear weapon arsenal, heart rate monitoring, World of Warcraft servers, Streaming business • Third party dependencies (software and hardware) Be reliable!

7. DynamoDB Outage US-East • “… there was a brief network disruption that impacted a portion of DynamoDB’s storage servers.” • 2:19am until 7:10am PDT • “There are several other AWS services that use DynamoDB that experienced problems during the event.” • SQS, EC2 auto scaling, CloudWatch

8. • Deployments themselves may cause issues • Unpredicted behaviour after a change has been rolled out • Issues during rollback • Change in client / user behaviour It’s not always the infrastructure

9. Evolution of Chaos Testing

10. Do the simplest thing ﬁrst • Prepare for your machines to die • “Cattle, not pets” (Adrian Cockcroft) • Resilience through redundancy • Stateless machines

11. Deal with infrastructure issues • Latency between instances • Package loss • Ports blocked • or even outages of an entire AZ

12. Think big! • Remember that DynamoDB failure? • Outage of an entire AWS region! • You’ll need more than one region in the ﬁrst place • Re-routing of entire trafﬁc from one region to another • Any region needs to be able to scale to take the load of two regions

13. Tooling (meet the Monkeys)

14. Chaos Monkey Kills random instances in your account

15. Chaos Gorilla Kills a random AZ in your account

16. Chaos Kong Kills an entire AWS region in your account

17. What’s in it? • A compilation of scripts • Scripts mess with your AWS account • Thus, they are very AWS speciﬁc • If not on AWS, get inspired and build your toolset around these ideas • Not a comprehensive toolset

18. • Latency Monkey • Conformity Monkey • Security Monkey • Doctor Monkey • 10-18 Monkey Simian Army

19. Chaos Engineering

20. • Systematic approach to Chaos Testing • Started by Netﬂix • Talk about it a lot to attract talent • Many other companies doing similar things in that ﬁeld • Want to grow a community around it Chaos Engineering

21. “Experiment on a distributed system in order to build conﬁdence in the system’s capability to withstand turbulent conditions in production.” Netﬂix

22. Four Principles of Chaos Engineering

23. Know your system • Operational insight • What is “normal”? What does a failure look like?

24. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour

25. The “Happy Path” • Trace through code where nothing bad happens • usually testing happens ﬁrst on the happy path • Bad things usually happen off the happy path

26. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour 2.Vary real-world events

27. Laboratory • “Works on my machine” (or “works in stage env.”)

28. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour 2.Vary real-world events 3.Run experiments in production

29. Four Principles of Chaos Engineering 1.Build a hypothesis around steady-state behaviour 2.Vary real-world events 3.Run experiments in production 4.Automate experiments to run continuously

30. Chaos Engineering Culture • http://principlesofchaos.com • More resources: • https://github.com/Netﬂix/SimianArmy • https://github.com/Netﬂix/atlas • https://www.youtube.com/watch?v=vq4QZ4_YDok

Principles Of Chaos Engineering - Chaos Engineering Hamburg

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Destaque

Destaque (9)

Semelhante a Principles Of Chaos Engineering - Chaos Engineering Hamburg

Semelhante a Principles Of Chaos Engineering - Chaos Engineering Hamburg (20)

Último

Último (20)

Principles Of Chaos Engineering - Chaos Engineering Hamburg