SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Theo Schlossnagle, Founder @Circonus, @postwait on Twitter
Operational Software
Ignorance
Buys
Pain
https://www.flickr.com/photos/hoyvinmayvin/4906678960
Tenet #1
Designing for the unknown
should never burden today
Compromise when forced.
Design with tomorrow’s expected volume,
but today’s functional requirements.
https://www.flickr.com/photos/alexander/20090860
Tenet #2
Never be left curious as to
what your software is doing
Observability is the only fast-path to success.
https://www.flickr.com/photos/jox1989/4764186425
Tenet #3
Understand what your
software was doing
This is one of the trickiest tenets.
You can’t log every instruction performed, every packet
sent & received, every context switch, every I/O.
Log things that are infrequent,
measure things that are frequent.
What’s frequent? I said this was tricky.
https://www.flickr.com/photos/guest_family/6175062186
Tenet #4
Understand internal failures
The only thing worse than a malfunction in production,
is one without sufficient actionable data.
Core dump, stack trace analysis, etc.
These can wake people up.
https://www.flickr.com/photos/skipthefiller/2451481428
Tenet #5
Operator remediation of an
external failure is a bug
A system failing, a disk failing, a network partition, etc.
Once the external failure is remediated, no human being
should be involved in returning the (internal) system to
correct operation.
https://www.flickr.com/photos/timzim/2308099322
Tenet #6 & #7
Tight coupling

reduces resiliency
Loose coupling

reduces debug-ability
Tenet #8
Avoid difficult problems

if possible
One does not “just add” replication, consensus and fault
tolerance into systems.
Never solve a harder problem than you are presented.
The best solution to a problem is to remove the problem.
https://www.flickr.com/photos/shakestercody/2124972276
A tail of two systems
Queueing
Storage
https://www.flickr.com/photos/skynoir/8300610952
From RabbitMQ to Fq
RabbitMQ: Oh, the outages we’ve had.
violates #1, #2, #3, #5, #8
Fq:
(#1) build only semantics we need
(#2/#3) DTrace probes
(#4) cores dumps and backtrace.io
(#5) handled by simplicity of design and use
(#6/#7) push decoupling out, gain debug-ability
(#8) punt clustering downstream to clients
https://www.flickr.com/photos/fotologic/2165675515
ns level deep routing
From Postgres to Snowth
Postgres 8… hacked in a column store using arrays
Tenet #3: build pg_statsd
Tenet #5: … build Snowth
Snowth: ring topology time-series store
(#1/#8) only commutative operations,
(#2) DTrace probes
(#3) Circonus itself (statsd, histograms, etc.)
(#4) backtrace.io
(#7) implemented Zipkin (Dapper-like) tracing
quartile bands
“mvalue” best-guess modes

Mais conteúdo relacionado

Mais procurados

Chaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient SystemsChaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient SystemsC4Media
 
AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud Pipelines
AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud PipelinesAI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud Pipelines
AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud PipelinesDynatrace
 
Chaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field GuideChaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field Guidematthewbrahms
 
30 days or less: New Features to Production
30 days or less: New Features to Production30 days or less: New Features to Production
30 days or less: New Features to ProductionKarthik Gaekwad
 
Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Alex Cachia
 
Casino In The Clouds
Casino In The CloudsCasino In The Clouds
Casino In The Cloudsgojkoadzic
 
Gaining visibility into your Openshift application container platform with Dy...
Gaining visibility into your Openshift application container platform with Dy...Gaining visibility into your Openshift application container platform with Dy...
Gaining visibility into your Openshift application container platform with Dy...Dynatrace
 
An Introduction to Chaos Engineering
An Introduction to Chaos EngineeringAn Introduction to Chaos Engineering
An Introduction to Chaos EngineeringGremlin
 
Scaling a Start-up DevOps team to 10x while scaling the system 50x
Scaling a Start-up DevOps team to 10x while scaling the system 50x Scaling a Start-up DevOps team to 10x while scaling the system 50x
Scaling a Start-up DevOps team to 10x while scaling the system 50x Stefan Zier
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About PerformanceTheo Schlossnagle
 
The Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSgThe Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSgDevSecOpsSg
 
Ops Happen: Improve Security Without Getting in the Way
Ops Happen: Improve Security Without Getting in the WayOps Happen: Improve Security Without Getting in the Way
Ops Happen: Improve Security Without Getting in the WaySeniorStoryteller
 
Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014
Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014
Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014Institut Lean France
 
Pragmatic Security and Rugged DevOps - SXSW 2015
Pragmatic Security and Rugged DevOps - SXSW 2015Pragmatic Security and Rugged DevOps - SXSW 2015
Pragmatic Security and Rugged DevOps - SXSW 2015James Wickett
 
AppSec is Eating Security
AppSec is Eating SecurityAppSec is Eating Security
AppSec is Eating SecurityAlex Stamos
 
4 Node.js Gotchas: What your ops team needs to know
4 Node.js Gotchas: What your ops team needs to know4 Node.js Gotchas: What your ops team needs to know
4 Node.js Gotchas: What your ops team needs to knowDynatrace
 
Security as Code: A DevSecOps Approach
Security as Code: A DevSecOps ApproachSecurity as Code: A DevSecOps Approach
Security as Code: A DevSecOps ApproachVMware Tanzu
 
Tales from a radically polyglot team
Tales from a radically polyglot teamTales from a radically polyglot team
Tales from a radically polyglot teamThoughtworks
 
Chaos Driven Development
Chaos Driven DevelopmentChaos Driven Development
Chaos Driven DevelopmentBruce Wong
 

Mais procurados (20)

Chaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient SystemsChaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient Systems
 
AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud Pipelines
AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud PipelinesAI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud Pipelines
AI-Powered DevOps: Injecting Speed & Quality Across Verizon’s Cloud Pipelines
 
Chaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field GuideChaos Engineering 101: A Field Guide
Chaos Engineering 101: A Field Guide
 
Chaos engineering intro
Chaos engineering introChaos engineering intro
Chaos engineering intro
 
30 days or less: New Features to Production
30 days or less: New Features to Production30 days or less: New Features to Production
30 days or less: New Features to Production
 
Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...Chaos Engineering – why we should all practice breaking things on purpose by ...
Chaos Engineering – why we should all practice breaking things on purpose by ...
 
Casino In The Clouds
Casino In The CloudsCasino In The Clouds
Casino In The Clouds
 
Gaining visibility into your Openshift application container platform with Dy...
Gaining visibility into your Openshift application container platform with Dy...Gaining visibility into your Openshift application container platform with Dy...
Gaining visibility into your Openshift application container platform with Dy...
 
An Introduction to Chaos Engineering
An Introduction to Chaos EngineeringAn Introduction to Chaos Engineering
An Introduction to Chaos Engineering
 
Scaling a Start-up DevOps team to 10x while scaling the system 50x
Scaling a Start-up DevOps team to 10x while scaling the system 50x Scaling a Start-up DevOps team to 10x while scaling the system 50x
Scaling a Start-up DevOps team to 10x while scaling the system 50x
 
A Coherent Discussion About Performance
A Coherent Discussion About PerformanceA Coherent Discussion About Performance
A Coherent Discussion About Performance
 
The Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSgThe Rise of DevSecOps - Fabian Lim - DevSecOpsSg
The Rise of DevSecOps - Fabian Lim - DevSecOpsSg
 
Ops Happen: Improve Security Without Getting in the Way
Ops Happen: Improve Security Without Getting in the WayOps Happen: Improve Security Without Getting in the Way
Ops Happen: Improve Security Without Getting in the Way
 
Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014
Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014
Enabling Lean IT with AWS by Carlos Condé at the Lean IT Summit 2014
 
Pragmatic Security and Rugged DevOps - SXSW 2015
Pragmatic Security and Rugged DevOps - SXSW 2015Pragmatic Security and Rugged DevOps - SXSW 2015
Pragmatic Security and Rugged DevOps - SXSW 2015
 
AppSec is Eating Security
AppSec is Eating SecurityAppSec is Eating Security
AppSec is Eating Security
 
4 Node.js Gotchas: What your ops team needs to know
4 Node.js Gotchas: What your ops team needs to know4 Node.js Gotchas: What your ops team needs to know
4 Node.js Gotchas: What your ops team needs to know
 
Security as Code: A DevSecOps Approach
Security as Code: A DevSecOps ApproachSecurity as Code: A DevSecOps Approach
Security as Code: A DevSecOps Approach
 
Tales from a radically polyglot team
Tales from a radically polyglot teamTales from a radically polyglot team
Tales from a radically polyglot team
 
Chaos Driven Development
Chaos Driven DevelopmentChaos Driven Development
Chaos Driven Development
 

Destaque

Pack iTS case study analysis
Pack iTS case study analysisPack iTS case study analysis
Pack iTS case study analysisSumit Singh
 
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...Avere Systems
 
Optical network architecture
Optical network architectureOptical network architecture
Optical network architectureSiddharth Singh
 
Starbucks organisational culture
Starbucks organisational cultureStarbucks organisational culture
Starbucks organisational cultureJoel Sebastian
 
Google Analytics vs. Omniture Comparative Guide
Google Analytics vs. Omniture Comparative GuideGoogle Analytics vs. Omniture Comparative Guide
Google Analytics vs. Omniture Comparative GuideJimmy Jay
 
Common terminologies of obstetrics
Common terminologies of obstetricsCommon terminologies of obstetrics
Common terminologies of obstetricsZeeshan Khan
 
The Payments Value Chain
The Payments Value ChainThe Payments Value Chain
The Payments Value ChainPYMNTS.com
 
Porter's five force analysis on computer industry
Porter's five force analysis on computer industryPorter's five force analysis on computer industry
Porter's five force analysis on computer industryRajath Menon
 
Overview Of Oil & Gas Accounting
Overview Of Oil & Gas AccountingOverview Of Oil & Gas Accounting
Overview Of Oil & Gas Accountinghori
 
Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...
Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...
Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...eme2525
 
Mobile Network Capacity Issues
Mobile Network Capacity IssuesMobile Network Capacity Issues
Mobile Network Capacity IssuesPhilip Corsano
 
Elements of a fairytale
Elements of a fairytaleElements of a fairytale
Elements of a fairytaleamandakuhl
 
Grid computing Seminar PPT
Grid computing Seminar PPTGrid computing Seminar PPT
Grid computing Seminar PPTUpender Upr
 
Project Management Office Roles Functions And Benefits
Project Management Office Roles Functions And BenefitsProject Management Office Roles Functions And Benefits
Project Management Office Roles Functions And BenefitsMaria Erland, PMP
 
Allergy and Hypersensitivity
Allergy and HypersensitivityAllergy and Hypersensitivity
Allergy and HypersensitivityMedicineAndHealth
 
Organization development and change
Organization development and changeOrganization development and change
Organization development and changesomanishalaka
 
Slides cloud computing
Slides cloud computingSlides cloud computing
Slides cloud computingHaslina
 
Compensation management
Compensation managementCompensation management
Compensation management805984
 

Destaque (20)

Pack iTS case study analysis
Pack iTS case study analysisPack iTS case study analysis
Pack iTS case study analysis
 
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
Optimizing the Upstreaming Workflow: Flexibly Scale Storage for Seismic Proce...
 
Optical network architecture
Optical network architectureOptical network architecture
Optical network architecture
 
Os security issues
Os security issuesOs security issues
Os security issues
 
Starbucks organisational culture
Starbucks organisational cultureStarbucks organisational culture
Starbucks organisational culture
 
Google Analytics vs. Omniture Comparative Guide
Google Analytics vs. Omniture Comparative GuideGoogle Analytics vs. Omniture Comparative Guide
Google Analytics vs. Omniture Comparative Guide
 
Common terminologies of obstetrics
Common terminologies of obstetricsCommon terminologies of obstetrics
Common terminologies of obstetrics
 
The Payments Value Chain
The Payments Value ChainThe Payments Value Chain
The Payments Value Chain
 
Porter's five force analysis on computer industry
Porter's five force analysis on computer industryPorter's five force analysis on computer industry
Porter's five force analysis on computer industry
 
ADVANTAGES OF CAVITY WALLS
ADVANTAGES OF CAVITY WALLSADVANTAGES OF CAVITY WALLS
ADVANTAGES OF CAVITY WALLS
 
Overview Of Oil & Gas Accounting
Overview Of Oil & Gas AccountingOverview Of Oil & Gas Accounting
Overview Of Oil & Gas Accounting
 
Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...
Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...
Pop up! a manual of paper mechanisms - duncan birmingham (tarquin books) [pop...
 
Mobile Network Capacity Issues
Mobile Network Capacity IssuesMobile Network Capacity Issues
Mobile Network Capacity Issues
 
Elements of a fairytale
Elements of a fairytaleElements of a fairytale
Elements of a fairytale
 
Grid computing Seminar PPT
Grid computing Seminar PPTGrid computing Seminar PPT
Grid computing Seminar PPT
 
Project Management Office Roles Functions And Benefits
Project Management Office Roles Functions And BenefitsProject Management Office Roles Functions And Benefits
Project Management Office Roles Functions And Benefits
 
Allergy and Hypersensitivity
Allergy and HypersensitivityAllergy and Hypersensitivity
Allergy and Hypersensitivity
 
Organization development and change
Organization development and changeOrganization development and change
Organization development and change
 
Slides cloud computing
Slides cloud computingSlides cloud computing
Slides cloud computing
 
Compensation management
Compensation managementCompensation management
Compensation management
 

Semelhante a Operational Software Design

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at FlickrJohn Allspaw
 
Hands On, Duchess 10/17/2012
Hands On, Duchess 10/17/2012Hands On, Duchess 10/17/2012
Hands On, Duchess 10/17/2012slandelle
 
2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdf
2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdf2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdf
2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdfdino715195
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrJohn Allspaw
 
Living system or build factory - Chris Maxwell
Living system or build factory  - Chris MaxwellLiving system or build factory  - Chris Maxwell
Living system or build factory - Chris MaxwellDevopsdays
 
Gatling - JUGL, 2012-09-13
Gatling  - JUGL, 2012-09-13Gatling  - JUGL, 2012-09-13
Gatling - JUGL, 2012-09-13Nicolas Rémond
 
Ways to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous deliveryWays to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous deliverya32an
 
Easy recovery621 user guide en
Easy recovery621 user guide enEasy recovery621 user guide en
Easy recovery621 user guide enjmav1502
 
Comment j'ai mis ma suite de tests au régime en 5 minutes par jour
Comment j'ai mis ma suite de tests au régime en 5 minutes par jourComment j'ai mis ma suite de tests au régime en 5 minutes par jour
Comment j'ai mis ma suite de tests au régime en 5 minutes par jourCARA_Lyon
 
Let's make this test suite run faster! SoftShake 2010
Let's make this test suite run faster! SoftShake 2010Let's make this test suite run faster! SoftShake 2010
Let's make this test suite run faster! SoftShake 2010David Gageot
 
Growing pains - PosKeyErrors and other malaises
Growing pains - PosKeyErrors and other malaisesGrowing pains - PosKeyErrors and other malaises
Growing pains - PosKeyErrors and other malaisesPhilip Bauer
 
[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check it
[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check it[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check it
[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check itChristian Earl Magpayo
 
NCET Tech Bite - Cloud Storage and Data Backup - June 2015
NCET Tech Bite - Cloud Storage and Data Backup - June 2015NCET Tech Bite - Cloud Storage and Data Backup - June 2015
NCET Tech Bite - Cloud Storage and Data Backup - June 2015Archersan
 
BSides London 2015 - Proprietary network protocols - risky business on the wire.
BSides London 2015 - Proprietary network protocols - risky business on the wire.BSides London 2015 - Proprietary network protocols - risky business on the wire.
BSides London 2015 - Proprietary network protocols - risky business on the wire.Jakub Kałużny
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharingJames Hsieh
 
What you wanted to know about MySQL, but could not find using inernal instrum...
What you wanted to know about MySQL, but could not find using inernal instrum...What you wanted to know about MySQL, but could not find using inernal instrum...
What you wanted to know about MySQL, but could not find using inernal instrum...Sveta Smirnova
 

Semelhante a Operational Software Design (20)

10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
 
Hands On, Duchess 10/17/2012
Hands On, Duchess 10/17/2012Hands On, Duchess 10/17/2012
Hands On, Duchess 10/17/2012
 
Netgear router error codes
Netgear router error codesNetgear router error codes
Netgear router error codes
 
2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdf
2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdf2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdf
2019-12-11-OWASP-IoT-Top-10---Introduction-and-Root-Causes.pdf
 
Dev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and FlickrDev and Ops Collaboration and Awareness at Etsy and Flickr
Dev and Ops Collaboration and Awareness at Etsy and Flickr
 
Living system or build factory - Chris Maxwell
Living system or build factory  - Chris MaxwellLiving system or build factory  - Chris Maxwell
Living system or build factory - Chris Maxwell
 
Gatling - JUGL, 2012-09-13
Gatling  - JUGL, 2012-09-13Gatling  - JUGL, 2012-09-13
Gatling - JUGL, 2012-09-13
 
Ways to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous deliveryWays to minimise performance risks in continuous delivery
Ways to minimise performance risks in continuous delivery
 
Easy recovery621 user guide en
Easy recovery621 user guide enEasy recovery621 user guide en
Easy recovery621 user guide en
 
Comment j'ai mis ma suite de tests au régime en 5 minutes par jour
Comment j'ai mis ma suite de tests au régime en 5 minutes par jourComment j'ai mis ma suite de tests au régime en 5 minutes par jour
Comment j'ai mis ma suite de tests au régime en 5 minutes par jour
 
Silos are for farmers
Silos are for farmersSilos are for farmers
Silos are for farmers
 
Let's make this test suite run faster! SoftShake 2010
Let's make this test suite run faster! SoftShake 2010Let's make this test suite run faster! SoftShake 2010
Let's make this test suite run faster! SoftShake 2010
 
Growing pains - PosKeyErrors and other malaises
Growing pains - PosKeyErrors and other malaisesGrowing pains - PosKeyErrors and other malaises
Growing pains - PosKeyErrors and other malaises
 
[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check it
[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check it[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check it
[Latest] Samsung Galaxy S3 Clone (SP8810 S930) How to root merun na! check it
 
NCET Tech Bite - Cloud Storage and Data Backup - June 2015
NCET Tech Bite - Cloud Storage and Data Backup - June 2015NCET Tech Bite - Cloud Storage and Data Backup - June 2015
NCET Tech Bite - Cloud Storage and Data Backup - June 2015
 
NCET Tech
NCET Tech NCET Tech
NCET Tech
 
BSides London 2015 - Proprietary network protocols - risky business on the wire.
BSides London 2015 - Proprietary network protocols - risky business on the wire.BSides London 2015 - Proprietary network protocols - risky business on the wire.
BSides London 2015 - Proprietary network protocols - risky business on the wire.
 
DAC
DACDAC
DAC
 
Crash dump analysis - experience sharing
Crash dump analysis - experience sharingCrash dump analysis - experience sharing
Crash dump analysis - experience sharing
 
What you wanted to know about MySQL, but could not find using inernal instrum...
What you wanted to know about MySQL, but could not find using inernal instrum...What you wanted to know about MySQL, but could not find using inernal instrum...
What you wanted to know about MySQL, but could not find using inernal instrum...
 

Mais de Theo Schlossnagle

Mais de Theo Schlossnagle (20)

Adding Simplicity to Complexity
Adding Simplicity to ComplexityAdding Simplicity to Complexity
Adding Simplicity to Complexity
 
Put Some SRE in Your Shipped Software
Put Some SRE in Your Shipped SoftwarePut Some SRE in Your Shipped Software
Put Some SRE in Your Shipped Software
 
Monitoring 101
Monitoring 101Monitoring 101
Monitoring 101
 
Distributed Systems - Like It Or Not
Distributed Systems - Like It Or NotDistributed Systems - Like It Or Not
Distributed Systems - Like It Or Not
 
Craftsmanship
CraftsmanshipCraftsmanship
Craftsmanship
 
Commandments of scale
Commandments of scaleCommandments of scale
Commandments of scale
 
Adaptive availability
Adaptive availabilityAdaptive availability
Adaptive availability
 
Project reality
Project realityProject reality
Project reality
 
The math behind big systems analysis.
The math behind big systems analysis.The math behind big systems analysis.
The math behind big systems analysis.
 
Understanding Slowness
Understanding SlownessUnderstanding Slowness
Understanding Slowness
 
OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012OmniOS Motivation and Design ~ LISA 2012
OmniOS Motivation and Design ~ LISA 2012
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Omnios and unix
Omnios and unixOmnios and unix
Omnios and unix
 
Monitoring and observability
Monitoring and observabilityMonitoring and observability
Monitoring and observability
 
Xtreme Deployment
Xtreme DeploymentXtreme Deployment
Xtreme Deployment
 
Atldevops
AtldevopsAtldevops
Atldevops
 
It's all about telemetry
It's all about telemetryIt's all about telemetry
It's all about telemetry
 
Monitoring is easy, why are we so bad at it presentation
Monitoring is easy, why are we so bad at it  presentationMonitoring is easy, why are we so bad at it  presentation
Monitoring is easy, why are we so bad at it presentation
 
Social improvements in monitoring
Social improvements in monitoringSocial improvements in monitoring
Social improvements in monitoring
 
What's in a number?
What's in a number?What's in a number?
What's in a number?
 

Último

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 

Último (20)

Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 

Operational Software Design

  • 1. Theo Schlossnagle, Founder @Circonus, @postwait on Twitter Operational Software Ignorance Buys Pain https://www.flickr.com/photos/hoyvinmayvin/4906678960
  • 2. Tenet #1 Designing for the unknown should never burden today Compromise when forced. Design with tomorrow’s expected volume, but today’s functional requirements. https://www.flickr.com/photos/alexander/20090860
  • 3. Tenet #2 Never be left curious as to what your software is doing Observability is the only fast-path to success. https://www.flickr.com/photos/jox1989/4764186425
  • 4. Tenet #3 Understand what your software was doing This is one of the trickiest tenets. You can’t log every instruction performed, every packet sent & received, every context switch, every I/O. Log things that are infrequent, measure things that are frequent. What’s frequent? I said this was tricky. https://www.flickr.com/photos/guest_family/6175062186
  • 5. Tenet #4 Understand internal failures The only thing worse than a malfunction in production, is one without sufficient actionable data. Core dump, stack trace analysis, etc. These can wake people up. https://www.flickr.com/photos/skipthefiller/2451481428
  • 6. Tenet #5 Operator remediation of an external failure is a bug A system failing, a disk failing, a network partition, etc. Once the external failure is remediated, no human being should be involved in returning the (internal) system to correct operation. https://www.flickr.com/photos/timzim/2308099322
  • 7. Tenet #6 & #7 Tight coupling
 reduces resiliency Loose coupling
 reduces debug-ability
  • 8. Tenet #8 Avoid difficult problems
 if possible One does not “just add” replication, consensus and fault tolerance into systems. Never solve a harder problem than you are presented. The best solution to a problem is to remove the problem. https://www.flickr.com/photos/shakestercody/2124972276
  • 9. A tail of two systems Queueing Storage https://www.flickr.com/photos/skynoir/8300610952
  • 10. From RabbitMQ to Fq RabbitMQ: Oh, the outages we’ve had. violates #1, #2, #3, #5, #8 Fq: (#1) build only semantics we need (#2/#3) DTrace probes (#4) cores dumps and backtrace.io (#5) handled by simplicity of design and use (#6/#7) push decoupling out, gain debug-ability (#8) punt clustering downstream to clients https://www.flickr.com/photos/fotologic/2165675515
  • 11. ns level deep routing
  • 12. From Postgres to Snowth Postgres 8… hacked in a column store using arrays Tenet #3: build pg_statsd Tenet #5: … build Snowth Snowth: ring topology time-series store (#1/#8) only commutative operations, (#2) DTrace probes (#3) Circonus itself (statsd, histograms, etc.) (#4) backtrace.io (#7) implemented Zipkin (Dapper-like) tracing
  • 13.
  • 14.