SRE Demystified - 03 - Choosing SLIs and SLOs

•

0 likes•96 views

According to Google, SRE is what you get when you treat operations as if it’s a software problem. In this video, I briefly explain different SLIs typically associated with a system. I will explain Availability, latency and quality SLIs in brief. Youtube channel here: https://youtu.be/EgpCw15fIK8

Technology

SRE Demystified
Choosing SLIs - A closer look
ganesh@ganeshniyer.com
ganesh.vigneswara@gmail.com,
http://ganeshniyer.com
Dr Ganesh Neelakanta Iyer

Dr Ganesh Neelakanta Iyer
SRE
2https://miro.medium.com/max/1026/1*kGnaCS8Mc_tabc-FyMj6yQ.png

Recap
3
https://pbs.twimg.com/media/EKT0DBJVAAApOew.jpg
https://img.deusm.com/informationweek/2014/11/1317662/ubm0197thanksgivingitsupport2_final.png

How SLOs help?
4
Product Development Operations
If reliability is a
feature, when do
you prioritise it
versus other
features?
How do you balance
the risk to reliability
from changing a
system with the
requirement to build
new, cool features for
that system?
What is the right level
of reliability for the
system you support?
https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

SLI Equation
1. SLIs fall between 0% and 100%
0% means nothing works, 100% means nothing is broken
2. SLIs have a consistent format
Consistency allows common tooling to be built around SLIs
Alerting logic, error budget calculations, and SLO analysis and reporting tools can all be written
to expect the same inputs: good events, valid events, and SLO threshold.
5
https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

$Availability SLI • The suggested specification for a request/response Availability SLI is: The proportion of valid requests served successfully • Turning this specification into an implementation requires making two choices • Which of the requests this system serves are valid for the SLI • What makes a response successful? • Sample success/failure indicators include HTTP/RPC response code • Percentage of HTTP GET requests for /profile/{user} or /profile/{user}/avatar that have 2XX, 3XX or 4XX (excl. 429) status measured at the load balancer • The availability of a virtual machine could be defined as the proportion of minutes that it was booted and accessible via SSH 7 https://miro.medium.com/max/2496/1*4_Isk3nxCga6jFL-I9VhzA.png https://www.usenix.org/sites/default/files/conference/protected-files/srecon18emea_slides_fong-jones.pdf$

$Latency • The suggested specification for a request/response Latency SLI is: The proportion of valid requests served faster than a threshold • Turning this specification into an implementation requires making two choices • Which of the requests this system serves are valid for the SLI, • What threshold differentiate between requests that are fast and slow • Percentage of HTTP GET requests for /profile/{user} that send their entire response within Xms measured at the load balancer 8 https://miro.medium.com/max/730/1*FfS0Jg6Nq5yW7Skndxvmkg.png https://www.usenix.org/sites/default/files/conference/protected-files/srecon18emea_slides_fong-jones.pdf https://www.usenix.org/sites/default/files/conference/protected-files/srecon18emea_slides_fong-jones1.pdf$

Quality
• The suggested speciﬁcation for a request/
response Quality SLI is:
The proportion of valid requests served
without degrading quality
• Turning this speciﬁcation into an
implementation requires making two choices
• Which of the requests this system serves
are valid for the SLI
• How to determine whether the response
was served with degraded quality.
9
https://blog.readytomanage.com/wp-content/uploads/2012/07/quality-total-quality-cartoon.jpg
https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

Other SLIs
10
Freshness Coverage
CorrectnessThroughput
https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

Dr Ganesh Neelakanta Iyer
ganesh@ganeshniyer.com
ganesh.vigneswara@gmail.com

What's hot

SRE Demystified - 04 - Engagement ModelDr Ganesh Iyer

SRE Demystified - 13 - Docs that matter -2Dr Ganesh Iyer

Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorksThoughtworks

ImprovingSBHSRemotetriggered server and websiteDevdutt Kumar

Database Health-Check Consulting ServiceOnomi

Modern CI/CD in the microservices world with KubernetesMikalai Alimenkou

Geek Sync I Surviving the Holidays with SQL ServerIDERA Software

Ride the database in JUnit tests with Database RiderMikalai Alimenkou

Geek Sync | Kick Start SQL Server 2016 Performance Tips and TricksIDERA Software

ALM@Work - Typical developer dayDomusDotNet

Optimize continuous delivery of oracle fusion middleware applicationsSuneraTech

Monitoring your physical, virtual and cloud infrastructure with Applications ...ManageEngine, Zoho Corporation

SP3 featuresAxle-IT

LANDesk Service pack 3 featuresInfraVision

Pool manager softwarePoolSpecialist

Geek Sync I Practical Performance Tuning with SQL Wait StatsIDERA Software

SQL Server Profiler & Performance Monitor - SarabPreet SinghRishu Mehra

Best Practices for Software License ManagementHouse of I.T.

Salesforce Meetup 18 April 2015 - Apex Trigger & Scheduler FramworksSumitkumar Shingavi

Geek Sync | The Five Essential Scripts for Performance TuningIDERA Software

What's hot (20)

SRE Demystified - 04 - Engagement Model

SRE Demystified - 13 - Docs that matter -2

Performance monitoring - Adoniram Mishra, Rupesh Dubey, ThoughtWorks

ImprovingSBHSRemotetriggered server and website

Database Health-Check Consulting Service

Modern CI/CD in the microservices world with Kubernetes

Geek Sync I Surviving the Holidays with SQL Server

Ride the database in JUnit tests with Database Rider

Geek Sync | Kick Start SQL Server 2016 Performance Tips and Tricks

ALM@Work - Typical developer day

Optimize continuous delivery of oracle fusion middleware applications

Monitoring your physical, virtual and cloud infrastructure with Applications ...

SP3 features

LANDesk Service pack 3 features

Pool manager software

Geek Sync I Practical Performance Tuning with SQL Wait Stats

SQL Server Profiler & Performance Monitor - SarabPreet Singh

Best Practices for Software License Management

Salesforce Meetup 18 April 2015 - Apex Trigger & Scheduler Framworks

Geek Sync | The Five Essential Scripts for Performance Tuning

Similar to SRE Demystified - 03 - Choosing SLIs and SLOs

Extensive Security and Performance Analysis Shows the Proposed Schemes Are Pr...IJERA Editor

Secure Authorised De-duplication using Convergent Encryption TechniqueEswar Publications

IRJET- Improving Data Storage Security and Performance in Cloud EnvironmentIRJET Journal

documentation for identity based secure distrbuted data storage schemesSahithi Naraparaju

MEDICAL FACILITY ANALYSIS2MEDICAL FACILITY ANALYSIS16.docxARIV4

9.system analysisNARESH DEVOLLA

School management SystemHATIM Bhagat

CONSULTANT ANALYSIS FOR MEDICAL FACILITY2CONSULTANT ANALYSIS FO.docxdonnajames55

5222020 SafeAssign Originality ReportfileCUsersDl.docxevonnehoggarth79783

5222020 SafeAssign Originality ReportfileCUsersDl.docxtaishao1

online multiplex ticket booking using ASP.NET C#(glosyn)Md Imran

Hrm database-management-java-projectchetanmbhimewal

Mris network architecture proposal r1Craig Burma

Hydra connect2015 security-accessibility-changemanagement-finalnewmanld

Successful Enterprise Single Sign-on: Addressing Deployment ChallengesHitachi ID Systems, Inc.

Object Oriented Secure Modeling using SELinux Trusted Operating SystemEswar Publications

Security of Oracle EBS - How I can Protect my System (UKOUG APPS 18 edition)Andrejs Prokopjevs

Chris Rutter: Avoiding The Security BrickMichael Man

1. What are some risks, threats, and vulnerabilities commonly foun.docxelliotkimberlee

50120130405015IAEME Publication

Similar to SRE Demystified - 03 - Choosing SLIs and SLOs (20)

Extensive Security and Performance Analysis Shows the Proposed Schemes Are Pr...

Secure Authorised De-duplication using Convergent Encryption Technique

IRJET- Improving Data Storage Security and Performance in Cloud Environment

documentation for identity based secure distrbuted data storage schemes

MEDICAL FACILITY ANALYSIS2MEDICAL FACILITY ANALYSIS16.docx

9.system analysis

School management System

CONSULTANT ANALYSIS FOR MEDICAL FACILITY2CONSULTANT ANALYSIS FO.docx

5222020 SafeAssign Originality ReportfileCUsersDl.docx

online multiplex ticket booking using ASP.NET C#(glosyn)

Hrm database-management-java-project

Mris network architecture proposal r1

Hydra connect2015 security-accessibility-changemanagement-final

Successful Enterprise Single Sign-on: Addressing Deployment Challenges

Object Oriented Secure Modeling using SELinux Trusted Operating System

Security of Oracle EBS - How I can Protect my System (UKOUG APPS 18 edition)

Chris Rutter: Avoiding The Security Brick

1. What are some risks, threats, and vulnerabilities commonly foun.docx

50120130405015

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3

The State of Passkeys with FIDO Alliance.pptxLoriGlavin3

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

"ML in Production",Oleksandr BaganFwdays

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

How to write a Business Continuity PlanDatabarracks

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Moving Beyond Passwords: FIDO Paris Seminar.pdf

The State of Passkeys with FIDO Alliance.pptx

Nell’iperspazio con Rocket: il Framework Web di Rust!

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)

SIP trunking in Janus @ Kamailio World 2024

Unraveling Multimodality with Large Language Models.pdf

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

How AI, OpenAI, and ChatGPT impact business and software.

TeamStation AI System Report LATAM IT Salaries 2024

"ML in Production",Oleksandr Bagan

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

DSPy a system for AI to Write Prompts and Do Fine Tuning

Connect Wave/ connectwave Pitch Deck Presentation

How to write a Business Continuity Plan

"Debugging python applications inside k8s environment", Andrii Soldatenko

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

SRE Demystified - 03 - Choosing SLIs and SLOs

1. SRE Demystified Choosing SLIs - A closer look ganesh@ganeshniyer.com ganesh.vigneswara@gmail.com, http://ganeshniyer.com Dr Ganesh Neelakanta Iyer

2. Dr Ganesh Neelakanta Iyer SRE 2https://miro.medium.com/max/1026/1*kGnaCS8Mc_tabc-FyMj6yQ.png

3. Recap 3 https://pbs.twimg.com/media/EKT0DBJVAAApOew.jpg https://img.deusm.com/informationweek/2014/11/1317662/ubm0197thanksgivingitsupport2_final.png

4. How SLOs help? 4 Product Development Operations If reliability is a feature, when do you prioritise it versus other features? How do you balance the risk to reliability from changing a system with the requirement to build new, cool features for that system? What is the right level of reliability for the system you support? https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

5. SLI Equation 1. SLIs fall between 0% and 100% 0% means nothing works, 100% means nothing is broken 2. SLIs have a consistent format Consistency allows common tooling to be built around SLIs Alerting logic, error budget calculations, and SLO analysis and reporting tools can all be written to expect the same inputs: good events, valid events, and SLO threshold. 5 https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

6. https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

7. Availability SLI • The suggested specification for a request/response Availability SLI is: The proportion of valid requests served successfully • Turning this specification into an implementation requires making two choices • Which of the requests this system serves are valid for the SLI • What makes a response successful? • Sample success/failure indicators include HTTP/RPC response code • Percentage of HTTP GET requests for /profile/{user} or /profile/{user}/avatar that have 2XX, 3XX or 4XX (excl. 429) status measured at the load balancer • The availability of a virtual machine could be defined as the proportion of minutes that it was booted and accessible via SSH 7 https://miro.medium.com/max/2496/1*4_Isk3nxCga6jFL-I9VhzA.png https://www.usenix.org/sites/default/files/conference/protected-files/srecon18emea_slides_fong-jones.pdf

8. Latency • The suggested specification for a request/response Latency SLI is: The proportion of valid requests served faster than a threshold • Turning this specification into an implementation requires making two choices • Which of the requests this system serves are valid for the SLI, • What threshold differentiate between requests that are fast and slow • Percentage of HTTP GET requests for /profile/{user} that send their entire response within Xms measured at the load balancer 8 https://miro.medium.com/max/730/1*FfS0Jg6Nq5yW7Skndxvmkg.png https://www.usenix.org/sites/default/files/conference/protected-files/srecon18emea_slides_fong-jones.pdf https://www.usenix.org/sites/default/files/conference/protected-files/srecon18emea_slides_fong-jones1.pdf

9. Quality • The suggested specification for a request/ response Quality SLI is: The proportion of valid requests served without degrading quality • Turning this specification into an implementation requires making two choices • Which of the requests this system serves are valid for the SLI • How to determine whether the response was served with degraded quality. 9 https://blog.readytomanage.com/wp-content/uploads/2012/07/quality-total-quality-cartoon.jpg https://www.usenix.org/sites/default/files/conference/protected-files/srecon18emea_slides_fong-jones.pdf

10. Other SLIs 10 Freshness Coverage CorrectnessThroughput https://www.usenix.org/sites/default/ﬁles/conference/protected-ﬁles/srecon18emea_slides_fong-jones.pdf

11. Dr Ganesh Neelakanta Iyer ganesh@ganeshniyer.com ganesh.vigneswara@gmail.com

SRE Demystified - 03 - Choosing SLIs and SLOs

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to SRE Demystified - 03 - Choosing SLIs and SLOs

Similar to SRE Demystified - 03 - Choosing SLIs and SLOs (20)

More from Dr Ganesh Iyer

More from Dr Ganesh Iyer (20)

Recently uploaded

Recently uploaded (20)

SRE Demystified - 03 - Choosing SLIs and SLOs