SlideShare uma empresa Scribd logo
1 de 9
Baixar para ler offline
SRE Demystified
NALSD
ganesh@ganeshniyer.com
ganesh.vigneswara@gmail.com,
http://ganeshniyer.com
Dr Ganesh Neelakanta Iyer
Designing distributed systems
• There are many ways to design distributed systems.
• One way involves growing systems organically
• Components are rewritten or redesigned as the system
handles more requests
• Another method starts with a proof of concept.
• Once the system adds value to the business, a second
version is designed from the ground up
2
Non-abstract large system design (NALSD)
• NALSD describes an iterative process for designing,
assessing, and evaluating distributed systems, such as Borg
cluster management for distributed computing and
the Google distributed file system
• NALSD describes a skill critical to SRE: the ability to
assess, design, and evaluate large systems
• Practically, NALSD combines elements of capacity planning,
component isolation, and graceful system degradation that
are crucial to highly available production systems
3
https://research.google/pubs/pub51/
https://research.google/pubs/pub43438/
Why Non-abstract?
• All systems will eventually have to run on real computers in real
datacenters using real networks
• The people designing distributed systems need to develop and
continuously exercise the muscle of turning a whiteboard design into
concrete estimates of resources at multiple steps in the process
• This extra bit of work up front typically leads to fewer last-minute
system design changes to account for some unforeseen physical
constraint
• The value of this exercise is in combining many imperfect-but-
reasonable results into a better understanding of the design
4
https://landing.google.com/sre/workbook/chapters/non-abstract-design/
Design process
• Iterative approach to design systems that meet our goals
• Each iteration defines a potential design and examines its
strengths and weaknesses
• This analysis either feeds into the next iteration or
indicates when the design is good enough to recommend
5
Two-Phases
1. Basic design
• Is it possible?
• Is the design even possible? If we didn’t have to worry
about enough RAM, CPU, network bandwidth, and so
on, what would we design to satisfy the requirements?
• Can we do better?
• For any such design, we ask, “Can we do better?” For
example, can we make the system meaningfully faster,
smaller, more efficient? If the design solves the problem
in O(N) time, can we solve it more quickly—say, O(ln(N))
6
Two-Phases
1. Scale-up the basic design
• Is it feasible?
• Is it possible to scale this design, given constraints on
money, hardware, and so on? If necessary, what
distributed design would satisfy the requirements?
• Is it resilient?
• Can the design fail gracefully? What happens when this
component fails? How does the system work when an
entire datacenter fails?
• Can we do better?
7
References
• https://landing.google.com/sre/workbook/chapters/non-abstract-
design/
• https://www.usenix.org/sites/default/files/conference/protected-files/
srecon18americas_slides_virji.pdf
• https://cloud.google.com/blog/products/management-tools/sre-
principles-and-flashcards-to-design-nalsd
• https://www.youtube.com/watch?v=modXC5IWTJI
8
Dr Ganesh Neelakanta Iyer
ganesh@ganeshniyer.com
ganesh.vigneswara@gmail.com

Mais conteúdo relacionado

Mais procurados

Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid arupmalakar
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotXiang Fu
 
Docker 基礎介紹與實戰
Docker 基礎介紹與實戰Docker 基礎介紹與實戰
Docker 基礎介紹與實戰Bo-Yi Wu
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and FutureDataWorks Summit
 
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...Spark Summit
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance AnalysisBrendan Gregg
 
Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheusKasper Nissen
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
Enforcing Bespoke Policies in Kubernetes
Enforcing Bespoke Policies in KubernetesEnforcing Bespoke Policies in Kubernetes
Enforcing Bespoke Policies in KubernetesTorin Sandall
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...Databricks
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to PrometheusJulien Pivotto
 
JupyterHub: Learning at Scale
JupyterHub: Learning at ScaleJupyterHub: Learning at Scale
JupyterHub: Learning at ScaleCarol Willing
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkFlink Forward
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaCloudera, Inc.
 
Monitoring Flink with Prometheus
Monitoring Flink with PrometheusMonitoring Flink with Prometheus
Monitoring Flink with PrometheusMaximilian Bode
 
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020HostedbyConfluent
 
Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First OverviewRicardo Paiva
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producerconfluent
 

Mais procurados (20)

Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Docker 基礎介紹與實戰
Docker 基礎介紹與實戰Docker 基礎介紹與實戰
Docker 基礎介紹與實戰
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
 
Container Performance Analysis
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
 
Monitoring with prometheus
Monitoring with prometheusMonitoring with prometheus
Monitoring with prometheus
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Enforcing Bespoke Policies in Kubernetes
Enforcing Bespoke Policies in KubernetesEnforcing Bespoke Policies in Kubernetes
Enforcing Bespoke Policies in Kubernetes
 
HAProxy
HAProxy HAProxy
HAProxy
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 
JupyterHub: Learning at Scale
JupyterHub: Learning at ScaleJupyterHub: Learning at Scale
JupyterHub: Learning at Scale
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
Monitoring Flink with Prometheus
Monitoring Flink with PrometheusMonitoring Flink with Prometheus
Monitoring Flink with Prometheus
 
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
Kafka Lag Monitoring For Human Beings (Elad Leev, AppsFlyer) Kafka Summit 2020
 
Apache Pulsar First Overview
Apache PulsarFirst OverviewApache PulsarFirst Overview
Apache Pulsar First Overview
 
Common issues with Apache Kafka® Producer
Common issues with Apache Kafka® ProducerCommon issues with Apache Kafka® Producer
Common issues with Apache Kafka® Producer
 

Semelhante a SRE Demystified - 16 - NALSD - Non-Abstract Large System Design

Software System Engineering - Chapter 15
Software System Engineering - Chapter 15Software System Engineering - Chapter 15
Software System Engineering - Chapter 15Fadhil Ismail
 
1605162990-week56.pptx
1605162990-week56.pptx1605162990-week56.pptx
1605162990-week56.pptxSumbalIlyas1
 
Scrum an extension pattern language for hyperproductive software development
Scrum an extension pattern language  for hyperproductive software developmentScrum an extension pattern language  for hyperproductive software development
Scrum an extension pattern language for hyperproductive software developmentShiraz316
 
Game development (Game Architecture)
Game development (Game Architecture)Game development (Game Architecture)
Game development (Game Architecture)Rajkumar Pawar
 
Wbs, estimation and scheduling
Wbs, estimation and schedulingWbs, estimation and scheduling
Wbs, estimation and schedulingSulman Ahmed
 
Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...
Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...
Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...ivaderivader
 
Nimble framework
Nimble frameworkNimble framework
Nimble frameworktusjain
 
Agile practices and benefits
Agile practices and benefitsAgile practices and benefits
Agile practices and benefitsRichard Stone
 
Nimble Framework - Software architecture and design in agile era - PSQT Template
Nimble Framework - Software architecture and design in agile era - PSQT TemplateNimble Framework - Software architecture and design in agile era - PSQT Template
Nimble Framework - Software architecture and design in agile era - PSQT Templatetjain
 
Code Management Workshop
Code Management WorkshopCode Management Workshop
Code Management WorkshopSameh El-Ashry
 
Agile_SDLC_Node.js@Paypal_ppt
Agile_SDLC_Node.js@Paypal_pptAgile_SDLC_Node.js@Paypal_ppt
Agile_SDLC_Node.js@Paypal_pptHitesh Kumar
 
Lecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnz
Lecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnzLecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnz
Lecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnzAhmadSajjad34
 
A Pattern-Language-for-software-Development
A Pattern-Language-for-software-DevelopmentA Pattern-Language-for-software-Development
A Pattern-Language-for-software-DevelopmentShiraz316
 

Semelhante a SRE Demystified - 16 - NALSD - Non-Abstract Large System Design (20)

Ch12
Ch12Ch12
Ch12
 
Software System Engineering - Chapter 15
Software System Engineering - Chapter 15Software System Engineering - Chapter 15
Software System Engineering - Chapter 15
 
1605162990-week56.pptx
1605162990-week56.pptx1605162990-week56.pptx
1605162990-week56.pptx
 
Ch 9-design-engineering
Ch 9-design-engineeringCh 9-design-engineering
Ch 9-design-engineering
 
Scrum an extension pattern language for hyperproductive software development
Scrum an extension pattern language  for hyperproductive software developmentScrum an extension pattern language  for hyperproductive software development
Scrum an extension pattern language for hyperproductive software development
 
Game development (Game Architecture)
Game development (Game Architecture)Game development (Game Architecture)
Game development (Game Architecture)
 
Software Design Concepts
Software Design ConceptsSoftware Design Concepts
Software Design Concepts
 
Wbs
WbsWbs
Wbs
 
Wbs, estimation and scheduling
Wbs, estimation and schedulingWbs, estimation and scheduling
Wbs, estimation and scheduling
 
Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...
Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...
Spacewalker: Rapid UI Design Exploration Using Lightweight Markup Enhancement...
 
Nimble framework
Nimble frameworkNimble framework
Nimble framework
 
Agile practices and benefits
Agile practices and benefitsAgile practices and benefits
Agile practices and benefits
 
RRC RUP
RRC RUPRRC RUP
RRC RUP
 
Nimble Framework - Software architecture and design in agile era - PSQT Template
Nimble Framework - Software architecture and design in agile era - PSQT TemplateNimble Framework - Software architecture and design in agile era - PSQT Template
Nimble Framework - Software architecture and design in agile era - PSQT Template
 
Code Management Workshop
Code Management WorkshopCode Management Workshop
Code Management Workshop
 
Chap05
Chap05Chap05
Chap05
 
Chapter-1.ppt
Chapter-1.pptChapter-1.ppt
Chapter-1.ppt
 
Agile_SDLC_Node.js@Paypal_ppt
Agile_SDLC_Node.js@Paypal_pptAgile_SDLC_Node.js@Paypal_ppt
Agile_SDLC_Node.js@Paypal_ppt
 
Lecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnz
Lecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnzLecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnz
Lecture 5 -6(CSC205).pptx jsksnxbbxjxksnsnz
 
A Pattern-Language-for-software-Development
A Pattern-Language-for-software-DevelopmentA Pattern-Language-for-software-Development
A Pattern-Language-for-software-Development
 

Mais de Dr Ganesh Iyer

SRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewSRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewDr Ganesh Iyer
 
SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2Dr Ganesh Iyer
 
SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1 SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1 Dr Ganesh Iyer
 
SRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLASRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLADr Ganesh Iyer
 
SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2Dr Ganesh Iyer
 
SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1Dr Ganesh Iyer
 
SRE Demystified - 09 - Simplicity
SRE Demystified - 09 - SimplicitySRE Demystified - 09 - Simplicity
SRE Demystified - 09 - SimplicityDr Ganesh Iyer
 
SRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical AlertingSRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical AlertingDr Ganesh Iyer
 
SRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed MonitoringSRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed MonitoringDr Ganesh Iyer
 
SRE Demystified - 05 - Toil Elimination
SRE Demystified - 05 - Toil EliminationSRE Demystified - 05 - Toil Elimination
SRE Demystified - 05 - Toil EliminationDr Ganesh Iyer
 
SRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement ModelSRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement ModelDr Ganesh Iyer
 
SRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOsSRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOsDr Ganesh Iyer
 
Machine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionMachine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionDr Ganesh Iyer
 
Making Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachMaking Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachDr Ganesh Iyer
 
Game Theory and Engineering Applications
Game Theory and Engineering ApplicationsGame Theory and Engineering Applications
Game Theory and Engineering ApplicationsDr Ganesh Iyer
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its ApplicationsDr Ganesh Iyer
 
How to become a successful entrepreneur
How to become a successful entrepreneurHow to become a successful entrepreneur
How to become a successful entrepreneurDr Ganesh Iyer
 
Dockers and kubernetes
Dockers and kubernetesDockers and kubernetes
Dockers and kubernetesDr Ganesh Iyer
 
Containerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deploymentContainerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deploymentDr Ganesh Iyer
 

Mais de Dr Ganesh Iyer (20)

SRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overviewSRE Demystified - 14 - SRE Practices overview
SRE Demystified - 14 - SRE Practices overview
 
SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2SRE Demystified - 13 - Docs that matter -2
SRE Demystified - 13 - Docs that matter -2
 
SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1 SRE Demystified - 12 - Docs that matter -1
SRE Demystified - 12 - Docs that matter -1
 
SRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLASRE Demystified - 01 - SLO SLI and SLA
SRE Demystified - 01 - SLO SLI and SLA
 
SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2SRE Demystified - 11 - Release management-2
SRE Demystified - 11 - Release management-2
 
SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1SRE Demystified - 10 - Release management-1
SRE Demystified - 10 - Release management-1
 
SRE Demystified - 09 - Simplicity
SRE Demystified - 09 - SimplicitySRE Demystified - 09 - Simplicity
SRE Demystified - 09 - Simplicity
 
SRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical AlertingSRE Demystified - 07 - Practical Alerting
SRE Demystified - 07 - Practical Alerting
 
SRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed MonitoringSRE Demystified - 06 - Distributed Monitoring
SRE Demystified - 06 - Distributed Monitoring
 
SRE Demystified - 05 - Toil Elimination
SRE Demystified - 05 - Toil EliminationSRE Demystified - 05 - Toil Elimination
SRE Demystified - 05 - Toil Elimination
 
SRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement ModelSRE Demystified - 04 - Engagement Model
SRE Demystified - 04 - Engagement Model
 
SRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOsSRE Demystified - 03 - Choosing SLIs and SLOs
SRE Demystified - 03 - Choosing SLIs and SLOs
 
Machine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionMachine Learning for Statisticians - Introduction
Machine Learning for Statisticians - Introduction
 
Making Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approachMaking Decisions - A Game Theoretic approach
Making Decisions - A Game Theoretic approach
 
Cloud and Industry4.0
Cloud and Industry4.0Cloud and Industry4.0
Cloud and Industry4.0
 
Game Theory and Engineering Applications
Game Theory and Engineering ApplicationsGame Theory and Engineering Applications
Game Theory and Engineering Applications
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
 
How to become a successful entrepreneur
How to become a successful entrepreneurHow to become a successful entrepreneur
How to become a successful entrepreneur
 
Dockers and kubernetes
Dockers and kubernetesDockers and kubernetes
Dockers and kubernetes
 
Containerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deploymentContainerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deployment
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

SRE Demystified - 16 - NALSD - Non-Abstract Large System Design

  • 2. Designing distributed systems • There are many ways to design distributed systems. • One way involves growing systems organically • Components are rewritten or redesigned as the system handles more requests • Another method starts with a proof of concept. • Once the system adds value to the business, a second version is designed from the ground up 2
  • 3. Non-abstract large system design (NALSD) • NALSD describes an iterative process for designing, assessing, and evaluating distributed systems, such as Borg cluster management for distributed computing and the Google distributed file system • NALSD describes a skill critical to SRE: the ability to assess, design, and evaluate large systems • Practically, NALSD combines elements of capacity planning, component isolation, and graceful system degradation that are crucial to highly available production systems 3 https://research.google/pubs/pub51/ https://research.google/pubs/pub43438/
  • 4. Why Non-abstract? • All systems will eventually have to run on real computers in real datacenters using real networks • The people designing distributed systems need to develop and continuously exercise the muscle of turning a whiteboard design into concrete estimates of resources at multiple steps in the process • This extra bit of work up front typically leads to fewer last-minute system design changes to account for some unforeseen physical constraint • The value of this exercise is in combining many imperfect-but- reasonable results into a better understanding of the design 4 https://landing.google.com/sre/workbook/chapters/non-abstract-design/
  • 5. Design process • Iterative approach to design systems that meet our goals • Each iteration defines a potential design and examines its strengths and weaknesses • This analysis either feeds into the next iteration or indicates when the design is good enough to recommend 5
  • 6. Two-Phases 1. Basic design • Is it possible? • Is the design even possible? If we didn’t have to worry about enough RAM, CPU, network bandwidth, and so on, what would we design to satisfy the requirements? • Can we do better? • For any such design, we ask, “Can we do better?” For example, can we make the system meaningfully faster, smaller, more efficient? If the design solves the problem in O(N) time, can we solve it more quickly—say, O(ln(N)) 6
  • 7. Two-Phases 1. Scale-up the basic design • Is it feasible? • Is it possible to scale this design, given constraints on money, hardware, and so on? If necessary, what distributed design would satisfy the requirements? • Is it resilient? • Can the design fail gracefully? What happens when this component fails? How does the system work when an entire datacenter fails? • Can we do better? 7
  • 8. References • https://landing.google.com/sre/workbook/chapters/non-abstract- design/ • https://www.usenix.org/sites/default/files/conference/protected-files/ srecon18americas_slides_virji.pdf • https://cloud.google.com/blog/products/management-tools/sre- principles-and-flashcards-to-design-nalsd • https://www.youtube.com/watch?v=modXC5IWTJI 8
  • 9. Dr Ganesh Neelakanta Iyer ganesh@ganeshniyer.com ganesh.vigneswara@gmail.com