Scaling DevOps of Microservices at Uber (Code Conf 2018)

•

1 gostou•271 visualizações

Scaling DevOps at Uber - How Uber handles scaling of its microservices based platform including building, deploying, running, testing, and monitoring.

Software

Edit or delete footer text in Master ipsandella doloreium dem isciame ndaestia nessed
quibus aut hiligenet ut ea debisci eturiate poresti vid min core, vercidigent.
Scaling DevOps at
Kiran Bondalapati, Uber
Igniting opportunity by setting the world in motion

10+ billion trips
15M+ trips per day
6 continents, 65 countries and 600+ cities
75M active monthly users
3M+ active drivers
16,000+ employees worldwide
3000+ developers worldwide

Bits + Atoms
0101010101010101010101
1010101010101011100111
0001110010101010101001

5/20/17 - 5 Billion Trips
6/18/16 - 2 Billion Trips
12/31/15 - 1 Billion Trips
6/10/18 - 10 Billion Trips
Business

1000s of Microservices
1000s of builds per day
10000+ deployments per day
100K+ service containers per cluster
~1M batch containers per day
DevOps

CODE
BUILD
DEPLOY
TEST
RUN
MONITOR
DevOps

Pre-history PHP (outsourced)
Marketplace Node.JS, moving to Go
Core Services Python, moving to Go, Java
Maps Python and Java
Data Python and Java
Metrics Go
Code
20000+ repos
Multiple languages and frameworks
Multiple communication protocols

4000+ builds per day
Build times affect developer productivity
Build sizes affect deployments
Build
Build without docker
Optimize layer generation
Distributed cache for intermediate layers

100s of services pulling 1000s images from Registry
Deploy
Vertical Scaling
Horizontal Scaling
P2P Distribution - Scales with Load

Reproduce Halloween and New Year
Systemic issues are hard in unit tests
Cascading failures are common in real life
Test
Hailstorm load testing framework
uDestroy random failure injection framework
Regular failure and failover drills
no testee … no workee

Containers are sized for peak load
Dynamic utilization affects cluster efficiency
Typical auto-scaling does not help
Run
Combine responsive and revocable tasks
Oversubscribe resources
Rate limiting of revocation

M3 metrics platform
~5B time series
~10M metrics/sec
Changing services, metrics, infrastructure, ...Monitoring
Rule based alert generators
Git based review and update
Measure oncall quality

HW/SW has tendency to have faults
100M+ alerts per month across Uber stack
Many faults are transient/temporary
Remediate
Smart alert prioritization
Automate manual tasks - reboot, restart, ...
SLA aware remediation

Provision
Deploy
Config
Scale
Update
Detect
Recovery
Remedy
Auto-

Standards based innovation
Layercake architecture
Avoid cyclic dependencies
Avoid cascading failures while designing
Incremental deployments - code and config
Test often … including production
Add guardrails to automation
Design for understandability
Learnings

Larger systems
Bigger impact of changeScale
Larger teams
Less each person knows
Our understanding of systems
breaks more often than
actual systems do

Proprietary and confidential © 2018 Uber Technologies, Inc. All rights reserved. No part of this
document may be reproduced or utilized in any form or by any means, electronic or mechanical,
including photocopying, recording, or by any information storage or retrieval systems, without
permission in writing from Uber. This document is intended only for the use of the individual or entity
to whom it is addressed and contains information that is privileged, confidential or otherwise exempt
from disclosure under applicable law. All recipients of this document are notified that the information
contained herein includes proprietary and confidential information of Uber, and recipient may not
make use of, disseminate, or in any way disclose this document or any of the enclosed information
to any person other than employees of addressee to the extent necessary for consultations with
authorized personnel of Uber.
We are hiring!
www.uber.com/careers/

Mais conteúdo relacionado

Mais procurados

Introduction to Blockchain and Smart ContractsSaad Zaher

Blockchain and SustainabilityCeline George

Nice solutions guide_v1.0Ranjit Patel

CI/CD for Microservices Best PracticesDevOps.com

DevSecOps: Let's Write Security Unit TestsPuma Security, LLC

DevOps vs Traditional IT Ops (DevOps Days ignite talk by Oliver White)ZeroTurnaround

F5 and Infoblox deliver complete secured DNS infrastructureDSorensenCPR

SRE and GitOps for Building Robust Kubernetes Platforms.pdfWeaveworks

Block chain 101 what it is, why it mattersPaul Brody

Mais procurados (9)

Introduction to Blockchain and Smart Contracts

Blockchain and Sustainability

Nice solutions guide_v1.0

CI/CD for Microservices Best Practices

DevSecOps: Let's Write Security Unit Tests

DevOps vs Traditional IT Ops (DevOps Days ignite talk by Oliver White)

F5 and Infoblox deliver complete secured DNS infrastructure

SRE and GitOps for Building Robust Kubernetes Platforms.pdf

Block chain 101 what it is, why it matters

Semelhante a Scaling DevOps of Microservices at Uber (Code Conf 2018)

DevOps Underground - Microservices Monitoringkloia

The Business Value of PaaS Automation - Kieron Sambrook-Smith - Presentation ...eZ Systems

Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData

How to Say Yes to Self-Service in the Cloud and Become an IT HeroRightScale

Osmius The Open Source, Fast and Extandable Monitoring Toolosmius

Agile Network India | Agility Day @Noida | SRE & AIOps | Murugan MuthayanAgileNetwork

Lunch and Learn and SneakersBill Zajac

Agile Mumbai 2022 - Adish Apte & Ashish Sharma | AI/ML Powered & Insights Fu...AgileNetwork

Experience i fix video v1.1Shahnawaz Alam

Platform governance, gestire un ecosistema di microservizi a livello enterpriseGiulio Roggero

Galit Post-Covid ORGANIZATION Presentation Galit Fein

Netflix Edge Engineering Open House Presentations - June 9, 2016Daniel Jacobson

From Duke of DevOps to Queen of Chaos - Api days 2018Christophe Rochefolle

How to Say Yes to Self-Service in the Cloud and Become an IT Hero (ENT217) | ...Amazon Web Services

Microservices and Prometheus (Microservices NYC 2016)Brian Brazil

Eliminate 7 MudasRaja Nagendra Kumar

Modernize and Simplify IT Operations Management for DevOps SuccessDevOps.com

Rundeck OverviewRundeck

DOES15 - Scott Prugh & Erica Morrison - Conway & Taylor Meet the Strangler (v...Gene Kim

Istio: Using nginMesh as the service proxyLee Calcote

Semelhante a Scaling DevOps of Microservices at Uber (Code Conf 2018) (20)

DevOps Underground - Microservices Monitoring

The Business Value of PaaS Automation - Kieron Sambrook-Smith - Presentation ...

Complex event processing platform handling millions of users - Krzysztof Zarz...

How to Say Yes to Self-Service in the Cloud and Become an IT Hero

Osmius The Open Source, Fast and Extandable Monitoring Tool

Agile Network India | Agility Day @Noida | SRE & AIOps | Murugan Muthayan

Lunch and Learn and Sneakers

Agile Mumbai 2022 - Adish Apte & Ashish Sharma | AI/ML Powered & Insights Fu...

Experience i fix video v1.1

Platform governance, gestire un ecosistema di microservizi a livello enterprise

Galit Post-Covid ORGANIZATION Presentation

Netflix Edge Engineering Open House Presentations - June 9, 2016

From Duke of DevOps to Queen of Chaos - Api days 2018

How to Say Yes to Self-Service in the Cloud and Become an IT Hero (ENT217) | ...

Microservices and Prometheus (Microservices NYC 2016)

Eliminate 7 Mudas

Modernize and Simplify IT Operations Management for DevOps Success

Rundeck Overview

DOES15 - Scott Prugh & Erica Morrison - Conway & Taylor Meet the Strangler (v...

Istio: Using nginMesh as the service proxy

Último

Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC

英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0

Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray

Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

React Server Component in Next.js by Hanief UtamaHanief Utama

Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services

Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort

Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig

SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa

How to Track Employee Performance A Comprehensive Guide.pdfLivetecs LLC

SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl

Introduction Computer Science - Software Design.pdfFerryKemperman

Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol

Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Recruitment Management Software Benefits (Infographic)Hr365.us smith

Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions

Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel

BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp

Cyber security and its impact on E commercemanigoyal112

Scaling DevOps of Microservices at Uber (Code Conf 2018)

1. Edit or delete footer text in Master ipsandella doloreium dem isciame ndaestia nessed quibus aut hiligenet ut ea debisci eturiate poresti vid min core, vercidigent. Scaling DevOps at Kiran Bondalapati, Uber Igniting opportunity by setting the world in motion

2. 10+ billion trips 15M+ trips per day 6 continents, 65 countries and 600+ cities 75M active monthly users 3M+ active drivers 16,000+ employees worldwide 3000+ developers worldwide

3. Bits + Atoms 0101010101010101010101 1010101010101011100111 0001110010101010101001

4. 5/20/17 - 5 Billion Trips 6/18/16 - 2 Billion Trips 12/31/15 - 1 Billion Trips 6/10/18 - 10 Billion Trips Business

5. 1000s of Microservices 1000s of builds per day 10000+ deployments per day 100K+ service containers per cluster ~1M batch containers per day DevOps

6. CODE BUILD DEPLOY TEST RUN MONITOR DevOps

7. Pre-history PHP (outsourced) Marketplace Node.JS, moving to Go Core Services Python, moving to Go, Java Maps Python and Java Data Python and Java Metrics Go Code 20000+ repos Multiple languages and frameworks Multiple communication protocols

8. Microservices March 2016

9. Microservices March 2016 March 2018

10. 4000+ builds per day Build times affect developer productivity Build sizes affect deployments Build Build without docker Optimize layer generation Distributed cache for intermediate layers

11. 100s of services pulling 1000s images from Registry Deploy Vertical Scaling Horizontal Scaling P2P Distribution - Scales with Load

12. Reproduce Halloween and New Year Systemic issues are hard in unit tests Cascading failures are common in real life Test Hailstorm load testing framework uDestroy random failure injection framework Regular failure and failover drills no testee … no workee

13. Containers are sized for peak load Dynamic utilization affects cluster efficiency Typical auto-scaling does not help Run Combine responsive and revocable tasks Oversubscribe resources Rate limiting of revocation

14. M3 metrics platform ~5B time series ~10M metrics/sec Changing services, metrics, infrastructure, ...Monitoring Rule based alert generators Git based review and update Measure oncall quality

15. HW/SW has tendency to have faults 100M+ alerts per month across Uber stack Many faults are transient/temporary Remediate Smart alert prioritization Automate manual tasks - reboot, restart, ... SLA aware remediation

16. Provision Deploy Config Scale Update Detect Recovery Remedy Auto-

17. Standards based innovation Layercake architecture Avoid cyclic dependencies Avoid cascading failures while designing Incremental deployments - code and config Test often … including production Add guardrails to automation Design for understandability Learnings

18. Larger systems Bigger impact of changeScale Larger teams Less each person knows Our understanding of systems breaks more often than actual systems do

19. Proprietary and confidential © 2018 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without permission in writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains information that is privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified that the information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent necessary for consultations with authorized personnel of Uber. We are hiring! www.uber.com/careers/

Scaling DevOps of Microservices at Uber (Code Conf 2018)

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (9)

Semelhante a Scaling DevOps of Microservices at Uber (Code Conf 2018)

Semelhante a Scaling DevOps of Microservices at Uber (Code Conf 2018) (20)

Último

Último (20)

Scaling DevOps of Microservices at Uber (Code Conf 2018)