SlideShare uma empresa Scribd logo
1 de 25
Super-sizing YouTube
    with Python

       Mike Solomon
     mike@youtube.com
Welcome
this is about scaling a web application

there are a lot of things left out - mostly
mistakes and implementation details

this may generate more questions than it
answers

my goal is to give you ideas for solving your own
problems
Architecture
this is the core of scalability

systems change over time, so will your
architecture

impossible to predict the optimal approach

   start simple

   aim for local maxima

python enables flexibility
YouTube's Early Days
web boxes do everything

servlets, images, thumbnails, search

shoehorn everything into Apache, MySQL

very simple

  this survives longer than you'd think
hw load balancer




                                      httpd
                                 mod_python
                    db objects        search   thumbnails
                          biz logic
                          servlets
                         templates



                           Early Web Stack
db master



                           circa January ‘06

      db replicas
Early Key Factors in
     Engineering
really small team

  we     python

  logical separation in code

  discipline and honor - not linguistically
  enforced (don’t waste time writing code to
  restrict people)*

grown by systematically removing bottlenecks

  easy to know when something is a `win`
Running Without Tripping

user demand can grow 50% in a day

removing one bottleneck can immediately reveal
another (usually more heinous)

replace and migrate components as they become
problems

  good (python) components make this easy

  obviously, pick your battles
Good Components
     (Hypothetical)
minimize dependencies*

accept some latency

localize failures - don’t let them spread

  you are only down if it looks like you are

applies to both systems and software
Balance Machine
        Resources
more efficient resource utilization via specialized
deployment

balance based on CPU, RAM, network and disk
usage patterns

overlay orthogonal loads

  disjoint tasks running on the same physical
  hardware
Migratory Patterns of the
     Norwegian Blue
  move from mod_python to mod_fastcgi

  move thumbnails to their own machines

  make search to a remote service running on
  separate machines

  run transcoder processes on video servers

  do more with the same hardware
Serenity Now



Can you spot where we turned on
transcoding processes?
SQL Shenanigans
if you have a relational database, it will be
abused

   difficult to track the true source

series of object proxies for DB-API enable
logging

encode a portion of call stack as a query
comment* (more about this later)
Object Caching
take pressure off of relational db

can save additional resources if your objects
require significant computation to set up

memcached makes a good home for this

need good client to make this into a truly useful
service ‡

  pools and better failure handling
Software Optimization
fast vs fast enough

strive for machine efficiency - don't obsess

be scientific - collect data and understand it

can yield some surprising results

don't assume code optimization techniques from
another language are relevant

just like carpentry, measure twice cut once
Python Optimization
pure python HMAC was 40% of web cpu

  write a few lines of C

threaded comments fiasco

  overly complex algorithm to compute the
  display object tree

  simplify query, simplify algorithm
Python Optimization
psyco - specializing compiler for Python

  'hot' functions are psyco-ized

  there is a 'context switch' penalty so you
  need to experiment to see if it helps

previous threaded comments algorithm

  -closure +psyco = 400% boost
Reasonable Efficiency
pruned all the obvious leaf services

dynamic web requests are one `service`

web service is easy to scale, so it stresses out
other resources - probably a DB

DB’s are hard(er) to scale

  tricks of escalating cleverness‡

  eventually, no cards left to play
Scaling MySQL
pretty much have to go horizontal

choose your partition plan carefully

understand your data access patterns

  what queries do you run most often?

  do you have joins?

  do you need transactional consistency? why?

does an 'entity' emerge?
Partition By Entity
entities are 'transactional'

allow joins across properties of an entity

entities are migratory

cross entity is more complicated

   weaken guarantees to make it easier

   minimize activity by design
EMD, a TLA not an ORM!
 connection and transaction management

 lookup service

 query factory

 minimalist table abstraction

   ORM can be (is?) evil

 make common behaviors simple, while leaving
 some transparency to the actual database
Seismic Retrofit
apply this fundamental change to a large and
growing site

make it relatively painless with python

  multiple inheritance

  decorators

  AST plugins for validation and testing
Resulting API
all the scale-aware code nicely opaque to
application developers

base use cases are painless
User.select_by_username(db_context, username)

Video.select_by_id(db_context, video_id)

Video.select_by_user_id(db_context, user_id)
Bulk Entity Migration
hijack mysql replication to partition on the fly
while the live site is running

all DML gets tagged with an entity id

read master binlog and selectively replay it into
a set of new mini-masters

update lookup service to point to new resources
Recurring Themes
the elegance of simplicity

take reliable open software and customize it

`pythonic veneer`

DIY - filing a ticket for a bugfix doesn’t give me
a warm feeling - take matters into your own
hands*
Questions?

Mais conteúdo relacionado

Mais procurados

Server’s variations bsw2015
Server’s variations bsw2015Server’s variations bsw2015
Server’s variations bsw2015Laurent Cerveau
 
Modern Cloud Fundamentals: Misconceptions and Industry Trends
Modern Cloud Fundamentals: Misconceptions and Industry TrendsModern Cloud Fundamentals: Misconceptions and Industry Trends
Modern Cloud Fundamentals: Misconceptions and Industry TrendsChristopher Bennage
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data ManagementChris Bunch
 
Using a gateway to leverage on premises data in Power BI
Using a gateway to leverage on premises data in Power BIUsing a gateway to leverage on premises data in Power BI
Using a gateway to leverage on premises data in Power BIAdam Saxton
 
2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...
2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...
2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...Eneko Jon Bilbao
 
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLFrom Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLKonstantin Gredeskoul
 
Tips & Tricks On Architecting Windows Azure For Costs
Tips & Tricks On Architecting Windows Azure For CostsTips & Tricks On Architecting Windows Azure For Costs
Tips & Tricks On Architecting Windows Azure For CostsNuno Godinho
 
Deep Learning Applications (dadada2017)
Deep Learning Applications (dadada2017)Deep Learning Applications (dadada2017)
Deep Learning Applications (dadada2017)Abhishek Thakur
 
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika SinghDeep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika SinghData Con LA
 
Java Enterprise Performance - Unburdended Applications
Java Enterprise Performance - Unburdended ApplicationsJava Enterprise Performance - Unburdended Applications
Java Enterprise Performance - Unburdended ApplicationsLucas Jellema
 
Aws 12 Month Free Tier for Web Designers and Developers
Aws 12 Month Free Tier for Web Designers and DevelopersAws 12 Month Free Tier for Web Designers and Developers
Aws 12 Month Free Tier for Web Designers and DevelopersDylan Burris
 
How to Build High Performance : WordPress
How to Build High Performance : WordPressHow to Build High Performance : WordPress
How to Build High Performance : WordPressDylan Burris
 
Metaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at NetflixMetaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at NetflixBill Liu
 
What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018
What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018
What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018Sri Ambati
 
Stat 5.4 Pre Sales Demo Master
Stat 5.4 Pre Sales Demo MasterStat 5.4 Pre Sales Demo Master
Stat 5.4 Pre Sales Demo Masterreachtimsq
 
Blockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalBlockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalKaren Lopez
 
System Update 2010 CrossRef Workshops
System Update 2010 CrossRef WorkshopsSystem Update 2010 CrossRef Workshops
System Update 2010 CrossRef WorkshopsCrossref
 
Re invent 2018 meetup presentation
Re invent 2018 meetup presentationRe invent 2018 meetup presentation
Re invent 2018 meetup presentationEliran Yamin
 
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...Flink Forward
 

Mais procurados (20)

Server’s variations bsw2015
Server’s variations bsw2015Server’s variations bsw2015
Server’s variations bsw2015
 
Modern Cloud Fundamentals: Misconceptions and Industry Trends
Modern Cloud Fundamentals: Misconceptions and Industry TrendsModern Cloud Fundamentals: Misconceptions and Industry Trends
Modern Cloud Fundamentals: Misconceptions and Industry Trends
 
Presentation on Large Scale Data Management
Presentation on Large Scale Data ManagementPresentation on Large Scale Data Management
Presentation on Large Scale Data Management
 
Using a gateway to leverage on premises data in Power BI
Using a gateway to leverage on premises data in Power BIUsing a gateway to leverage on premises data in Power BI
Using a gateway to leverage on premises data in Power BI
 
2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...
2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...
2016 Mastering SAP Tech - 2 Speed IT and lessons from an Agile Waterfall eCom...
 
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQLFrom Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
From Obvious to Ingenius: Incrementally Scaling Web Apps on PostgreSQL
 
Tips & Tricks On Architecting Windows Azure For Costs
Tips & Tricks On Architecting Windows Azure For CostsTips & Tricks On Architecting Windows Azure For Costs
Tips & Tricks On Architecting Windows Azure For Costs
 
Deep Learning Applications (dadada2017)
Deep Learning Applications (dadada2017)Deep Learning Applications (dadada2017)
Deep Learning Applications (dadada2017)
 
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika SinghDeep Learning Frameworks Using Spark on YARN by Vartika Singh
Deep Learning Frameworks Using Spark on YARN by Vartika Singh
 
Java Enterprise Performance - Unburdended Applications
Java Enterprise Performance - Unburdended ApplicationsJava Enterprise Performance - Unburdended Applications
Java Enterprise Performance - Unburdended Applications
 
Aws 12 Month Free Tier for Web Designers and Developers
Aws 12 Month Free Tier for Web Designers and DevelopersAws 12 Month Free Tier for Web Designers and Developers
Aws 12 Month Free Tier for Web Designers and Developers
 
How to Build High Performance : WordPress
How to Build High Performance : WordPressHow to Build High Performance : WordPress
How to Build High Performance : WordPress
 
Trends in DNN compression
Trends in DNN compressionTrends in DNN compression
Trends in DNN compression
 
Metaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at NetflixMetaflow: The ML Infrastructure at Netflix
Metaflow: The ML Infrastructure at Netflix
 
What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018
What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018
What's New in H2O Driverless AI? - Arno Candel - H2O AI World London 2018
 
Stat 5.4 Pre Sales Demo Master
Stat 5.4 Pre Sales Demo MasterStat 5.4 Pre Sales Demo Master
Stat 5.4 Pre Sales Demo Master
 
Blockchain for the DBA and Data Professional
Blockchain for the DBA and Data ProfessionalBlockchain for the DBA and Data Professional
Blockchain for the DBA and Data Professional
 
System Update 2010 CrossRef Workshops
System Update 2010 CrossRef WorkshopsSystem Update 2010 CrossRef Workshops
System Update 2010 CrossRef Workshops
 
Re invent 2018 meetup presentation
Re invent 2018 meetup presentationRe invent 2018 meetup presentation
Re invent 2018 meetup presentation
 
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...
Flink Forward SF 2017: Trevor Grant - Introduction to Online Machine Learning...
 

Semelhante a Os Solomon

Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Yahoo Developer Network
 
The Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web ServiceThe Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web ServicePoornima Vijayashanker
 
PHP – Faster And Cheaper. Scale Vertically with IBM i
PHP – Faster And Cheaper. Scale Vertically with IBM iPHP – Faster And Cheaper. Scale Vertically with IBM i
PHP – Faster And Cheaper. Scale Vertically with IBM iSam Hennessy
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015Christopher Curtin
 
Architecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons LearnedArchitecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons LearnedJoão Pedro Martins
 
Intro to Cloud Architecture
Intro to Cloud ArchitectureIntro to Cloud Architecture
Intro to Cloud Architecturewlscaudill
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsDirecti Group
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
Modern Application Stacks
Modern Application StacksModern Application Stacks
Modern Application Stackschartjes
 
AWS case study: real estate portal
AWS case study: real estate portalAWS case study: real estate portal
AWS case study: real estate portalAndreas Chatzakis
 
Class 7: Introduction to web technology entrepreneurship
Class 7: Introduction to web technology entrepreneurshipClass 7: Introduction to web technology entrepreneurship
Class 7: Introduction to web technology entrepreneurshipallanchao
 
Web Speed And Scalability
Web Speed And ScalabilityWeb Speed And Scalability
Web Speed And ScalabilityJason Ragsdale
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitterRoger Xia
 
Gavin M
Gavin MGavin M
Gavin MOntico
 
Software Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableSoftware Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableComsysto Reply GmbH
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudyJohn Adams
 

Semelhante a Os Solomon (20)

Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
Data Applications and Infrastructure at LinkedIn__HadoopSummit2010
 
The Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web ServiceThe Evolution of a Scrappy Startup to a Successful Web Service
The Evolution of a Scrappy Startup to a Successful Web Service
 
PHP – Faster And Cheaper. Scale Vertically with IBM i
PHP – Faster And Cheaper. Scale Vertically with IBM iPHP – Faster And Cheaper. Scale Vertically with IBM i
PHP – Faster And Cheaper. Scale Vertically with IBM i
 
UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015UnConference for Georgia Southern Computer Science March 31, 2015
UnConference for Georgia Southern Computer Science March 31, 2015
 
Architecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons LearnedArchitecting a Large Software Project - Lessons Learned
Architecting a Large Software Project - Lessons Learned
 
Intro to Cloud Architecture
Intro to Cloud ArchitectureIntro to Cloud Architecture
Intro to Cloud Architecture
 
Handling Data in Mega Scale Systems
Handling Data in Mega Scale SystemsHandling Data in Mega Scale Systems
Handling Data in Mega Scale Systems
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
Modern Application Stacks
Modern Application StacksModern Application Stacks
Modern Application Stacks
 
AWS case study: real estate portal
AWS case study: real estate portalAWS case study: real estate portal
AWS case study: real estate portal
 
Class 7: Introduction to web technology entrepreneurship
Class 7: Introduction to web technology entrepreneurshipClass 7: Introduction to web technology entrepreneurship
Class 7: Introduction to web technology entrepreneurship
 
Web Speed And Scalability
Web Speed And ScalabilityWeb Speed And Scalability
Web Speed And Scalability
 
Db trends final
Db trends   finalDb trends   final
Db trends final
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Gavin M
Gavin MGavin M
Gavin M
 
Software Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuableSoftware Architecture and Architectors: useless VS valuable
Software Architecture and Architectors: useless VS valuable
 
John adams talk cloudy
John adams   talk cloudyJohn adams   talk cloudy
John adams talk cloudy
 

Mais de oscon2007

J Ruby Whirlwind Tour
J Ruby Whirlwind TourJ Ruby Whirlwind Tour
J Ruby Whirlwind Touroscon2007
 
Solr Presentation5
Solr Presentation5Solr Presentation5
Solr Presentation5oscon2007
 
Os Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman WiifmOs Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman Wiifmoscon2007
 
Performance Whack A Mole
Performance Whack A MolePerformance Whack A Mole
Performance Whack A Moleoscon2007
 
Os Lanphier Brashears
Os Lanphier BrashearsOs Lanphier Brashears
Os Lanphier Brashearsoscon2007
 
Os Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman SwpOs Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman Swposcon2007
 
Os Berlin Dispelling Myths
Os Berlin Dispelling MythsOs Berlin Dispelling Myths
Os Berlin Dispelling Mythsoscon2007
 
Os Keysholistic
Os KeysholisticOs Keysholistic
Os Keysholisticoscon2007
 
Os Jonphillips
Os JonphillipsOs Jonphillips
Os Jonphillipsoscon2007
 
Os Urnerupdated
Os UrnerupdatedOs Urnerupdated
Os Urnerupdatedoscon2007
 

Mais de oscon2007 (20)

J Ruby Whirlwind Tour
J Ruby Whirlwind TourJ Ruby Whirlwind Tour
J Ruby Whirlwind Tour
 
Solr Presentation5
Solr Presentation5Solr Presentation5
Solr Presentation5
 
Os Borger
Os BorgerOs Borger
Os Borger
 
Os Harkins
Os HarkinsOs Harkins
Os Harkins
 
Os Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman WiifmOs Fitzpatrick Sussman Wiifm
Os Fitzpatrick Sussman Wiifm
 
Os Bunce
Os BunceOs Bunce
Os Bunce
 
Yuicss R7
Yuicss R7Yuicss R7
Yuicss R7
 
Performance Whack A Mole
Performance Whack A MolePerformance Whack A Mole
Performance Whack A Mole
 
Os Fogel
Os FogelOs Fogel
Os Fogel
 
Os Lanphier Brashears
Os Lanphier BrashearsOs Lanphier Brashears
Os Lanphier Brashears
 
Os Tucker
Os TuckerOs Tucker
Os Tucker
 
Os Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman SwpOs Fitzpatrick Sussman Swp
Os Fitzpatrick Sussman Swp
 
Os Furlong
Os FurlongOs Furlong
Os Furlong
 
Os Berlin Dispelling Myths
Os Berlin Dispelling MythsOs Berlin Dispelling Myths
Os Berlin Dispelling Myths
 
Os Kimsal
Os KimsalOs Kimsal
Os Kimsal
 
Os Pruett
Os PruettOs Pruett
Os Pruett
 
Os Alrubaie
Os AlrubaieOs Alrubaie
Os Alrubaie
 
Os Keysholistic
Os KeysholisticOs Keysholistic
Os Keysholistic
 
Os Jonphillips
Os JonphillipsOs Jonphillips
Os Jonphillips
 
Os Urnerupdated
Os UrnerupdatedOs Urnerupdated
Os Urnerupdated
 

Último

Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 

Último (20)

Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 

Os Solomon

  • 1. Super-sizing YouTube with Python Mike Solomon mike@youtube.com
  • 2. Welcome this is about scaling a web application there are a lot of things left out - mostly mistakes and implementation details this may generate more questions than it answers my goal is to give you ideas for solving your own problems
  • 3. Architecture this is the core of scalability systems change over time, so will your architecture impossible to predict the optimal approach start simple aim for local maxima python enables flexibility
  • 4. YouTube's Early Days web boxes do everything servlets, images, thumbnails, search shoehorn everything into Apache, MySQL very simple this survives longer than you'd think
  • 5. hw load balancer httpd mod_python db objects search thumbnails biz logic servlets templates Early Web Stack db master circa January ‘06 db replicas
  • 6. Early Key Factors in Engineering really small team we python logical separation in code discipline and honor - not linguistically enforced (don’t waste time writing code to restrict people)* grown by systematically removing bottlenecks easy to know when something is a `win`
  • 7. Running Without Tripping user demand can grow 50% in a day removing one bottleneck can immediately reveal another (usually more heinous) replace and migrate components as they become problems good (python) components make this easy obviously, pick your battles
  • 8. Good Components (Hypothetical) minimize dependencies* accept some latency localize failures - don’t let them spread you are only down if it looks like you are applies to both systems and software
  • 9. Balance Machine Resources more efficient resource utilization via specialized deployment balance based on CPU, RAM, network and disk usage patterns overlay orthogonal loads disjoint tasks running on the same physical hardware
  • 10. Migratory Patterns of the Norwegian Blue move from mod_python to mod_fastcgi move thumbnails to their own machines make search to a remote service running on separate machines run transcoder processes on video servers do more with the same hardware
  • 11. Serenity Now Can you spot where we turned on transcoding processes?
  • 12. SQL Shenanigans if you have a relational database, it will be abused difficult to track the true source series of object proxies for DB-API enable logging encode a portion of call stack as a query comment* (more about this later)
  • 13. Object Caching take pressure off of relational db can save additional resources if your objects require significant computation to set up memcached makes a good home for this need good client to make this into a truly useful service ‡ pools and better failure handling
  • 14. Software Optimization fast vs fast enough strive for machine efficiency - don't obsess be scientific - collect data and understand it can yield some surprising results don't assume code optimization techniques from another language are relevant just like carpentry, measure twice cut once
  • 15. Python Optimization pure python HMAC was 40% of web cpu write a few lines of C threaded comments fiasco overly complex algorithm to compute the display object tree simplify query, simplify algorithm
  • 16. Python Optimization psyco - specializing compiler for Python 'hot' functions are psyco-ized there is a 'context switch' penalty so you need to experiment to see if it helps previous threaded comments algorithm -closure +psyco = 400% boost
  • 17. Reasonable Efficiency pruned all the obvious leaf services dynamic web requests are one `service` web service is easy to scale, so it stresses out other resources - probably a DB DB’s are hard(er) to scale tricks of escalating cleverness‡ eventually, no cards left to play
  • 18. Scaling MySQL pretty much have to go horizontal choose your partition plan carefully understand your data access patterns what queries do you run most often? do you have joins? do you need transactional consistency? why? does an 'entity' emerge?
  • 19. Partition By Entity entities are 'transactional' allow joins across properties of an entity entities are migratory cross entity is more complicated weaken guarantees to make it easier minimize activity by design
  • 20. EMD, a TLA not an ORM! connection and transaction management lookup service query factory minimalist table abstraction ORM can be (is?) evil make common behaviors simple, while leaving some transparency to the actual database
  • 21. Seismic Retrofit apply this fundamental change to a large and growing site make it relatively painless with python multiple inheritance decorators AST plugins for validation and testing
  • 22. Resulting API all the scale-aware code nicely opaque to application developers base use cases are painless User.select_by_username(db_context, username) Video.select_by_id(db_context, video_id) Video.select_by_user_id(db_context, user_id)
  • 23. Bulk Entity Migration hijack mysql replication to partition on the fly while the live site is running all DML gets tagged with an entity id read master binlog and selectively replay it into a set of new mini-masters update lookup service to point to new resources
  • 24. Recurring Themes the elegance of simplicity take reliable open software and customize it `pythonic veneer` DIY - filing a ticket for a bugfix doesn’t give me a warm feeling - take matters into your own hands*