SlideShare uma empresa Scribd logo
1 de 48
Large platform
architecture in (mostly)
  perl - an illustrated
          tour
         Tomas (t0m) Doran
         São Paulo.pm perl workshop 2010
         YAPC::EU Pisa 2010
This talk

• Is mostly a ramble
• About what I do for a living
• Good bits
• and bad bits (probably mostly bad bits)
• And when I say ‘illustrated’, I’m not very
  good at diagrams, sorry...
Making money from
  independent music

• IMPOSSIBLE
• No, no it isn’t. But we’re very lucky to have
  people who know the music industry
• A startup would tank
• Last.fm guys “keep losing less money”
The state51 conspiracy
Consolidated Independent
 Media Service Provider
 • Several (largely profitable) businesses based
   on the same technology platform
 • East London (Brick Lane), a warehouse.
 • > 60% of UK independent content goes
   through us somewhere
Being S3 on the cheap
• WAV files are big.Videos are bigger.
• Transcodes aren’t small, especially when
  you have 15 of them.
• My music collection is several   hundred
  terrabytes
• Need to be able to serve this stuff fast and
  concurrently.
MogileFS

• Is free.
• Runs on cheap hardware
• Cheaper then S3.
• Not so awesome if you aren’t Livejournal
Data center design
• 8 amp racks. Seriously, WTF!?!?!
• Electricity is more expensive than servers,
  ergo rolling hardware upgrades trivially pay
  for themselves.
• Transit is really, really expensive.
• Worth buying fiber to other locations to
  peer if you need lots of bandwidth.
Platform overview
           <VIP>                                  <VIP>                             <VIP>

           Varnish                                Varnish                           Varnish


ESI:                                 ESI:                              ESI:

                       nginx                                  nginx                             nginx
       Apache        + mogile               Apache          + mogile          Apache          + mogile
                     + custom                               + custom                          + custom



       FCGI          FCGI le                  FCGI         FCGI le          FCGI            FCGI le
       apps            auth                    apps           auth            apps              auth




                                                      Also: Encoding (bare metal)
                                                          Encoding (VMWare)
                                                        Encoding SOAP service
                                                              Memcached
                                                             Mogile Tracker

                                                      Storage
                Replication                            StorageStorage
                                                         StorageStorageStorage
  MySQL                          MySQL                     StorageStorageStorage
                                                            StorageStorageStorage
 Object store                   Object store                  StorageStorageStorage
   Master                         Slave                         StorageStorageStorage
                                                                 StorageStorageStorage
                                                                   StorageStorageStorage
                                                                     StorageStorageStorage
                                                                      StorageStorageStorage
                                                                        StorageStorageStorage
                                                                                 StorageStorage
                                                                                         Storage
Web architecture
• App servers apache, apps FastCGI, port 81
• Varnish + ESI, caching, port 80
• 1 varnish per host, talks to all the apaches
• 1 VIP per host
• Host fail:VIP transfer
• Apache/app fail (or overload), varnish
  rebalances/retries.
Web architecture (cont)

• Varnish doesn’t cache media, just provides
  failover.
• nginx sends the hit to FastCGI app.
• Returns X-Accel-Redirect.
• nginx talks to MogileFS, handles delivery.
<VIP>                                  <VIP>                             <VIP>

           Varnish                                Varnish                           Varnish


ESI:                                 ESI:                              ESI:

                       nginx                                  nginx                             nginx
       Apache        + mogile               Apache          + mogile          Apache          + mogile
                     + custom                               + custom                          + custom



       FCGI          FCGI le                  FCGI         FCGI le          FCGI            FCGI le
       apps            auth                    apps           auth            apps              auth




                                                      Also: Encoding (bare metal)
                                                          Encoding (VMWare)
                                                        Encoding SOAP service
                                                              Memcached
                                                             Mogile Tracker

                                                      Storage
                Replication                            StorageStorage
                                                         StorageStorageStorage
  MySQL                          MySQL                     StorageStorageStorage
                                                            StorageStorageStorage
 Object store                   Object store                  StorageStorageStorage
   Master                         Slave                         StorageStorageStorage
                                                                 StorageStorageStorage
                                                                   StorageStorageStorage
                                                                     StorageStorageStorage
                                                                      StorageStorageStorage
                                                                        StorageStorageStorage
                                                                                 StorageStorage
                                                                                         Storage
Storage architecture
• Lots of boxes with lots of disk.
• Many additional roles to storage. (Mogile
  tracker, memcache node, metal encoding,
  VMWare, SOAP Service)
• Not all the boxes do all the roles.
• All the roles can safely fall over and die.
• Which is good, as they do. Or the box falls
  over. Or a, then b.
<VIP>                                  <VIP>                             <VIP>

           Varnish                                Varnish                           Varnish


ESI:                                 ESI:                              ESI:

                       nginx                                  nginx                             nginx
       Apache        + mogile               Apache          + mogile          Apache          + mogile
                     + custom                               + custom                          + custom



       FCGI          FCGI le                  FCGI         FCGI le          FCGI            FCGI le
       apps            auth                    apps           auth            apps              auth




                                                      Also: Encoding (bare metal)
                                                          Encoding (VMWare)
                                                        Encoding SOAP service
                                                              Memcached
                                                             Mogile Tracker

                                                      Storage
                Replication                            StorageStorage
                                                         StorageStorageStorage
  MySQL                          MySQL                     StorageStorageStorage
                                                            StorageStorageStorage
 Object store                   Object store                  StorageStorageStorage
   Master                         Slave                         StorageStorageStorage
                                                                 StorageStorageStorage
                                                                   StorageStorageStorage
                                                                     StorageStorageStorage
                                                                      StorageStorageStorage
                                                                        StorageStorageStorage
                                                                                 StorageStorage
                                                                                         Storage
WAV files

• WAV is a container format.
• Loosely defined.
• You can stuff XML documents in WAV files
• Some encoders (oh hai flac) very picky.
• ‘dirty’ and ‘clean’ WAV files.
Transcoding everything


• Lots of different formats
• WMA - GNARGGH%$@*&!!
Win32

• We’re running ActiveState for hysterical
  raisins.
• No XS modules
• Thin as possible
Encoding
HTTP Nodes
 HTTP Nodes
  HTTP Nodes           Encoding Service        Uploading Service




    GET
     &
    PUT
                                       SOAP
                                                                    media
                   Encoder



     Downloader                  Uploader

                                                                   Win32 &
      Local Disk             Encoder
                              (mp3)
                                          Encoder
                                           (wma)                    Unix
Snakes On A Plane

• SOAP actually works ok here, as we
  control both ends.
• Old version of SOAP::Lite
• Wouldn’t recommend interoperating
Logging
• Used to be terribly hard to debug
• Push logs into syslog
• Aggregate in splunk - time correlated from
  encoding machines, web service machines,
  etc.
• Much easier to work out what happened.
Hardware is shit

• When you have several 100 Tb, undetected
  bit error rate of magnetic media is actually
  significant.
• See also networks, memory, etc.
Things will always fail

• If you need reliability, you have to design it
  in from the start.
• Not only will you have (a lot of) hardware
  failures, all the software will break in
  unexpected ways. Lets not talk about
  netotworks..
• Maybe you don’t need this..
Queueing

• We have work queues of different types of
  media (e.g. mp3/wma/aac etc)
• In the database.
• Don’t do this.
MySQL sucks

• 1 type of JOIN
• No query rewriting
• Not enough stats for the planner to be
  sane
This can hurt
• File Transform table:
 • Master (File)
 • Result (File)
 • Status (pending/complete/failed/running)
 • TransformStep (from/to)
• Leads to bad join order, massive fail
MySQL sucks

       FAIL
How to fail
• SELECT all file transforms that lead to wma
  (millions).
• JOIN all files, ever (millions). Filter to find
  those in state ‘pending’
• All pending looks like a bad bet - cardinality
  of ‘all wmas’ looks better than cardinality of
  ‘all pending’.
• JOIN in the wrong order, nested loop,
  screwed..
Queueing
• Did I mention queues in the DB suck?
• Even if you’re not screwing it up.
• Get a Message Queue (or at least an async
  job server)
• If your problem is simple - Gearman.
  Harder or you need interop - RabbitMQ.
Mutable state

• Mutable state is the enemy
• Too many things rw.
• No idea how an object got to this state
Anemic domain model
  Object-oriented programming (OOP) is a
 programming paradigm that uses "objects" –
 data structures consisting of data fields and
      methods together with their
 interactions – to design applications and
computer programs. Programming techniques
     may include features such as data
abstraction, encapsulation, modularity,
       polymorphism, and inheritance.
Anemic domain model
• Superset of too much mutable state
• Able to create invalid objects
• Able to make previously valid objects
  invalid
• Violation of the encapsulation and
  information hiding principles.
scripts

• Lots of our business logic was in scripts
  that manipulated objects
• You need people to run scripts (in screen
  sessions)
• Ewwww, ewwwww.
Jobs
• Moved to a job based approach
• Jobs started by file creation, or changing
  state of something in a web app
• Jobs sent via message queuing.
• Results go via message queueing
• Jobs trigger other jobs
Jobs Example
• Validate XLS file supplied with order.
• Valid files trigger another job to create
  objects for each thing in the XLS
• This then triggers another job to create
  transforms, which are then done...
• ... etc ...
• Can’t do this workflow in a web request.
Jobs Future

• More automation of things people run
  scripts for.
• Automatic job regeneration (you will lose
  messages).
Lava flow

• Old (possibly unclean/invalid) data
• Old (unused/unmaintained) code
• “What harm does it do”
Relational integrity

• Seems to be a pipe dream more often then
  not in the real world.
• Why?
• It’s not hard
Data consistency


• This should theoretically be the same thing
  as relational integrity.
• In practice...
Mumble View Crap

• Too much logic in templates
• Copy & paste
• Business objects viewed as unchangeable
• Deleted 3000 lines from 2 simple
  workflows. This fixed a dozen bugs.
Tangram
• No LEFT JOIN
• Displaying a product list becomes an x n
  problem.
• OUCH
• Keep stupid - put the entire DB hot in
  memcache!
Don’t do web design

• You are a programmer
• Make people pay for a design/CSS/HTML
  person
• Work with them
• Be happy
Love your sysadmins
• Help them out.
• Build packages, or local::libs or something
• Keep everything in revision control
• Allow things to be sensibly configured.
• DOCUMENT THE POSSIBLE SETTINGS
• Use systems management - Puppet?
Love your logs

• Active feedback
• Aggregate in splunk
• Actively prune useless stuff
• Actively add useful stuff after a production
  incident
ESI

• Is really awesome
• Make the pain go away
• PURGE requests
• Keep everything hot all the time
memcache everything

• Keep the entire database hot in memcache
• We mostly ask trivial questions, so just
  cache those paths.
• 30 Gb of RAM isn’t actually much (3
  boxes..)
memcache
• IS A CACHE
• Use sequential port numbers and CNAMES
• E.g. cache0:11210, cache1:11211,
  cache2:11212 etc..
• Run several per machine
• Allows you to scale capacity and rebalance
  without entire cache flush.
Don’t push bytes

• X-Sendfile and X-Accel-Redirect
• I already talked about file delivery like this
• Using 100Mb of RAM to proxy web
  requests does not scale.
Test everything

• Redundant systems need testing
• You’ll still die unexpectedly in production
• If you can manage it, make responsibility for
  deployment SEP.
• Thanks for listening
• Questions?

Mais conteúdo relacionado

Semelhante a Large platform architecture in (mostly) perl - an illustrated tour

The Hitchhikers Guide to client Side Persistent Storage
The Hitchhikers Guide to client Side Persistent StorageThe Hitchhikers Guide to client Side Persistent Storage
The Hitchhikers Guide to client Side Persistent StorageJens Arps
 
Networking Israeli Day 2013 - Hecatonchire: Transparent Memory Aggregation
Networking Israeli Day 2013 - Hecatonchire: Transparent Memory AggregationNetworking Israeli Day 2013 - Hecatonchire: Transparent Memory Aggregation
Networking Israeli Day 2013 - Hecatonchire: Transparent Memory Aggregationaidanshribman
 
EC2とVarnishで画像配信
EC2とVarnishで画像配信EC2とVarnishで画像配信
EC2とVarnishで画像配信Issei Naruta
 
PHP in the Cloud
PHP in the CloudPHP in the Cloud
PHP in the CloudAcquia
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introductionxiakaicd
 
Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...
Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...
Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...AOE
 
Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewXavier Llorà
 
Asyncifying WebAssembly for the modern Web
Asyncifying WebAssembly for the modern WebAsyncifying WebAssembly for the modern Web
Asyncifying WebAssembly for the modern WebIngvar Stepanyan
 
Offline first, the painless way
Offline first, the painless wayOffline first, the painless way
Offline first, the painless wayMarcel Kalveram
 
Bangalore cloudstack user group
Bangalore cloudstack user groupBangalore cloudstack user group
Bangalore cloudstack user groupShapeBlue
 
On Failure and Resilience
On Failure and ResilienceOn Failure and Resilience
On Failure and ResilienceMike Brittain
 
Building Content Applications with JCR and OSGi
Building Content Applications with JCR and OSGiBuilding Content Applications with JCR and OSGi
Building Content Applications with JCR and OSGiCédric Hüsler
 
Managing large and distributed Eclipse server applications.
Managing large and distributed Eclipse server applications.Managing large and distributed Eclipse server applications.
Managing large and distributed Eclipse server applications.Gunnar Wagenknecht
 
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and SolrSharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and SolrMongoDB
 
Mongo la search platform - january 2013
Mongo la   search platform - january 2013Mongo la   search platform - january 2013
Mongo la search platform - january 2013MongoDB
 
tofu - COOKPAD's image system
tofu - COOKPAD's image systemtofu - COOKPAD's image system
tofu - COOKPAD's image systemIssei Naruta
 
料理を楽しくする画像配信システム
料理を楽しくする画像配信システム料理を楽しくする画像配信システム
料理を楽しくする画像配信システムIssei Naruta
 

Semelhante a Large platform architecture in (mostly) perl - an illustrated tour (20)

The Hitchhikers Guide to client Side Persistent Storage
The Hitchhikers Guide to client Side Persistent StorageThe Hitchhikers Guide to client Side Persistent Storage
The Hitchhikers Guide to client Side Persistent Storage
 
Networking Israeli Day 2013 - Hecatonchire: Transparent Memory Aggregation
Networking Israeli Day 2013 - Hecatonchire: Transparent Memory AggregationNetworking Israeli Day 2013 - Hecatonchire: Transparent Memory Aggregation
Networking Israeli Day 2013 - Hecatonchire: Transparent Memory Aggregation
 
EC2とVarnishで画像配信
EC2とVarnishで画像配信EC2とVarnishで画像配信
EC2とVarnishで画像配信
 
PHP in the Cloud
PHP in the CloudPHP in the Cloud
PHP in the Cloud
 
Polyglot OSGi
Polyglot OSGiPolyglot OSGi
Polyglot OSGi
 
Oracle+golden+gate+introduction
Oracle+golden+gate+introductionOracle+golden+gate+introduction
Oracle+golden+gate+introduction
 
Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...
Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...
Magento Imagine 2013: Fabrizio Branca - Learning To Fly: How Angry Birds Reac...
 
Meandre 2.0 Alpha Preview
Meandre 2.0 Alpha PreviewMeandre 2.0 Alpha Preview
Meandre 2.0 Alpha Preview
 
Asyncifying WebAssembly for the modern Web
Asyncifying WebAssembly for the modern WebAsyncifying WebAssembly for the modern Web
Asyncifying WebAssembly for the modern Web
 
Offline first, the painless way
Offline first, the painless wayOffline first, the painless way
Offline first, the painless way
 
Bangalore cloudstack user group
Bangalore cloudstack user groupBangalore cloudstack user group
Bangalore cloudstack user group
 
On Failure and Resilience
On Failure and ResilienceOn Failure and Resilience
On Failure and Resilience
 
Building Content Applications with JCR and OSGi
Building Content Applications with JCR and OSGiBuilding Content Applications with JCR and OSGi
Building Content Applications with JCR and OSGi
 
Managing large and distributed Eclipse server applications.
Managing large and distributed Eclipse server applications.Managing large and distributed Eclipse server applications.
Managing large and distributed Eclipse server applications.
 
Tales from the OSGi trenches
Tales from the OSGi trenchesTales from the OSGi trenches
Tales from the OSGi trenches
 
OSGi for mere mortals
OSGi for mere mortalsOSGi for mere mortals
OSGi for mere mortals
 
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and SolrSharded By Business Line: Migrating to a Core Database using MongoDB and Solr
Sharded By Business Line: Migrating to a Core Database using MongoDB and Solr
 
Mongo la search platform - january 2013
Mongo la   search platform - january 2013Mongo la   search platform - january 2013
Mongo la search platform - january 2013
 
tofu - COOKPAD's image system
tofu - COOKPAD's image systemtofu - COOKPAD's image system
tofu - COOKPAD's image system
 
料理を楽しくする画像配信システム
料理を楽しくする画像配信システム料理を楽しくする画像配信システム
料理を楽しくする画像配信システム
 

Mais de Tomas Doran

Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data storesTomas Doran
 
Dockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsDockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsTomas Doran
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Tomas Doran
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflowTomas Doran
 
Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerTomas Doran
 
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsChasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsTomas Doran
 
Deploying puppet code at light speed
Deploying puppet code at light speedDeploying puppet code at light speed
Deploying puppet code at light speedTomas Doran
 
Thinking through puppet code layout
Thinking through puppet code layoutThinking through puppet code layout
Thinking through puppet code layoutTomas Doran
 
Docker puppetcamp london 2013
Docker puppetcamp london 2013Docker puppetcamp london 2013
Docker puppetcamp london 2013Tomas Doran
 
"The worst code I ever wrote"
"The worst code I ever wrote""The worst code I ever wrote"
"The worst code I ever wrote"Tomas Doran
 
Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Tomas Doran
 
Test driven infrastructure development
Test driven infrastructure developmentTest driven infrastructure development
Test driven infrastructure developmentTomas Doran
 
London devops - orc
London devops - orcLondon devops - orc
London devops - orcTomas Doran
 
London devops logging
London devops loggingLondon devops logging
London devops loggingTomas Doran
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012Tomas Doran
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testingTomas Doran
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testingTomas Doran
 
Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Tomas Doran
 
Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkTomas Doran
 

Mais de Tomas Doran (20)

Empowering developers to deploy their own data stores
Empowering developers to deploy their own data storesEmpowering developers to deploy their own data stores
Empowering developers to deploy their own data stores
 
Dockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internalsDockersh and a brief intro to the docker internals
Dockersh and a brief intro to the docker internals
 
Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014Sensu and Sensibility - Puppetconf 2014
Sensu and Sensibility - Puppetconf 2014
 
Steamlining your puppet development workflow
Steamlining your puppet development workflowSteamlining your puppet development workflow
Steamlining your puppet development workflow
 
Building a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for DockerBuilding a smarter application stack - service discovery and wiring for Docker
Building a smarter application stack - service discovery and wiring for Docker
 
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and JenkinsChasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
Chasing AMI - Building Amazon machine images with Puppet, Packer and Jenkins
 
Deploying puppet code at light speed
Deploying puppet code at light speedDeploying puppet code at light speed
Deploying puppet code at light speed
 
Thinking through puppet code layout
Thinking through puppet code layoutThinking through puppet code layout
Thinking through puppet code layout
 
Docker puppetcamp london 2013
Docker puppetcamp london 2013Docker puppetcamp london 2013
Docker puppetcamp london 2013
 
"The worst code I ever wrote"
"The worst code I ever wrote""The worst code I ever wrote"
"The worst code I ever wrote"
 
Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)Test driven infrastructure development (2 - puppetconf 2013 edition)
Test driven infrastructure development (2 - puppetconf 2013 edition)
 
Test driven infrastructure development
Test driven infrastructure developmentTest driven infrastructure development
Test driven infrastructure development
 
London devops - orc
London devops - orcLondon devops - orc
London devops - orc
 
London devops logging
London devops loggingLondon devops logging
London devops logging
 
Message:Passing - lpw 2012
Message:Passing - lpw 2012Message:Passing - lpw 2012
Message:Passing - lpw 2012
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
 
Webapp security testing
Webapp security testingWebapp security testing
Webapp security testing
 
Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!Dates aghhhh!!?!?!?!
Dates aghhhh!!?!?!?!
 
Messaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new frameworkMessaging, interoperability and log aggregation - a new framework
Messaging, interoperability and log aggregation - a new framework
 
Zero mq logs
Zero mq logsZero mq logs
Zero mq logs
 

Large platform architecture in (mostly) perl - an illustrated tour

  • 1. Large platform architecture in (mostly) perl - an illustrated tour Tomas (t0m) Doran São Paulo.pm perl workshop 2010 YAPC::EU Pisa 2010
  • 2. This talk • Is mostly a ramble • About what I do for a living • Good bits • and bad bits (probably mostly bad bits) • And when I say ‘illustrated’, I’m not very good at diagrams, sorry...
  • 3. Making money from independent music • IMPOSSIBLE • No, no it isn’t. But we’re very lucky to have people who know the music industry • A startup would tank • Last.fm guys “keep losing less money”
  • 4. The state51 conspiracy Consolidated Independent Media Service Provider • Several (largely profitable) businesses based on the same technology platform • East London (Brick Lane), a warehouse. • > 60% of UK independent content goes through us somewhere
  • 5. Being S3 on the cheap • WAV files are big.Videos are bigger. • Transcodes aren’t small, especially when you have 15 of them. • My music collection is several hundred terrabytes • Need to be able to serve this stuff fast and concurrently.
  • 6. MogileFS • Is free. • Runs on cheap hardware • Cheaper then S3. • Not so awesome if you aren’t Livejournal
  • 7. Data center design • 8 amp racks. Seriously, WTF!?!?! • Electricity is more expensive than servers, ergo rolling hardware upgrades trivially pay for themselves. • Transit is really, really expensive. • Worth buying fiber to other locations to peer if you need lots of bandwidth.
  • 8. Platform overview <VIP> <VIP> <VIP> Varnish Varnish Varnish ESI: ESI: ESI: nginx nginx nginx Apache + mogile Apache + mogile Apache + mogile + custom + custom + custom FCGI FCGI le FCGI FCGI le FCGI FCGI le apps auth apps auth apps auth Also: Encoding (bare metal) Encoding (VMWare) Encoding SOAP service Memcached Mogile Tracker Storage Replication StorageStorage StorageStorageStorage MySQL MySQL StorageStorageStorage StorageStorageStorage Object store Object store StorageStorageStorage Master Slave StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorage Storage
  • 9. Web architecture • App servers apache, apps FastCGI, port 81 • Varnish + ESI, caching, port 80 • 1 varnish per host, talks to all the apaches • 1 VIP per host • Host fail:VIP transfer • Apache/app fail (or overload), varnish rebalances/retries.
  • 10. Web architecture (cont) • Varnish doesn’t cache media, just provides failover. • nginx sends the hit to FastCGI app. • Returns X-Accel-Redirect. • nginx talks to MogileFS, handles delivery.
  • 11. <VIP> <VIP> <VIP> Varnish Varnish Varnish ESI: ESI: ESI: nginx nginx nginx Apache + mogile Apache + mogile Apache + mogile + custom + custom + custom FCGI FCGI le FCGI FCGI le FCGI FCGI le apps auth apps auth apps auth Also: Encoding (bare metal) Encoding (VMWare) Encoding SOAP service Memcached Mogile Tracker Storage Replication StorageStorage StorageStorageStorage MySQL MySQL StorageStorageStorage StorageStorageStorage Object store Object store StorageStorageStorage Master Slave StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorage Storage
  • 12. Storage architecture • Lots of boxes with lots of disk. • Many additional roles to storage. (Mogile tracker, memcache node, metal encoding, VMWare, SOAP Service) • Not all the boxes do all the roles. • All the roles can safely fall over and die. • Which is good, as they do. Or the box falls over. Or a, then b.
  • 13. <VIP> <VIP> <VIP> Varnish Varnish Varnish ESI: ESI: ESI: nginx nginx nginx Apache + mogile Apache + mogile Apache + mogile + custom + custom + custom FCGI FCGI le FCGI FCGI le FCGI FCGI le apps auth apps auth apps auth Also: Encoding (bare metal) Encoding (VMWare) Encoding SOAP service Memcached Mogile Tracker Storage Replication StorageStorage StorageStorageStorage MySQL MySQL StorageStorageStorage StorageStorageStorage Object store Object store StorageStorageStorage Master Slave StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorageStorage StorageStorage Storage
  • 14. WAV files • WAV is a container format. • Loosely defined. • You can stuff XML documents in WAV files • Some encoders (oh hai flac) very picky. • ‘dirty’ and ‘clean’ WAV files.
  • 15. Transcoding everything • Lots of different formats • WMA - GNARGGH%$@*&!!
  • 16. Win32 • We’re running ActiveState for hysterical raisins. • No XS modules • Thin as possible
  • 17. Encoding HTTP Nodes HTTP Nodes HTTP Nodes Encoding Service Uploading Service GET & PUT SOAP media Encoder Downloader Uploader Win32 & Local Disk Encoder (mp3) Encoder (wma) Unix
  • 18. Snakes On A Plane • SOAP actually works ok here, as we control both ends. • Old version of SOAP::Lite • Wouldn’t recommend interoperating
  • 19. Logging • Used to be terribly hard to debug • Push logs into syslog • Aggregate in splunk - time correlated from encoding machines, web service machines, etc. • Much easier to work out what happened.
  • 20. Hardware is shit • When you have several 100 Tb, undetected bit error rate of magnetic media is actually significant. • See also networks, memory, etc.
  • 21. Things will always fail • If you need reliability, you have to design it in from the start. • Not only will you have (a lot of) hardware failures, all the software will break in unexpected ways. Lets not talk about netotworks.. • Maybe you don’t need this..
  • 22. Queueing • We have work queues of different types of media (e.g. mp3/wma/aac etc) • In the database. • Don’t do this.
  • 23. MySQL sucks • 1 type of JOIN • No query rewriting • Not enough stats for the planner to be sane
  • 24. This can hurt • File Transform table: • Master (File) • Result (File) • Status (pending/complete/failed/running) • TransformStep (from/to) • Leads to bad join order, massive fail
  • 25. MySQL sucks FAIL
  • 26. How to fail • SELECT all file transforms that lead to wma (millions). • JOIN all files, ever (millions). Filter to find those in state ‘pending’ • All pending looks like a bad bet - cardinality of ‘all wmas’ looks better than cardinality of ‘all pending’. • JOIN in the wrong order, nested loop, screwed..
  • 27. Queueing • Did I mention queues in the DB suck? • Even if you’re not screwing it up. • Get a Message Queue (or at least an async job server) • If your problem is simple - Gearman. Harder or you need interop - RabbitMQ.
  • 28. Mutable state • Mutable state is the enemy • Too many things rw. • No idea how an object got to this state
  • 29. Anemic domain model Object-oriented programming (OOP) is a programming paradigm that uses "objects" – data structures consisting of data fields and methods together with their interactions – to design applications and computer programs. Programming techniques may include features such as data abstraction, encapsulation, modularity, polymorphism, and inheritance.
  • 30. Anemic domain model • Superset of too much mutable state • Able to create invalid objects • Able to make previously valid objects invalid • Violation of the encapsulation and information hiding principles.
  • 31. scripts • Lots of our business logic was in scripts that manipulated objects • You need people to run scripts (in screen sessions) • Ewwww, ewwwww.
  • 32. Jobs • Moved to a job based approach • Jobs started by file creation, or changing state of something in a web app • Jobs sent via message queuing. • Results go via message queueing • Jobs trigger other jobs
  • 33. Jobs Example • Validate XLS file supplied with order. • Valid files trigger another job to create objects for each thing in the XLS • This then triggers another job to create transforms, which are then done... • ... etc ... • Can’t do this workflow in a web request.
  • 34. Jobs Future • More automation of things people run scripts for. • Automatic job regeneration (you will lose messages).
  • 35. Lava flow • Old (possibly unclean/invalid) data • Old (unused/unmaintained) code • “What harm does it do”
  • 36. Relational integrity • Seems to be a pipe dream more often then not in the real world. • Why? • It’s not hard
  • 37. Data consistency • This should theoretically be the same thing as relational integrity. • In practice...
  • 38. Mumble View Crap • Too much logic in templates • Copy & paste • Business objects viewed as unchangeable • Deleted 3000 lines from 2 simple workflows. This fixed a dozen bugs.
  • 39. Tangram • No LEFT JOIN • Displaying a product list becomes an x n problem. • OUCH • Keep stupid - put the entire DB hot in memcache!
  • 40. Don’t do web design • You are a programmer • Make people pay for a design/CSS/HTML person • Work with them • Be happy
  • 41. Love your sysadmins • Help them out. • Build packages, or local::libs or something • Keep everything in revision control • Allow things to be sensibly configured. • DOCUMENT THE POSSIBLE SETTINGS • Use systems management - Puppet?
  • 42. Love your logs • Active feedback • Aggregate in splunk • Actively prune useless stuff • Actively add useful stuff after a production incident
  • 43. ESI • Is really awesome • Make the pain go away • PURGE requests • Keep everything hot all the time
  • 44. memcache everything • Keep the entire database hot in memcache • We mostly ask trivial questions, so just cache those paths. • 30 Gb of RAM isn’t actually much (3 boxes..)
  • 45. memcache • IS A CACHE • Use sequential port numbers and CNAMES • E.g. cache0:11210, cache1:11211, cache2:11212 etc.. • Run several per machine • Allows you to scale capacity and rebalance without entire cache flush.
  • 46. Don’t push bytes • X-Sendfile and X-Accel-Redirect • I already talked about file delivery like this • Using 100Mb of RAM to proxy web requests does not scale.
  • 47. Test everything • Redundant systems need testing • You’ll still die unexpectedly in production • If you can manage it, make responsibility for deployment SEP.
  • 48. • Thanks for listening • Questions?

Notas do Editor