SlideShare uma empresa Scribd logo
1 de 34
How To Make Life Suck
       Less!
    (when building scalable systems)

       Bradford Stephens
    c: www.DrawnToScaleHQ.com
       b: www.roadtofailure.com
           t: @lusciouspear
About Me
About Me

• Founder, Drawn to Scale. Lead Engineer,
  Visible Technologies
About Me

• Founder, Drawn to Scale. Lead Engineer,
  Visible Technologies
• CS Degree, University of North FL
About Me

• Founder, Drawn to Scale. Lead Engineer,
  Visible Technologies
• CS Degree, University of North FL
• Former careers in politics, music, finance,
  consulting
Drawn to Scale

• Building the “Big Data” platform: ingestion,
  processing, storage, search
• Products coming: Big Log, Big Search
  (faceted), Big Message...
Topics

• Overview
• Operations
• Engneering
• Process
Everything Changes
      with Big Data

• Bar is set higher: a previously niche field,
  few standard stacks (like LAMP)
• You need to have better engineering for
  minimum success
Scalability Matters

• “Web-Scale” data is unstructured and
  exponentially interconnected
• Social Media: Catalyst
• All data is important
• Data Size != Business Size
The Traditional DB
• Excel with highly structured, normalizable
  data
• Non-Linear Scale Cost
• More data = less features
• Optimized for single-node
• 90% of utility is 5% of capability
Ergo, Distributed

• Optimize for the problems, no Swiss-Army
  knife
• Shared-nothing, commodity boxes
• Linear scale cost
The State of Things

• Order changed from 20 years ago:
• Cust. Experience is paramount
• Engineers are precious
• Fast I/O is expensive
• Storage is cheap
Recovery-Oriented
      Computing

1. Seamlessly Partitioned
2. Synchronously Redundant
3. Heavily Monitored
Operations

Moving the Sysadmin:Box ratio from 2:1 to
             200:1 to 2000:1


   (yes devs, you’ll care about this too)
Ops vs. Eng

• Engineers build, Ops manages
• Fixing problems: devs code+automate, ops
  hire
• Want something fixed? Call devs at 2 AM.
Config is Important

• Configuration is not 2nd-class anymore
• Needs to be tackled by Engineers
• New frameworks = months of
  configuration and experimentation
• Chef is a good start, but...
Production = Test

• Surprise! You don’t have a Test environment
  any more.
• Test Cost => Prod Cost
• Anything that’s not your data center is an
  approximation. Switches, cable, power,
  boxes, etc...
You’re Always Testing

• Constantly simulate failures and brownouts
  of boxes, racks, switches...
• “Canary in the Coal Mine”: run a box and
  rack at 175% current load.
Deployment


• Deploy gradually: 1 box, 2 boxes, 1 rack...
• Code granularly, backwards-compatible
Built to Fail
• “It’s working” isn’t binary
• Acting weird? Shoot it.
• Multi-system failure is common: be
  topology aware
• Avoid false negative: something’s wrong and
  you don’t know it, lose customer data
• This is empowering!
Engineering


This is Systems Software, not Applications
                 Software
This is Hard :(
• Engineering at scale is very different than
  writing a 3-tier webapp
• Care about garbage collection, election
  algorithms, data structures, access patterns,
  etc...
• CS knowledge is required, not a luxury
• DBA/RDBMS skills pretty useless
• CAP is law
Not Everything’s a Table

• Structure your data according to how it
  needs to be used
• Unstructured massive files, graphs, KV-
  stores
• The more your problem narrows, the
  easier it is to scale
Big Data is BIG

• Imagine your test passes taking hours
• What works at 1.5 TB may fail at 10MB or
  2 TB
• Many tests, simple code
• Soft Delete Only
“No, I won’t give you a
        repro”

• Often impossible to repro a bug on
  demand in a cluster
• Either fix your logging or your bug
• Log everything (we have a product for this!)
Avoiding Impedance
       Mismatch

• High vs. Low Latency vs. Throughput
• A lot of data eventually, or a little now
• MapReduce vs. Sharding/Indexing
Simple Workflow
                       Semantic     Unstructured
Hadoop      Collect
                       Analysis       Analysis



                       Structured
                        Analysis
Hadoop +    Store in
 HBase      HBase
                                     Store in
                       Indexing
                                     Hadoop


Lucene+                 Load/
              Pull
 Solr+                 Replicate
            Indexes
 Katta                  Shards           Search
Biz + Process


The softer side of distributed computing
Hiring


• Plan for more engineers, less ops
• Be aware of “context switch cost” when
  training RDBMS-folks
It’s Not Just Coding
• Be aware of research cost
• Much more time spent experimenting, not
  coding
• Coding all this from scratch is horrific
• Nailing together 10+ OSS projects is a pain
• Open source anything not “Secret sauce”
Solve your Core
         Problem

• “Making your own electricity doesn’t create
  better tasting beer”
• Plan to use an end-to-end platform in the
  future (hint: ours!)
In Summary

• Plan for everything to fail
• Test constantly in production
• Systems Software requires Computer
  Science
• Don’t build it if you don’t have to
Thanks!

• Ya’ll
• Road to Failure Readers
• James Hamilton, Amazon/MS
• Bradford Cross, Flightcaster
• Ryan Rawson, HBase/Stumbleupon
Useful Resources

• www.roadtofailure.com
• www.highscalability.com
• perspectives.mvdirona.com

Mais conteúdo relacionado

Mais procurados

Normalization in Redux
Normalization in ReduxNormalization in Redux
Normalization in ReduxUnfold UI
 
Should I use a document database?
Should I use a document database?Should I use a document database?
Should I use a document database?Oren Eini
 
Big data and mstr bridge the elephant
Big data and mstr   bridge the elephantBig data and mstr   bridge the elephant
Big data and mstr bridge the elephantKognitio
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Randy Shoup
 
Best Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayBest Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayRandy Shoup
 
How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.Steve Hoffman
 
[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReaders
[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReaders[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReaders
[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReadersRakuten Group, Inc.
 
Apache hadoop by shah
Apache hadoop by shahApache hadoop by shah
Apache hadoop by shahShah Hussain
 
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015Adrian Carr
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughRandy Shoup
 
Evolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBayEvolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBayRandy Shoup
 
Responsive images
Responsive imagesResponsive images
Responsive imagesNate Walton
 
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless Dreams
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless DreamsRainbows, Unicorns, and other Fairy Tales in the Land of Serverless Dreams
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless DreamsJosh Carlisle
 
Hadoop at Meebo: Lessons in the Real World
Hadoop at Meebo: Lessons in the Real WorldHadoop at Meebo: Lessons in the Real World
Hadoop at Meebo: Lessons in the Real Worldvoberoi
 
Webinar: Capacity Planning
Webinar: Capacity PlanningWebinar: Capacity Planning
Webinar: Capacity PlanningMongoDB
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at ScaleRandy Shoup
 

Mais procurados (20)

Normalization in Redux
Normalization in ReduxNormalization in Redux
Normalization in Redux
 
Should I use a document database?
Should I use a document database?Should I use a document database?
Should I use a document database?
 
Big data and mstr bridge the elephant
Big data and mstr   bridge the elephantBig data and mstr   bridge the elephant
Big data and mstr bridge the elephant
 
Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015Service Architectures At Scale - QCon London 2015
Service Architectures At Scale - QCon London 2015
 
Continuous database deployment
Continuous database deploymentContinuous database deployment
Continuous database deployment
 
Not Just Another Overview of Apache Hadoop
Not Just Another Overview of Apache HadoopNot Just Another Overview of Apache Hadoop
Not Just Another Overview of Apache Hadoop
 
Best Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBayBest Practices for Large-Scale Websites -- Lessons from eBay
Best Practices for Large-Scale Websites -- Lessons from eBay
 
How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.How Open Source is Transforming the Internet. Again.
How Open Source is Transforming the Internet. Again.
 
[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReaders
[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReaders[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReaders
[Rakuten TechConf2014] [C-2] Big Data for eBooks and eReaders
 
Apache hadoop by shah
Apache hadoop by shahApache hadoop by shah
Apache hadoop by shah
 
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
 
Minimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good EnoughMinimum Viable Architecture - Good Enough is Good Enough
Minimum Viable Architecture - Good Enough is Good Enough
 
Optimera STHLM 2011 - Mikael Berggren, Spotify
Optimera STHLM 2011 - Mikael Berggren, SpotifyOptimera STHLM 2011 - Mikael Berggren, Spotify
Optimera STHLM 2011 - Mikael Berggren, Spotify
 
Evolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBayEvolving Architecture and Organization - Lessons from Google and eBay
Evolving Architecture and Organization - Lessons from Google and eBay
 
Responsive images
Responsive imagesResponsive images
Responsive images
 
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless Dreams
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless DreamsRainbows, Unicorns, and other Fairy Tales in the Land of Serverless Dreams
Rainbows, Unicorns, and other Fairy Tales in the Land of Serverless Dreams
 
Hadoop at Meebo: Lessons in the Real World
Hadoop at Meebo: Lessons in the Real WorldHadoop at Meebo: Lessons in the Real World
Hadoop at Meebo: Lessons in the Real World
 
Qcon talk
Qcon talkQcon talk
Qcon talk
 
Webinar: Capacity Planning
Webinar: Capacity PlanningWebinar: Capacity Planning
Webinar: Capacity Planning
 
Moving Fast at Scale
Moving Fast at ScaleMoving Fast at Scale
Moving Fast at Scale
 

Destaque

Patentové právo a jeho potencionální reforma2
Patentové právo a jeho potencionální reforma2Patentové právo a jeho potencionální reforma2
Patentové právo a jeho potencionální reforma2Tomáš Pavelka
 
是誰讓台灣停止轉動?
是誰讓台灣停止轉動?是誰讓台灣停止轉動?
是誰讓台灣停止轉動?lys167
 
O net วิทยาศาสตร์
O net วิทยาศาสตร์O net วิทยาศาสตร์
O net วิทยาศาสตร์Sendai' Toktak
 
Thomas Lang Creative Coordanetion and foot techinic
Thomas Lang Creative Coordanetion and foot techinicThomas Lang Creative Coordanetion and foot techinic
Thomas Lang Creative Coordanetion and foot techinicMateus da Silva
 
escala likert
escala likertescala likert
escala likertAna Paula
 
02ca74a77252a41d5905194b2213fd74
02ca74a77252a41d5905194b2213fd7402ca74a77252a41d5905194b2213fd74
02ca74a77252a41d5905194b2213fd74Yokyok' Nnp
 
Restaurace trutnov
Restaurace trutnovRestaurace trutnov
Restaurace trutnovzsjak
 
Self advocacy and social inclusion – learnings from the speaking up over the ...
Self advocacy and social inclusion – learnings from the speaking up over the ...Self advocacy and social inclusion – learnings from the speaking up over the ...
Self advocacy and social inclusion – learnings from the speaking up over the ...Christine Bigby
 
ČESKÝ INTERNET s.r.o. - základní informace
ČESKÝ INTERNET s.r.o. - základní informaceČESKÝ INTERNET s.r.o. - základní informace
ČESKÝ INTERNET s.r.o. - základní informaceIvo Stejskal
 
Webinář Výběrové řízení dle Pravidel OPPI bez chyb
Webinář Výběrové řízení dle Pravidel OPPI bez chybWebinář Výběrové řízení dle Pravidel OPPI bez chyb
Webinář Výběrové řízení dle Pravidel OPPI bez chybeNovation s.r.o.
 
O processo de recolha de dados - inquérito
O processo de recolha de dados - inquéritoO processo de recolha de dados - inquérito
O processo de recolha de dados - inquéritoSerafina Roque
 

Destaque (15)

Mezinárodní cestovní ruch
Mezinárodní cestovní ruchMezinárodní cestovní ruch
Mezinárodní cestovní ruch
 
Patentové právo a jeho potencionální reforma2
Patentové právo a jeho potencionální reforma2Patentové právo a jeho potencionální reforma2
Patentové právo a jeho potencionální reforma2
 
是誰讓台灣停止轉動?
是誰讓台灣停止轉動?是誰讓台灣停止轉動?
是誰讓台灣停止轉動?
 
O net วิทยาศาสตร์
O net วิทยาศาสตร์O net วิทยาศาสตร์
O net วิทยาศาสตร์
 
Thomas Lang Creative Coordanetion and foot techinic
Thomas Lang Creative Coordanetion and foot techinicThomas Lang Creative Coordanetion and foot techinic
Thomas Lang Creative Coordanetion and foot techinic
 
escala likert
escala likertescala likert
escala likert
 
02ca74a77252a41d5905194b2213fd74
02ca74a77252a41d5905194b2213fd7402ca74a77252a41d5905194b2213fd74
02ca74a77252a41d5905194b2213fd74
 
Humanos/tecnología
Humanos/tecnologíaHumanos/tecnología
Humanos/tecnología
 
Restaurace trutnov
Restaurace trutnovRestaurace trutnov
Restaurace trutnov
 
Self advocacy and social inclusion – learnings from the speaking up over the ...
Self advocacy and social inclusion – learnings from the speaking up over the ...Self advocacy and social inclusion – learnings from the speaking up over the ...
Self advocacy and social inclusion – learnings from the speaking up over the ...
 
Cesna Light
Cesna LightCesna Light
Cesna Light
 
Karate
KarateKarate
Karate
 
ČESKÝ INTERNET s.r.o. - základní informace
ČESKÝ INTERNET s.r.o. - základní informaceČESKÝ INTERNET s.r.o. - základní informace
ČESKÝ INTERNET s.r.o. - základní informace
 
Webinář Výběrové řízení dle Pravidel OPPI bez chyb
Webinář Výběrové řízení dle Pravidel OPPI bez chybWebinář Výběrové řízení dle Pravidel OPPI bez chyb
Webinář Výběrové řízení dle Pravidel OPPI bez chyb
 
O processo de recolha de dados - inquérito
O processo de recolha de dados - inquéritoO processo de recolha de dados - inquérito
O processo de recolha de dados - inquérito
 

Semelhante a Make Life Suck Less (Building Scalable Systems)

Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)Bradford Stephens
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudChris Dagdigian
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Andrew Brust
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLTriNimbus
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game ChangerCaserta
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineeringThang Bui (Bob)
 
Accelerating analytics in a new era of data
Accelerating analytics in a new era of dataAccelerating analytics in a new era of data
Accelerating analytics in a new era of dataArnon Shimoni
 
Hadoop for the Absolute Beginner
Hadoop for the Absolute BeginnerHadoop for the Absolute Beginner
Hadoop for the Absolute BeginnerIke Ellis
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitterRoger Xia
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...smallerror
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...xlight
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP120bi
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsAchievers Tech
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...Institute of Contemporary Sciences
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's ArchitectureTony Tam
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesRob Winters
 

Semelhante a Make Life Suck Less (Building Scalable Systems) (20)

Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)Make Life Suck Less (Building Scalable Systems)
Make Life Suck Less (Building Scalable Systems)
 
Mapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the CloudMapping Life Science Informatics to the Cloud
Mapping Life Science Informatics to the Cloud
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACLPerformance Optimization of Cloud Based Applications by Peter Smith, ACL
Performance Optimization of Cloud Based Applications by Peter Smith, ACL
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer5 Things that Make Hadoop a Game Changer
5 Things that Make Hadoop a Game Changer
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
 
Accelerating analytics in a new era of data
Accelerating analytics in a new era of dataAccelerating analytics in a new era of data
Accelerating analytics in a new era of data
 
Hadoop for the Absolute Beginner
Hadoop for the Absolute BeginnerHadoop for the Absolute Beginner
Hadoop for the Absolute Beginner
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHPScaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling a High Traffic Web Application: Our Journey from Java to PHP
 
Scaling High Traffic Web Applications
Scaling High Traffic Web ApplicationsScaling High Traffic Web Applications
Scaling High Traffic Web Applications
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 How to use Big Data and Data Lake concept in business using Hadoop and Spark... How to use Big Data and Data Lake concept in business using Hadoop and Spark...
How to use Big Data and Data Lake concept in business using Hadoop and Spark...
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 

Make Life Suck Less (Building Scalable Systems)

  • 1. How To Make Life Suck Less! (when building scalable systems) Bradford Stephens c: www.DrawnToScaleHQ.com b: www.roadtofailure.com t: @lusciouspear
  • 3. About Me • Founder, Drawn to Scale. Lead Engineer, Visible Technologies
  • 4. About Me • Founder, Drawn to Scale. Lead Engineer, Visible Technologies • CS Degree, University of North FL
  • 5. About Me • Founder, Drawn to Scale. Lead Engineer, Visible Technologies • CS Degree, University of North FL • Former careers in politics, music, finance, consulting
  • 6. Drawn to Scale • Building the “Big Data” platform: ingestion, processing, storage, search • Products coming: Big Log, Big Search (faceted), Big Message...
  • 7. Topics • Overview • Operations • Engneering • Process
  • 8. Everything Changes with Big Data • Bar is set higher: a previously niche field, few standard stacks (like LAMP) • You need to have better engineering for minimum success
  • 9. Scalability Matters • “Web-Scale” data is unstructured and exponentially interconnected • Social Media: Catalyst • All data is important • Data Size != Business Size
  • 10. The Traditional DB • Excel with highly structured, normalizable data • Non-Linear Scale Cost • More data = less features • Optimized for single-node • 90% of utility is 5% of capability
  • 11. Ergo, Distributed • Optimize for the problems, no Swiss-Army knife • Shared-nothing, commodity boxes • Linear scale cost
  • 12. The State of Things • Order changed from 20 years ago: • Cust. Experience is paramount • Engineers are precious • Fast I/O is expensive • Storage is cheap
  • 13. Recovery-Oriented Computing 1. Seamlessly Partitioned 2. Synchronously Redundant 3. Heavily Monitored
  • 14. Operations Moving the Sysadmin:Box ratio from 2:1 to 200:1 to 2000:1 (yes devs, you’ll care about this too)
  • 15. Ops vs. Eng • Engineers build, Ops manages • Fixing problems: devs code+automate, ops hire • Want something fixed? Call devs at 2 AM.
  • 16. Config is Important • Configuration is not 2nd-class anymore • Needs to be tackled by Engineers • New frameworks = months of configuration and experimentation • Chef is a good start, but...
  • 17. Production = Test • Surprise! You don’t have a Test environment any more. • Test Cost => Prod Cost • Anything that’s not your data center is an approximation. Switches, cable, power, boxes, etc...
  • 18. You’re Always Testing • Constantly simulate failures and brownouts of boxes, racks, switches... • “Canary in the Coal Mine”: run a box and rack at 175% current load.
  • 19. Deployment • Deploy gradually: 1 box, 2 boxes, 1 rack... • Code granularly, backwards-compatible
  • 20. Built to Fail • “It’s working” isn’t binary • Acting weird? Shoot it. • Multi-system failure is common: be topology aware • Avoid false negative: something’s wrong and you don’t know it, lose customer data • This is empowering!
  • 21. Engineering This is Systems Software, not Applications Software
  • 22. This is Hard :( • Engineering at scale is very different than writing a 3-tier webapp • Care about garbage collection, election algorithms, data structures, access patterns, etc... • CS knowledge is required, not a luxury • DBA/RDBMS skills pretty useless • CAP is law
  • 23. Not Everything’s a Table • Structure your data according to how it needs to be used • Unstructured massive files, graphs, KV- stores • The more your problem narrows, the easier it is to scale
  • 24. Big Data is BIG • Imagine your test passes taking hours • What works at 1.5 TB may fail at 10MB or 2 TB • Many tests, simple code • Soft Delete Only
  • 25. “No, I won’t give you a repro” • Often impossible to repro a bug on demand in a cluster • Either fix your logging or your bug • Log everything (we have a product for this!)
  • 26. Avoiding Impedance Mismatch • High vs. Low Latency vs. Throughput • A lot of data eventually, or a little now • MapReduce vs. Sharding/Indexing
  • 27. Simple Workflow Semantic Unstructured Hadoop Collect Analysis Analysis Structured Analysis Hadoop + Store in HBase HBase Store in Indexing Hadoop Lucene+ Load/ Pull Solr+ Replicate Indexes Katta Shards Search
  • 28. Biz + Process The softer side of distributed computing
  • 29. Hiring • Plan for more engineers, less ops • Be aware of “context switch cost” when training RDBMS-folks
  • 30. It’s Not Just Coding • Be aware of research cost • Much more time spent experimenting, not coding • Coding all this from scratch is horrific • Nailing together 10+ OSS projects is a pain • Open source anything not “Secret sauce”
  • 31. Solve your Core Problem • “Making your own electricity doesn’t create better tasting beer” • Plan to use an end-to-end platform in the future (hint: ours!)
  • 32. In Summary • Plan for everything to fail • Test constantly in production • Systems Software requires Computer Science • Don’t build it if you don’t have to
  • 33. Thanks! • Ya’ll • Road to Failure Readers • James Hamilton, Amazon/MS • Bradford Cross, Flightcaster • Ryan Rawson, HBase/Stumbleupon
  • 34. Useful Resources • www.roadtofailure.com • www.highscalability.com • perspectives.mvdirona.com