SlideShare uma empresa Scribd logo
1 de 31
Performance	
  Architecture	
  for	
  
          Cloud	
  
                 March	
  7,	
  2011	
  
                Adrian	
  Cockcro:	
  
           @adrianco	
  #ne=lixcloud	
  #ccevent	
  
        h@p://www.linkedin.com/in/adriancockcro:	
  
                 acockcro:@ne=lix.com	
  
Who,	
  Why,	
  What	
  

       Ne=lix	
  in	
  the	
  Cloud	
  
  Cloud	
  Performance	
  Challenges	
  
Performance	
  Architecture	
  and	
  Tools	
  
                    	
  
Ne=lix.com	
  is	
  now	
  ~100%	
  Cloud	
  

               See	
  h@p://techblog.ne=lix.com	
  
  Detailed	
  SlideShare	
  presentaQon	
  :	
  Ne=lix	
  on	
  Cloud	
  
               h@p://slideshare.net/adrianco	
  


        We	
  have	
  25	
  minutes	
  -­‐	
  not	
  half	
  a	
  day	
  to	
  discuss	
  everything!	
  
A	
  Nice	
  Problem	
  To	
  Have…	
  
h@p://techblog.ne=lix.com/2011/02/redesigning-­‐ne=lix-­‐api.html	
  




                   37x	
  Growth	
  Jan	
  
                   2010-­‐Jan	
  2011	
  
Data	
  Center	
  
                                   We	
  stopped	
  
                                 building	
  our	
  own	
  
                                   datacenters	
  


  Capacity	
  growth	
  is	
  acceleraQng,	
  unpredictable	
  
  Product	
  launch	
  spikes	
  -­‐	
  iPhone,	
  Wii,	
  PS3,	
  XBox	
  
We	
  want	
  to	
  use	
  clouds,	
  
we	
  don’t	
  have	
  Qme	
  to	
  build	
  them	
  
           Public	
  cloud	
  for	
  agility	
  and	
  scale	
  
   AWS	
  because	
  they	
  are	
  big	
  enough	
  to	
  allocate	
  
     thousands	
  of	
  instances	
  per	
  hour	
  for	
  us	
  
Ne=lix	
  EC2	
  Instances	
  per	
  Account	
  
                (summer	
  2010,	
  producQon	
  is	
  up	
  ~3x	
  now…)	
  
“Many	
  Thousands”	
  




           Content	
  Encoding	
  




          Test	
  and	
  ProducQon	
  
                                             Log	
  Analysis	
  

                                         “Several	
  Months”	
  
AWS	
  Performance?	
  
                    Mostly	
  good,	
  be@er	
  than	
  expected	
  over-­‐all	
  
•  The	
  Good	
  
     –  Large	
  EC2	
  Instance	
  types	
  (esp.	
  the	
  m2	
  range)	
  
     –  Internal	
  disk	
  performance	
  
     –  Network	
  performance	
  within	
  and	
  between	
  
          Availability	
  Zones	
  
     –  Robustness	
  and	
  scalability	
  of	
  S3,	
  SQS	
  
     	
  
•  The	
  Bad	
  
     –  ElasQc	
  Load	
  Balancer	
  has	
  too	
  many	
  limitaQons	
  
     –  SimpleDB	
  needs	
  memcached	
  front	
  end,	
  too	
  
        many	
  limitaQons	
  at	
  Terabyte	
  scale	
  

•  The	
  Ugly	
  
     –  EBS	
  performance	
  is	
  slow	
  and	
  inconsistent,	
  we	
  
        avoid	
  it	
  
Learnings	
  
•  Datacenter	
  oriented	
  tools	
  don’t	
  work	
  
      –  Ephemeral	
  instances	
  
      –  High	
  rate	
  of	
  change	
  
      –  Need	
  too	
  much	
  hand-­‐holding	
  and	
  manual	
  setup	
  

•  Cloud	
  Tools	
  Don’t	
  Scale	
  for	
  Enterprise	
  
      –  Too	
  many	
  tools	
  are	
  “Startup”	
  oriented	
  
      –  Built	
  our	
  own	
  tools	
  for	
  1000’s	
  of	
  instances	
  
      –  Drove	
  vendors	
  to	
  be	
  dynamic,	
  scale,	
  add	
  APIs	
  

•  “fork-­‐li:ed”	
  apps	
  are	
  fragile	
  
      –  Too	
  many	
  datacenter	
  oriented	
  assumpQons	
  
      –  We	
  re-­‐wrote	
  our	
  code	
  base!	
  
      –  (We	
  re-­‐write	
  it	
  conQnuously	
  anyway)	
  
Cloud	
  Performance	
  Challenges	
  

       Model	
  Driven	
  Architecture	
  
      Capacity	
  Planning	
  &	
  Metrics	
  
Model	
  Driven	
  Architecture	
  
•  Datacenter	
  PracQces	
  
   –  Lots	
  of	
  unique	
  hand-­‐tweaked	
  systems	
  
   –  Hard	
  to	
  enforce	
  pa@erns	
  

•  Model	
  Driven	
  Cloud	
  Architecture	
  
   –  Perforce/Ivy/Hudson	
  based	
  builds	
  for	
  everything	
  
   –  Every	
  producQon	
  instance	
  is	
  a	
  pre-­‐baked	
  AMI	
  
   –  Every	
  applicaQon	
  is	
  managed	
  by	
  an	
  Autoscaler	
  

            No	
  excep(ons,	
  every	
  change	
  is	
  a	
  new	
  AMI	
  
Model	
  Driven	
  ImplicaQons	
  
•  Automated	
  “Least	
  Privilege”	
  Security	
  
   –  Tightly	
  specified	
  security	
  groups	
  
   –  Fine	
  grain	
  IAM	
  keys	
  to	
  access	
  AWS	
  resources	
  
   –  Performance	
  tools	
  security	
  and	
  integraQon	
  


•  Model	
  Driven	
  Performance	
  Monitoring	
  
   –  Hundreds	
  of	
  instances	
  appear	
  in	
  a	
  few	
  minutes…	
  
   –  Tools	
  have	
  to	
  “garbage	
  collect”	
  dead	
  instances	
  	
  
Capacity	
  Planning	
  &	
  Metrics	
  
What	
  is	
  Capacity	
  Planning?	
  
•  We	
  care	
  about	
  
     –  CPU,	
  Memory,	
  Network	
  and	
  Disk	
  resources	
  consumed	
  
     –  ApplicaQon	
  response	
  Qmes	
  

•  We	
  need	
  to	
  know	
  
     –  how	
  much	
  of	
  each	
  resource	
  we	
  are	
  using	
  now	
  
     –  how	
  much	
  will	
  we	
  use	
  in	
  the	
  future	
  
     –  how	
  much	
  headroom	
  we	
  have	
  to	
  handle	
  higher	
  loads	
  

•  We	
  want	
  to	
  understand	
  
     –  how	
  headroom	
  varies	
  
     –  how	
  it	
  relates	
  to	
  response	
  Qmes	
  and	
  throughput	
  
Capacity	
  Planning	
  in	
  Clouds	
  
                     (a	
  few	
  things	
  have	
  changed…)	
  

•    Capacity	
  is	
  expensive	
  
•    Capacity	
  takes	
  Qme	
  to	
  buy	
  and	
  provision	
  
•    Capacity	
  only	
  increases,	
  can’t	
  be	
  shrunk	
  easily	
  
•    Capacity	
  comes	
  in	
  big	
  chunks,	
  paid	
  up	
  front	
  
•    Planning	
  errors	
  can	
  cause	
  big	
  problems	
  
•    Systems	
  are	
  clearly	
  defined	
  assets	
  
•    Systems	
  can	
  be	
  instrumented	
  in	
  detail	
  
•    Depreciate	
  assets	
  over	
  3	
  years	
  (reservaQons!)	
  
OK,	
  so	
  just	
  give	
  me	
  the	
  data!	
  

       Throughput	
  –	
  not	
  hard	
  
 Response	
  Time	
  –	
  mean+2xSD?	
  %iles?	
  
               UQlizaQon….	
  
UQlizaQon	
  

“UQlizaQon	
  is	
  virtually	
  useless	
  as	
  a	
  metric”	
  
   CMG	
  2006	
  Paper	
  by	
  Adrian	
  Cockcro:	
  
 VirtualizaQon	
  is	
  a	
  DOS	
  a@ack	
  on	
  Capacity	
  
                      Planning…	
  
What	
  would	
  you	
  say	
  if	
  you	
  were	
  asked:	
  
Q:	
  That	
  system	
  is	
  slow,	
  how	
  busy	
  is	
  it?	
  
A:	
  I	
  have	
  no	
  idea…	
  
A:	
  The	
  graph	
  in	
  this	
  tool	
  looks	
  about	
  50%	
  
A:	
  But	
  the	
  graph	
  in	
  this	
  other	
  tool	
  is	
  65%	
  
A:	
  Amazon	
  CloudWatch	
  says	
  82%	
  
A:	
  Linux	
  says	
  us	
  sy	
  ni	
  id	
  wa	
  st	
  L	
  
A:	
  Why	
  do	
  you	
  want	
  to	
  know?	
  
A:	
  I’m	
  sorry,	
  you	
  don’t	
  understand	
  your	
  quesQon….	
  
What's	
  the	
  problem	
  with	
  UQlizaQon?	
  
•  CPU	
  Capacity	
  
    –  Varying	
  capacity	
  due	
  to	
  mulQ-­‐tenancy	
  
    –  Non-­‐idenQcal	
  servers	
  or	
  CPUs	
  (check	
  /proc/cpuinfo)	
  
    –  Non-­‐linear	
  capacity	
  due	
  to	
  hyperthreading	
  etc.	
  

•  Measurement	
  Errors	
  
   –  Monitoring	
  tools	
  that	
  ignore	
  “stolen	
  Qme”	
  (all	
  of	
  them)	
  
   –  Mechanisms	
  with	
  built	
  in	
  bias	
  (clock	
  Qck	
  counQng)	
  
   –  Pla=orm	
  and	
  release	
  specific	
  changes	
  in	
  metrics	
  

        Every	
  tool	
  shows	
  a	
  different	
  value	
  for	
  the	
  same	
  metric!	
  
Performance	
  Tools	
  Architecture	
  
Monitoring	
  Issues	
  
•  Problem	
  
   –  Too	
  many	
  tools,	
  each	
  with	
  a	
  good	
  reason	
  to	
  exist	
  
   –  Hard	
  to	
  get	
  an	
  integrated	
  view	
  of	
  a	
  problem	
  
   –  Too	
  much	
  manual	
  work	
  building	
  dashboards	
  
   –  Tools	
  are	
  not	
  discoverable,	
  views	
  are	
  not	
  filtered	
  

•  SoluQon	
  
   –  Get	
  vendors	
  to	
  add	
  deep	
  linking	
  URLs	
  and	
  APIs	
  
   –  IntegraQon	
  “portal”	
  Qes	
  everything	
  together	
  
   –  Underlying	
  dependency	
  database	
  
   –  Dynamic	
  portal	
  generaQon,	
  relevant	
  data,	
  all	
  tools	
  
Data	
  Sources	
  
                                      • External	
  URL	
  availability	
  and	
  latency	
  alerts	
  and	
  reports	
  –	
  Keynote	
  
     External	
  TesQng	
             • Stress	
  tesQng	
  -­‐	
  SOASTA	
  

                                      • Ne=lix	
  REST	
  calls	
  –	
  Chukwa	
  to	
  DataOven	
  with	
  GUID	
  transacQon	
  idenQfier	
  
 Request	
  Trace	
  Logging	
        • Generic	
  HTTP	
  –	
  AppDynamics	
  service	
  Qer	
  aggregaQon,	
  end	
  to	
  end	
  tracking	
  

                                      • Tracers	
  and	
  counters	
  –	
  log4j,	
  tracer	
  central,	
  Chukwa	
  to	
  DataOven	
  
   ApplicaQon	
  logging	
            • Trackid	
  and	
  Audit/Debug	
  logging	
  –	
  DataOven,	
  Appdynamics	
  	
  GUID	
  cross	
  reference	
  

                                      • ApplicaQon	
  specific	
  real	
  Qme	
  –	
  Nimso:,	
  Appdynamics,	
  Epic	
  
        JMX	
  	
  Metrics	
          • Service	
  and	
  SLA	
  percenQles	
  –	
  Nimso:,	
  Appdynamics,	
  Epic,logged	
  to	
  DataOven	
  

                                      • Stdout	
  logs	
  –	
  S3	
  –	
  DataOven,	
  Nimso:	
  alerQng	
  
Tomcat	
  and	
  Apache	
  logs	
     • Standard	
  format	
  Access	
  and	
  Error	
  logs	
  –	
  S3	
  –	
  DataOven,	
  Nimso:	
  AlerQng	
  

                                      • Garbage	
  CollecQon	
  –	
  Nimso:,	
  Appdynamics	
  
               JVM	
                  • Memory	
  usage,	
  call	
  stacks,	
  resource/call	
  -­‐	
  AppDynamics	
  

                                      • system	
  CPU/Net/RAM/Disk	
  metrics	
  –	
  AppDynamics,	
  Epic,	
  Nimso:	
  AlerQng	
  
              Linux	
                 • SNMP	
  metrics	
  –	
  Epic,	
  Network	
  flows	
  -­‐	
  FasQp	
  

                                      • Load	
  balancer	
  traffic	
  –	
  Amazon	
  Cloudwatch,	
  SimpleDB	
  usage	
  stats	
  
              AWS	
                   • System	
  configuraQon	
  	
  -­‐	
  CPU	
  count/speed	
  and	
  RAM	
  size,	
  overall	
  usage	
  -­‐	
  AWS	
  
Integrated	
  Dashboards	
  
Dashboards	
  Architecture	
  
•  Integrated	
  Dashboard	
  View	
  
    –  Single	
  web	
  page	
  containing	
  content	
  from	
  many	
  tools	
  
    –  Filtered	
  to	
  highlight	
  most	
  “interesQng”	
  data	
  
•  Relevance	
  Controller	
  
    –  Drill	
  in,	
  add	
  and	
  remove	
  content	
  interacQvely	
  
    –  Given	
  an	
  applicaQon,	
  alert	
  or	
  problem	
  area,	
  dynamically	
  
       build	
  a	
  dashboard	
  relevant	
  to	
  your	
  role	
  and	
  needs	
  
•  Dependency	
  and	
  Incident	
  Model	
  
    –  Model	
  Driven	
  -­‐	
  Interrogates	
  tools	
  and	
  AWS	
  APIs	
  
    –  Document	
  store	
  to	
  capture	
  dependency	
  tree	
  and	
  states	
  
Dashboard	
  Prototype	
  
  (not	
  everything	
  is	
  integrated	
  yet)	
  
AppDynamics	
  
        How	
  to	
  look	
  deep	
  inside	
  your	
  cloud	
  applicaQons	
  

•  AutomaQc	
  Monitoring	
  
   –  Base	
  AMI	
  bakes	
  in	
  all	
  monitoring	
  tools	
  
   –  Outbound	
  calls	
  only	
  –	
  no	
  discovery/polling	
  issues	
  
   –  InacQve	
  instances	
  removed	
  a:er	
  a	
  few	
  days	
  
   	
  
•  Incident	
  Alarms	
  (deviaQon	
  from	
  baseline)	
  
   –  Business	
  TransacQon	
  latency	
  and	
  error	
  rate	
  
   –  Alarm	
  thresholds	
  discover	
  their	
  own	
  baseline	
  
   –  Email	
  contains	
  URL	
  to	
  Incident	
  Workbench	
  UI	
  
Using	
  AppDynamics	
  
(simple	
  example	
  from	
  early	
  2010)	
  
Switch	
  to	
  Snapshot	
  View	
  
     Pick	
  a	
  slow	
  call	
  graph	
  
InteracQons	
  for	
  this	
  Snapshot	
  
       Click	
  to	
  view	
  call	
  graph	
  
Point	
  Finger	
  and	
  Assess	
  Impact	
  
 (an	
  async	
  S3	
  write	
  was	
  slow,	
  no	
  big	
  deal)	
  
Summary	
  
•  Performance	
  of	
  AWS	
  Systems	
  isn’t	
  an	
  issue	
  

•  Broken	
  datacenter	
  tools	
  and	
  metrics	
  is	
  the	
  issue!	
  

•  IntegraQng	
  too	
  many	
  different	
  tools	
  
    –  They	
  are	
  not	
  designed	
  to	
  be	
  integrated	
  
    –  Did	
  I	
  menQon	
  that	
  I	
  hate	
  flash	
  based	
  user	
  interfaces?	
  
    –  We	
  have	
  “persuaded”	
  vendors	
  to	
  add	
  APIs	
  

•  If	
  you	
  can’t	
  see	
  deep	
  inside	
  your	
  app,	
  you’re	
  L	
  
                            QuesQons?	
  Job	
  ApplicaQons?	
  
                           @adrianco	
  #ne=lixcloud	
  #ccevent	
  
                                            	
  

Mais conteúdo relacionado

Mais procurados

SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformAdrian Cockcroft
 
Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Adrian Cockcroft
 
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionGluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionAdrian Cockcroft
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSFAdrian Cockcroft
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Adrian Cockcroft
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud ArchitectureAdrian Cockcroft
 
Cloud Architecture best practices
Cloud Architecture best practicesCloud Architecture best practices
Cloud Architecture best practicesOmid Vahdaty
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAdrian Cockcroft
 
Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...
Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...
Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...Amazon Web Services
 
AWS Innovation at Scale – Rodney Haywood
AWS Innovation at Scale – Rodney HaywoodAWS Innovation at Scale – Rodney Haywood
AWS Innovation at Scale – Rodney HaywoodAmazon Web Services
 
Hadoop and HBase on Amazon Web Services
Hadoop and HBase on Amazon Web Services Hadoop and HBase on Amazon Web Services
Hadoop and HBase on Amazon Web Services Amazon Web Services
 
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudIntuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudSid Anand
 
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Amazon Web Services
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is uselessAdrian Cockcroft
 
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...Amazon Web Services
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)Amazon Web Services
 

Mais procurados (20)

SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source Platform
 
Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011
 
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionGluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSF
 
NetflixOSS Meetup
NetflixOSS MeetupNetflixOSS Meetup
NetflixOSS Meetup
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3)
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud Architecture
 
Cloud Architecture best practices
Cloud Architecture best practicesCloud Architecture best practices
Cloud Architecture best practices
 
Global Netflix Platform
Global Netflix PlatformGlobal Netflix Platform
Global Netflix Platform
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at Netflix
 
Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...
Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...
Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockc...
 
AWS Innovation at Scale – Rodney Haywood
AWS Innovation at Scale – Rodney HaywoodAWS Innovation at Scale – Rodney Haywood
AWS Innovation at Scale – Rodney Haywood
 
Hadoop and HBase on Amazon Web Services
Hadoop and HBase on Amazon Web Services Hadoop and HBase on Amazon Web Services
Hadoop and HBase on Amazon Web Services
 
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the CloudIntuit CTOF 2011 - Netflix for Mobile in the Cloud
Intuit CTOF 2011 - Netflix for Mobile in the Cloud
 
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017 Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
Accelerate your Business with SAP on AWS - AWS Summit Cape Town 2017
 
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Cmg06 utilization is useless
Cmg06 utilization is uselessCmg06 utilization is useless
Cmg06 utilization is useless
 
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
AWS re:Invent 2016: Building HPC Clusters as Code in the (Almost) Infinite Cl...
 
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
AWS re:Invent 2016: High Performance Computing on AWS (CMP207)
 
Create cloud service on AWS
Create cloud service on AWSCreate cloud service on AWS
Create cloud service on AWS
 

Destaque

Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSAdrian Cockcroft
 
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXCassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXVinay Kumar Chella
 
Millicomputing Usenix 2008
Millicomputing Usenix 2008Millicomputing Usenix 2008
Millicomputing Usenix 2008Adrian Cockcroft
 
Honest performance testing with NDBench
Honest performance testing with NDBenchHonest performance testing with NDBench
Honest performance testing with NDBenchVinay Kumar Chella
 
Tools and Platforms for OpenFlow/SDN
Tools and Platforms for OpenFlow/SDNTools and Platforms for OpenFlow/SDN
Tools and Platforms for OpenFlow/SDNUmesh Krishnaswamy
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflixgreggulrich
 
Netflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconNetflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconAdrian Cockcroft
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
 

Destaque (12)

Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWS
 
NoSQL for Netflix
NoSQL for NetflixNoSQL for Netflix
NoSQL for Netflix
 
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXCassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
 
Millicomputing Usenix 2008
Millicomputing Usenix 2008Millicomputing Usenix 2008
Millicomputing Usenix 2008
 
Real world repairs
Real world repairsReal world repairs
Real world repairs
 
Honest performance testing with NDBench
Honest performance testing with NDBenchHonest performance testing with NDBench
Honest performance testing with NDBench
 
Tools and Platforms for OpenFlow/SDN
Tools and Platforms for OpenFlow/SDNTools and Platforms for OpenFlow/SDN
Tools and Platforms for OpenFlow/SDN
 
Cassandra Operations at Netflix
Cassandra Operations at NetflixCassandra Operations at Netflix
Cassandra Operations at Netflix
 
Netflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at GlueconNetflix Architecture Tutorial at Gluecon
Netflix Architecture Tutorial at Gluecon
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
 
Migrating to Public Cloud
Migrating to Public CloudMigrating to Public Cloud
Migrating to Public Cloud
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 

Semelhante a Performance architecture for cloud connect

Managing application & instance state on AWS
Managing application & instance state on AWSManaging application & instance state on AWS
Managing application & instance state on AWSDavid Mat
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsAmazon Web Services
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013aspyker
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformDATAVERSITY
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...Adrian Cockcroft
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumAdrian Cockcroft
 
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...Amazon Web Services
 
Netflix web-adrian-qcon
Netflix web-adrian-qconNetflix web-adrian-qcon
Netflix web-adrian-qconYiwei Ma
 
Building a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for AnalystsBuilding a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for AnalystsAvere Systems
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Amazon Web Services
 
Java Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the CloudJava Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the CloudMongoDB
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesJosef Adersberger
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesQAware GmbH
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Crate.io
 
Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2Amazon Web Services
 
Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSaspyker
 
(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture Patterns(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture PatternsAmazon Web Services
 
Meetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSMeetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSAWS Vietnam Community
 
From AWS to Series A in 5 Easy Pieces
From AWS to Series A in 5 Easy PiecesFrom AWS to Series A in 5 Easy Pieces
From AWS to Series A in 5 Easy PiecesAmazon Web Services
 

Semelhante a Performance architecture for cloud connect (20)

Managing application & instance state on AWS
Managing application & instance state on AWSManaging application & instance state on AWS
Managing application & instance state on AWS
 
T1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on awsT1 – Architecting highly available applications on aws
T1 – Architecting highly available applications on aws
 
NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013NetflixOSS for Triangle Devops Oct 2013
NetflixOSS for Triangle Devops Oct 2013
 
Estimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics PlatformEstimating the Total Costs of Your Cloud Analytics Platform
Estimating the Total Costs of Your Cloud Analytics Platform
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV Forum
 
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
Improving Availability & Lowering Costs with Auto Scaling & Amazon EC2 (CPN20...
 
Netflix web-adrian-qcon
Netflix web-adrian-qconNetflix web-adrian-qcon
Netflix web-adrian-qcon
 
Building a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for AnalystsBuilding a Just-in-Time Application Stack for Analysts
Building a Just-in-Time Application Stack for Analysts
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
Managing Your Cloud Assets
Managing Your Cloud AssetsManaging Your Cloud Assets
Managing Your Cloud Assets
 
Java Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the CloudJava Agile ALM: OTAP and DevOps in the Cloud
Java Agile ALM: OTAP and DevOps in the Cloud
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Patterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to KubernetesPatterns and Pains of Migrating Legacy Applications to Kubernetes
Patterns and Pains of Migrating Legacy Applications to Kubernetes
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2
 
Cloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSSCloud Services Powered by IBM SoftLayer and NetflixOSS
Cloud Services Powered by IBM SoftLayer and NetflixOSS
 
(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture Patterns(ARC309) Getting to Microservices: Cloud Architecture Patterns
(ARC309) Getting to Microservices: Cloud Architecture Patterns
 
Meetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSMeetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWS
 
From AWS to Series A in 5 Easy Pieces
From AWS to Series A in 5 Easy PiecesFrom AWS to Series A in 5 Easy Pieces
From AWS to Series A in 5 Easy Pieces
 

Mais de Adrian Cockcroft

Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Adrian Cockcroft
 
Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Adrian Cockcroft
 
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Adrian Cockcroft
 

Mais de Adrian Cockcroft (6)

Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
 
Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013
 
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
 
Gluecon keynote
Gluecon keynoteGluecon keynote
Gluecon keynote
 
Dystopia as a Service
Dystopia as a ServiceDystopia as a Service
Dystopia as a Service
 
Netflix and Open Source
Netflix and Open SourceNetflix and Open Source
Netflix and Open Source
 

Último

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 

Último (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Performance architecture for cloud connect

  • 1. Performance  Architecture  for   Cloud   March  7,  2011   Adrian  Cockcro:   @adrianco  #ne=lixcloud  #ccevent   h@p://www.linkedin.com/in/adriancockcro:   acockcro:@ne=lix.com  
  • 2. Who,  Why,  What   Ne=lix  in  the  Cloud   Cloud  Performance  Challenges   Performance  Architecture  and  Tools    
  • 3. Ne=lix.com  is  now  ~100%  Cloud   See  h@p://techblog.ne=lix.com   Detailed  SlideShare  presentaQon  :  Ne=lix  on  Cloud   h@p://slideshare.net/adrianco   We  have  25  minutes  -­‐  not  half  a  day  to  discuss  everything!  
  • 4. A  Nice  Problem  To  Have…   h@p://techblog.ne=lix.com/2011/02/redesigning-­‐ne=lix-­‐api.html   37x  Growth  Jan   2010-­‐Jan  2011  
  • 5. Data  Center   We  stopped   building  our  own   datacenters   Capacity  growth  is  acceleraQng,  unpredictable   Product  launch  spikes  -­‐  iPhone,  Wii,  PS3,  XBox  
  • 6. We  want  to  use  clouds,   we  don’t  have  Qme  to  build  them   Public  cloud  for  agility  and  scale   AWS  because  they  are  big  enough  to  allocate   thousands  of  instances  per  hour  for  us  
  • 7. Ne=lix  EC2  Instances  per  Account   (summer  2010,  producQon  is  up  ~3x  now…)   “Many  Thousands”   Content  Encoding   Test  and  ProducQon   Log  Analysis   “Several  Months”  
  • 8. AWS  Performance?   Mostly  good,  be@er  than  expected  over-­‐all   •  The  Good   –  Large  EC2  Instance  types  (esp.  the  m2  range)   –  Internal  disk  performance   –  Network  performance  within  and  between   Availability  Zones   –  Robustness  and  scalability  of  S3,  SQS     •  The  Bad   –  ElasQc  Load  Balancer  has  too  many  limitaQons   –  SimpleDB  needs  memcached  front  end,  too   many  limitaQons  at  Terabyte  scale   •  The  Ugly   –  EBS  performance  is  slow  and  inconsistent,  we   avoid  it  
  • 9. Learnings   •  Datacenter  oriented  tools  don’t  work   –  Ephemeral  instances   –  High  rate  of  change   –  Need  too  much  hand-­‐holding  and  manual  setup   •  Cloud  Tools  Don’t  Scale  for  Enterprise   –  Too  many  tools  are  “Startup”  oriented   –  Built  our  own  tools  for  1000’s  of  instances   –  Drove  vendors  to  be  dynamic,  scale,  add  APIs   •  “fork-­‐li:ed”  apps  are  fragile   –  Too  many  datacenter  oriented  assumpQons   –  We  re-­‐wrote  our  code  base!   –  (We  re-­‐write  it  conQnuously  anyway)  
  • 10. Cloud  Performance  Challenges   Model  Driven  Architecture   Capacity  Planning  &  Metrics  
  • 11. Model  Driven  Architecture   •  Datacenter  PracQces   –  Lots  of  unique  hand-­‐tweaked  systems   –  Hard  to  enforce  pa@erns   •  Model  Driven  Cloud  Architecture   –  Perforce/Ivy/Hudson  based  builds  for  everything   –  Every  producQon  instance  is  a  pre-­‐baked  AMI   –  Every  applicaQon  is  managed  by  an  Autoscaler   No  excep(ons,  every  change  is  a  new  AMI  
  • 12. Model  Driven  ImplicaQons   •  Automated  “Least  Privilege”  Security   –  Tightly  specified  security  groups   –  Fine  grain  IAM  keys  to  access  AWS  resources   –  Performance  tools  security  and  integraQon   •  Model  Driven  Performance  Monitoring   –  Hundreds  of  instances  appear  in  a  few  minutes…   –  Tools  have  to  “garbage  collect”  dead  instances    
  • 13. Capacity  Planning  &  Metrics  
  • 14. What  is  Capacity  Planning?   •  We  care  about   –  CPU,  Memory,  Network  and  Disk  resources  consumed   –  ApplicaQon  response  Qmes   •  We  need  to  know   –  how  much  of  each  resource  we  are  using  now   –  how  much  will  we  use  in  the  future   –  how  much  headroom  we  have  to  handle  higher  loads   •  We  want  to  understand   –  how  headroom  varies   –  how  it  relates  to  response  Qmes  and  throughput  
  • 15. Capacity  Planning  in  Clouds   (a  few  things  have  changed…)   •  Capacity  is  expensive   •  Capacity  takes  Qme  to  buy  and  provision   •  Capacity  only  increases,  can’t  be  shrunk  easily   •  Capacity  comes  in  big  chunks,  paid  up  front   •  Planning  errors  can  cause  big  problems   •  Systems  are  clearly  defined  assets   •  Systems  can  be  instrumented  in  detail   •  Depreciate  assets  over  3  years  (reservaQons!)  
  • 16. OK,  so  just  give  me  the  data!   Throughput  –  not  hard   Response  Time  –  mean+2xSD?  %iles?   UQlizaQon….  
  • 17. UQlizaQon   “UQlizaQon  is  virtually  useless  as  a  metric”   CMG  2006  Paper  by  Adrian  Cockcro:   VirtualizaQon  is  a  DOS  a@ack  on  Capacity   Planning…  
  • 18. What  would  you  say  if  you  were  asked:   Q:  That  system  is  slow,  how  busy  is  it?   A:  I  have  no  idea…   A:  The  graph  in  this  tool  looks  about  50%   A:  But  the  graph  in  this  other  tool  is  65%   A:  Amazon  CloudWatch  says  82%   A:  Linux  says  us  sy  ni  id  wa  st  L   A:  Why  do  you  want  to  know?   A:  I’m  sorry,  you  don’t  understand  your  quesQon….  
  • 19. What's  the  problem  with  UQlizaQon?   •  CPU  Capacity   –  Varying  capacity  due  to  mulQ-­‐tenancy   –  Non-­‐idenQcal  servers  or  CPUs  (check  /proc/cpuinfo)   –  Non-­‐linear  capacity  due  to  hyperthreading  etc.   •  Measurement  Errors   –  Monitoring  tools  that  ignore  “stolen  Qme”  (all  of  them)   –  Mechanisms  with  built  in  bias  (clock  Qck  counQng)   –  Pla=orm  and  release  specific  changes  in  metrics   Every  tool  shows  a  different  value  for  the  same  metric!  
  • 21. Monitoring  Issues   •  Problem   –  Too  many  tools,  each  with  a  good  reason  to  exist   –  Hard  to  get  an  integrated  view  of  a  problem   –  Too  much  manual  work  building  dashboards   –  Tools  are  not  discoverable,  views  are  not  filtered   •  SoluQon   –  Get  vendors  to  add  deep  linking  URLs  and  APIs   –  IntegraQon  “portal”  Qes  everything  together   –  Underlying  dependency  database   –  Dynamic  portal  generaQon,  relevant  data,  all  tools  
  • 22. Data  Sources   • External  URL  availability  and  latency  alerts  and  reports  –  Keynote   External  TesQng   • Stress  tesQng  -­‐  SOASTA   • Ne=lix  REST  calls  –  Chukwa  to  DataOven  with  GUID  transacQon  idenQfier   Request  Trace  Logging   • Generic  HTTP  –  AppDynamics  service  Qer  aggregaQon,  end  to  end  tracking   • Tracers  and  counters  –  log4j,  tracer  central,  Chukwa  to  DataOven   ApplicaQon  logging   • Trackid  and  Audit/Debug  logging  –  DataOven,  Appdynamics    GUID  cross  reference   • ApplicaQon  specific  real  Qme  –  Nimso:,  Appdynamics,  Epic   JMX    Metrics   • Service  and  SLA  percenQles  –  Nimso:,  Appdynamics,  Epic,logged  to  DataOven   • Stdout  logs  –  S3  –  DataOven,  Nimso:  alerQng   Tomcat  and  Apache  logs   • Standard  format  Access  and  Error  logs  –  S3  –  DataOven,  Nimso:  AlerQng   • Garbage  CollecQon  –  Nimso:,  Appdynamics   JVM   • Memory  usage,  call  stacks,  resource/call  -­‐  AppDynamics   • system  CPU/Net/RAM/Disk  metrics  –  AppDynamics,  Epic,  Nimso:  AlerQng   Linux   • SNMP  metrics  –  Epic,  Network  flows  -­‐  FasQp   • Load  balancer  traffic  –  Amazon  Cloudwatch,  SimpleDB  usage  stats   AWS   • System  configuraQon    -­‐  CPU  count/speed  and  RAM  size,  overall  usage  -­‐  AWS  
  • 24. Dashboards  Architecture   •  Integrated  Dashboard  View   –  Single  web  page  containing  content  from  many  tools   –  Filtered  to  highlight  most  “interesQng”  data   •  Relevance  Controller   –  Drill  in,  add  and  remove  content  interacQvely   –  Given  an  applicaQon,  alert  or  problem  area,  dynamically   build  a  dashboard  relevant  to  your  role  and  needs   •  Dependency  and  Incident  Model   –  Model  Driven  -­‐  Interrogates  tools  and  AWS  APIs   –  Document  store  to  capture  dependency  tree  and  states  
  • 25. Dashboard  Prototype   (not  everything  is  integrated  yet)  
  • 26. AppDynamics   How  to  look  deep  inside  your  cloud  applicaQons   •  AutomaQc  Monitoring   –  Base  AMI  bakes  in  all  monitoring  tools   –  Outbound  calls  only  –  no  discovery/polling  issues   –  InacQve  instances  removed  a:er  a  few  days     •  Incident  Alarms  (deviaQon  from  baseline)   –  Business  TransacQon  latency  and  error  rate   –  Alarm  thresholds  discover  their  own  baseline   –  Email  contains  URL  to  Incident  Workbench  UI  
  • 27. Using  AppDynamics   (simple  example  from  early  2010)  
  • 28. Switch  to  Snapshot  View   Pick  a  slow  call  graph  
  • 29. InteracQons  for  this  Snapshot   Click  to  view  call  graph  
  • 30. Point  Finger  and  Assess  Impact   (an  async  S3  write  was  slow,  no  big  deal)  
  • 31. Summary   •  Performance  of  AWS  Systems  isn’t  an  issue   •  Broken  datacenter  tools  and  metrics  is  the  issue!   •  IntegraQng  too  many  different  tools   –  They  are  not  designed  to  be  integrated   –  Did  I  menQon  that  I  hate  flash  based  user  interfaces?   –  We  have  “persuaded”  vendors  to  add  APIs   •  If  you  can’t  see  deep  inside  your  app,  you’re  L   QuesQons?  Job  ApplicaQons?   @adrianco  #ne=lixcloud  #ccevent