SlideShare uma empresa Scribd logo
1 de 15
U"liza"on	
  is	
  Virtually	
  Useless	
  as	
  a	
  
                   Metric!	
  
      CMG	
  2006	
  -­‐	
  Reno	
  NV	
  
                  Adrian	
  Cockcro?	
  –	
  NeAlix	
  Inc.	
  
                   With	
  minor	
  updates	
  2010	
  
                (At	
  the	
  "me:	
  Dis"nguished	
  Engineer	
  
                   eBay	
  Research	
  Labs,	
  eBay	
  Inc.)	
  
Agenda	
  
•    Headroom	
  
•    U"liza"on	
  
•    Response	
  Time	
  
•    The	
  Many	
  Ways	
  In	
  Which	
  U"liza"on	
  Metrics	
  Are	
  Broken	
  
•    An	
  Alterna"ve	
  
•    eBay.com	
  Architecture,	
  Scale	
  and	
  Rate	
  of	
  Change	
  
•    Response	
  "mes	
  for	
  an	
  eBay.com	
  SOA	
  service	
  pool	
  
•    Conclusions	
  
Headroom	
  
•  Headroom	
  is	
  available	
  usable	
  resources	
  
    –  Total	
  Capacity	
  minus	
  Peak	
  U"liza"on	
  and	
  Margin	
  
    –  Applies	
  to	
  CPU,	
  RAM,	
  Net,	
  Disk	
  and	
  OS	
  



                                             Margin

                                             Headroom




                                      Utilization
U"liza"on	
  
•  U"liza"on	
  is	
  the	
  propor"on	
  of	
  busy	
  "me	
  
•  Always	
  defined	
  over	
  a	
  "me	
  interval	
  




                  Utilization
Response	
  Time	
  
•    Service	
  "me	
  occurs	
  while	
  using	
  a	
  resource	
  
•    Queue	
  "me	
  waits	
  for	
  access	
  to	
  a	
  resource	
  
•    Response	
  Time	
  =	
  Queue	
  "me	
  +	
  Service	
  "me	
  
•    Assump"ons	
  
       –    Steady	
  state	
  averages	
  
       –    Random	
  arrivals	
  
       –    Constant	
  service	
  "me	
  
       –    M	
  servers	
  processing	
  the	
  same	
  queue	
  
•  Approxima"ons	
  	
  
       –  Queue	
  length	
  =	
  Throughput	
  x	
  Response	
  Time	
  (Liale's	
  Law)	
  
       –  Response	
  Time	
  =	
  Service	
  Time	
  /	
  (Headroom	
  +	
  Margin)	
  
       –  Response	
  Time	
  =	
  Service	
  Time	
  /	
  (1	
  -­‐	
  U"liza"onM)	
  
Response	
  Time	
  Curves	
  
Systems	
  with	
  many	
  servers	
  (e.g.	
  CPUs)	
  can	
  run	
  at	
  higher	
  u"liza"on	
  
     levels,	
  but	
  degrade	
  more	
  rapidly	
  when	
  they	
  finally	
  run	
  out	
  of	
  
     capacity.	
  Headroom	
  margin	
  should	
  be	
  set	
  according	
  to	
  a	
  response	
  
     "me	
  target.	
  

                                                                         R = S / (1 - (U%)m)




                                               Headroom
                                               margin
So	
  what's	
  the	
  problem	
  with	
  
                            U"liza"on?	
  
•    Unsafe	
  assump"ons!	
  Complex	
  adap"ve	
  systems	
  have	
  replaced	
  simple	
  ones	
  
•    Random	
  arrivals?	
  
      –  Bursty	
  traffic	
  with	
  long	
  tail	
  arrival	
  rate	
  distribu"on	
  

•    Constant	
  service	
  "me?	
  
      –  Variable	
  clock	
  rate	
  CPUs,	
  inverse	
  load	
  dependent	
  service	
  "me	
  

      –  Complex	
  transac"ons,	
  request	
  and	
  response	
  dependent	
  

•    M	
  servers	
  processing	
  the	
  same	
  queue?	
  
      –  Virtual	
  servers	
  with	
  varying	
  non-­‐integral	
  concurrency	
  

      –  Non-­‐iden"cal	
  servers	
  or	
  CPUs,	
  Hyperthreading,	
  Mul"core,	
  NUMA	
  

•    Measurement	
  Errors?	
  
      –  Measurement	
  mechanisms	
  with	
  built	
  in	
  bias,	
  e.g.	
  sampling	
  from	
  the	
  scheduler	
  clock	
  

      –  PlaAorm	
  specific	
  and	
  release	
  specific	
  systemic	
  changes	
  in	
  the	
  accoun"ng	
  of	
  interrupt	
  "me	
  
Storage	
  U"liza"on 	
  	
  
•  Storage	
  virtualiza"on	
  broke	
  u"liza"on	
  metrics	
  a	
  long	
  
   "me	
  ago	
  
•  Host	
  server	
  measures	
  busy	
  "me	
  on	
  a	
  "disk"	
  
    –  Simple	
  disk,	
  "single	
  server"	
  response	
  "me	
  gets	
  high	
  near	
  
       100%	
  u"liza"on	
  
    –  Cached	
  RAID	
  LUN,	
  one	
  I/O	
  stream	
  can	
  report	
  100%	
  
       u"liza"on,	
  but	
  full	
  capacity	
  supports	
  many	
  threads	
  of	
  I/O	
  
       since	
  there	
  are	
  many	
  disks	
  and	
  RAM	
  buffering	
  
•  New	
  metric	
  -­‐	
  "Capability	
  U"liza"on"	
  
    –  Adjusted	
  to	
  report	
  propor"on	
  of	
  actual	
  capacity	
  for	
  
       current	
  workload	
  mix	
  
    –  Measured	
  by	
  tools	
  such	
  as	
  Ortera	
  Atlas	
  (hap://
       www.ortera.com)	
  
Threaded	
  CPU	
  Pipelines	
  
•  CPU	
  microarchitecture	
  op"miza"ons	
  
     –  Extra	
  register	
  sets	
  working	
  with	
  the	
  exis"ng	
  arithme"c	
  and	
  floa"ng	
  point	
  units	
  
     –  When	
  the	
  CPU	
  stalls	
  on	
  a	
  memory	
  read,	
  it	
  switches	
  registers/threads	
  
     –  Opera"ng	
  system	
  sees	
  mul"ple	
  schedulable	
  en""es	
  (CPUs)	
  
•  Intel	
  Hyperthreading	
  
     –    Each	
  CPU	
  core	
  has	
  an	
  extra	
  thread	
  to	
  use	
  spare	
  cycles	
  
     –    Typical	
  benefit	
  is	
  20%,	
  so	
  total	
  capacity	
  is	
  1.2	
  CPUs	
  
     –    Second	
  thread	
  much	
  slower	
  when	
  first	
  thread	
  is	
  busy	
  
     –    Hyperthreading	
  aware	
  op"miza"ons	
  in	
  recent	
  opera"ng	
  systems	
  
•  Sun	
  CoolThreads	
  
     –    "Niagara"	
  SPARC	
  CPU	
  has	
  eight	
  cores,	
  one	
  shared	
  floa"ng	
  point	
  unit	
  
     –    Each	
  CPU	
  core	
  has	
  four	
  threads,	
  but	
  each	
  core	
  is	
  a	
  very	
  simple	
  design	
  
     –    Behaves	
  like	
  32	
  slow	
  CPUs	
  for	
  integer,	
  snail	
  like	
  uniprocessor	
  for	
  FP	
  
     –    Overall	
  throughput	
  is	
  very	
  high,	
  performance	
  per	
  waa	
  is	
  excep"onal	
  
•  Hyperformix	
  have	
  performance	
  modeling	
  of	
  Hyperthreads	
  and	
  Niagara	
  
Variable	
  Clock	
  Rate	
  CPUs	
  
•    Laptop	
  and	
  other	
  low	
  power	
  devices	
  do	
  this	
  all	
  the	
  "me	
  
       –  Watch	
  CPU	
  usage	
  of	
  a	
  video	
  playback	
  applica"on	
  and	
  toggle	
  mains/baaery	
  power….	
  
•    Next	
  Genera"on	
  Server	
  CPU	
  Power	
  Op"miza"on	
  -­‐	
  AMD	
  PowerNow!™	
  
       –    AMD	
  Opteron	
  x64	
  server	
  CPU	
  detects	
  overall	
  u"liza"on	
  and	
  reduces	
  clock	
  rate	
  
       –    Actual	
  speeds	
  vary,	
  but	
  for	
  example	
  could	
  reduce	
  from	
  2.6GHz	
  to	
  1.2GHz	
  
       –    Speed	
  varies	
  per	
  socket,	
  so	
  pairs	
  of	
  CPU	
  cores	
  vary	
  together	
  
       –    Changes	
  are	
  not	
  currently	
  understood	
  or	
  reported	
  by	
  opera"ng	
  system	
  metrics	
  
       –    Speed	
  changes	
  can	
  occur	
  every	
  few	
  milliseconds	
  
•    Possible	
  scenario:	
  
       –  You	
  es"mate	
  20%	
  u"liza"on	
  at	
  2.6GHz	
  and	
  see	
  45%	
  reported	
  in	
  prac"ce	
  (at	
  1.2GHz)	
  
       –  Load	
  doubles,	
  reported	
  u"liza"on	
  drops	
  to	
  40%	
  (at	
  2.6GHz)	
  
       –  Actual	
  mapping	
  of	
  u"liza"on	
  to	
  clock	
  rate	
  is	
  unknown	
  at	
  this	
  point	
  
•    Older	
  Opterons,	
  and	
  "low	
  power"	
  versions	
  used	
  in	
  blades	
  do	
  not	
  vary	
  clock	
  rate	
  
•    Disaster	
  scenario	
  -­‐	
  you	
  get	
  a	
  capacity	
  surge	
  and	
  the	
  datacenter	
  power	
  and	
  cooling	
  can't	
  cope	
  
     with	
  all	
  the	
  systems	
  at	
  the	
  high	
  clock	
  rate!	
  
Virtual	
  Machine	
  Monitors	
  
•  VMware,	
  Xen,	
  and	
  good	
  old	
  mainframe	
  LPARs	
  etc.	
  
    –  Non-­‐integral	
  and	
  non-­‐constant	
  frac"ons	
  of	
  a	
  machine	
  
    –  Naiive	
  opera"ng	
  systems	
  and	
  applica"ons	
  that	
  don't	
  
       expect	
  this	
  behavior	
  
    –  However,	
  lots	
  of	
  recent	
  tools	
  development	
  from	
  vendors	
  
       (BMC,	
  Teamquest	
  etc.)	
  

•  Average	
  CPU	
  count	
  must	
  be	
  reported	
  for	
  each	
  
   measurement	
  interval	
  

•  VMM	
  overhead	
  varies,	
  applica"on	
  scaling	
  
   characteris"cs	
  may	
  be	
  affected	
  
Whats	
  My	
  Headroom?	
  How	
  to	
  plot	
  it?	
  
•  Measure	
  and	
  report	
  absolute	
  CPU	
  power	
  if	
  you	
  can	
  get	
  it…	
  
•  Plot	
  shows	
  headroom	
  in	
  blue,	
  margin	
  in	
  red,	
  total	
  power	
  tracking,	
  day/
   night	
  workload	
  varia"on,	
  ploaed	
  as	
  mean	
  +	
  two	
  standard	
  devia"ons.	
  
Cockcro?	
  Headroom	
  Plot	
  
•  Scaaer	
  plot	
  of	
  disk	
  
   response	
  "me	
  (ms)	
  vs.	
  
   Throughput	
  (KB)	
  
•  Histograms	
  on	
  axes	
  
•  Throughput	
  "me	
  series	
  
   plot	
  
•  Shows	
  distribu"ons	
  and	
  
   shape	
  of	
  response	
  "me	
  
•  Fits	
  throughput	
  weighted	
  
   inverse	
  gaussian	
  curve	
  
•  Coded	
  using	
  "R"	
  sta"s"cs	
  
   package	
  
•  Blogged	
  development	
  at	
  
   hap://
   perfcap.blogspot.com	
  
Thread	
  Limited	
  Response	
  Time	
  
•    Thread-­‐limited	
  responses	
  
•    Mixture	
  of	
  fast	
  and	
  slow	
  requests	
  
•    Oscilla"ng	
  behaviors	
  
•    Distribu"ons	
  are	
  long	
  tail	
  
•    Workload	
  behaves	
  a	
  bit	
  like	
  adhoc	
  
     queries	
  to	
  a	
  DSS	
  perhaps?	
  
•    Measurements	
  are	
  of	
  a	
  single	
  SOA	
  
     service	
  pool	
  
•    Response	
  is	
  in	
  milliseconds	
  
•    Throughput	
  is	
  execu"ons/s	
  

Exec                        Resp
Min.   :    1.00            Min.   :    0.0
1st Qu.:    2.00            1st Qu.: 150.0
Median :    8.00            Median : 361.0
Mean   :   64.68            Mean   : 533.5
3rd Qu.:   45.00            3rd Qu.: 771.9
Max.   :10795.00            Max.   :19205.0
Conclusion	
  
•  Check	
  your	
  assump"ons…	
  
•  Record	
  and	
  plot	
  absolute	
  capacity	
  for	
  each	
  
   measurement	
  interval	
  
•  Plot	
  response	
  "me	
  as	
  a	
  func"on	
  of	
  throughput,	
  not	
  
   just	
  u"liza"on	
  
•  SOA	
  response	
  characteris"cs	
  are	
  complicated	
  and	
  not	
  
   well	
  understood….	
  

                               Ques"ons?	
  
                         (Now	
  acockcro?@neAlix.com)	
  
                          hap://perfcap.blogspot.com	
  

Mais conteúdo relacionado

Mais procurados

(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The CloudAmazon Web Services
 
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...Mohamed Sayed
 
Applications in the Cloud
Applications in the CloudApplications in the Cloud
Applications in the CloudEberhard Wolff
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud ArchitectureAdrian Cockcroft
 
Managing Performance in the Cloud
Managing Performance in the CloudManaging Performance in the Cloud
Managing Performance in the CloudDevOpsGroup
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformAdrian Cockcroft
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsScott Miao
 
Cloud Architecture best practices
Cloud Architecture best practicesCloud Architecture best practices
Cloud Architecture best practicesOmid Vahdaty
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesAlexander Penev
 
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Adrian Cockcroft
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSFAdrian Cockcroft
 
High Availability Infrastructure for Cloud Computing
High Availability Infrastructure for Cloud ComputingHigh Availability Infrastructure for Cloud Computing
High Availability Infrastructure for Cloud ComputingBob Rhubart
 
(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...
(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...
(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...Amazon Web Services
 
Top 10 Application Problems
Top 10 Application ProblemsTop 10 Application Problems
Top 10 Application ProblemsAppDynamics
 
Your Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's PerspectiveYour Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's PerspectiveIlya Ganelin
 

Mais procurados (20)

(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud(ISM301) Engineering Netflix Global Operations In The Cloud
(ISM301) Engineering Netflix Global Operations In The Cloud
 
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
 
Applications in the Cloud
Applications in the CloudApplications in the Cloud
Applications in the Cloud
 
Svc 202-netflix-open-source
Svc 202-netflix-open-sourceSvc 202-netflix-open-source
Svc 202-netflix-open-source
 
Netflix Global Cloud Architecture
Netflix Global Cloud ArchitectureNetflix Global Cloud Architecture
Netflix Global Cloud Architecture
 
Google Compute and MapR
Google Compute and MapRGoogle Compute and MapR
Google Compute and MapR
 
Managing Performance in the Cloud
Managing Performance in the CloudManaging Performance in the Cloud
Managing Performance in the Cloud
 
Netflix in the Cloud
Netflix in the CloudNetflix in the Cloud
Netflix in the Cloud
 
SV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source PlatformSV Forum Platform Architecture SIG - Netflix Open Source Platform
SV Forum Platform Architecture SIG - Netflix Open Source Platform
 
analytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the awsanalytic engine - a common big data computation service on the aws
analytic engine - a common big data computation service on the aws
 
Cloud Architecture best practices
Cloud Architecture best practicesCloud Architecture best practices
Cloud Architecture best practices
 
Zero Downtime JEE Architectures
Zero Downtime JEE ArchitecturesZero Downtime JEE Architectures
Zero Downtime JEE Architectures
 
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
Global Netflix - HPTS Workshop - Scaling Cassandra benchmark to over 1M write...
 
Architectures for High Availability - QConSF
Architectures for High Availability - QConSFArchitectures for High Availability - QConSF
Architectures for High Availability - QConSF
 
High Availability Infrastructure for Cloud Computing
High Availability Infrastructure for Cloud ComputingHigh Availability Infrastructure for Cloud Computing
High Availability Infrastructure for Cloud Computing
 
HPC on AWS
HPC on AWSHPC on AWS
HPC on AWS
 
(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...
(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...
(BIZ305) Case Study: Migrating Oracle E-Business Suite to AWS | AWS re:Invent...
 
Top 10 Application Problems
Top 10 Application ProblemsTop 10 Application Problems
Top 10 Application Problems
 
Your Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's PerspectiveYour Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's Perspective
 
Global Netflix Platform
Global Netflix PlatformGlobal Netflix Platform
Global Netflix Platform
 

Semelhante a Cmg06 utilization is useless

Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabitsYves Goeleven
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity PlanningMongoDB
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloudJustin Swanhart
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machineheraflux
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormJohn Georgiadis
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelinesSumant Tambe
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Ilya Ganelin
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopAyon Sinha
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Amazon Web Services
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture Haris456
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?Deepak Shankar
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?Deepak Shankar
 
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...PlovDev Conference
 
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...InfinIT - Innovationsnetværket for it
 

Semelhante a Cmg06 utilization is useless (20)

Azug - successfully breeding rabits
Azug - successfully breeding rabitsAzug - successfully breeding rabits
Azug - successfully breeding rabits
 
CPU Caches
CPU CachesCPU Caches
CPU Caches
 
Capacity Planning
Capacity PlanningCapacity Planning
Capacity Planning
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloud
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
Right-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual MachineRight-Sizing your SQL Server Virtual Machine
Right-Sizing your SQL Server Virtual Machine
 
Real-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and StormReal-Time Analytics with Kafka, Cassandra and Storm
Real-Time Analytics with Kafka, Cassandra and Storm
 
Tuning kafka pipelines
Tuning kafka pipelinesTuning kafka pipelines
Tuning kafka pipelines
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
Your Linux AMI: Optimization and Performance (CPN302) | AWS re:Invent 2013
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?How to create innovative architecture using VisualSim?
How to create innovative architecture using VisualSim?
 
How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?How to create innovative architecture using ViualSim?
How to create innovative architecture using ViualSim?
 
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
PlovDev 2016: Application Performance in Virtualized Environments by Todor T...
 
Lecture4
Lecture4Lecture4
Lecture4
 
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...Towards "write once - run whenever possible" with Safety Critical Java af Ben...
Towards "write once - run whenever possible" with Safety Critical Java af Ben...
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Cpu Caches
Cpu CachesCpu Caches
Cpu Caches
 

Mais de Adrian Cockcroft

Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesAdrian Cockcroft
 
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Adrian Cockcroft
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...Adrian Cockcroft
 
Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Adrian Cockcroft
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowAdrian Cockcroft
 
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionGluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionAdrian Cockcroft
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAdrian Cockcroft
 
Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSAdrian Cockcroft
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumAdrian Cockcroft
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Adrian Cockcroft
 
Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)Adrian Cockcroft
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraAdrian Cockcroft
 
Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Adrian Cockcroft
 

Mais de Adrian Cockcroft (20)

Yow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with NotesYow Conference Dec 2013 Netflix Workshop Slides with Notes
Yow Conference Dec 2013 Netflix Workshop Slides with Notes
 
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
 
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
CMG2013 Workshop: Netflix Cloud Native, Capacity, Performance and Cost Optimi...
 
Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013Bottleneck analysis - Devopsdays Silicon Valley 2013
Bottleneck analysis - Devopsdays Silicon Valley 2013
 
Netflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search RoadshowNetflix Global Applications - NoSQL Search Roadshow
Netflix Global Applications - NoSQL Search Roadshow
 
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial IntroductionGluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
Gluecon 2013 - NetflixOSS Cloud Native Tutorial Introduction
 
Gluecon keynote
Gluecon keynoteGluecon keynote
Gluecon keynote
 
Dystopia as a Service
Dystopia as a ServiceDystopia as a Service
Dystopia as a Service
 
Netflix and Open Source
Netflix and Open SourceNetflix and Open Source
Netflix and Open Source
 
NetflixOSS Meetup
NetflixOSS MeetupNetflixOSS Meetup
NetflixOSS Meetup
 
AWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at NetflixAWS Re:Invent - High Availability Architecture at Netflix
AWS Re:Invent - High Availability Architecture at Netflix
 
Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWS
 
Netflix in the Cloud at SV Forum
Netflix in the Cloud at SV ForumNetflix in the Cloud at SV Forum
Netflix in the Cloud at SV Forum
 
Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3) Cloud Architecture Tutorial - Why and What (1of 3)
Cloud Architecture Tutorial - Why and What (1of 3)
 
Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)Cloud Architecture Tutorial - Running in the Cloud (3of3)
Cloud Architecture Tutorial - Running in the Cloud (3of3)
 
Migrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global CassandraMigrating Netflix from Datacenter Oracle to Global Cassandra
Migrating Netflix from Datacenter Oracle to Global Cassandra
 
Netflix Velocity Conference 2011
Netflix Velocity Conference 2011Netflix Velocity Conference 2011
Netflix Velocity Conference 2011
 
Migrating to Public Cloud
Migrating to Public CloudMigrating to Public Cloud
Migrating to Public Cloud
 
Netflix in the cloud 2011
Netflix in the cloud 2011Netflix in the cloud 2011
Netflix in the cloud 2011
 
NoSQL for Netflix
NoSQL for NetflixNoSQL for Netflix
NoSQL for Netflix
 

Último

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Último (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Cmg06 utilization is useless

  • 1. U"liza"on  is  Virtually  Useless  as  a   Metric!   CMG  2006  -­‐  Reno  NV   Adrian  Cockcro?  –  NeAlix  Inc.   With  minor  updates  2010   (At  the  "me:  Dis"nguished  Engineer   eBay  Research  Labs,  eBay  Inc.)  
  • 2. Agenda   •  Headroom   •  U"liza"on   •  Response  Time   •  The  Many  Ways  In  Which  U"liza"on  Metrics  Are  Broken   •  An  Alterna"ve   •  eBay.com  Architecture,  Scale  and  Rate  of  Change   •  Response  "mes  for  an  eBay.com  SOA  service  pool   •  Conclusions  
  • 3. Headroom   •  Headroom  is  available  usable  resources   –  Total  Capacity  minus  Peak  U"liza"on  and  Margin   –  Applies  to  CPU,  RAM,  Net,  Disk  and  OS   Margin Headroom Utilization
  • 4. U"liza"on   •  U"liza"on  is  the  propor"on  of  busy  "me   •  Always  defined  over  a  "me  interval   Utilization
  • 5. Response  Time   •  Service  "me  occurs  while  using  a  resource   •  Queue  "me  waits  for  access  to  a  resource   •  Response  Time  =  Queue  "me  +  Service  "me   •  Assump"ons   –  Steady  state  averages   –  Random  arrivals   –  Constant  service  "me   –  M  servers  processing  the  same  queue   •  Approxima"ons     –  Queue  length  =  Throughput  x  Response  Time  (Liale's  Law)   –  Response  Time  =  Service  Time  /  (Headroom  +  Margin)   –  Response  Time  =  Service  Time  /  (1  -­‐  U"liza"onM)  
  • 6. Response  Time  Curves   Systems  with  many  servers  (e.g.  CPUs)  can  run  at  higher  u"liza"on   levels,  but  degrade  more  rapidly  when  they  finally  run  out  of   capacity.  Headroom  margin  should  be  set  according  to  a  response   "me  target.   R = S / (1 - (U%)m) Headroom margin
  • 7. So  what's  the  problem  with   U"liza"on?   •  Unsafe  assump"ons!  Complex  adap"ve  systems  have  replaced  simple  ones   •  Random  arrivals?   –  Bursty  traffic  with  long  tail  arrival  rate  distribu"on   •  Constant  service  "me?   –  Variable  clock  rate  CPUs,  inverse  load  dependent  service  "me   –  Complex  transac"ons,  request  and  response  dependent   •  M  servers  processing  the  same  queue?   –  Virtual  servers  with  varying  non-­‐integral  concurrency   –  Non-­‐iden"cal  servers  or  CPUs,  Hyperthreading,  Mul"core,  NUMA   •  Measurement  Errors?   –  Measurement  mechanisms  with  built  in  bias,  e.g.  sampling  from  the  scheduler  clock   –  PlaAorm  specific  and  release  specific  systemic  changes  in  the  accoun"ng  of  interrupt  "me  
  • 8. Storage  U"liza"on     •  Storage  virtualiza"on  broke  u"liza"on  metrics  a  long   "me  ago   •  Host  server  measures  busy  "me  on  a  "disk"   –  Simple  disk,  "single  server"  response  "me  gets  high  near   100%  u"liza"on   –  Cached  RAID  LUN,  one  I/O  stream  can  report  100%   u"liza"on,  but  full  capacity  supports  many  threads  of  I/O   since  there  are  many  disks  and  RAM  buffering   •  New  metric  -­‐  "Capability  U"liza"on"   –  Adjusted  to  report  propor"on  of  actual  capacity  for   current  workload  mix   –  Measured  by  tools  such  as  Ortera  Atlas  (hap:// www.ortera.com)  
  • 9. Threaded  CPU  Pipelines   •  CPU  microarchitecture  op"miza"ons   –  Extra  register  sets  working  with  the  exis"ng  arithme"c  and  floa"ng  point  units   –  When  the  CPU  stalls  on  a  memory  read,  it  switches  registers/threads   –  Opera"ng  system  sees  mul"ple  schedulable  en""es  (CPUs)   •  Intel  Hyperthreading   –  Each  CPU  core  has  an  extra  thread  to  use  spare  cycles   –  Typical  benefit  is  20%,  so  total  capacity  is  1.2  CPUs   –  Second  thread  much  slower  when  first  thread  is  busy   –  Hyperthreading  aware  op"miza"ons  in  recent  opera"ng  systems   •  Sun  CoolThreads   –  "Niagara"  SPARC  CPU  has  eight  cores,  one  shared  floa"ng  point  unit   –  Each  CPU  core  has  four  threads,  but  each  core  is  a  very  simple  design   –  Behaves  like  32  slow  CPUs  for  integer,  snail  like  uniprocessor  for  FP   –  Overall  throughput  is  very  high,  performance  per  waa  is  excep"onal   •  Hyperformix  have  performance  modeling  of  Hyperthreads  and  Niagara  
  • 10. Variable  Clock  Rate  CPUs   •  Laptop  and  other  low  power  devices  do  this  all  the  "me   –  Watch  CPU  usage  of  a  video  playback  applica"on  and  toggle  mains/baaery  power….   •  Next  Genera"on  Server  CPU  Power  Op"miza"on  -­‐  AMD  PowerNow!™   –  AMD  Opteron  x64  server  CPU  detects  overall  u"liza"on  and  reduces  clock  rate   –  Actual  speeds  vary,  but  for  example  could  reduce  from  2.6GHz  to  1.2GHz   –  Speed  varies  per  socket,  so  pairs  of  CPU  cores  vary  together   –  Changes  are  not  currently  understood  or  reported  by  opera"ng  system  metrics   –  Speed  changes  can  occur  every  few  milliseconds   •  Possible  scenario:   –  You  es"mate  20%  u"liza"on  at  2.6GHz  and  see  45%  reported  in  prac"ce  (at  1.2GHz)   –  Load  doubles,  reported  u"liza"on  drops  to  40%  (at  2.6GHz)   –  Actual  mapping  of  u"liza"on  to  clock  rate  is  unknown  at  this  point   •  Older  Opterons,  and  "low  power"  versions  used  in  blades  do  not  vary  clock  rate   •  Disaster  scenario  -­‐  you  get  a  capacity  surge  and  the  datacenter  power  and  cooling  can't  cope   with  all  the  systems  at  the  high  clock  rate!  
  • 11. Virtual  Machine  Monitors   •  VMware,  Xen,  and  good  old  mainframe  LPARs  etc.   –  Non-­‐integral  and  non-­‐constant  frac"ons  of  a  machine   –  Naiive  opera"ng  systems  and  applica"ons  that  don't   expect  this  behavior   –  However,  lots  of  recent  tools  development  from  vendors   (BMC,  Teamquest  etc.)   •  Average  CPU  count  must  be  reported  for  each   measurement  interval   •  VMM  overhead  varies,  applica"on  scaling   characteris"cs  may  be  affected  
  • 12. Whats  My  Headroom?  How  to  plot  it?   •  Measure  and  report  absolute  CPU  power  if  you  can  get  it…   •  Plot  shows  headroom  in  blue,  margin  in  red,  total  power  tracking,  day/ night  workload  varia"on,  ploaed  as  mean  +  two  standard  devia"ons.  
  • 13. Cockcro?  Headroom  Plot   •  Scaaer  plot  of  disk   response  "me  (ms)  vs.   Throughput  (KB)   •  Histograms  on  axes   •  Throughput  "me  series   plot   •  Shows  distribu"ons  and   shape  of  response  "me   •  Fits  throughput  weighted   inverse  gaussian  curve   •  Coded  using  "R"  sta"s"cs   package   •  Blogged  development  at   hap:// perfcap.blogspot.com  
  • 14. Thread  Limited  Response  Time   •  Thread-­‐limited  responses   •  Mixture  of  fast  and  slow  requests   •  Oscilla"ng  behaviors   •  Distribu"ons  are  long  tail   •  Workload  behaves  a  bit  like  adhoc   queries  to  a  DSS  perhaps?   •  Measurements  are  of  a  single  SOA   service  pool   •  Response  is  in  milliseconds   •  Throughput  is  execu"ons/s   Exec Resp Min. : 1.00 Min. : 0.0 1st Qu.: 2.00 1st Qu.: 150.0 Median : 8.00 Median : 361.0 Mean : 64.68 Mean : 533.5 3rd Qu.: 45.00 3rd Qu.: 771.9 Max. :10795.00 Max. :19205.0
  • 15. Conclusion   •  Check  your  assump"ons…   •  Record  and  plot  absolute  capacity  for  each   measurement  interval   •  Plot  response  "me  as  a  func"on  of  throughput,  not   just  u"liza"on   •  SOA  response  characteris"cs  are  complicated  and  not   well  understood….   Ques"ons?   (Now  acockcro?@neAlix.com)   hap://perfcap.blogspot.com