O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Cost-Aware Cloud ArchitecturesJinesh Varia                           Adrian Cockcroft@jinman                              ...
Return on Agility (Agile ROI) = More Revenue
Cloud Economics – Agile ROIGet a faster Return by Speeding up Investment                                Observe           ...
« Want to increase innovation?  Lower the cost of failure »                             Joi Ito
Experiment Often & Adapt Quickly                                                    • Cost of failure falls drama...
Accelerate building a new line of business                              Market Replay (2007)
Go Global in Minutes
Netflix Examples• Brazilian Proxy Experiment   •   No employees in Brazil, no “meetings with IT”   •   Deployed instances ...
Product Launch Agility - Rightsized                              Demand                              Cloud                ...
Product Launch - Under-estimated                            Demand                            Cloud                       ...
Product Launch Agility – Over-estimated                     $                                  Demand                     ...
Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue
When you turn off your cloudresources, you actually stop paying for
Architectures that follows the moneyHow your architecture scales ∝ Customer Traffic
www.MyWebSite.com         (dynamic data)                       Amazon Route 53                                            ...
Hourly CPU Load       14       12       10       8Load       6                           25% Savings       4       2      ...
Web Servers           50% Savings              Weekly CPU Load                1   5    9   13   17   21   25   29   33   3...
Architectures that follows the money How your architecture scales ∝ Customer TrafficHow your architecture scales ∝ How you...
Mastering the Trade-offsHow many $/customer are you willing to spend for50% better latency to customers that willincrease ...
Netflix’s use of Custom Metrics                 Business                  SLAs                   Requests Your            ...
Instances   Business Throughput
50%+ Cost Saving                          Scale up/down                             by 70%+Move to Load-Based      Scaling
Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue      #2 Busin...
When Comparing TCO…
Cost and                                                                                  wasted     wasted               ...
When Comparing TCO…                       PlaceMake sure that         Poweryou are including                       Pipesal...
Save more when you reserve    On-demand            Reserved                 Spot                                          ...
Business-aligned Architectures = Savings   Free Offering                          Premium Offering    • Optimize for reduc...
Save more when you reserve   On-demand          Reserved    Instances         Instances                            Light  ...
Break-even point                                         Utilization        Ideal For             Savings over            ...
Mix and Match Reserved Types and On-Demand            12            10                                                    ...
Netflix Concept of Reserving Capacity for Maximum SavingsOccasional Spikes          On-Demand                     On-Deman...
Netflix Concept of Reserving Capacity for Maximum SavingsOccasional Spikes          On-Demand                    On-Demand...
Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue      #2 Busin...
Usage Patterns: Variety of Applications and Environments  Every Company has….                Every Application has….      ...
Consolidated Billing: Single payer for a group of                                        accounts                         ...
Over-Reserve the Production Environment                                        Total Capacity                     Producti...
Consolidated Billing Borrows Unused Reservations                                                Total Capacity            ...
Consolidated Billing Advantages• Production account is guaranteed to get burst capacity   • Reservation is higher than nor...
Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue      #2 Busin...
Continuous optimization in your            architecture results in                recurring savingsas early as your next m...
Right-size your cloud: Use only what you needAn instance typefor every purposeAssess yourmemory & CPUrequirements• Fit you...
Reserved Instance Marketplace              Buy a smaller term instance     Sell your unused Reserved Instance    Buy insta...
Instance Type OptimizationOlder m1 and m2 families      Latest m3 family Slower CPUs                     Faster CPUs (Sand...
Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue      #2 Busin...
Follow the Customer (Run web servers) during the day                             16                             14        ...
Total                         Instances                         Reserved                           Table              14 T...
Soaking up unused reservationsUnused reserved instances is published as a metricNetflix Data Science ETL Workload (Starts ...
Building Cost-Aware Cloud Architectures #1 Business Agility by Rapid Experimentation = Increased Revenue      #2 Business-...
Thank you!Jinesh Varia and Adrian Cockcroft     jvaria@amazon.com @jinman   acockcroft@netflix.com @adrianco
Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockcroft (Netflix)
Próximos SlideShares
Carregando em…5
×

Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockcroft (Netflix)

5 ways you can build cost-awareness into your cloud architectures and maximize your savings (business-driven auto scaling, mixing and matching reserved/on-demand, iterating and optimizing fungible resources, follow the customer (run auto scaling web servers) during the day and follow the money (run hadoop and transcoding jobs) at night and soak up your reservations.

  • Entre para ver os comentários

Building Cost-Aware Cloud Architectures - Jinesh Varia (AWS) and Adrian Cockcroft (Netflix)

  1. 1. Cost-Aware Cloud ArchitecturesJinesh Varia Adrian Cockcroft@jinman @adriancoTechnology Evangelist Director, Architecture
  2. 2. Return on Agility (Agile ROI) = More Revenue
  3. 3. Cloud Economics – Agile ROIGet a faster Return by Speeding up Investment Observe Act OrientRapid innovation byspeeding up the DecideOODA loopTry, fail, try again, succeed
  4. 4. « Want to increase innovation? Lower the cost of failure » Joi Ito
  5. 5. Experiment Often & Adapt Quickly       • Cost of failure falls dramatically • Return on (small incremental) Investments is high • More risk taking, more innovation   • More iteration, faster innovation
  6. 6. Accelerate building a new line of business Market Replay (2007)
  7. 7. Go Global in Minutes
  8. 8. Netflix Examples• Brazilian Proxy Experiment • No employees in Brazil, no “meetings with IT” • Deployed instances into two zones in AWS Brazil • Experimented with network proxy optimization • Decided that gain wasn’t enough, shut everything down• European Launch using AWS Ireland • No employees in Ireland, no provisioning delay, everything worked • No need to do detailed capacity planning • Over-provisioned on day 1, shrunk to fit after a few days • Capacity grows as needed for additional country launches
  9. 9. Product Launch Agility - Rightsized Demand Cloud Datacenter
  10. 10. Product Launch - Under-estimated Demand Cloud Datacenter
  11. 11. Product Launch Agility – Over-estimated $ Demand Cloud Datacenter
  12. 12. Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue
  13. 13. When you turn off your cloudresources, you actually stop paying for
  14. 14. Architectures that follows the moneyHow your architecture scales ∝ Customer Traffic
  15. 15. www.MyWebSite.com (dynamic data) Amazon Route 53 media.MyWebSite.com (DNS) (static data) Elastic Load Balancer Amazon Auto Scaling group : Web Tier CloudFront Amazon EC2 Auto Scaling group : App Tier Amazon RDS Amazon Amazon S3 RDSAvailability Zone #1 Availability Zone #2
  16. 16. Hourly CPU Load 14 12 10 8Load 6 25% Savings 4 2 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 24 Hours in a Day Optimize by the time of day
  17. 17. Web Servers 50% Savings Weekly CPU Load 1 5 9 13 17 21 25 29 33 37 41 45 49 Weeks in a YearOptimize during a year
  18. 18. Architectures that follows the money How your architecture scales ∝ Customer TrafficHow your architecture scales ∝ How you make money
  19. 19. Mastering the Trade-offsHow many $/customer are you willing to spend for50% better latency to customers that willincrease in conversion to paid customers (ormore signups) by 10%?How many $/customer are you willing to spend for100 more renders per minute (10% reduction inwait time for customers) resulting in 50% morereach (viral awareness)?How many $/job are you willing to spend for 30%more faster results for your analytics job?
  20. 20. Netflix’s use of Custom Metrics Business SLAs Requests Your User Timeout PUT 2 weeks App Latency Resp time Concurrent Alarm Amazon CloudWatch UsersInstance Custom Metrics via Servo “Increase, Decrease, Shrink, Expand your Instances ”
  21. 21. Instances Business Throughput
  22. 22. 50%+ Cost Saving Scale up/down by 70%+Move to Load-Based Scaling
  23. 23. Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue #2 Business-driven Auto Scaling Architectures = Savings
  24. 24. When Comparing TCO…
  25. 25. Cost and wasted wasted Demand capacity capacity 600kMaintainingon-premiseinfrastructure wasted capacityfor peak 300kdemand is wasted lost customers, order ed hardwareexpensive capacity 200k Capacity of resources Actual demand Q1 Q2 Q3 Q4 Q1 Time
  26. 26. When Comparing TCO… PlaceMake sure that Poweryou are including Pipesall the cost factorsinto consideration People Patterns
  27. 27. Save more when you reserve On-demand Reserved Spot Dedicated Instances Instances Instances Instances• Pay as you go • One time low • Requested Bid • Standard and upfront fee + Price and Pay as Reserved discounted hourly you go • Multi-Tenant costs • Price change Single Customer• Zero commitment • Upto 71% savings every hour based • Ideal for over On-Demand on unused EC2 compliance and capacity regulatory workloads Billing Options
  28. 28. Business-aligned Architectures = Savings Free Offering Premium Offering • Optimize for reducing cost  Optimized for Faster response times • Acceptable Delay Limits  No DelaysImplementation Implementation • Use Spot Instances first  Paid Subscriptions ∝ RIs • Use on-demand Instances, if  Use on-demand Instances during Spot is not available in 15 min weekends (high traffic)  Bid higher in spot if On-Demand is not available
  29. 29. Save more when you reserve On-demand Reserved Instances Instances Light Utilization RI• Pay as you go • One time low upfront fee + 1-year and Medium discounted 3-year terms Utilization RI hourly costs• Zero Heavy commitment • Upto 71% Utilization RI savings over On- Demand
  30. 30. Break-even point Utilization Ideal For Savings over (Uptime) On-Demandds 10% - 40% Disaster Recoveryow Light Utilization RI (>3.5 < 5.5 months/year) (Lowest Upfront) 56% + 40% - 75% Standard Reserved 1-year and 3- year terms Medium Utilization RI (>5.5 < 7 months/year) Capacity 66%s >75% Baseline Servers Heavy Utilization RI (>7 months/year) (Lowest Total Cost) 71%r On-
  31. 31. Mix and Match Reserved Types and On-Demand 12 10 On-Demand 8Instances 6 Light RI Light RI Light RI Light RI 4 2 Heavy Utilization Reserved Instances 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Days of Month
  32. 32. Netflix Concept of Reserving Capacity for Maximum SavingsOccasional Spikes On-Demand On-Demand Heavy RI Heavy RINormal Usage Light RI Light RI Billing Billing us-west region us-east region
  33. 33. Netflix Concept of Reserving Capacity for Maximum SavingsOccasional Spikes On-Demand On-Demand Heavy RI Light RINormal Usage Heavy RI Light RI Billing Billing us-west region us-east region
  34. 34. Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue #2 Business-driven Auto Scaling Architectures = Savings #3 Mix and Match Reserved Instances with On-Demand = Savings
  35. 35. Usage Patterns: Variety of Applications and Environments Every Company has…. Every Application has…. LOB and Products Production Fleet Fleet Dev Fleet Marketing Site Test Fleet Intranet Site Staging/QA BI and DW Perf Fleet CRM DR Site Training Sites
  36. 36. Consolidated Billing: Single payer for a group of accounts One Bill for multiple accounts Easy Tracking of account charges (e.g., download CSV of cost data) Volume Discounts can be reached faster with combined usage Reserved Instances are shared across accounts (including RDS Reserved DBs)
  37. 37. Over-Reserve the Production Environment Total Capacity Production Env. 100 Reserved Account QA/Staging Env. 0 Reserved Account Perf Testing Env. 0 Reserved Account Development Env. 0 Reserved Account Storage Account 0 Reserved
  38. 38. Consolidated Billing Borrows Unused Reservations Total Capacity Production Env. 68 Used Account QA/Staging Env. 10 Borrowed Account Perf Testing Env. 6 Borrowed Account Development Env. 12 Borrowed Account Storage Account 4 Borrowed
  39. 39. Consolidated Billing Advantages• Production account is guaranteed to get burst capacity • Reservation is higher than normal usage level • Requests for more capacity always work up to reserved limit • Higher availability for handling unexpected peak demands• No additional cost • Other lower priority accounts soak up unused reservations • Totals roll up in the monthly billing cycle
  40. 40. Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue #2 Business-driven Auto Scaling Architectures = Savings #3 Mix and Match Reserved Instances with On-Demand = Savings #4 Consolidated Billing and Shared Reservations = Savings
  41. 41. Continuous optimization in your architecture results in recurring savingsas early as your next month’s bill
  42. 42. Right-size your cloud: Use only what you needAn instance typefor every purposeAssess yourmemory & CPUrequirements• Fit your application to the resource• Fit the resource to your applicationOnly use a largerinstance whenneeded
  43. 43. Reserved Instance Marketplace Buy a smaller term instance Sell your unused Reserved Instance Buy instance with different OS or type Sell unwanted or over-bought capacityBuy a Reserved instance in different region Further reduce costs by optimizing
  44. 44. Instance Type OptimizationOlder m1 and m2 families Latest m3 family Slower CPUs Faster CPUs (Sandybridge) Higher response times Lower response times Smaller caches (6MB) Bigger caches (20MB) Oldest m1.xl 15GB/8ECU/$0.48 Even faster for Java vs. ECU Old m2.xl 17GB/6.5ECU/$0.41 New m3.xl 15GB/13 ECU/$0.50 ~16 ECU/$/hr 26 ECU/$/hr – 62% better! Java measured even higher Deploy fewer instances
  45. 45. Key Takeaways on Cost-Aware Architectures…. #1 Business Agility by Rapid Experimentation = Increased Revenue #2 Business-driven Auto Scaling Architectures = Savings #3 Mix and Match Reserved Instances with On-Demand = Savings #4 Consolidated Billing and Shared Reservations = Savings #5 Always-on Instance Type Optimization = Recurring Savings
  46. 46. Follow the Customer (Run web servers) during the day 16 14 No. of Reserved Instances No of Instances Running 12 10 8 Auto Scaling Servers 6 Hadoop Servers 4 2 0 Mon Tue Wed Thur Fri Sat Sun Week Follow the Money (Run Hadoop clusters) at night
  47. 47. Total Instances Reserved Table 14 Types 4 AZ-mappings Web Launch 40 Unused HadoopApplication Reservations Fleet Fleet Calculator Total Instances Running now = 100 Total unused Reservations available = 40 in 2 AZs (5 min interval)
  48. 48. Soaking up unused reservationsUnused reserved instances is published as a metricNetflix Data Science ETL Workload (Starts after midnight)• Daily business metrics roll-up• EMR clusters started using hundreds of instancesNetflix Movie Encoding Workload• Long queue of high and low priority encoding jobs• Can soak up 1000’s of additional unused instances
  49. 49. Building Cost-Aware Cloud Architectures #1 Business Agility by Rapid Experimentation = Increased Revenue #2 Business-driven Auto Scaling Architectures = Savings #3 Mix and Match Reserved Instances with On-Demand = Savings #4 Consolidated Billing and Shared Reservations = Savings #5 Always-on Instance Type Optimization = Recurring Savings #6 Follow the Customer (Run web servers) during the day Follow the Money (Run Hadoop clusters) at night
  50. 50. Thank you!Jinesh Varia and Adrian Cockcroft jvaria@amazon.com @jinman acockcroft@netflix.com @adrianco

×