O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Architectures for open and scalable clouds

My presentation for 2012's Cloud Connect that goes over architectural and design patterns for open and scalable clouds. Technical deck targeted at business audiences with a technical bent.

  • Entre para ver os comentários

Architectures for open and scalable clouds

  1. 1. Architectures for open and scalable cloudsFebruary 14, 2012Randy Bias, CTO & Co-founder CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
  2. 2. Our Perspective on Cloud Computing It came from the large Internet players. 2
  3. 3. A Story of Two Clouds 3
  4. 4. A Story of Two Clouds 4
  5. 5. Tenets of Open & Scalable Clouds1. Avoid vendor lock-in like bubonic plague • See also Open Cloud Initiative (opencloudinitiative.org)2. Simplicity scales, complexity fails • 10x bigger == 100x more complex3. TCO matters; measuring ROI is critical to success4. Security is paramount ... but different5. Risk acceptance over risk mitigation6. Agility & iteration over big bang 5
  6. 6. This is a BIG Topic• What I am covering today is patterns in: • Hardware and software • Networking, storage, and compute• NOT covered today: • Cloud operations • Infrastructure software engineering • Measuring success through operational excellence • Security 6
  7. 7. Open Clouds (briefly) 7
  8. 8. A Word on ‘Open’ 8
  9. 9. Here we go ...• Elements: • Open APIs & protocols • Open hardware • Open networking • Open source software (OSS)• Combined with: • Architectural patterns, best practices, & de facto standards • Operational excellence 9
  10. 10. Open APIs & Protocols 10
  11. 11. Open Hardware 11
  12. 12. Open NetworkingPublished Networking Blueprints 12
  13. 13. Open Source SoftwareOpen Cloud OS 13
  14. 14. Open & ScalableCloud Patterns 14
  15. 15. Threads• Small failure domains are less impacting• Loose-coupling minimizes cascade failures• Scale-out over scale-up with exceptions• More AND cheaper• State synchronization is dangerous (remember CAP)• Everything has an API• Automation ONLY works w/ homogeneity & modularity• Lowest common denominator (LCD) services (LBaaS vs F5aaS)• People are the number one source of failures 15
  16. 16. Pattern: Loose couplingSynchronous, blocking calls mean cascading failures. Async, non-block calls mean failure in isolation. 16
  17. 17. Pattern:Open source software Excessive software taxation is the past. You can always fork. Black boxes create lock-in. 17
  18. 18. Pattern:Uptime in software - self managementHardware fails.Software fails. People fail. Only software can measure itself & respond to failure in near real-time. Applications designed for 99.999% uptime can run anywhere 18
  19. 19. Pattern: Scale-out, not UP You name them and when they get Scale Up: (Virtual*) sick, you nurseServers are like pets them back to health garfield.company.com attrib: Bill Baker, Distinguished Engineer, Microsoft * added by yours truly ... 19
  20. 20. Pattern: Scale-out, not UP You name them and when they get Scale Up: (Virtual*) sick, you nurse Servers are like pets them back to health garfield.company.com You number them Scale Out: (Virtual*) and when they getServers are like cattle sick, you shoot them web001.company.com attrib: Bill Baker, Distinguished Engineer, Microsoft * added by yours truly ... 19
  21. 21. Pattern: Buy from ODMsODMs operate theirbusinesses on 3-10% margins. AMZN, GOOG, and Facebook buy direct without a middleman. Only a few enterprise vendors are pivoting to compete. 20
  22. 22. Pattern:Less enterprise “value” in x86 servers Generic servers rule. Full stop. Nothing is better because nothing else is *generic*. “... a data center full of vanity free servers ... more efficient ... less expensive to build and run ... “ - OCP 21
  23. 23. Pattern: Flat NetworkingThe largest cloud operators all run layer-3 routed, flat networks with no VLANs. Cloud-ready apps don’t need or want VLANs. Enterprise apps can besupported on open clouds using Software-defined Networking (SDN) 22
  24. 24. Pattern: Software-defined Networking (SDN)• x86 server is the new Linecard• network switch is the new ASIC “Network Virtualization”• VXLAN (or NVGRE) is the new Chassis• SDN Controller is the new SUP Engine 23
  25. 25. Pattern: Flat Networking + SDNsFlat + SDN co-exist Internet & thrive together VM VM Availability Zone VM VM VPC VM Gateway Virtual L2 Network 1 2 VM VM Standard VM VM VPC Virtual Private Security Security Cloud Group Group Networking Physical Node 24
  26. 26. Pattern: RAIS instead of HA pairs/clusters• Redundant arrays of inexpensive services (RAIS) • Load balanced • No state sharing • On failure, connections are lost, but failures are rare• Ridiculously simple & scalable• Most things retry anyway• Hardware failures are in-frequent & impact subset of traffic • (N-F)/N, where N = total, F = failed• Cascade failures are unlikely and failure domains are small 25
  27. 27. Service array (RAIS) example:Public IP Backbone Routers Blocks OSPF Route Announcements RAIS (NAT, LB, VPN) Cloud Access Switches API Return Traffic (default or source NAT)CloudControl Plane AZ (Spine) Switches 26
  28. 28. Pattern: Lots of inexpensive 1RU Switches Simple spine-and-leaf flat routed network Rack 1 Rack 2 Rack 31RU: 6K-30K VMs / AZ 27
  29. 29. Pattern: Lots of inexpensive 1RU Switches Simple spine-and-leaf flat routed network Multiple Multiple Multiple Rack 1 Rack 2 Rack 3 Racks2 Rack Racks2 Rack Racks2 Rack Rack 1 Rack 1 Rack 11RU: 6K-30K VMs / AZ Modular: 40K-200K VMs / AZ 27
  30. 30. Pattern: Direct-attached Storage (DAS)Cloud-ready apps DAS is the smallest failuremanage their own domain possible withdata replication. reasonable storage I/O.SAN == massive failure SSDs will be the great domain. equalizer. 28
  31. 31. Pattern: Elastic Block Device Services EBS/EBD is a crutch for poorly written apps. Bigger failure domains (AWSoutage anyone?), complex, sets high expectations Sometimes you need a crutch. When you do, overbuild the network, and make sure you have a smart scheduler. 29
  32. 32. Pattern: More Servers == More Storage I/O >1M writes/second, triple-redundancy w/ Cassandra on AWS Linear scale-out == linear costs for performance 30
  33. 33. Pattern: Hypervisors are a commodityCloud end-users want OS of choice, not HVs. Level up! Managing iron is for mainframe operators. Hypervisor of the future is open source, easily modifiable, & extensible. 31
  34. 34. Open Cloud SystemSimply Scaled Production Ready randyb@cloudscaling.com @randybias 32