O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Openstack Reality Check (Version in English)

5.795 visualizações

Publicada em

Openstack Reality Check - 45 minutes of Openstack Hate (english version as held on Netways OSDC Conference, April 2015)

Publicada em: Internet
  • Hi there! Get Your Professional Job-Winning Resume Here - Check our website! http://bit.ly/resumpro
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui
  • This is a very interesting slide deck. But it's first and second points are cloud specific, not OpenStack Specific at all. Of course local storage is bad.. Of course single threaded disk tests are bad in a multi tenant server.. How is this OpenStacks shortcoming?
       Responder 
    Tem certeza que deseja  Sim  Não
    Insira sua mensagem aqui

Openstack Reality Check (Version in English)

  1. 1. 45 Minutes of Openstack Hate: A reality check Kristian Köhntopp, Cloud Architect Old Fart Martin Loschwitz, Cloud Architect Future Integration Representative SysEleven https://www.flickr.com/photos/pinksherbet/188842453 Pink Sherbet Photography (CC BY 2.0)
  2. 2. 2
  3. 3. 3
  4. 4. https://uksysadmin.files.wordpress.com/2011/03/openstackwallpaper1.png
  5. 5. Widely supported by many marketing departments https://www.flickr.com/photos/ahockley/8662640096 Aaron Hockley (cc-by-sa 2.0)
  6. 6. 6 http://www.zdnet.com/article/customers-reporting-interest-in-cloud-containers-linux-and-openstack-for-2015/
 http://www.businesscloudnews.com/2015/04/09/red-hat-dell-redouble-openstack-private-cloud-efforts/
 http://blogs.wsj.com/cio/2015/02/25/the-morning-download-h-p-deal-with-deutsche-bank-is-step-forward-for-openstack/
  7. 7. What Openstack Vendors promise… https://www.flickr.com/photos/debarshiray/8237431709/sizes/l debarshiray (CC-BY-SA)
  8. 8. How Openstack Admins imagine their workplace… (C) Kristian Köhntopp, 2006
  9. 9. What is being delivered… (C) Hendrik Scholz, 2006
  10. 10. What the admin job actually looks like… (C) Kristian Köhntopp, 2015
  11. 11. What is it that we want to do? 11
  12. 12. „Any VM, anywhere.“ 12
  13. 13. „Infrastructure as Code.“ 13
  14. 14. Why would I need that? • 24 Cores (48 Cores HT), 256G RAM, 2* 10GBit und 12* 3TB HDD including a solid BBU • 10k EUR
 • Applications that actually fill that box are rare. So we are cutting it up to sell off the parts. 14
  15. 15. "Any VM, anywhere"15 CPU, RAM StorageNetwork OverlayUnderlay
  16. 16. Storage "Where the Internet lives", http://www.google.com/about/datacenters/gallery/#/tech/12
  17. 17. Storage requirements • VM with Ephemeral Storage • Not a problem, right? Because storage can be an un- RAID-ed local disk. • But then you got no migration. • Mainentance sucks w/o Migration. For you and for your customers. 17
  18. 18. Lesson #1:
 Nope, local storage is not a legal default. 18
  19. 19. Storage • VM with Volume: All writes are always remote. • And redundant. • Relevant metrics: • Bandwidth, IOPS and Latency • MB/sec, multithreaded-fsync()/sec und sequential- fsync()/sec • Where do the limits come from? 19
  20. 20. Storage vs. requirements • Scales easily: • Bandwidth, MT-IOPS • Hard to scale: • Sequential-IOPS (b/c latency, target: 10k) • Default-Benchmark: "MySQL Slave on a volume" • "16 KB single-thread random-write on a datafile." 20
  21. 21. Lesson #2:
 A Single-Threaded Random-Write Benchmark is a fine first evaluation. 21
  22. 22. Storage vs. actual Openstack delivery • OpenStack Default is iSCSI und tgtd • »The problem is left as an exercise to the user.«
 • Storage Vendors are totally in love with that approach. • https://wiki.openstack.org/wiki/CinderSupportMatrix 22
  23. 23. Storage: Ceph - the good • Excellent bandwidth • Good Multithreaded-IOPS • Very robust, if you got spare memory and time for MTTR 23
  24. 24. Storage: Ceph - the bad • CRUSH • Layout follows topology • Change topology, move data needlessly • You do need a background network for storage to facilitate Rebalancing/Recovery/Cluster-Reorg 24
  25. 25. Storage: Ceph - the ugly • Sequential IOPS fatally slow (200 IOPS). • MySQL slave on our Ceph-Volumes @ 200 Commit/s • Windows 8.1 boots from a Ceph-Volume in 15 Minutes 25
  26. 26. Lesson #3:
 Pure Play OpenStack will not work for production.
 Corollary: OpenStack is not a functioning Open Source Project. 26
  27. 27. Storage requirements vs. Multitenancy • IOPS-Requirements need SSD to implement 27 "Magento Indexer" Single-Threaded Database Updates
  28. 28. Storage requirements vs. Multitenancy • IOPS-Requirements need SSD to implement • IOPS Quotas require SSD to implement 27 You can't put quota on what isn't there. Small IOPS = large incremental steps, large variance.
  29. 29. Storage requirements vs. Multitenancy • IOPS-Requirements need SSD to implement • IOPS Quotas require SSD to implement • SSD complicated - Preice vs. guranteed Performance 27 Target price 1 EUR/GB atm. Low performance variation more important than huge peak perf.
  30. 30. Storage requirements vs. Multitenancy • IOPS-Requirements need SSD to implement • IOPS Quotas require SSD to implement • SSD complicated - Preice vs. guranteed Performance • Caches complicated 27 Working Set > Cache means you are working the raw iron.
  31. 31. Lesson #4:
 Caching is as much part of the problem as it is part of the solution. 28
  32. 32. Network Mercury Redstone Connector MR-1 (1960) https://www.flickr.com/photos/jurvetson/5691350527 Steve Jurvetson (CC-BY)
  33. 33. Network requirements • Hosting setup: • Topology: Wait-Free, Oversubscription-Free • Redundancy: SPOF-Free • Multi-Tenancy: Isolation and Quotas • Also: • Capable of carrying the storage traffic 30
  34. 34. –Joe Random Alphauser „We just rolled out a few dozen VMs with Hadoop and played Terasort.“ 31 http://bradhedlund.com/2012/01/25/construct-a-leaf-spine-design-with-40g-or-10g-an-observation-in-scaling-the-fabric/
  35. 35. Notwork: Openstack default offering • Open vSwitch, GRE-Ball, Broadcast-Problem, Chokepoint, SPOF • Current releases are only marginally less braindead. 32 SPOF
  36. 36. Lesson #5:
 Ok, networking doesn't work, too. 33
  37. 37. Let's look around for a non-broken solution34
  38. 38. OpenContrail: the good, … • Open Source Project by Juniper • Uses existing hardware routing infrastructure • scales, no SPOFs, no Chokepoints • based on MPLS, BGP, and other well understood protocols • actually delivers stuff (as opposed to e.g. OpenDaylite) 35
  39. 39. OpenContrail: the bad, … • Juniper bought Contrail, doesn't understand how to market it • little and outdated documentation, bad release management, very bad packaging • funny support organisation 36
  40. 40. OpenContrail: the bad, … • During the OpenContrail build process, the scons-based build-process will • download the full source of tools such as Bind and Curl • patch them wildly • put the resulting binaries into a .deb package • which will of course conflict with half a dozen Ubuntu packages 37
  41. 41. … and the ugly. https://www.flickr.com/photos/59145750@N03/5559004171 Mara Tr. (CC-BY 2.0)
  42. 42. Technology-Jenga • OpenContrail’s "Stack" • vrouter (.ko), if-map, Sandesh (XML over Thrift!) • C++, Python, node.js, irond (Java) • redis, Cassandra, Zookeeper • xmpp, BGP, MPLS • Up Next in OpenContrail 2.2: Kafka • Put that into a job profile, try to find a candidate. Is he named 39
  43. 43. Lesson #6:
 Problem not limited to Contrail scope. 40
  44. 44. Let's look what else is in the box… https://www.flickr.com/photos/markjsebastian/4217877353/ Mark Sebastian (CC BY-SA 2.0)
  45. 45. General requirements as of 2015 • Distributed Anything: • NTP, centralized Logging, centralized Monitoring • functional validation before components before join, clean cluster startup • cluster comms are Kyle-Kingsbury-proof • CA, encrypted data in flight, optionally encrypted data at rest • User-Story regarding Live-Upgrades, with Canaries 42
  46. 46. OpenStack delivers this… https://www.flickr.com/photos/z287marc/3189567558/ z287marc (CC BY 2.0)
  47. 47. A wild hack session appears… https://www.flickr.com/photos/nauright/11341288174/ Romana Klee (CC BY-SA 2.0)
  48. 48. Not an isolated problem… • Start a HEAT stack with a few networks and 20 VMs • … and the cluster switches off. • Nova does not communicate with Cinder, but has a hard timeout. • "Are we stupid or is OpenStack stupid? Let's test a few public clouds…" 45
  49. 49. Result… • Phone ringing… • "Whatever you are doing, could you please do something else?" 46 "Rotes Telefon", http://de.wikipedia.org/wiki/Datei:Jimmy_Carter_Library_and_Museum_99.JPG User:Piotrus (CC-BY 3.0)
  50. 50. Not an isolated problem…47 Implementation tested on Devstack in VMware Fusion on a MacBook Air in St. Oberholz?
  51. 51. Not an isolated problem…48 https://thespeechatimeforchoosing.files.wordpress.com/2012/12/facepalm.jpg Keystone
  52. 52. Not an isolated problem… • "Could you have a look, my Instances won't instantiate." • A longer debug session later it turns out that libvirts on cloud17 is dysfunctional. • Compute-Nodes in OpenStack join the cluster simply by being visible in AMQP (register as a consumer), no functional validation. 49
  53. 53. Not an isolated problem…50 Ceilometer
  54. 54. Not an isolated problem… • During the Juno release cycle, Horizon got patched to be „less confusing for users“ (any GNOME devs around?) • Result: Assigning floating IPs using Horizon became impossible when using OpenContrail • Lesson learned: OpenStack wants to be versatile, but in fact, most development only ever happens for the „streamline“ setups 51
  55. 55. What is the real problem? https://www.flickr.com/photos/jdhancock/8395113234 JD Hancock (CC BY 2.0)
  56. 56. „Any VM, anywhere.“ 53
  57. 57. „As a hoster/enterprise/department, I want a virtualization platform which does…“ 54
  58. 58. Problemspec vs. Produktspec • Requirements regarding • Tenant Isolation, • Billing Model, • Operations Model, • Development Model, • Scalability 55
  59. 59. Prototypes vs. Product "Prototype in the round file", https://www.flickr.com/photos/generated/3313311558 Jared Tarbell (CC-BY 2.0)
  60. 60. Stabilize the core with actual engineering
 vs.
 Moar components 57
  61. 61. “The Big Tent” https://www.flickr.com/photos/fmckinlay/2410261506 Fiona Shields (CC BY-NC-ND 2.0)
  62. 62. "The problem with the Big Tent is that it is full of clowns." https://www.flickr.com/photos/fmckinlay/2410261506 Fiona Shields (CC BY-NC-ND 2.0)
  63. 63. –Maik Zumstrull „Instead of having a functional solution I can now choose between 13 differently deficient components.“ 60
  64. 64. Democracy as a software architecture model • Product definition == target specification • Who is our customer and what do they need? • What are the mandatory/desireable/optional properties of the product to become part of the solution space instead of the problem space? 61
  65. 65. Democracy as a software architecture model • Quality Assurance • "But OpenStack does have Blueprints, CI and code review" • Without a target specification you have no idea about requirements on scale, workload types, security, … 62
  66. 66. Who actually votes on your commit…
  67. 67. Democracy as a software architecture model • Limited, learnable technology stack • One recommended and tested solution with a documented best practice for deployment instead of a plugin architecture. • Reuseable solutions for comparable problems in neighboring subprojects (“Architecture Refactoring”). 64
  68. 68. Monasca
  69. 69. How to win at Hipsterbingo… • »Monasca is an open-source multi-tenant, highly scalable, performant, fault-tolerant monitoring-as-a- service solution that integrates with OpenStack.
 
 It uses a REST API for high-speed metrics processing and querying and has a streaming alarm engine and notification engine.« 66 Bingo!
  70. 70. How to win at Hipsterbingo… • »Uses a number of underlying technologies:
 
 Apache Kafka, Apache Storm, Zookeeper, MySQL, Vagrant, Dropwizard, InfluxDB, Vertica.« 67
  71. 71. Democracy as a software architecture model • Useful vs. Hyped • “Docker is like OpenStack, but from a Dev instead of an Ops perspective. 
 
 Wouldn't it be great to unify the projects?” 68 https://twitter.com/martinisoft/status/527191803603468288
  72. 72. And finally…69
  73. 73. And finally…69
  74. 74. TL;DR please? 70
  75. 75. TL;DR • OpenStack right now: not a product, but a box of nuts and bolts. Significant assembly required. • Not all parts are actually bad, but the quality varies. A lot.
 • No other project has the momentum. • Most other projects do not have multi-tenancy, multi-node. • Many haven't even grasped the actual problem. Including Docker. 71
  76. 76. TL;DR • Deep in the Gartner hype cycle phase 2: • Users and hence real-world requirements missing. • No unified view on acceptable solutions. • Vendors think they can "win" and "own" the market or products, pushing their embrace and extend agendas. 72
  77. 77. TL;DR • Apache-like open development processes are a proven model for taming open source and commoditisation. • Compare Hadoop • Based on methods pioneered by Java 73
  78. 78. 74 ?
  79. 79. 3. Openstack DACH Tag 11. Juni 2015 Berlin http://openstack-dach.org 75

×