O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Scaling Uber's Elasticsearch Clusters

Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2H8eqwk.

Danny Yuan talks about how Uber scaled its Elasticsearch clusters as well as its ingestion pipelines for ingestions, queries, data storage, and operations by a three-person team. He covers topics like federation, query optimization, caching, failure recovery, data fidelity, transition from Lambda architecture to Kappa architecture, and improvements on Elasticsearch internals. Filmed at qconlondon.com.

Danny Yuan is a software engineer in Uber. He’s currently working on streaming systems for Uber’s marketplace platform. Prior to joining Uber, he worked on building Netflix’s cloud platform. His work includes predictive autoscaling, distributed tracing service, real-time data pipeline that scaled to process hundreds of billions of events every day, and Netflix’s low-latency crypto services.

  • Seja o primeiro a comentar

Scaling Uber's Elasticsearch Clusters

  1. 1. Scaling Uber’s Elasticsearch as an Geo-Temporal Database Danny Yuan @ Uber
  2. 2. InfoQ.com: News & Community Site Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ uber-elasticsearch-clusters • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week
  3. 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon London www.qconlondon.com
  4. 4. Use Cases for a Geo-Temporal Database
  5. 5. Real-time Decisions on Global Scale
  6. 6. Dynamic Pricing: Every Hexagon, Every Minute
  7. 7. Dynamic Pricing: Every Hexagon, Every Minute
  8. 8. Metrics: how many UberXs were in a trip in the past 10 minutes
  9. 9. Metrics: how many UberXs were in a trip in the past 10 minutes
  10. 10. Market Analysis: Travel Times
  11. 11. Forecasting: Granular Forecasting of Rider Demand
  12. 12. How Can We Produce Geo-Temporal Data for Ever Changing Business Needs?
  13. 13. Key Question: What Is the Right Abstraction?
  14. 14. Abstraction: Single-Table OLAP on Geo-Temporal Data
  15. 15. Abstraction: Single-Table OLAP on Geo-Temporal Data SELECT <agg functions>, <dimensions> 
 FROM <data_source>
 WHERE <boolean filter>
 GROUP BY <dimensions>
 HAVING <boolean filter>
 ORDER BY <sorting criterial>
 LIMIT <n>

  16. 16. Abstraction: Single-Table OLAP on Geo-Temporal Data SELECT <agg functions>, <dimensions> 
 FROM <data_source>
 WHERE <boolean filter>
 GROUP BY <dimensions>
 HAVING <boolean filter>
 ORDER BY <sorting criterial>
 LIMIT <n>

  17. 17. Why Elasticsearch? - Arbitrary boolean query - Sub-second response time - Built-in distributed aggregation functions - High-cardinality queries - Idempotent insertion to deduplicate data - Second-level data freshness - Scales with data volume - Operable by small team
  18. 18. Current Scale: An Important Context - Ingestion: 850K to 1.3M messages/second - Ingestion volume: 12TB / day - Doc scans: 100M to 4B docs/ second - Data size: 1 PB - Cluster size: 700 ElasticSearch Machines - Ingestion pipeline: 100+ Data Pipeline Jobs
  19. 19. Our Story of Scaling Elasticsearch
  20. 20. Three Dimensions of Scale Ingestion Query Operation
  21. 21. Driving Principles - Optimize for fast iteration - Optimize for simple operations - Optimize for automation and tools - Optimize for being reasonably fast
  22. 22. The Past: We Started Small
  23. 23. Constraints for Being Small - Three-person team - Two data centers - Small set of requirements: common analytics for machines
  24. 24. First Order of Business: Take Care of the Basics
  25. 25. Get Single-Node Right: Follow the 20-80 Rule - One table <—> multiple indices by time range - Disable _source field - Disable _all field - Use doc_values for storage - Disable analyzed field - Tune JVM parameters
  26. 26. Make Decisions with Numbers - What’s the maximum number of recovery threads? - What’s the maximum size of request queue? - What should the refresh rate be? - How many shards should an index have? - What’s the throttling threshold? - Solution: Set up end-to-end stress testing framework
  27. 27. Deployment in Two Data Centers - Each data center has exclusive set of cities - Should tolerate failure of a single data center - Ingestion should continue to work - Querying any city should return correct results
  28. 28. Deployment in Two Data Centers: trade space for availability
  29. 29. Deployment in Two Data Centers: trade space for availability
  30. 30. Deployment in Two Data Centers: trade space for availability
  31. 31. Discretize Geo Locations: H3
  32. 32. Optimizations to Ingestion
  33. 33. Optimizations to Ingestion
  34. 34. Dealing with Large Volume of Data - An event source produces more than 3TB every day - Key insight: human does not need too granular data - Key insight: stream data usually has lots of redundancy
  35. 35. - Pruning unnecessary fields - Devise algorithms to remove redundancy - 3TB —> 42 GB, more than 70x of reduction! - Bulk write Dealing with Large Volume of Data
  36. 36. Data Modeling Matters
  37. 37. Example: Efficient and Reliable Join - Example: Calculate Completed/Requested ratio with two different event streams
  38. 38. Example: Efficient and Reliable Join: Use Elasticsearch - Calculate Completed/Requested ratio from two Kafka topics - Can we use streaming join? - Can we join on the query side? - Solution: rendezvous at Elasticsearch on trip ID TripID Pickup Time Completed 1 2018-02-03T… TRUE 2 2018-02-3T… FALSE
  39. 39. Example: aggregation on state transitions
  40. 40. Optimize Querying Elasticsearch
  41. 41. Hide Query Optimization from Users - Do we really expect every user to write Elasticsearch queries? - What if someone issues a very expensive query? - Solution: Isolation with a query layer
  42. 42. Query Layer with Multiple Clusters
  43. 43. Query Layer with Multiple Clusters
  44. 44. Query Layer with Multiple Clusters - Generate efficient Elasticsearch queries - Rejecting expensive queries - Routing queries - hardcoded first
  45. 45. Efficient Query Generation - “GROUP BY a, b”
  46. 46. Rejecting Expensive Queries - 10,000 hexagons / city x 1440 minutes per day x 800 cities - Cardinality: 11 Billion (!) buckets —> Out Of Memory Error
  47. 47. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  48. 48. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  49. 49. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  50. 50. Routing Queries "DEMAND": { "CLUSTERS": { "TIER0": { "CLUSTERS": ["ES_CLUSTER_TIER0"], }, "TIER2": { "CLUSTERS": ["ES_CLUSTER_TIER2"] } }, "INDEX": "MARKETPLACE_DEMAND-", "SUFFIXFORMAT": “YYYYMM.WW", "ROUTING": “PRODUCT_ID”, }
  51. 51. Summary of First Iteration
  52. 52. Evolution: Success Breeds Failures
  53. 53. Unexpected Surges
  54. 54. Applications Went Haywire
  55. 55. Solution: Distributed Rate limiting
  56. 56. Solution: Distributed Rate limiting Per-Cluster Rate Limit
  57. 57. Solution: Distributed Rate limiting Per-Instance Rate Limit
  58. 58. Workload Evolved - Users query months of data for modeling and complex analytics - Key insight: Data can be a little stale for long-range queries - Solution: Caching layer and delayed execution
  59. 59. Time Series Cache
  60. 60. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  61. 61. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  62. 62. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  63. 63. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  64. 64. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  65. 65. Time Series Cache - Redis as the cache store - Cache key is based on normalized query content and time range
  66. 66. Delayed Execution - Allow registering long-running queries - Provide cached but stale data for such queries - Dedicated cluster and queued executions - Rationale: three months of data vs a few hours of staleness - Example: [-30d, 0d] —> [-30d, -1d]
  67. 67. Scale Operations
  68. 68. - Make the system transparent - Optimize for MTTR - mean time to recover - Strive for consistency - Automation is the most effective way to get consistency Driving Principles
  69. 69. - Cluster slowed down with all metrics being normal - Requires additional instrumentation - ES Plugin as a solution Challenge: Diagnosis
  70. 70. - Elasticsearch cluster becomes harder to operate as its size increases - MTTR increases as cluster size increases - Multi-tenancy becomes a huge issue - Can’t have too many shards Challenge: Cluster Size Becomes an Enemy
  71. 71. - 3 clusters —> many smaller clusters - Dynamic routing - Meta-data driven Federation
  72. 72. Federation
  73. 73. Federation
  74. 74. Federation
  75. 75. Federation
  76. 76. Federation
  77. 77. Federation
  78. 78. How Can We Trust the Data?
  79. 79. Self-Serving Trust System
  80. 80. Self-Serving Trust System
  81. 81. Self-Serving Trust System
  82. 82. Self-Serving Trust System
  83. 83. Too Much Manual Maintenance Work
  84. 84. - Adjusting queue size - Restart machines - Relocating shards Too Much Manual Maintenance Work
  85. 85. Auto Ops
  86. 86. Auto Ops
  87. 87. Ongoing Work for the Future
  88. 88. - Strong reliability - Strong consistency among replicas - Multi-tenancy Future Work
  89. 89. Summary - Three dimensions of scaling: ingestion, query, and operations - Be simple and practical: successful systems emerge from simple ones - Abstraction and data modeling matter - Invest in thorough instrumentation - Invest in automation and tools
  90. 90. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ uber-elasticsearch-clusters

×