O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Breaking Open Apache Geode: How It Works and Why

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 59 Anúncio
Anúncio

Mais Conteúdo rRelacionado

Semelhante a Breaking Open Apache Geode: How It Works and Why (20)

Anúncio

Mais de VMware Tanzu (20)

Mais recentes (20)

Anúncio

Breaking Open Apache Geode: How It Works and Why

  1. 1. Apache Geode Summit 2019 Breaking Open Apache Geode - Dan Smith, Pivotal Dan Da
  2. 2. What is Geode?
  3. 3. What is Geode? ● Distributed key-value store Client Put (key, value) Server Server Server
  4. 4. ● Distributed key-value store ● Highly available What is Geode? Client Put (key, value) Server Server Server
  5. 5. ● Distributed key-value store ● Highly available ● Low Latency What is Geode? Client Put (key, value) Server Server < 1ms Whoah!
  6. 6. ● Distributed key-value store ● Highly available ● Low Latency ● Consistent and Partition Tolerant What is Geode Client Put (key, value) Server Server Oh, no! A network partition!
  7. 7. ● Two types of regions What is Geode Client Put (A) Replicated Server A Server Server A A A
  8. 8. ● Two types of regions What is Geode Client Put (A) Replicated Server A Server Server A A A Partitioned Server A Server Server B A
  9. 9. What is Geode ● Keys and Values are Objects (Java, C++, C#, JSON) ● Has ○ Secondary Indexes & Querying ○ Continuous Queries ○ Transactions ○ Persistence ○ WAN replication ○ Event delivery ○ Parallel functions ○ ...
  10. 10. Components 1 1 Membership Distributed Locks Replicated Regions Partitioned Regions Function Execution Serialization Messaging Persistence Indexes Querying WAN ReplicationStatistics
  11. 11. Components 1 2 Membership Distributed Locks Replicated Regions Partitioned Regions Function Execution Serialization Messaging Persistence Indexes Querying WAN ReplicationStatistics Partitioned Regions
  12. 12. Components 1 3 Membership Distributed Locks Replicated Regions Partitioned Regions Function Execution Serialization Messaging Persistence Indexes Querying WAN ReplicationStatisticsPartitioned Regions - Partitioning & Routing - High Availability - Consistency - Recovery and Rebalancing
  13. 13. ● A partitioned regions is divided into buckets Partitioned Regions Put (“Marie Tharp”, value) Bucket 0 Bucket 1 Bucket 2 Bucket 3 Bucket N hash = “Marie Tharp”.hashCode() bucket = hash % num_buckets
  14. 14. Server 2 Server 1 Server 3 ● Buckets are mapped to servers Partitioned Regions Put (“Marie Tharp”, value) Bucket 0 Bucket 3 Bucket N Bucket 1 Bucket 2 hash = “Marie Tharp”.hashCode() bucket = hash % num_buckets
  15. 15. The End
  16. 16. What about? ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  17. 17. Placing Buckets ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  18. 18. Be Lazy
  19. 19. Server 2 Server 1 Client Partitioned Regions - Lazy Creation Put (key, value) Hash Function Put in Bucket 2 Routing Table (empty) Server 3 Proxy
  20. 20. Server 2 Server 1 Client Partitioned Regions - Lazy Creation Put (key, value) Hash Function Routing Table (empty) Server 3 Bucket 2 key=value Proxy Create Bucket!
  21. 21. Server 2 Server 1 Client Partitioned Regions - Lazy Discovery Routing Table (empty) Server 3 Bucket 2 key=value Proxy Reply - Bucket Metadata Changed!
  22. 22. Server 2 Server 1 Client Partitioned Regions - Lazy Discovery Routing Table Server 3 Bucket 2 key=value Proxy Get Bucket Locations
  23. 23. Server 2 Server 1 Client Partitioned Regions - Lazy Discovery Put (key, value) Hash Function Put in Bucket Bucket 2 key=value Routing Table Bucket 2 Server 3
  24. 24. High Availability ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  25. 25. Duplicate Work
  26. 26. Server 2 Server 1 Client Partitioned Regions - High Availability Put (key, value) Hash Function Put in Bucket Routing Table Bucket 2 Server 3 Bucket 2 key=value
  27. 27. Server 2 Server 1 Client Partitioned Regions - High Availability Put (key, value) Hash Function Put in Bucket Routing Table Bucket 2 Server 3 Bucket 2 key=value Bucket 2 key=value
  28. 28. Server 2 Server 1 Client Partitioned Regions - Failover Put (key, value) Hash Function Put in Bucket Bucket 2 key=value Routing Table Bucket 2 Server 3 Bucket 2 key=value
  29. 29. Consistency ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we add/remove servers?
  30. 30. Server 2 Server 1 Client 1 Consistency - Ships Passing in the Night Put (key, value1) Bucket 2 key=value1 Server 3 Client 2 Put (key, value2) Bucket 2 key=value2
  31. 31. Server 2 Server 1 Client 1 Consistency - Ships Passing in the Night Put (key, value1) Bucket 2 key=value2 Server 3 Client 2 Put (key, value2) Bucket 2 key=value1
  32. 32. Consistency ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution?
  33. 33. Wait in Line
  34. 34. Server 2 Server 1 Client 1 Consistency - Ships Passing in the Night Put (key, value1) Bucket 2 key=value2 Server 3 Client 2 Put (key, value2) Bucket 2 key=value1
  35. 35. Server 2 Server 1 Client 1 Consistency Put (key, value1) Bucket 2 key=value2 Server 3 Client 2 Put (key, value2) Bucket 2 key=value2 Operations on key Serialized on primary
  36. 36. Server 2 Server 1 Client Consistency - Lingering Operations Put (key, value) Hash Function Put in Bucket Bucket 2 key=value Routing Table Bucket 2 Server 3 Bucket 2 key=value
  37. 37. Server 2 Client Consistency - Lingering Operations Server 3 Bucket 2 key=value Old, lingering event (key, value, Event ID)Put (key, value1) Hash Function Routing Table Bucket 2 Event Tracker (key, value, Event ID)
  38. 38. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value2 Client 2 Bucket 2 key=value2
  39. 39. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value2 Client 2 Bucket 2 key=value2
  40. 40. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value1 Client 2 Put (key, value2) Bucket 2 key=value2
  41. 41. Give Up
  42. 42. Server 2 Server 1 Client 1 Consistency - Network Partitions Put (key, value1) Bucket 2 key=value2 Client 2 Put (key, value2) Bucket 2 key=value2
  43. 43. ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do we improve data distribution? Restoring Redundancy
  44. 44. Tell Others What to Do
  45. 45. Partitioned Regions - Redundancy Recovery Start Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Start Start
  46. 46. Partitioned Regions - Redundancy Recovery Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Got a lock!
  47. 47. Partitioned Regions - Redundancy Recovery Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Bucket 2 Make a copy! Copy Bucket
  48. 48. Partitioned Regions - Redundancy Recovery Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Nothing to Do Bucket 2
  49. 49. Partitioned Regions - Redundancy Recovery Nothing to Do Server 4Server 2 Bucket 2 Redundancy Provider Redundancy Provider Server 3 Redundancy Provider Bucket 2
  50. 50. Rebalancing ● How does data get to a bucket? ● How does geode handle failures? ● How does geode ensure data is consistent? ● How are lost bucket copies replaced? ● How do improve data distribution?
  51. 51. Be Greedy (Optimizer)
  52. 52. Rebalancing - What are we optimizing ● Cost based optimizer ● Minimizes the variance in bytes stored on each member ● Greedy algorithm ○ Maximize the improvement in variance per byte moved Bucket 1 Bucket 3 Bucket 2Server 1 Bucket 1 Bucket 3 Bucket 2 Variance: 1600 Server 2 Server 3 60 0 0
  53. 53. Server 3 Server 1 Server 2 Rebalancing - What are we optimizing ● Cost based optimizer ● Minimizes the variance in bytes stored on each member ● Greedy algorithm ○ Maximize the improvement in variance per byte moved Bucket 1 Bucket 3 Bucket 2 Variance: 1050 45 15 0
  54. 54. Server 3 Server 1 Server 2 Rebalancing - What are we optimizing ● Cost based optimizer ● Minimizes the variance in bytes stored on each member ● Greedy algorithm ○ Maximize the improvement in variance per byte moved Bucket 1 Bucket 3 Bucket 2 Variance: 150 30 15 15
  55. 55. Rebalancing - what does it do? Three Phases 1. Restore Redundancy 2. Optimize bucket distribution 3. Optimize primary distribution Membership changes start from phase 1 again.
  56. 56. Putting it Together ● Start with the simple idea: Hashing ● Using - Laziness, Duplication, Bossyness and Greed ● Get ○ High Availability ○ Low Latency ○ Consistency
  57. 57. Links ● Mailing List: dev-subscribe@geode.apache.org ● Internal Architecture: https://cwiki.apache.org/confluence/x/AolXAw
  58. 58. Q & A 59

×