O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Building a 100% ScyllaDB Shard-Aware Application Using Rust

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 40 Anúncio

Building a 100% ScyllaDB Shard-Aware Application Using Rust

Baixar para ler offline

At Numberly we designed an entire data processing application on ScyllaDB's low-level internal sharding using Rust.

Starting from what seemed like a crazy idea, our application design actually delivers amazing strengths like idempotence, distributed and predictable data processing with infinite scalability thanks to ScyllaDB.

Having ScyllaDB as our only backend, we managed to reduce operational costs while benefiting from core architectural paradigms like:
- predictable data distribution and processing capacity
- idempotence by leveraging deterministic data sharding
- optimized data manipulation using consistent shard-aware partition keys
- virtually infinite scaling along ScyllaDB

This talk will walk you through this amazing experience. We will share our thought process, the roadblocks we overcame and the numerous contributions we made to ScyllaDB to reach our goal in production.

Guaranteed 100% made with love in Paris using Scylla and Rust!

At Numberly we designed an entire data processing application on ScyllaDB's low-level internal sharding using Rust.

Starting from what seemed like a crazy idea, our application design actually delivers amazing strengths like idempotence, distributed and predictable data processing with infinite scalability thanks to ScyllaDB.

Having ScyllaDB as our only backend, we managed to reduce operational costs while benefiting from core architectural paradigms like:
- predictable data distribution and processing capacity
- idempotence by leveraging deterministic data sharding
- optimized data manipulation using consistent shard-aware partition keys
- virtually infinite scaling along ScyllaDB

This talk will walk you through this amazing experience. We will share our thought process, the roadblocks we overcame and the numerous contributions we made to ScyllaDB to reach our goal in production.

Guaranteed 100% made with love in Paris using Scylla and Rust!

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Semelhante a Building a 100% ScyllaDB Shard-Aware Application Using Rust (20)

Mais de ScyllaDB (20)

Anúncio

Mais recentes (20)

Building a 100% ScyllaDB Shard-Aware Application Using Rust

  1. 1. Building a 100% ScyllaDB Shard-Aware Application Using Rust Alexys Jacob, Joseph Perez, Yassir Barchi
  2. 2. Alexys CTO Joseph Senior Software Engineer Yassir Lead Software Engineer
  3. 3. The Path to a 100% Shard-Aware Application
  4. 4. Project Context at Numberly The Omnichannel Delivery team got tasked to build a platform that could be the single point of entry for all the messages Numberly is operating and routing for their clients. ■ Clients & Platforms send messages using REST API gateways (Email, SMS, WhatsApp) ■ Gateways render and relay the messages to the Central Message Routing Platform ■ Which offers strong and consistent features for all channels ■ Scheduling ■ Accounting ■ Tracing ■ Routing
  5. 5. Before After
  6. 6. Central Messaging Platform Constraints ■ High Availability The platform is a Single Point Of Failure for _all_ our messages, it must be resilient. ■ Horizontal scalability Ability to scale to match our message routing needs, no matter the channel.
  7. 7. Central Messaging Platform Guarantees ■ Observability Expose per channel metrics and allow per message or per batch tracing. ■ Idempotence The platform guarantees that the same message can’t be sent twice.
  8. 8. Design Thinking & Key Concepts We need to apply some key concepts in our design to keep up with the constraints and guarantees of our platform. Reliability - Simple: few share-(almost?)-nothing components - Low coupling: keep remote dependencies to its minimum - Coding language: performant with explicit patterns and strict paradigms
  9. 9. Design Thinking & Key Concepts Scale - Application layer: easy to deploy & scale with strong resilience - Data bus: high-throughput, highly-resilient, horizontally scalable, time and order preserving capabilities message bus - Data querying: low-latency, one-or-many query support Idempotence - Processing isolation: workload distribution should be deterministic
  10. 10. Considering Numberly’s stack, the first go-to architecture could have been… Platform Architecture 101
  11. 11. Reliability HA with low coupling Relies on 3 data technologies Scalability Easy to deploy Kubernetes Data horizontal scaling ScyllaDB Kafka Redis Data low latency querying ScyllaDB Kafka Redis Data ordered bus ScyllaDB Kafka Redis Idempotence Deterministic workload distribution SUM( ScyllaDB + Kafka + Redis ) ?! Platform Architecture Not So 101
  12. 12. Reliability HA with low coupling Use only ONE data technology Scalability Easy to deploy Kubernetes Data horizontal scaling ScyllaDB Kafka Redis Data low latency querying ScyllaDB Kafka Redis Data ordered bus ScyllaDB Kafka Redis Idempotence Deterministic workload distribution ScyllaDB?! The Daring Architecture
  13. 13. What if I used ScyllaDB’s shard-per-core architecture inside my application?
  14. 14. ScyllaDB Shard-Per-Core Architecture ScyllaDB shard-per-core data distribution and deterministic processing.
  15. 15. Using ScyllaDB Shard-Per-Core Architecture Let’s align our application with ScyllaDB’s shard-per-core deterministic data distribution!
  16. 16. Using ScyllaDB’s shard-awareness at the core of our application we gain: - Deterministic workload distribution - Super optimized data processing capacity aligned from the application to the storage layer - Strong latency and isolation guarantees per application instance (pod) - Patterned after ScyllaDB’s Infinite scale The 100% Shard-Aware Application
  17. 17. Building a Shard-Aware Application
  18. 18. The Language Dilemma
  19. 19. The Language Dilemma ■ We need a modern language that reflects our desire to build a reliable, secure and efficient platform.
  20. 20. The Language Dilemma ■ We need a modern language that reflects our desire to build a reliable, secure and efficient platform. ■ Shard calculation algorithm requires performant hashing capabilities and a great synergy with the ScyllaDB driver.
  21. 21. The Language Dilemma ■ We need a modern language that reflects our desire to build a reliable, secure and efficient platform. ■ Shard calculation algorithm requires performant hashing capabilities and a great synergy with the ScyllaDB driver. reliable + secure + efficient = Rust
  22. 22. The Language Dilemma ■ We need a modern language that reflects our desire to build a reliable, secure and efficient platform. ■ Shard calculation algorithm requires performant hashing capabilities and a great synergy with the ScyllaDB driver. reliable + secure + efficient + = Rust
  23. 23. Our Stack is Born
  24. 24. Deterministic Data Ingestion Partition Key is a tuple of (channel, customer, message id) Clustering Key (event date, event action) 1 Ingester Store
  25. 25. Deterministic Data Ingestion Partition Key is a tuple of (channel, customer, message id) Clustering Key (event date, event action) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) 1 2 2 Ingester Store
  26. 26. Deterministic Data Processing Partition Key is a tuple of (channel, customer, message id) Clustering Key (event date, event action) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) 1 2 2 Ingester Store Scheduler Shard (1) handler Shard (2) handler Shard (3) handler Shard (N) handler
  27. 27. Deterministic Data Processing Partition Key is a tuple of (channel, customer, message id) Clustering Key (event date, event action) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) 1 2 2 Ingester Store Scheduler 3 Shard (1) handler Shard (2) handler Shard (3) handler Shard (N) handler SELECT ... FROM buffer WHERE channel = ? AND shard = 2 AND timestamp >= ? AND timestamp <= currentTimestamp() LIMIT ?
  28. 28. Deterministic Data Processing Partition Key is a tuple of (channel, customer, message id) Clustering Key (event date, event action) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) 1 2 2 Ingester Store Scheduler 3 Shard (1) handler Shard (2) handler Shard (3) handler Shard (N) handler SELECT ... FROM buffer WHERE channel = ? AND shard = 2 AND timestamp >= ? AND timestamp <= currentTimestamp() LIMIT ? 4
  29. 29. Deterministic Message Routing Partition Key is a tuple of (channel, customer, message id) Clustering Key (event date, event action) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) 1 2 2 Ingester Store Scheduler 3 5 Channel EMAIL MTA Shard (1) handler Shard (2) handler Shard (3) handler Shard (N) handler SELECT ... FROM buffer WHERE channel = ? AND shard = 2 AND timestamp >= ? AND timestamp <= currentTimestamp() LIMIT ? 4
  30. 30. Could We Replace Kafka With ScyllaDB? Partition Key is a tuple of (channel, shard) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) Store Scheduler Shard (1) handler Shard (2) handler Shard (3) handler Shard (N) handler
  31. 31. Trying To Replace Kafka With ScyllaDB Partition Key is a tuple of (channel, shard) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) Store Scheduler Shard (1) handler Shard (2) handler Shard (3) handler Shard (N) handler SELECT buffer_last_pull_ts FROM sharding WHERE channel = ? AND shard = 2 3.1
  32. 32. Replacing Kafka With ScyllaDB Partition Key is a tuple of (channel, shard) Partition Key is a tuple of (channel, shard) Clustering Key (timestamp, customer, message id) Store Scheduler Shard (1) handler Shard (2) handler Shard (3) handler Shard (N) handler SELECT ... FROM buffer WHERE channel = ? AND shard = 2 AND timestamp >= ? AND timestamp <= currentTimestamp() LIMIT ? SELECT buffer_last_pull_ts FROM sharding WHERE channel = ? AND shard = 2 3.2 - window_offset buffer_last_pulled_ts currentTimestamp() 3.1
  33. 33. Retrospective
  34. 34. What We Learned on the Road ■ Load testing is more than useful Spotted a lot of non trivial issues (batch execution delay, timeouts, large partitions, etc.) ■ Time-Window Compaction Strategy Message buffering as time-series processing allowed us to avoid large partitions!
  35. 35. What We Contributed to Make it Possible ■ Rust driver contributions
  36. 36. What We Contributed to Make it Possible ■ Rust driver contributions ■ Bugs discovery
  37. 37. What We Contributed to Make it Possible ■ Rust driver contributions ■ Bugs discovery ❤ ScyllaDB support
  38. 38. What We Wish We Could Do ■ Long-polling for time-series Our architecture implies regular fetching, but we have idea to improve this. ■ A Rust driver with less allocations We did encounter some memory issues and have (a lot?) of ideas to improve the Rust driver!
  39. 39. Going Further with ScyllaDB Features ■ CDC Kafka source connector Use CDC to stream message events to the rest of the infrastructure, without touching applicative code ■ Replace LWT by Raft? We use LWT in a few places, e.g. dynamic shard workload attribution, and can’t wait to test strongly-consistent tables!
  40. 40. Thank You Stay in Touch Want to have fun with us? Reach out! alexys@numberly.com | joseph@numberly.com | yassir@numberly.com @ultrabug numberly

×