🎥 Sign up for upcoming webinars or browse through our library of on-demand recordings here: https://www.scylladb.com/resources/webinars/
About this webinar:
Numberly operates business-critical data pipelines and applications where failure and latency means "lost money" in the best-case scenario. Most of their data pipelines and applications are deployed on Kubernetes and rely on Kafka and ScyllaDB, with Kafka acting as the message bus and ScyllaDB as the source of data for enrichment. The availability and latency of both systems are thus very important for data pipelines. While most of Numberly’s applications are developed using Python, they found a need to move high-performance applications to Rust in order to benefit from a lower-level programming language.
Learn the lessons from Numberly’s experience, including:
- Rationale in selecting a lower-level language
- Developing using a lower-level Rust code base
- Observability and analyzing latency impacts with Rust
- Tuning everything from Apache Avro to driver client settings
- How to build a mission-critical system combining Apache Kafka and ScyllaDB
- Half a year Rust in production feedback
What Are The Drone Anti-jamming Systems Technology?
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
1. Learning Rust the Hard
Way for a Production
Kafka + ScyllaDB Pipeline
Alexys Jacob, CTO, Numberly
2. 2
+ For distributed, data-intensive apps that require high
performance and low latency
+ 400+ users worldwide
+ Results
+ Comcast: Reduced P99 latencies by 95%
+ FireEye: 1500% improvement in throughput
+ Discord: Reduced C* nodes from ~140 to 6
+ iFood: 9X cost reduction vs. DynamoDB
+ Open Source, Enterprise and Cloud options
+ Fully compatible with Apache Cassandra and Amazon
DynamoDB
About ScyllaDB
1ms <1ms
10ms
1M
10M
ScyllaDB Universe of 400+ Users
3. 400+ Companies Use ScyllaDB
Seamless experiences
across content + devices
Make marketing more
relevant, effective
and measurable
Corporate fleet
management
Real-time analytics
2,000,000 SKU -commerce
management
Real-time location tracking
for friends/family
Video recommendation
management
IoT for industrial
machines
Synchronize browser
properties for millions
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
3
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Distributed storage for
distributed ledger tech
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
4. Alexys Jacob
4
@ultrabug
+ CTO, Numberly
+ ScyllaDB awarded Open Source & University contributor
+ Open Source author & contributor
+ Apache Avro, Apache Airflow, MongoDB, MkDocs…
+ Tech speaker & writer
+ Gentoo Linux developer
+ Python Software Foundation contributing member
Speaker Photo
5. Agenda
+ The thought process to move from Python to Rust
+ Context, promises, arguments and decision
+ Learning Rust the hard way
+ All the stack components I had to work with in Rust
+ Tips, Open Source contributions and code samples
+ What is worth it?
+ Graphs, production numbers
+ Personal notes
5
7. At Numberly, we move and process (a lot of) data using Kafka streams and pipelines that are
enriched using ScyllaDB.
processor
app
processor
app
Project context at Numberly
Scylla
processor
app
raw data
enriched data
enriched data
enriched data client
app
partner
API
business
app
7
8. processor
app
processor
app
Pipeline reliability = latency + resilience
Scylla
processor
app
raw data
enriched data
enriched data
enriched data client
app
partner
API
business
app
If a processor or ScyllaDB is slow or fails,
our business, partners & clients are at risk.
8
9. A major change in our pipeline processors had to be undertaken, giving us the opportunity to redesign
them entirely.
The (rusted) opportunity
Scylla
processor
app
raw data
enriched data
enriched data
enriched data client
app
partner
API
business
app
9
10. “Hey, why not rewrite
those 3 Python processor apps
into 1 Rust app?”
10
11. The (never tried before) Rust promises
11
A language empowering everyone to build reliable and efficient software.
+ Secure
+ Memory and thread safety as first class citizens
+ No runtime or garbage collector
+ Easy to deploy
+ Compiled binaries are self-sufficient
+ No compromises
+ Strongly and statically typed
+ Exhaustivity is mandatory
+ Built-in error management syntax and primitives
+ Plays well with Python
+ PyO3 can be used to run Rust from Python (or the contrary)
12. Efficient software != Faster software
+ “Fast” meanings vary depending on your objectives.
+ Fast to develop?
+ Fast to maintain?
+ Fast to prototype?
+ Fast to process data?
+ Fast to cover all failure cases?
“Selecting a programming language can be a form of
premature optimization
12
13. Efficient software != Faster software
+ “Fast” meanings vary depending on your objectives.
+ Fast to develop? Python is way faster + did that for 15 years
+ Fast to maintain? Very few people at Numberly do know Rust
+ Fast to prototype? No, code must be complete to compile and run
+ Fast to process data? Sure: to prove it, measure it
+ Fast to cover all failure cases? Definitely: mandatory exhaustivity + error handling primitives
“I did not choose Rust to be “faster”.
Our Python code was fast enough
to deliver their pipeline processing.
13
14. Innovation cannot exist
if you don’t accept to lose time.
The question is
to know when and on what project.
14
15. The Reliable software paradigms
+ What makes me slow will make me stronger.
+ Low level paradigms (ownership, borrowing, lifetimes).
+ Strong type safety.
+ Compilation (debug, release).
+ Dependency management.
+ Exhaustive pattern matching.
+ Error management primitives (Result).
+ Explicit return values (Option).
15
16. The Reliable software paradigms
+ What makes me slow will make me stronger.
+ Low level paradigms (ownership, borrowing, lifetimes). If it compiles, it’s safe
+ Strong type safety. Predictable, readable, maintainable
+ Compilation (debug, release). Compiler is very helpful vs a random Python exception
+ Dependency management. Finally something looking sane vs Python mess
+ Exhaustive pattern matching. Confidence that you’re not forgetting something
+ Error management primitives (Result). Handle failure right from the language syntax
+ Explicit return values (Option). Clear separation between Some(value) and None
“
I chose Rust because it provided me with
the programming paradigms at the right abstraction level
that I needed to finally understand and better explain
the reliability and performance of my application.
16
18. Production is not a Hello World
+ Learning the syntax and handling errors everywhere
+ Confluent Kafka + Schema Registry + Avro
+ Asynchronous latency-optimized design
+ ScyllaDB multi-datacenter
+ MongoDB
+ Kubernetes deployment
+ Prometheus exporter
+ Grafana dashboarding
+ Sentry
Scylla
processor
app
Confluent
Kafka
18
19. Confluent Kafka Schema Registry
+ Confluent Schema Registry breaks vanilla Apache Avro deserialization.
+ Gerard Klijs’ schema_registry_converter crate helps
+ I discovered performance problems which we worked and have been addressed!
+ Latency-overhead-free manual approach:
19
20. Apache Avro Rust was broken!
+ avro-rs crate given to Apache Avro without an appointed
committer.
+ Deserialization of complex schemas was broken...
+ I contributed fixes to Apache Avro (AVRO-3232+3240)
+ Now merged thanks to Martin Grigorov!
+ Rust compiler optimizations give a hell of a boost
(once Avro is fixed)
+ Deserializing Avro is faster than JSON!
20
21. green thread / msg
Asynchronous patterns to optimize latency
+ Tricks to make your Kafka consumer strategy more efficient.
+ Deserialize your consumer messages on the consumer loop, not on green-threads
+ Spawning a green-thread has a performance cost
+ Control your green-thread parallelism
+ Defer to green-threads when I/O starts to be required
Kafka
consumer
+
avro
deserializer
raw data
green thread / msg
green thread / msg
green thread / msg
green thread / msg
Scylla
enriched data
21
23. Scylla Rust (shard-aware) driver
+ The scylla-rust-driver crate is mature enough for
production
+ Use a CachingSession to automatically cache your
prepared statements
+ Beware: prepared queries are NOT paged, use paged
queries with execute_iter() instead!
+ Use at least version 0.4.2 if you run a multi-DC cluster!
23
24. Exporting metrics properly for Prometheus
+ Effectively measuring latencies down to microseconds.
+ Fine tune your histogram buckets to match your expected latencies!
...
24
25. Grafana dashboarding
+ Graph your precious metrics right!
+ ScyllaDB prepared statement cache size
+ Query and throughput rates
+ Kafka commits occurrence
+ Errors by type
+ Kubernetes pod memory
+ ...
+ Visualizing Prom Histograms
max by (environment)(histogram_quantile(0.50, processing_latency_seconds_bucket{...}))
25
27. Did I really lose time because of Rust?
+ I spent more time analyzing the latency impacts of code patterns and drivers’ options than
struggling with Rust syntax.
+ Key figures for this application:
+ Kafka consumer max throughput with processing? 200K msg/s on 20 partitions
+ Avro deserialization P50 latency? 75µs
+ Scylla SELECT P50 latency on 1.5B+ rows tables? 250µs
+ Scylla INSERT P50 latency on 1.5B+ rows tables? 660µs
27
28. It went better than expected
+ Rust crates ecosystem is mature, similar to Python Package Index.
+ 3 Python apps totalling 54 pods replaced by 1 Rust app totalling 20 pods
+ We helped & worked on making the scylla-rust-driver even better
+ Token aware policy can fallback to non-replicas for higher availability
+ Optimized partition key calculations for prepared statements
+ More to come!
+ This feels like the most reliable and efficient software I ever wrote!
28
30. Brought to you by
FREE VIRTUAL EVENT | OCTOBER 19-20, 2022
The event for developers who care about
high-performance, low-latency applications.
Register at p99conf.io
Follow us on Twitter: @p99conf #p99conf
31. Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/