Distributed Tests on Pulsar with Fallout - Pulsar Summit NA 2021

Distributed Tests on Pulsar with
Fallout
Enrico Olivelli
DataStax - Luna Streaming Team
Apache Pulsar Committer
Member of Apache BookKeeper and Apache ZooKeeper PMC
Apache Curator VP

Agenda
● Testing Distributed Messaging Systems
● Introduction to Fallout
● Fallout architecture
● NoSQLBench
● Anatomy of a Fallout test
● Fallout and Pulsar
● Live Demo
● Future works
2

Testing Distributed Messaging Systems
3
Classic types of tests: Unit tests, Integration tests, System tests
Distributed System Tests:
● Launch N machines (or clusters!)
● Deploy applications (Helm, Unzip tarballs…)
● Run clients
● Inject failures
● Perform system wide assertions
● Create reports (performance, failures…)
● Compare reports (regression tests)

Fallout
4
OpenSource project (ASLv2 licensed), created at DataStax
https://github.com/datastax/fallout
Initially started for Apache Cassandra®, now it is a general purpose tool.
A layer on top of Jepsen.io https://jepsen.io/
Design Repeatable Experiments with real clusters:
- Declarative language (YAML)
- Deterministic Setup-Run-Check loop
- Supports k8s natively (helm, kubectl, k8s jobs...)
- Integrated with GKE
- Monitor longevity tests
- Aggregate logs from all nodes/pods

NoSQLBench
5
OpenSource project (ASLv2 licensed)
https://github.com/nosqlbench/nosqlbench
Allows you to exercise your system:
- Load generator
- Performance measurement
- Distributed execution
Supports many drivers:
- Apache Cassandra, MongoDB...
- JDBC
- Generic HTTP based services
- Messaging: Kafka, Pulsar, JMS 2.0
For every driver it tracks basic metrics: throughput/latency but also driver specific metrics (like
transaction commit time for Pulsar)
Integrated with Dropwizard metrics and with Graphite

Fallout Architecture
6
Key components:
- Provisioners: where to run the test
- Configuration Managers: what to run
- Providers: access to the services and information
Workload:
- Modules: actions
- Phases: execution model: concurrent, sequential
- Artifact checkers: summarize metrics, produce charts, verify logs

Anatomy of a Fallout test - Provisioner and ConfigurationManager
7
# Parameters
image:
name: datastax/pulsar
version: 2.6.2_1.0.0
...........
---
ensemble:
server:
node.count: {{cluster.numNodes}}
provisioner:
name: gke
configuration_manager:
- name: helm
properties:
helm.install.values.file: <<file:pulsar-values.yaml>>
helm.repo.name: {{helmchart.reponame}}

Anatomy of a Fallout test - Workload and Checkers
8
workload:
phases:
- create-topic:
module: kubernetes_job
properties:
manifest: <<file:createtopic.yaml>>
- produce_messages:
module: nosqlbench
properties:
cycles: {{producer.nummessages}}
consume_messages:
module: nosqlbench
properties:
checkers:
nofail:
checker: nofail
artifact_checkers:
generate_chart:
artifact_checker: hdrtool

Testing Pulsar with Fallout
9
Release stability validation:
- Test cluster wide features, in k8s environments (like k8s functions)
- Longevity tests
- Simulate failures: BookKeeper, ZooKeeper, Broker, Proxy
- Simulate rollout restarts
- Simulate upgrades
Benchmarks:
- Hunt for performance regressions, running tests against current ‘master’ branch
- Compare different releases (Apache Pulsar, Luna Streaming …)
- Measure a given setup (configuration + cluster size + machines), in a reproducible way
- Reproduce performance issues

Simulating Bookie failure with ChaosMesh
10
Sample scenario:
- Start a 6 nodes cluster on GKE
- Deploy Apache Pulsar 2.7.2 using Helm
- 1 broker
- 3 bookies
- 1 proxy
- Replication parameters: 2-2-2 (2 copies)
- Deploy a NoSQLBench pod
- Deploy ChaosMesh (using Helm)
- Create a partitioned topic
- Produce and Consume messages
- Simulate Bookie pod failure (one bookie at a time)
- Track time series for latency
- No error must be reported by Producers and Consumers
Template: https://github.com/datastax/pulsar-fallout/blob/master/benchmarks/template.yaml

Simulating Bookie failure with ChaosMesh
11
Live demo
Key points:
- Fallout UI
- Template
- Parameters
- System wide log aggregation
- Verify Bookie failures in the logs of the Broker
- Show latency generated graph

Pulsar Release Validation toolkit
12
Repository with sample files for basic release validation and NoSQLBench based testing:
https://github.com/datastax/pulsar-fallout
Examples for:
- Deploy Pulsar, from 2.7.0 up to your custom docker image
- Using Apache Pulsar Helm Chart and Luna Streaming Helm Chart
- Running NoSQLBench
- Using ChaosMesh for failure injection
- Creating custom configurations of Pulsar
- Run client tools (pulsar-perf, pulsar-client, pulsar-admin)

Future works
13
At DataStax we are already using Fallout for Apache Cassandra and Apache Pulsar.
Useful follow ups for the community:
- Contribute the corpus of tests to the Apache repo
- Give to the community an easy to test Apache Pulsar with real distributed system tests
- Integrate Fallout based validation for pre-release validation or PR validation
- Use Fallout Docker images to run tests on GitHub actions
Fallout and NoSQLBench are public Open Source projects, everyone can contribute and enhance
these powerful tools

Wrapping up
14
Fallout:
- Distributed system tests are hard to design and to deploy
- Testing manually a complex project is error-prone
- Fallout is a brand new framework to easily write distributed system tests
- Reproducible
- Easy to use (YAML based, declarative style)
- NoSQLBench is the perfect companion for Fallout (but you are not required to use it)
Filling in the gaps in Pulsar testing:
- Systematically test and verify performances
- Ensure that Pulsar runs well on real world clusters (k8s as first class citizen)
- Be able to reproduce real world workloads in lab

References
15
LinkedIn - https://www.linkedin.com/in/enrico-olivelli-984b7874/
Twitter: @eolivelli
Apache Pulsar Community: http://pulsar.apache.org/en/contact/ (Slack, ML…)
References:
Fallout - https://github.com/datastax/fallout
Pulsar Templates - https://github.com/datastax/pulsar-fallout
NoSQLBench - https://github.com/nosqlbench/noslqbench
Great tutorial about Fallout - https://www.youtube.com/watch?v=45iTmTBjU0M DataStax Fallout -
Testing Scaleable Distributed System with Sean McCarthy

Distributed Tests on Pulsar with Fallout - Pulsar Summit NA 2021

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Distributed Tests on Pulsar with Fallout - Pulsar Summit NA 2021

Semelhante a Distributed Tests on Pulsar with Fallout - Pulsar Summit NA 2021 (20)

Mais de StreamNative

Mais de StreamNative (20)

Último

Último (20)

Distributed Tests on Pulsar with Fallout - Pulsar Summit NA 2021