SlideShare a Scribd company logo
1 of 36
Download to read offline
Testing Cassandra
Guarantees under
Diverse Failure
Modes with Jepsen
Joel Knighton
@joelknighton
DataStax
#CassandraSummit
Who I am
Mathematician
Software hobbyist
Logic enthusiast
Former DataStaxIntern
DataStax Cassandra Developer
What I Do
Deconstruct
Formalize
Communicate
Prove
Automate
How We Test #1
Unit Tests
ant test
in-tree
How We Test #2
Distributed Tests
nosetests
On GitHub – available at riptano/cassandra-dtest
Why You’re Here
Jepsen
Kyle Kingsbury (aphyr)
https://aphyr.com/tags/jepsen
What Jepsen Is
A blog series about distributed systems behavior
A talk series about distributed systems behavior
A Clojure library to test the behavior of distributed systems
A collection of tests written using those libraries
What We Hope
Jepsen
💘
Cassandra
What I Did
Jepsen Tests
lein test
On GitHub – available at riptano/jepsen
A Test Incarnate
{:name …
:os …
:db …
:client …
:generator …
:conductors {:nemesis …}
:checker …}
names the results
prepares the os
configures/starts/stops the db
interacts with the db
instructions on how to interact
interacts with the environment
looks at and assesses test run
What You Need
One machine to run the tests
+
n machines to run Cassandra
How A Test Runs
lein test
os
n1
n2
n3
n4
n5
How A Test Runs
lein test
db
n1
n2
n3
n4
n5
How A Test Runs
lein test
client 1
client 2
client 3
client 4
client 5
nemesis
n1
n2
n3
n4
n5
read
write 3
start nemesis
write 4
read
stop nemesis
write 1
cas 2 -> 3
…
How A Test Runs
lein test
checker
1 – read
2 – write 3
1 – read 0
n – start nemesis
2 – write timed-out
3 – write 4
n – started nemesis
3 – wrote 4
4 – read
4 – read 4
n – stop nemesis
0 – write 1
1 – cas 2 -> 3
n – stopped nemesis
…
valid?
Latency
Single Test Deep-Dive
lein test :only
cassandra.collections.set-test/
cql-set-isolate-node-decommission
Single Test Name
Test name used to label folder where
test results, logs, and history will be
stored with timestamp
cassandra cql set isolate node decommission
Single Test Nodes
[:n1 :n2 :n3 :n4 :n5]
Single Test Net
net/iptables
(drop! ;use iptables to drop packets)
(heal! ;flush iptables)
Single Test OS
debian/os
(setup! ;adjust hostfile
;update package manager
;install base packages like curl, iptables, etc.
;make sure network is healed)
(teardown!)
Single Test DB
cassandra.core/db
(setup! ;shutdown and wipe Cassandra if running
;install, configure, and start Cassandra)
(teardown! ;shutdown and wipe Cassandra)
(log-files ;return path to log files)
Single Test Client
cql-set-client
(setup! ;driver connect to all nodes
;create schema)
(invoke! ;add? Run CQL to add to set, handle errors
;read? Read value of CQL set, handle errors)
(teardown! ;disconnect driver)
Single Test Generator
(gen/phases
(->> (adds)
(gen/stagger 1/10)
(gen/delay 1/2)
std-gen)
(read-once))
Single Test Conductors
{:nemesis (nemesis/partition-random-node)
:decommissioner (c/decommissioner)}
What a Conductor Is
It’s just a client
Single Test Checker
checker/set
(check ;look at history of run
;find ok or uncertain adds
;compare these to final read
;return map with validity and
;ok, lost, unexpected, recovered)
Invariants We Test
Do CQL collections (maps, sets) merge cleanlywhen add-only?
Do counters merge to accuratelyreflect increments/decrements?
Does LWT in a single datacenterallow us linearizability?
Do materialized views converge to matching the base table?
Do batch writes eventually get applied atomically?
Failures We Consider
How does this work under a variety of network partitions?
What about with node crashes?
Even if nodes are flushing and compacting?
And when nodes are being bootstrapped?
Or decommissioned?
While clocks drift?
How We Run
Start the Docker container
Install Java driver, Cassaforte, clj-ssh, and Jepsen
Use environment variables to point to build under test
Run lein test with any desired selectors and profiles
Tunable Options
Should we make a best-effort attempt to scale test length?
Should we enable commitlog compression, the coordinator
batchlog on materialized views, or hinted handoff?
Is a different compaction strategy or phi value in the failure
detector appropriate for this test?
Should we install from a tagged release, a URL pointing to a
tarball, or a local tarball?
Should we leave Cassandra running after the test?
What We’ve Found
Issues with counter undercounting/overcounting(#10143)
Decommission race conditions causing gossip problems (#10231)
Write durability violations when recovering commitlog (#9851)
Problems with merging of collections (#10001)
Batchlog replay failures after decommission/crash (#10068)
Incorrect asserts in counter write-path when timestamps collide
A variety of materialized view issues during development
Work We Shared
Minor Jepsen fixes/features (Jepsen PRs #58, 59, 62)
Docker images to run Jepsen tests (Docker Hub: tjake/jepsen)
Multibox Vagrant configurations to run Jepsen tests (on GitHub)
Upstream library fixes (clj-ssh PR #36)
Cassandra Jepsen tests (on GitHub)
Available on CassCI (on cassci.datastax.com)
Jepsen on CassCI
Lessons I Learned
Tests verifying invariants under failures are valuable and practical
These tests can and should be a part of regular development
Testing complex systems is hard, but there are low-hanging fruit
Jepsen provides one readily available way to accomplish this goal
Considering invariants against a recorded test run is effective
Invariants should be explicit and carefully considered in design
Thanks
Jake Luciani
DataStax
The Cassandra community
Kyle Kingsbury
QUESTIONS?
TLA+ • TLC • TLAPS • Clojure
Formal Methods • Jepsen
CRDTs • Cassandra • Gossip
Consistency Models • Alloy
Model Checking • Testing
@joelknighton
#CassandraSummit

More Related Content

What's hot

Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on firePatrick McFadin
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sSveta Smirnova
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and SparkPatrick McFadin
 
Database Automation with MySQL Triggers and Event Schedulers
Database Automation with MySQL Triggers and Event SchedulersDatabase Automation with MySQL Triggers and Event Schedulers
Database Automation with MySQL Triggers and Event SchedulersAbdul Rahman Sherzad
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...Alex Zaballa
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraJohnny Miller
 
DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...
DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...
DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...DataStax
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strataPatrick McFadin
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneEnkitec
 
Advanced Oracle Troubleshooting
Advanced Oracle TroubleshootingAdvanced Oracle Troubleshooting
Advanced Oracle TroubleshootingHector Martinez
 
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should KnowOTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should KnowAlex Zaballa
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...
Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...
Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...Alex Zaballa
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf TuningHighLoad2009
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in ActionSveta Smirnova
 
Think Exa!
Think Exa!Think Exa!
Think Exa!Enkitec
 

What's hot (20)

Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Introduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]sIntroduction to MySQL Query Tuning for Dev[Op]s
Introduction to MySQL Query Tuning for Dev[Op]s
 
Tanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata MigrationsTanel Poder - Performance stories from Exadata Migrations
Tanel Poder - Performance stories from Exadata Migrations
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and Spark
 
Database Automation with MySQL Triggers and Event Schedulers
Database Automation with MySQL Triggers and Event SchedulersDatabase Automation with MySQL Triggers and Event Schedulers
Database Automation with MySQL Triggers and Event Schedulers
 
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
OOW16 - Oracle Database 12c - The Best Oracle Database 12c New Features for D...
 
Introduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache CassandraIntroduction to CQL and Data Modeling with Apache Cassandra
Introduction to CQL and Data Modeling with Apache Cassandra
 
DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...
DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...
DataStax | Advanced DSE Analytics Client Configuration (Jacek Lewandowski) | ...
 
Time series with apache cassandra strata
Time series with apache cassandra   strataTime series with apache cassandra   strata
Time series with apache cassandra strata
 
In Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry OsborneIn Memory Database In Action by Tanel Poder and Kerry Osborne
In Memory Database In Action by Tanel Poder and Kerry Osborne
 
Advanced Oracle Troubleshooting
Advanced Oracle TroubleshootingAdvanced Oracle Troubleshooting
Advanced Oracle Troubleshooting
 
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should KnowOTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
OTN TOUR 2016 - DBA Commands and Concepts That Every Developer Should Know
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...
Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...
Oracle Database 12c - The Best Oracle Database 12c Tuning Features for Develo...
 
Highload Perf Tuning
Highload Perf TuningHighload Perf Tuning
Highload Perf Tuning
 
MySQL Performance Schema in Action
MySQL Performance Schema in ActionMySQL Performance Schema in Action
MySQL Performance Schema in Action
 
HiveServer2
HiveServer2HiveServer2
HiveServer2
 
Think Exa!
Think Exa!Think Exa!
Think Exa!
 
How to Design Indexes, Really
How to Design Indexes, ReallyHow to Design Indexes, Really
How to Design Indexes, Really
 

Viewers also liked

Viewers also liked (7)

Enablers for o commerce
Enablers for o commerceEnablers for o commerce
Enablers for o commerce
 
articolo Hearth and Home
articolo Hearth and Homearticolo Hearth and Home
articolo Hearth and Home
 
lookbook 2
lookbook 2lookbook 2
lookbook 2
 
Smart contracts in Solidity
Smart contracts in SoliditySmart contracts in Solidity
Smart contracts in Solidity
 
Fluor
FluorFluor
Fluor
 
402 @ Mobile next
402 @ Mobile next402 @ Mobile next
402 @ Mobile next
 
mcommad
mcommadmcommad
mcommad
 

Similar to Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen

Automated testing with OffScale and MongoDB
Automated testing with OffScale and MongoDBAutomated testing with OffScale and MongoDB
Automated testing with OffScale and MongoDBOmer Gertel
 
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...HostedbyConfluent
 
Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandraVinay Kumar Chella
 
Rspec Tips
Rspec TipsRspec Tips
Rspec Tipslionpeal
 
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipelineKubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipelineKubeAcademy
 
Containerised Testing at Demonware : PyCon Ireland 2016
Containerised Testing at Demonware : PyCon Ireland 2016Containerised Testing at Demonware : PyCon Ireland 2016
Containerised Testing at Demonware : PyCon Ireland 2016Thomas Shaw
 
Corwin on Containers
Corwin on ContainersCorwin on Containers
Corwin on ContainersCorwin Brown
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing Ran Levy
 
"Petascale Genomics with Spark", Sean Owen,Director of Data Science at Cloudera
"Petascale Genomics with Spark", Sean Owen,Director of Data Science at Cloudera"Petascale Genomics with Spark", Sean Owen,Director of Data Science at Cloudera
"Petascale Genomics with Spark", Sean Owen,Director of Data Science at ClouderaDataconomy Media
 
Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...
Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...
Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...Paul King
 
Angular JS in 2017
Angular JS in 2017Angular JS in 2017
Angular JS in 2017Ayush Sharma
 
Scala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camouScala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camouJ On The Beach
 
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...DataStax
 
Effective Testing in DSE
Effective Testing in DSEEffective Testing in DSE
Effective Testing in DSEpedjak
 
Performance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle CoherencePerformance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle Coherencearagozin
 
PuppetDB: Sneaking Clojure into Operations
PuppetDB: Sneaking Clojure into OperationsPuppetDB: Sneaking Clojure into Operations
PuppetDB: Sneaking Clojure into Operationsgrim_radical
 
CoreOS, or How I Learned to Stop Worrying and Love Systemd
CoreOS, or How I Learned to Stop Worrying and Love SystemdCoreOS, or How I Learned to Stop Worrying and Love Systemd
CoreOS, or How I Learned to Stop Worrying and Love SystemdRichard Lister
 
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...DynamicInfraDays
 

Similar to Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen (20)

Automated testing with OffScale and MongoDB
Automated testing with OffScale and MongoDBAutomated testing with OffScale and MongoDB
Automated testing with OffScale and MongoDB
 
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
Tales from the four-comma club: Managing Kafka as a service at Salesforce | L...
 
Building and running cloud native cassandra
Building and running cloud native cassandraBuilding and running cloud native cassandra
Building and running cloud native cassandra
 
Rspec Tips
Rspec TipsRspec Tips
Rspec Tips
 
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipelineKubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
KubeCon EU 2016: Leveraging ephemeral namespaces in a CI/CD pipeline
 
Containerised Testing at Demonware : PyCon Ireland 2016
Containerised Testing at Demonware : PyCon Ireland 2016Containerised Testing at Demonware : PyCon Ireland 2016
Containerised Testing at Demonware : PyCon Ireland 2016
 
Corwin on Containers
Corwin on ContainersCorwin on Containers
Corwin on Containers
 
Resilience Testing
Resilience Testing Resilience Testing
Resilience Testing
 
"Petascale Genomics with Spark", Sean Owen,Director of Data Science at Cloudera
"Petascale Genomics with Spark", Sean Owen,Director of Data Science at Cloudera"Petascale Genomics with Spark", Sean Owen,Director of Data Science at Cloudera
"Petascale Genomics with Spark", Sean Owen,Director of Data Science at Cloudera
 
Sql optimize
Sql optimizeSql optimize
Sql optimize
 
Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...
Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...
Industrial Strength Groovy - Tools for the Professional Groovy Developer: Pau...
 
Angular JS in 2017
Angular JS in 2017Angular JS in 2017
Angular JS in 2017
 
DevOps Odessa #TechTalks 21.01.2020
DevOps Odessa #TechTalks 21.01.2020DevOps Odessa #TechTalks 21.01.2020
DevOps Odessa #TechTalks 21.01.2020
 
Scala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camouScala, docker and testing, oh my! mario camou
Scala, docker and testing, oh my! mario camou
 
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...
DataStax | Effective Testing in DSE (Lessons Learned) (Predrag Knezevic) | Ca...
 
Effective Testing in DSE
Effective Testing in DSEEffective Testing in DSE
Effective Testing in DSE
 
Performance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle CoherencePerformance Test Driven Development with Oracle Coherence
Performance Test Driven Development with Oracle Coherence
 
PuppetDB: Sneaking Clojure into Operations
PuppetDB: Sneaking Clojure into OperationsPuppetDB: Sneaking Clojure into Operations
PuppetDB: Sneaking Clojure into Operations
 
CoreOS, or How I Learned to Stop Worrying and Love Systemd
CoreOS, or How I Learned to Stop Worrying and Love SystemdCoreOS, or How I Learned to Stop Worrying and Love Systemd
CoreOS, or How I Learned to Stop Worrying and Love Systemd
 
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
 

Recently uploaded

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfCionsystems
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....ShaimaaMohamedGalal
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 

Recently uploaded (20)

Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Active Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdfActive Directory Penetration Testing, cionsystems.com.pdf
Active Directory Penetration Testing, cionsystems.com.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Clustering techniques data mining book ....
Clustering techniques data mining book ....Clustering techniques data mining book ....
Clustering techniques data mining book ....
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 

Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen

  • 1. Testing Cassandra Guarantees under Diverse Failure Modes with Jepsen Joel Knighton @joelknighton DataStax #CassandraSummit
  • 2. Who I am Mathematician Software hobbyist Logic enthusiast Former DataStaxIntern DataStax Cassandra Developer
  • 4. How We Test #1 Unit Tests ant test in-tree
  • 5. How We Test #2 Distributed Tests nosetests On GitHub – available at riptano/cassandra-dtest
  • 6. Why You’re Here Jepsen Kyle Kingsbury (aphyr) https://aphyr.com/tags/jepsen
  • 7. What Jepsen Is A blog series about distributed systems behavior A talk series about distributed systems behavior A Clojure library to test the behavior of distributed systems A collection of tests written using those libraries
  • 9. What I Did Jepsen Tests lein test On GitHub – available at riptano/jepsen
  • 10. A Test Incarnate {:name … :os … :db … :client … :generator … :conductors {:nemesis …} :checker …} names the results prepares the os configures/starts/stops the db interacts with the db instructions on how to interact interacts with the environment looks at and assesses test run
  • 11. What You Need One machine to run the tests + n machines to run Cassandra
  • 12. How A Test Runs lein test os n1 n2 n3 n4 n5
  • 13. How A Test Runs lein test db n1 n2 n3 n4 n5
  • 14. How A Test Runs lein test client 1 client 2 client 3 client 4 client 5 nemesis n1 n2 n3 n4 n5 read write 3 start nemesis write 4 read stop nemesis write 1 cas 2 -> 3 …
  • 15. How A Test Runs lein test checker 1 – read 2 – write 3 1 – read 0 n – start nemesis 2 – write timed-out 3 – write 4 n – started nemesis 3 – wrote 4 4 – read 4 – read 4 n – stop nemesis 0 – write 1 1 – cas 2 -> 3 n – stopped nemesis … valid? Latency
  • 16. Single Test Deep-Dive lein test :only cassandra.collections.set-test/ cql-set-isolate-node-decommission
  • 17. Single Test Name Test name used to label folder where test results, logs, and history will be stored with timestamp cassandra cql set isolate node decommission
  • 18. Single Test Nodes [:n1 :n2 :n3 :n4 :n5]
  • 19. Single Test Net net/iptables (drop! ;use iptables to drop packets) (heal! ;flush iptables)
  • 20. Single Test OS debian/os (setup! ;adjust hostfile ;update package manager ;install base packages like curl, iptables, etc. ;make sure network is healed) (teardown!)
  • 21. Single Test DB cassandra.core/db (setup! ;shutdown and wipe Cassandra if running ;install, configure, and start Cassandra) (teardown! ;shutdown and wipe Cassandra) (log-files ;return path to log files)
  • 22. Single Test Client cql-set-client (setup! ;driver connect to all nodes ;create schema) (invoke! ;add? Run CQL to add to set, handle errors ;read? Read value of CQL set, handle errors) (teardown! ;disconnect driver)
  • 23. Single Test Generator (gen/phases (->> (adds) (gen/stagger 1/10) (gen/delay 1/2) std-gen) (read-once))
  • 24. Single Test Conductors {:nemesis (nemesis/partition-random-node) :decommissioner (c/decommissioner)}
  • 25. What a Conductor Is It’s just a client
  • 26. Single Test Checker checker/set (check ;look at history of run ;find ok or uncertain adds ;compare these to final read ;return map with validity and ;ok, lost, unexpected, recovered)
  • 27. Invariants We Test Do CQL collections (maps, sets) merge cleanlywhen add-only? Do counters merge to accuratelyreflect increments/decrements? Does LWT in a single datacenterallow us linearizability? Do materialized views converge to matching the base table? Do batch writes eventually get applied atomically?
  • 28. Failures We Consider How does this work under a variety of network partitions? What about with node crashes? Even if nodes are flushing and compacting? And when nodes are being bootstrapped? Or decommissioned? While clocks drift?
  • 29. How We Run Start the Docker container Install Java driver, Cassaforte, clj-ssh, and Jepsen Use environment variables to point to build under test Run lein test with any desired selectors and profiles
  • 30. Tunable Options Should we make a best-effort attempt to scale test length? Should we enable commitlog compression, the coordinator batchlog on materialized views, or hinted handoff? Is a different compaction strategy or phi value in the failure detector appropriate for this test? Should we install from a tagged release, a URL pointing to a tarball, or a local tarball? Should we leave Cassandra running after the test?
  • 31. What We’ve Found Issues with counter undercounting/overcounting(#10143) Decommission race conditions causing gossip problems (#10231) Write durability violations when recovering commitlog (#9851) Problems with merging of collections (#10001) Batchlog replay failures after decommission/crash (#10068) Incorrect asserts in counter write-path when timestamps collide A variety of materialized view issues during development
  • 32. Work We Shared Minor Jepsen fixes/features (Jepsen PRs #58, 59, 62) Docker images to run Jepsen tests (Docker Hub: tjake/jepsen) Multibox Vagrant configurations to run Jepsen tests (on GitHub) Upstream library fixes (clj-ssh PR #36) Cassandra Jepsen tests (on GitHub) Available on CassCI (on cassci.datastax.com)
  • 34. Lessons I Learned Tests verifying invariants under failures are valuable and practical These tests can and should be a part of regular development Testing complex systems is hard, but there are low-hanging fruit Jepsen provides one readily available way to accomplish this goal Considering invariants against a recorded test run is effective Invariants should be explicit and carefully considered in design
  • 35. Thanks Jake Luciani DataStax The Cassandra community Kyle Kingsbury
  • 36. QUESTIONS? TLA+ • TLC • TLAPS • Clojure Formal Methods • Jepsen CRDTs • Cassandra • Gossip Consistency Models • Alloy Model Checking • Testing @joelknighton #CassandraSummit