This document discusses Cassandra drivers and how to optimize queries. It begins with an introduction to Cassandra drivers and examples of basic usage in Java, Python and Ruby. It then covers the differences between synchronous and asynchronous queries. Prepared statements and consistency levels are also discussed. The document explores how consistency levels, driver policies and node outages impact performance and latency. Hinted handoff is described as a performance optimization that stores hints for missed writes on down nodes. Lastly, it provides best practices around driver usage.
2. Who am I and what do I do?
• Ben Bromhead
• Co-founder and CTO of Instaclustr -> www.instaclustr.com
• Instaclustr provides Cassandra-as-a-Service in the cloud
• Currently in AWS, Azure and IBM Softlayer
• We currently manage 400+ nodes
3. What this talk will cover
• Driver basics
• Sync vs Async
• Driver connection policies and tuning
4. The driver
• The Cassandra driver contains the logic for connecting to
Cassandra and running queries in a fast and efficient manner
• Focus on the Datastax Open Source drivers:
6. Cassandra Drivers
• All have a similar architecture that consists of:
• Session & pool management
• Chainable policies for managing failure and performance
• Sync vs Async queries
• Failover & Retry
• Tracing
7. Cassandra Drivers
A basic example in Java:
Cluster cluster = Cluster.builder()
.addContactPoints("52.89.183.67")
.withPort(9042)
.build();
Session session = cluster.newSession();
session.execute("SELECT * FROM foo…");
8. Cassandra Drivers
A basic example in Python:
cluster = Cluster(contact_points=["52.89.183.67"], port=9042)
session = cluster.connect()
rows = session.execute("SELECT name, age, email FROM users")
9. Cassandra Drivers
A basic example in Ruby:
cluster = Cassandra.cluster(
:hosts => ["52.89.183.67", "52.89.99.88", "54.69.217.141"],
:datacenter => 'AWS_VPC_US_WEST_2'
)
session = cluster.connect()
rows = session.execute("SELECT name, age, email FROM users")
10. Cassandra Drivers
• Architecture makes the driver similar across languages
• What happens under the hood?
• Cluster object creates configuration (auth, load balancing, contact
points).
• Session object holds the thread pool and manages connections.
• Session object authenticates and maintains connections.
• Session can be shared and is threadsafe!
11. Different ways of querying
• Synchronous:
session.execute("SELECT * FROM foo..”);
• Asynchronous:
ResultSetFuture result = session.executeAsync("SELECT * FROM
foo..”);
result.get();
12. How do these perform?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
Op/s
13. How do these perform?
Latency
0
20
40
60
80
Read Sync Write Sync Read Async Write Async
ms
14. Different ways of querying
Prepared Statements:
PreparedStatement statement = getSession().prepare(
"INSERT INTO simplex.songs " +
"(id, title, album, artist, tags) " +
"VALUES (?, ?, ?, ?, ?);");
15. Different ways of querying
boundStatement = new BoundStatement(statement);
getSession().execute(boundStatement.bind(
UUID.fromString("2cc9ccb7-6221-4ccb-8387-f22b6a1b354d"),
UUID.fromString("756716f7-2e54-4715-9f00-91dcbea6cf50"),
"La Petite Tonkinoise",
"Bye Bye Blackbird",
"Joséphine Baker") );
16. Drivers and consistency
• Within the different ways of querying Cassandra you can also adjust
the consistency level per query.
• Lets have a quick consistency refresh
17. A brief intro to tuneable consistency
• Cassandra is considered to be a db that favours Availability and
Partition Tolerance.
• Let’s you change those characteristics per query to suit your
application requirement
18. Two consistency levers
• Consistency level - How many acknowledgements/responses from
replicas before a query is considered a success.
• Replication Factor (RF) - How many copies of a record do I store.
19. Two consistency levers
• Consistency level - Chosen by the client at query time
• Replication Factor (RF) - Determined client on schema definition
20. Consistency Levels
• ALL - Every replica
• *QUORUM - (EACH_QUORUM, QUORUM, LOCAL_QUORUM)
• Numbered - (ONE, TWO, THREE, LOCAL_ONE)
• *SERIAL - (SERIAL, LOCAL_SERIAL)
• ANY
21. What does it all mean
• At the client level (your application) you have total control
• Define implicit and explicit failure handling
• Isolate queries to a single geography
• Trade consistency for latency (a decision is better than no
decision)
35. How does an outage impact Op/s ?
Operations
0
7500
15000
22500
30000
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
36. How does an outage impact latency ?
Latency
0
25
50
75
100
Read Sync Write Sync Read Async Write Async
ONE QUORUM ALL
37. We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff
• Read repair
38. We are now have a replica that is not
consistent
• Anti-entropy repair (only guaranteed way to make things consistent)
• Hinted handoff - lets cover this quickly
• Read repair
39. What is hinted handoff ?
• A performance optimisation for “catching up” nodes who missed
writes.
47. Some things to keep in mind
• Cassandra will only store hints for a certain period of time, set by
max_hint_window_in_ms. 3 hours by default
• Hints are not a reliable delivery mechanism
• Hint replay will cause counters to overcoat
• CF of ANY will cause a hint to be stored even if no replicas are
available. Sometimes called extreme availability… also called who
knows where and if your data is safe?
48. Hinted handoff performance
• Causes the same volume of writes to occur in a cluster with
reduced capacity (local write amplification on the co-ordinator
node)
• Hints are written to system.hints, each replica has hints stored in a
single partition.
• Hints use TTLs and tombstones.. the hint table is actually a queue!
• When cassandra starts compacting or throwing tombstone
warnings on the system.hints table… things are bad
49. Hinted handoff performance
• Rewritten in Cassandra 3.0 (in beta now)
• Takes a commitlog approach:
• No compaction
• no TTL
• no tombstones
• no memtables
50. How does this relate to the driver?
• With a node outage the “latency” on the down node becomes
hours/days until it becomes consistent
• Cassandra itself takes over the client portion of ensuring the write
makes it to the node that was down.
• You can control whether C* handles this (via repair, HH etc) or
whether your application controls this (have your client receive an
exception instead).
51. Driver policies
• Cassandra driver policies allow you to control failure
• Cassandra driver policies allow you to control how the driver routes
requests
• The driver is your load balancer
• This can reduce your latency and/or increase op/s (in some cases)
55. Last but not least
• Use one Cluster instance per (physical) cluster (per application
lifetime)
• Use at most one Session per keyspace, or use a single Session and
explicitly specify the keyspace in your queries
• If you execute a statement more than once, consider using a
PreparedStatement
• You can reduce the number of network roundtrips and also have
atomic operations by using Batches