More Related Content
Similar to Paris Cassandra Meetup - Cassandra for Developers (20)
More from Michaël Figuière (17)
Paris Cassandra Meetup - Cassandra for Developers
- 1. Cassandra for Developers
DataStax Drivers in Practice
Michaël Figuière
Drivers & Developer Tools Architect
@mfiguiere
- 2. Cassandra Peer to Peer Architecture
© 2014 DataStax, All Rights Reserved.
2
Node
Node Node
Node
Node
Node
Every node have the same role,
there’s no Master or Slave
Each node contains a replica of
some partitions of tables
- 3. Cassandra Peer to Peer Architecture
© 2014 DataStax, All Rights Reserved.
3
Node
Node Replica
Replica
Replica
Node
Each partition is stored in
several Replicas to ensure
durability and high availability
- 4. Client / Server Communication
© 2014 DataStax, All Rights Reserved.
4
Client
Client
Client
Client
Node
Node Replica
Replica
Replica
Node
Coordinator node:
Forwards all R/W requests
to corresponding replicas
- 6. Tunable Consistency
© 2014 DataStax, All Rights Reserved.
66
Write and wait for
acknowledge from one node
Write ‘B’
B A A
Time
A A A
- 7. Tunable Consistency
© 2014 DataStax, All Rights Reserved.
77
Write and wait for
acknowledge from one node
Write ‘B’
B A A
Time
A A A
- 8. Tunable Consistency
© 2014 DataStax, All Rights Reserved.
88
R + W < N
A A A
Read waiting for one node
to answer
B A A
8
B A A
Write and wait for
acknowledge from one node
Time
- 9. Tunable Consistency
© 2014 DataStax, All Rights Reserved.
9
R + W = N
A A A
B B
A
B B A
Write and wait for
acknowledges from two nodes
Read waiting for one node
to answer
Time
- 10. Tunable Consistency
© 2014 DataStax, All Rights Reserved.
10
R + W > N
A A A
B B
A
B B
A
Write and wait for
acknowledges from two nodes
Read waiting for two nodes
to answer
Time
- 11. Tunable Consistency
© 2014 DataStax, All Rights Reserved.
11
R = W = QUORUM
A A A
B B
A
B B
A
Time
QUORUM = (N / 2) + 1
- 12. Cassandra Query Language (CQL)
• Similar to SQL, mostly a subset
• Without joins, sub-queries, and aggregations
• Primary Key contains:
• A Partition Key used to select the partition that will store the Row
• Some Clustering Columns, used to define how Rows should be grouped
and sorted on the disk
• Support Collections
• Support User Defined Types (UDT)
© 2014 DataStax, All Rights Reserved.
12
- 13. CQL: Create Table
CREATE TABLE users (
login text,
name text,
age int,
…
PRIMARY KEY (login));
Just like in SQL!
login is the partition key, it will
be hashed and rows will be
spread over the cluster on
different partitions
© 2014 DataStax, All Rights Reserved. 13
- 14. CQL: Clustered Table
A TimeUUID is a UUID that
can be sorted chronologically
CREATE TABLE mailbox (
login text,
message_id timeuuid,
interlocutor text,
message text,
PRIMARY KEY((login), message_id)
);
message_id is a clustering column,
it means that all the rows with a
same login will be grouped and
sorted by message_id on the disk
© 2014 DataStax, All Rights Reserved. 14
- 15. CQL: Queries
Get message by user and message_id (date)
SELECT * FROM mailbox
WHERE login = jdoe
AND message_id = '2014-09-25 16:00:00';
Get message by user and date interval
SELECT * FROM mailbox WHERE login = jdoe
AND message_id <= '2014-09-25 16:00:00'
AND message_id >= '2014-09-20 16:00:00';
WHERE clauses can only be constraints
on the primary key and range queries are
not possible on the partition key
© 2014 DataStax, All Rights Reserved. 15
- 16. CQL: Collections
CREATE TABLE users (
login text,
set and list have a similar
name text,
semantic as in Java
age int,
friends set<text>,
hobbies list<text>,
languages map<int, text>,
…
PRIMARY KEY (login)
); It’s not possible to use nested
collections… yet
© 2014 DataStax, All Rights Reserved. 16
- 17. Cassandra 2.1: User Defined Type (UDT)
CREATE TABLE users (
login text,
…
street_number int,
street_name text,
postcode int,
country text,
…
PRIMARY KEY(login));
CREATE TYPE address (
street_number int,
street_name text,
postcode int,
country text
);
CREATE TABLE users (
login text,
…
location frozen<address>,
…
PRIMARY KEY(login)
);
© 2014 DataStax, All Rights Reserved. 17
- 18. Cassandra 2.1: UDT Insert / Update
INSERT INTO users(login,name, location)
VALUES ('jdoe','John DOE',
{
'street_number': 124,
'street_name': 'Congress Avenue',
'postcode': 95054,
'country': 'USA'
});
UPDATE users SET location =
{
'street_number': 125,
'street_name': 'Congress Avenue',
'postcode': 95054,
'country': 'USA'
}
WHERE login = jdoe;
© 2014 DataStax, All Rights Reserved. 18
- 19. Client / Server Communication
© 2014 DataStax, All Rights Reserved.
19
Client
Client
Client
Client
Node
Node Replica
Replica
Replica
Node
Coordinator node:
Forwards all R/W requests
to corresponding replicas
- 20. Request Pipelining
© 2014 DataStax, All Rights Reserved.
20
Client
Without
Request Pipelining
Cassandra
Client Cassandra
With
Request Pipelining
- 21. Notifications
© 2014 DataStax, All Rights Reserved.
21
Client
Without
Notifications
With
Notifications
Node
Node
Node
Client
Node
Node
Node
- 24. Failover
© 2014 DataStax, All Rights Reserved.
24
Client
Thread
Node
Node
Node
Client
Thread
Client
Thread
Node
7
2
4
3 5 1
Driver
6
- 25. DataStax Drivers Highlights
• Asynchronous architecture using Non Blocking IOs
• Prepared Statements Support
• Automatic Failover
• Node Discovery
• Tunable Load Balancing
• Round Robin, Latency Awareness, Multi Data Centers, Replica Awareness
• Cassandra Tracing Support
• Compression & SSL
© 2014 DataStax, All Rights Reserved.
25
- 26. DataCenter Aware Balancing
© 2014 DataStax, All Rights Reserved.
26
Node
Node
Client Node
Node
Datacenter B
Node
Node
Client
Client
Client
Client
Client
Datacenter A
Local nodes are queried
first, if non are available,
the request could be
sent to a remote node.
- 27. Token Aware Balancing
© 2014 DataStax, All Rights Reserved.
Nodes that own a Replica
of the PK being read or
written by the query will
be contacted first.
27
Node
Node
Replica
Node
Client
Replica
Replica
Partition Key will be
inferred from Prepared
Statements metadata
- 28. State of DataStax Drivers
© 2014 DataStax, All Rights Reserved.
28
Cassandra
1.2
Cassandra
2.0
Cassandra
2.1
Java 1.0 - 2.1 2.0 - 2.1 2.1
Python 1.0 - 2.1 2.0 - 2.1 2.1
C# 1.0 - 2.1 2.0 - 2.1 2.1
Node.js 1.0 1.0 Later
C++ 1.0-beta4 1.0-beta4 Later
Ruby 1.0-beta3 1.0-beta3 Later
Later versions of Cassandra can use earlier Drivers, but some features won’t be supported
- 29. DataStax Driver in Practice
Java
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-‐driver-‐core</artifactId>
<version>2.1.0</version>
</dependency>
Python
$
pip
install
cassandra-‐driver
C#
PM>
Install-‐Package
CassandraCSharpDriver
Ruby
gem
install
cassandra-‐driver
-‐-‐pre
Node.js
$
npm
install
cassandra-‐driver
© 2014 DataStax, All Rights Reserved. 29
- 30. Connect and Write
Cluster cluster = Cluster.builder()
.addContactPoints("10.1.2.5", "cassandra_node3")
.build();
Session session = cluster.connect(“my_keyspace");
session.execute(
"INSERT INTO user (user_id, name, email)
VALUES (12345, 'johndoe', 'john@doe.com')"
);
The rest of the
nodes will be
discovered by
the driver
A keyspace is
just like a
schema in the
SQL world
© 2014 DataStax, All Rights Reserved. 30
- 31. Read
ResultSet resultSet = session.execute(
Session is a thread safe
object. A singleton should
be instantiated at startup
"SELECT * FROM user WHERE user_id IN (1,8,13)"
);
List<Row> rows = resultSet.all();
for (Row row : rows) {
String userId = row.getString("user_id");
String name = row.getString("name");
String email = row.getString("email");
}
Actually ResultSet also
implements Iterable<Row>
© 2014 DataStax, All Rights Reserved. 31
- 32. Write with Prepared Statements
PreparedStatement objects
are also threadsafe, just create
a singleton at startup
PreparedStatement insertUser = session.prepare(
"INSERT INTO user (user_id, name, email)
VALUES (?, ?, ?)"
);
BoundStatement statement = insertUser
.bind(12345, "johndoe", "john@doe.com")
.setConsistencyLevel(ConsistencyLevel.QUORUM);
session.execute(statement);
Parameters can
be named as well
BoundStatement
is a stateful, NON
threadsafe object
Consistency Level can be
set for each statement
© 2014 DataStax, All Rights Reserved. 32
- 33. Asynchronous Read
ResultSetFuture future = session.executeAsync(
"SELECT * FROM user WHERE user_id IN (1,2,3)"
);
ResultSet resultSet = future.get();
List<Row> rows = resultSet.all();
for (Row row : rows) {
String userId = row.getString("user_id");
String name = row.getString("name");
String email = row.getString("email");
}
Will not block. Returns
immediately
Will block until less all
the connections are
busy
© 2014 DataStax, All Rights Reserved. 33
- 34. Asynchronous Read with Callbacks
ResultSetFuture future = session.executeAsync(
"SELECT * FROM user WHERE user_id IN (1,2,3)"
);
future.addListener(new Runnable() {
public void run() {
// Process the results here
}
}, executor);
ResultSetFuture
implements Guava’s
ListenableFuture
executor =
Executors
.newCachedThreadPool();
executor =
MoreExecutors
.sameThreadExecutor();
Only if your listener code
is trivial and non blocking
as it’ll be executed in the
IO Thread
…Or any thread pool that
you prefer
© 2014 DataStax, All Rights Reserved. 34
- 35. Query Builder
import static of
QueryBuilder is required in
order to use the DSL
import static
com.datastax.driver.core.querybuilder.QueryBuilder.*;
Statement selectAll =
select().all().from("user").where(eq("user_id", userId));
session.execute(selectAll);
Statement insert = insertInto("user")
.value("user_id", 2)
.value("name", "johndoe")
.value("email", "john@doe.com");
session.execute(insert);
© 2014 DataStax, All Rights Reserved. 35
- 36. Python
cluster = Cluster(['10.1.1.3', '10.1.1.4', ’10.1.1.5'])
session = cluster.connect('mykeyspace')
def handle_success(rows):
user = rows[0]
try:
process_user(user.name, user.age, user.id)
except Exception:
log.error("Failed to process user %s", user.id)
# don't re-raise errors in the callback
def handle_error(exception):
log.error("Failed to fetch user info: %s", exception)
future = session.execute_async("SELECT * FROM users WHERE user_id=3")
future.add_callbacks(handle_success, handle_error)
It’s also possible to retrieve
the result from the future
object synchronously
© 2014 DataStax, All Rights Reserved. 36
- 37. C#
var cluster = Cluster.Builder()
.AddContactPoints("host1", "host2", "host3")
.Build();
var session = cluster.Connect("sample_keyspace");
var task = session.ExecuteAsync(statement);
task.ContinueWith((t) =>
{
var rs = t.Result;
foreach (var row in rs)
{
//Get the values from each row
}
}, TaskContinuationOptions.OnlyOnRanToCompletion);
Asynchronously
execute a query
using the TPL
© 2014 DataStax, All Rights Reserved. 37
- 38. C / C++
CassString query = cass_string_init("SELECT keyspace_name
FROM system.schema_keyspaces;");
CassStatement* statement = cass_statement_new(query, 0);
CassFuture* result_future = cass_session_execute(session, statement);
if (cass_future_error_code(result_future) == CASS_OK) {
const CassResult* result = cass_future_get_result(result_future);
CassIterator* rows = cass_iterator_from_result(result);
while (cass_iterator_next(rows)) {
// Process results
}
cass_result_free(result);
cass_iterator_free(rows);
}
cass_future_free(result_future);
Each structure must
be freed with the
appropriate function
© 2014 DataStax, All Rights Reserved. 38
- 39. Node.js
var cassandra = require('cassandra-driver');
var client = new cassandra.Client({
contactPoints: ['host1', 'h2'],
keyspace: 'ks1'
});
var query =
'SELECT email, last_name FROM user_profiles WHERE key=?';
client.execute(query, ['guy'], function(err, result) {
assert.ifError(err);
console.log('got user profile with email ' +
result.rows[0].email);
});
Here we’re using a
Parameterized Statement,
which is not prepared, but
still allows parameters
© 2014 DataStax, All Rights Reserved. 39
- 40. Ruby
cluster = Cassandra.cluster
session = cluster.connect(‘system')
future = session.execute_async('SELECT * FROM schema_columnfamilies')
future.on_success do |rows|
rows.each do |row|
Register a listener on the
future, which will be called
when results are available
puts "The keyspace #{row['keyspace_name']} has a table called
#{row['columnfamily_name']}"
end
end
future.join
© 2014 DataStax, All Rights Reserved. 40
- 41. Object Mapper
• Avoid boilerplate for common use cases
• Map Objects to Statements and ResultSets to Objects
• Do NOT hide Cassandra from the developer
• No “clever tricks” à la Hibernate
• Not JPA compatible, but JPA-ish API
© 2014 DataStax, All Rights Reserved.
41
- 42. Object Mapper in Practice
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-‐driver-‐mapping</artifactId>
<version>2.1.0</version>
</dependency>
Additional artifact for
object mapping
Available from Driver 2.1.0
© 2014 DataStax, All Rights Reserved. 42
- 43. Basic Object Mapping
CREATE
TYPE
address
(
street
text,
city
text,
zip
int
);
CREATE
TABLE
users
(
email
text
PRIMARY
KEY,
address
address
);
@UDT(keyspace
=
"ks",
name
=
"address")
public
class
Address
{
private
String
street;
private
String
city;
private
int
zip;
//
getters
and
setters
omitted...
}
@Table(keyspace
=
"ks",
name
=
"users")
public
class
User
{
@PartitionKey
private
String
email;
private
Address
address;
//
getters
and
setters
omitted...
}
© 2014 DataStax, All Rights Reserved. 43
- 44. Basic Object Mapping
MappingManager
manager
=
new
MappingManager(session);
Mapper
mapper
=
manager.mapper(User.class);
UserProfile
myProfile
=
mapper.get("xyz@example.com");
ListenableFuture
saveFuture
=
mapper.saveAsync(anotherProfile);
mapper.delete("xyz@example.com");
Mapper, just like Session, is
a thread-safe object. Create
a singleton at startup.
get() returns a mapped row
for the given Primary Key
ListenableFuture from
Guava. Completed when the
write is acknowledged.
© 2014 DataStax, All Rights Reserved. 44
- 45. Accessors
@Accessor
interface
UserAccessor
{
@Query("SELECT
*
FROM
user_profiles
LIMIT
:max")
Result<User>
firstN(@Param("max")
int
limit);
}
UserAccessor
accessor
=
manager.createAccessor(UserAccessor.class);
Result<User>
users
=
accessor.firstN(10);
for
(User
user
:
users)
{
System.out.println(
profile.getAddress().getZip()
);
}
Result is like ResultSet
but specialized for a
mapped class…
…so we iterate over it
just like we would with a
ResultSet
© 2014 DataStax, All Rights Reserved. 45
- 46. We’re Hiring!
Cassandra Tech Day - Paris
November 4th
Cassandra Summit Europe - London
December 3-4th
@mfiguiere