SlideShare a Scribd company logo
1 of 46
Download to read offline
Cassandra for Developers 
DataStax Drivers in Practice 
Michaël Figuière 
Drivers & Developer Tools Architect 
@mfiguiere
Cassandra Peer to Peer Architecture 
© 2014 DataStax, All Rights Reserved. 
2 
Node 
Node Node 
Node 
Node 
Node 
Every node have the same role, 
there’s no Master or Slave 
Each node contains a replica of 
some partitions of tables
Cassandra Peer to Peer Architecture 
© 2014 DataStax, All Rights Reserved. 
3 
Node 
Node Replica 
Replica 
Replica 
Node 
Each partition is stored in 
several Replicas to ensure 
durability and high availability
Client / Server Communication 
© 2014 DataStax, All Rights Reserved. 
4 
Client 
Client 
Client 
Client 
Node 
Node Replica 
Replica 
Replica 
Node 
Coordinator node: 
Forwards all R/W requests 
to corresponding replicas
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
5 
3 replicas 
A A A 
Time 
5
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
66 
Write and wait for 
acknowledge from one node 
Write ‘B’ 
B A A 
Time 
A A A
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
77 
Write and wait for 
acknowledge from one node 
Write ‘B’ 
B A A 
Time 
A A A
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
88 
R + W < N 
A A A 
Read waiting for one node 
to answer 
B A A 
8 
B A A 
Write and wait for 
acknowledge from one node 
Time
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
9 
R + W = N 
A A A 
B B 
A 
B B A 
Write and wait for 
acknowledges from two nodes 
Read waiting for one node 
to answer 
Time
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
10 
R + W > N 
A A A 
B B 
A 
B B 
A 
Write and wait for 
acknowledges from two nodes 
Read waiting for two nodes 
to answer 
Time
Tunable Consistency 
© 2014 DataStax, All Rights Reserved. 
11 
R = W = QUORUM 
A A A 
B B 
A 
B B 
A 
Time 
QUORUM = (N / 2) + 1
Cassandra Query Language (CQL) 
• Similar to SQL, mostly a subset 
• Without joins, sub-queries, and aggregations 
• Primary Key contains: 
• A Partition Key used to select the partition that will store the Row 
• Some Clustering Columns, used to define how Rows should be grouped 
and sorted on the disk 
• Support Collections 
• Support User Defined Types (UDT) 
© 2014 DataStax, All Rights Reserved. 
12
CQL: Create Table 
CREATE TABLE users ( 
login text, 
name text, 
age int, 
… 
PRIMARY KEY (login)); 
Just like in SQL! 
login is the partition key, it will 
be hashed and rows will be 
spread over the cluster on 
different partitions 
© 2014 DataStax, All Rights Reserved. 13
CQL: Clustered Table 
A TimeUUID is a UUID that 
can be sorted chronologically 
CREATE TABLE mailbox ( 
login text, 
message_id timeuuid, 
interlocutor text, 
message text, 
PRIMARY KEY((login), message_id) 
); 
message_id is a clustering column, 
it means that all the rows with a 
same login will be grouped and 
sorted by message_id on the disk 
© 2014 DataStax, All Rights Reserved. 14
CQL: Queries 
Get message by user and message_id (date) 
SELECT * FROM mailbox 
WHERE login = jdoe 
AND message_id = '2014-09-25 16:00:00'; 
Get message by user and date interval 
SELECT * FROM mailbox WHERE login = jdoe 
AND message_id <= '2014-09-25 16:00:00' 
AND message_id >= '2014-09-20 16:00:00'; 
WHERE clauses can only be constraints 
on the primary key and range queries are 
not possible on the partition key 
© 2014 DataStax, All Rights Reserved. 15
CQL: Collections 
CREATE TABLE users ( 
login text, 
set and list have a similar 
name text, 
semantic as in Java 
age int, 
friends set<text>, 
hobbies list<text>, 
languages map<int, text>, 
… 
PRIMARY KEY (login) 
); It’s not possible to use nested 
collections… yet 
© 2014 DataStax, All Rights Reserved. 16
Cassandra 2.1: User Defined Type (UDT) 
CREATE TABLE users ( 
login text, 
… 
street_number int, 
street_name text, 
postcode int, 
country text, 
… 
PRIMARY KEY(login)); 
CREATE TYPE address ( 
street_number int, 
street_name text, 
postcode int, 
country text 
); 
CREATE TABLE users ( 
login text, 
… 
location frozen<address>, 
… 
PRIMARY KEY(login) 
); 
© 2014 DataStax, All Rights Reserved. 17
Cassandra 2.1: UDT Insert / Update 
INSERT INTO users(login,name, location) 
VALUES ('jdoe','John DOE', 
{ 
'street_number': 124, 
'street_name': 'Congress Avenue', 
'postcode': 95054, 
'country': 'USA' 
}); 
UPDATE users SET location = 
{ 
'street_number': 125, 
'street_name': 'Congress Avenue', 
'postcode': 95054, 
'country': 'USA' 
} 
WHERE login = jdoe; 
© 2014 DataStax, All Rights Reserved. 18
Client / Server Communication 
© 2014 DataStax, All Rights Reserved. 
19 
Client 
Client 
Client 
Client 
Node 
Node Replica 
Replica 
Replica 
Node 
Coordinator node: 
Forwards all R/W requests 
to corresponding replicas
Request Pipelining 
© 2014 DataStax, All Rights Reserved. 
20 
Client 
Without 
Request Pipelining 
Cassandra 
Client Cassandra 
With 
Request Pipelining
Notifications 
© 2014 DataStax, All Rights Reserved. 
21 
Client 
Without 
Notifications 
With 
Notifications 
Node 
Node 
Node 
Client 
Node 
Node 
Node
Asynchronous Driver Architecture 
© 2014 DataStax, All Rights Reserved. 
22 
Client 
Thread 
Node 
Node 
Node 
Client 
Thread 
Client 
Thread 
Node 
Driver
Asynchronous Driver Architecture 
© 2014 DataStax, All Rights Reserved. 
23 
Client 
Thread 
Node 
Node 
Node 
Client 
Thread 
Client 
Thread 
Node 
6 
2 
3 
4 
5 
1 
Driver
Failover 
© 2014 DataStax, All Rights Reserved. 
24 
Client 
Thread 
Node 
Node 
Node 
Client 
Thread 
Client 
Thread 
Node 
7 
2 
4 
3 5 1 
Driver 
6
DataStax Drivers Highlights 
• Asynchronous architecture using Non Blocking IOs 
• Prepared Statements Support 
• Automatic Failover 
• Node Discovery 
• Tunable Load Balancing 
• Round Robin, Latency Awareness, Multi Data Centers, Replica Awareness 
• Cassandra Tracing Support 
• Compression & SSL 
© 2014 DataStax, All Rights Reserved. 
25
DataCenter Aware Balancing 
© 2014 DataStax, All Rights Reserved. 
26 
Node 
Node 
Client Node 
Node 
Datacenter B 
Node 
Node 
Client 
Client 
Client 
Client 
Client 
Datacenter A 
Local nodes are queried 
first, if non are available, 
the request could be 
sent to a remote node.
Token Aware Balancing 
© 2014 DataStax, All Rights Reserved. 
Nodes that own a Replica 
of the PK being read or 
written by the query will 
be contacted first. 
27 
Node 
Node 
Replica 
Node 
Client 
Replica 
Replica 
Partition Key will be 
inferred from Prepared 
Statements metadata
State of DataStax Drivers 
© 2014 DataStax, All Rights Reserved. 
28 
Cassandra 
1.2 
Cassandra 
2.0 
Cassandra 
2.1 
Java 1.0 - 2.1 2.0 - 2.1 2.1 
Python 1.0 - 2.1 2.0 - 2.1 2.1 
C# 1.0 - 2.1 2.0 - 2.1 2.1 
Node.js 1.0 1.0 Later 
C++ 1.0-beta4 1.0-beta4 Later 
Ruby 1.0-beta3 1.0-beta3 Later 
Later versions of Cassandra can use earlier Drivers, but some features won’t be supported
DataStax Driver in Practice 
Java 
<dependency> 
<groupId>com.datastax.cassandra</groupId> 
<artifactId>cassandra-­‐driver-­‐core</artifactId> 
<version>2.1.0</version> 
</dependency> 
Python 
$ 
pip 
install 
cassandra-­‐driver 
C# 
PM> 
Install-­‐Package 
CassandraCSharpDriver 
Ruby 
gem 
install 
cassandra-­‐driver 
-­‐-­‐pre 
Node.js 
$ 
npm 
install 
cassandra-­‐driver 
© 2014 DataStax, All Rights Reserved. 29
Connect and Write 
Cluster cluster = Cluster.builder() 
.addContactPoints("10.1.2.5", "cassandra_node3") 
.build(); 
Session session = cluster.connect(“my_keyspace"); 
session.execute( 
"INSERT INTO user (user_id, name, email) 
VALUES (12345, 'johndoe', 'john@doe.com')" 
); 
The rest of the 
nodes will be 
discovered by 
the driver 
A keyspace is 
just like a 
schema in the 
SQL world 
© 2014 DataStax, All Rights Reserved. 30
Read 
ResultSet resultSet = session.execute( 
Session is a thread safe 
object. A singleton should 
be instantiated at startup 
"SELECT * FROM user WHERE user_id IN (1,8,13)" 
); 
List<Row> rows = resultSet.all(); 
for (Row row : rows) { 
String userId = row.getString("user_id"); 
String name = row.getString("name"); 
String email = row.getString("email"); 
} 
Actually ResultSet also 
implements Iterable<Row> 
© 2014 DataStax, All Rights Reserved. 31
Write with Prepared Statements 
PreparedStatement objects 
are also threadsafe, just create 
a singleton at startup 
PreparedStatement insertUser = session.prepare( 
"INSERT INTO user (user_id, name, email) 
VALUES (?, ?, ?)" 
); 
BoundStatement statement = insertUser 
.bind(12345, "johndoe", "john@doe.com") 
.setConsistencyLevel(ConsistencyLevel.QUORUM); 
session.execute(statement); 
Parameters can 
be named as well 
BoundStatement 
is a stateful, NON 
threadsafe object 
Consistency Level can be 
set for each statement 
© 2014 DataStax, All Rights Reserved. 32
Asynchronous Read 
ResultSetFuture future = session.executeAsync( 
"SELECT * FROM user WHERE user_id IN (1,2,3)" 
); 
ResultSet resultSet = future.get(); 
List<Row> rows = resultSet.all(); 
for (Row row : rows) { 
String userId = row.getString("user_id"); 
String name = row.getString("name"); 
String email = row.getString("email"); 
} 
Will not block. Returns 
immediately 
Will block until less all 
the connections are 
busy 
© 2014 DataStax, All Rights Reserved. 33
Asynchronous Read with Callbacks 
ResultSetFuture future = session.executeAsync( 
"SELECT * FROM user WHERE user_id IN (1,2,3)" 
); 
future.addListener(new Runnable() { 
public void run() { 
// Process the results here 
} 
}, executor); 
ResultSetFuture 
implements Guava’s 
ListenableFuture 
executor = 
Executors 
.newCachedThreadPool(); 
executor = 
MoreExecutors 
.sameThreadExecutor(); 
Only if your listener code 
is trivial and non blocking 
as it’ll be executed in the 
IO Thread 
…Or any thread pool that 
you prefer 
© 2014 DataStax, All Rights Reserved. 34
Query Builder 
import static of 
QueryBuilder is required in 
order to use the DSL 
import static 
com.datastax.driver.core.querybuilder.QueryBuilder.*; 
Statement selectAll = 
select().all().from("user").where(eq("user_id", userId)); 
session.execute(selectAll); 
Statement insert = insertInto("user") 
.value("user_id", 2) 
.value("name", "johndoe") 
.value("email", "john@doe.com"); 
session.execute(insert); 
© 2014 DataStax, All Rights Reserved. 35
Python 
cluster = Cluster(['10.1.1.3', '10.1.1.4', ’10.1.1.5']) 
session = cluster.connect('mykeyspace') 
def handle_success(rows): 
user = rows[0] 
try: 
process_user(user.name, user.age, user.id) 
except Exception: 
log.error("Failed to process user %s", user.id) 
# don't re-raise errors in the callback 
def handle_error(exception): 
log.error("Failed to fetch user info: %s", exception) 
future = session.execute_async("SELECT * FROM users WHERE user_id=3") 
future.add_callbacks(handle_success, handle_error) 
It’s also possible to retrieve 
the result from the future 
object synchronously 
© 2014 DataStax, All Rights Reserved. 36
C# 
var cluster = Cluster.Builder() 
.AddContactPoints("host1", "host2", "host3") 
.Build(); 
var session = cluster.Connect("sample_keyspace"); 
var task = session.ExecuteAsync(statement); 
task.ContinueWith((t) => 
{ 
var rs = t.Result; 
foreach (var row in rs) 
{ 
//Get the values from each row 
} 
}, TaskContinuationOptions.OnlyOnRanToCompletion); 
Asynchronously 
execute a query 
using the TPL 
© 2014 DataStax, All Rights Reserved. 37
C / C++ 
CassString query = cass_string_init("SELECT keyspace_name 
FROM system.schema_keyspaces;"); 
CassStatement* statement = cass_statement_new(query, 0); 
CassFuture* result_future = cass_session_execute(session, statement); 
if (cass_future_error_code(result_future) == CASS_OK) { 
const CassResult* result = cass_future_get_result(result_future); 
CassIterator* rows = cass_iterator_from_result(result); 
while (cass_iterator_next(rows)) { 
// Process results 
} 
cass_result_free(result); 
cass_iterator_free(rows); 
} 
cass_future_free(result_future); 
Each structure must 
be freed with the 
appropriate function 
© 2014 DataStax, All Rights Reserved. 38
Node.js 
var cassandra = require('cassandra-driver'); 
var client = new cassandra.Client({ 
contactPoints: ['host1', 'h2'], 
keyspace: 'ks1' 
}); 
var query = 
'SELECT email, last_name FROM user_profiles WHERE key=?'; 
client.execute(query, ['guy'], function(err, result) { 
assert.ifError(err); 
console.log('got user profile with email ' + 
result.rows[0].email); 
}); 
Here we’re using a 
Parameterized Statement, 
which is not prepared, but 
still allows parameters 
© 2014 DataStax, All Rights Reserved. 39
Ruby 
cluster = Cassandra.cluster 
session = cluster.connect(‘system') 
future = session.execute_async('SELECT * FROM schema_columnfamilies') 
future.on_success do |rows| 
rows.each do |row| 
Register a listener on the 
future, which will be called 
when results are available 
puts "The keyspace #{row['keyspace_name']} has a table called 
#{row['columnfamily_name']}" 
end 
end 
future.join 
© 2014 DataStax, All Rights Reserved. 40
Object Mapper 
• Avoid boilerplate for common use cases 
• Map Objects to Statements and ResultSets to Objects 
• Do NOT hide Cassandra from the developer 
• No “clever tricks” à la Hibernate 
• Not JPA compatible, but JPA-ish API 
© 2014 DataStax, All Rights Reserved. 
41
Object Mapper in Practice 
<dependency> 
<groupId>com.datastax.cassandra</groupId> 
<artifactId>cassandra-­‐driver-­‐mapping</artifactId> 
<version>2.1.0</version> 
</dependency> 
Additional artifact for 
object mapping 
Available from Driver 2.1.0 
© 2014 DataStax, All Rights Reserved. 42
Basic Object Mapping 
CREATE 
TYPE 
address 
( 
street 
text, 
city 
text, 
zip 
int 
); 
CREATE 
TABLE 
users 
( 
email 
text 
PRIMARY 
KEY, 
address 
address 
); 
@UDT(keyspace 
= 
"ks", 
name 
= 
"address") 
public 
class 
Address 
{ 
private 
String 
street; 
private 
String 
city; 
private 
int 
zip; 
// 
getters 
and 
setters 
omitted... 
} 
@Table(keyspace 
= 
"ks", 
name 
= 
"users") 
public 
class 
User 
{ 
@PartitionKey 
private 
String 
email; 
private 
Address 
address; 
// 
getters 
and 
setters 
omitted... 
} 
© 2014 DataStax, All Rights Reserved. 43
Basic Object Mapping 
MappingManager 
manager 
= 
new 
MappingManager(session); 
Mapper 
mapper 
= 
manager.mapper(User.class); 
UserProfile 
myProfile 
= 
mapper.get("xyz@example.com"); 
ListenableFuture 
saveFuture 
= 
mapper.saveAsync(anotherProfile); 
mapper.delete("xyz@example.com"); 
Mapper, just like Session, is 
a thread-safe object. Create 
a singleton at startup. 
get() returns a mapped row 
for the given Primary Key 
ListenableFuture from 
Guava. Completed when the 
write is acknowledged. 
© 2014 DataStax, All Rights Reserved. 44
Accessors 
@Accessor 
interface 
UserAccessor 
{ 
@Query("SELECT 
* 
FROM 
user_profiles 
LIMIT 
:max") 
Result<User> 
firstN(@Param("max") 
int 
limit); 
} 
UserAccessor 
accessor 
= 
manager.createAccessor(UserAccessor.class); 
Result<User> 
users 
= 
accessor.firstN(10); 
for 
(User 
user 
: 
users) 
{ 
System.out.println( 
profile.getAddress().getZip() 
); 
} 
Result is like ResultSet 
but specialized for a 
mapped class… 
…so we iterate over it 
just like we would with a 
ResultSet 
© 2014 DataStax, All Rights Reserved. 45
We’re Hiring! 
Cassandra Tech Day - Paris 
November 4th 
Cassandra Summit Europe - London 
December 3-4th 
@mfiguiere

More Related Content

What's hot

06 response-headers
06 response-headers06 response-headers
06 response-headers
snopteck
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
jbellis
 
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net DriverGetting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
DataStax Academy
 
Integrating OpenStack with Active Directory
Integrating OpenStack with Active DirectoryIntegrating OpenStack with Active Directory
Integrating OpenStack with Active Directory
cjellick
 

What's hot (20)

Introduction to apache zoo keeper
Introduction to apache zoo keeper Introduction to apache zoo keeper
Introduction to apache zoo keeper
 
Apache ZooKeeper
Apache ZooKeeperApache ZooKeeper
Apache ZooKeeper
 
06 response-headers
06 response-headers06 response-headers
06 response-headers
 
Hazelcast
HazelcastHazelcast
Hazelcast
 
Jafka guide
Jafka guideJafka guide
Jafka guide
 
Ice mini guide
Ice mini guideIce mini guide
Ice mini guide
 
Cassandra 3.0 advanced preview
Cassandra 3.0 advanced previewCassandra 3.0 advanced preview
Cassandra 3.0 advanced preview
 
Cassandra Summit 2013 Keynote
Cassandra Summit 2013 KeynoteCassandra Summit 2013 Keynote
Cassandra Summit 2013 Keynote
 
DataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with JavaDataStax NYC Java Meetup: Cassandra with Java
DataStax NYC Java Meetup: Cassandra with Java
 
Zookeeper
ZookeeperZookeeper
Zookeeper
 
SCWCD : Thread safe servlets : CHAP : 8
SCWCD : Thread safe servlets : CHAP : 8SCWCD : Thread safe servlets : CHAP : 8
SCWCD : Thread safe servlets : CHAP : 8
 
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
DevOpsDays Warsaw 2015: Running High Performance And Fault Tolerant Elasticse...
 
Cassandra summit keynote 2014
Cassandra summit keynote 2014Cassandra summit keynote 2014
Cassandra summit keynote 2014
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
 
Keystone deep dive 1
Keystone deep dive 1Keystone deep dive 1
Keystone deep dive 1
 
Getting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net DriverGetting Started with Datatsax .Net Driver
Getting Started with Datatsax .Net Driver
 
Cassandra EU - Data model on fire
Cassandra EU - Data model on fireCassandra EU - Data model on fire
Cassandra EU - Data model on fire
 
Integrating OpenStack with Active Directory
Integrating OpenStack with Active DirectoryIntegrating OpenStack with Active Directory
Integrating OpenStack with Active Directory
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Kerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit eastKerberizing spark. Spark Summit east
Kerberizing spark. Spark Summit east
 

Similar to Paris Cassandra Meetup - Cassandra for Developers

Encode x NEAR: Technical Overview of NEAR 1
Encode x NEAR: Technical Overview of NEAR 1Encode x NEAR: Technical Overview of NEAR 1
Encode x NEAR: Technical Overview of NEAR 1
KlaraOrban
 
Paris Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra DriversParis Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra Drivers
Michaël Figuière
 

Similar to Paris Cassandra Meetup - Cassandra for Developers (20)

Fraud Detection for Israel BigThings Meetup
Fraud Detection  for Israel BigThings MeetupFraud Detection  for Israel BigThings Meetup
Fraud Detection for Israel BigThings Meetup
 
Multi-cluster k8ssandra
Multi-cluster k8ssandraMulti-cluster k8ssandra
Multi-cluster k8ssandra
 
Ruby Driver Explained: DataStax Webinar May 5th 2015
Ruby Driver Explained: DataStax Webinar May 5th 2015Ruby Driver Explained: DataStax Webinar May 5th 2015
Ruby Driver Explained: DataStax Webinar May 5th 2015
 
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
Cassandra Summit 2014: Highly Scalable Web Application in the Cloud with Cass...
 
Schema-based multi-tenant architecture using Quarkus &amp; Hibernate-ORM.pdf
Schema-based multi-tenant architecture using Quarkus &amp; Hibernate-ORM.pdfSchema-based multi-tenant architecture using Quarkus &amp; Hibernate-ORM.pdf
Schema-based multi-tenant architecture using Quarkus &amp; Hibernate-ORM.pdf
 
Spark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and FurureSpark Cassandra Connector: Past, Present and Furure
Spark Cassandra Connector: Past, Present and Furure
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Virtual training intro to InfluxDB - June 2021
Virtual training  intro to InfluxDB  - June 2021Virtual training  intro to InfluxDB  - June 2021
Virtual training intro to InfluxDB - June 2021
 
What's New in Apache Hive
What's New in Apache HiveWhat's New in Apache Hive
What's New in Apache Hive
 
DataStax 6 and Beyond
DataStax 6 and BeyondDataStax 6 and Beyond
DataStax 6 and Beyond
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
 
Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache CassandraCassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
Cassandra Day Atlanta 2015: BetterCloud: Leveraging Apache Cassandra
 
Encode x NEAR: Technical Overview of NEAR 1
Encode x NEAR: Technical Overview of NEAR 1Encode x NEAR: Technical Overview of NEAR 1
Encode x NEAR: Technical Overview of NEAR 1
 
Paris Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra DriversParis Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra Drivers
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoop
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Flight on Zeppelin with Apache Spark & Cassandra
Flight on Zeppelin with Apache Spark & CassandraFlight on Zeppelin with Apache Spark & Cassandra
Flight on Zeppelin with Apache Spark & Cassandra
 

More from Michaël Figuière

ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with CassandraApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
Michaël Figuière
 
NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with CassandraNoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
Michaël Figuière
 
GTUG Nantes (Dec 2011) - BigTable et NoSQL
GTUG Nantes (Dec 2011) - BigTable et NoSQLGTUG Nantes (Dec 2011) - BigTable et NoSQL
GTUG Nantes (Dec 2011) - BigTable et NoSQL
Michaël Figuière
 
Duchess France (Nov 2011) - Atelier Apache Mahout
Duchess France (Nov 2011) - Atelier Apache MahoutDuchess France (Nov 2011) - Atelier Apache Mahout
Duchess France (Nov 2011) - Atelier Apache Mahout
Michaël Figuière
 
JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...
JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...
JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...
Michaël Figuière
 
BreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec Cassandra
BreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec CassandraBreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec Cassandra
BreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec Cassandra
Michaël Figuière
 
Mix-IT (Apr 2011) - Intelligence Collective avec Apache Mahout
Mix-IT (Apr 2011) - Intelligence Collective avec Apache MahoutMix-IT (Apr 2011) - Intelligence Collective avec Apache Mahout
Mix-IT (Apr 2011) - Intelligence Collective avec Apache Mahout
Michaël Figuière
 
Xebia Knowledge Exchange (mars 2011) - Machine Learning with Apache Mahout
Xebia Knowledge Exchange (mars 2011) - Machine Learning with Apache MahoutXebia Knowledge Exchange (mars 2011) - Machine Learning with Apache Mahout
Xebia Knowledge Exchange (mars 2011) - Machine Learning with Apache Mahout
Michaël Figuière
 
Breizh JUG (mar 2011) - NoSQL : Des Grands du Web aux Entreprises
Breizh JUG (mar 2011) - NoSQL : Des Grands du Web aux EntreprisesBreizh JUG (mar 2011) - NoSQL : Des Grands du Web aux Entreprises
Breizh JUG (mar 2011) - NoSQL : Des Grands du Web aux Entreprises
Michaël Figuière
 
FOSDEM (feb 2011) - A real-time search engine with Lucene and S4
FOSDEM (feb 2011) -  A real-time search engine with Lucene and S4FOSDEM (feb 2011) -  A real-time search engine with Lucene and S4
FOSDEM (feb 2011) - A real-time search engine with Lucene and S4
Michaël Figuière
 
Xebia Knowledge Exchange (feb 2011) - Large Scale Web Development
Xebia Knowledge Exchange (feb 2011) - Large Scale Web DevelopmentXebia Knowledge Exchange (feb 2011) - Large Scale Web Development
Xebia Knowledge Exchange (feb 2011) - Large Scale Web Development
Michaël Figuière
 
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
Michaël Figuière
 
Lorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprises
Lorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprisesLorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprises
Lorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprises
Michaël Figuière
 
Tours JUG (oct 2010) - NoSQL, des grands du Web aux entreprises
Tours JUG (oct 2010) - NoSQL, des grands du Web aux entreprisesTours JUG (oct 2010) - NoSQL, des grands du Web aux entreprises
Tours JUG (oct 2010) - NoSQL, des grands du Web aux entreprises
Michaël Figuière
 
Paris JUG (sept 2010) - NoSQL : Des concepts à la réalité
Paris JUG (sept 2010) - NoSQL : Des concepts à la réalitéParis JUG (sept 2010) - NoSQL : Des concepts à la réalité
Paris JUG (sept 2010) - NoSQL : Des concepts à la réalité
Michaël Figuière
 
Xebia Knowledge Exchange (mars 2010) - Lucene : From theory to real world
Xebia Knowledge Exchange (mars 2010) - Lucene : From theory to real worldXebia Knowledge Exchange (mars 2010) - Lucene : From theory to real world
Xebia Knowledge Exchange (mars 2010) - Lucene : From theory to real world
Michaël Figuière
 
Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...
Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...
Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...
Michaël Figuière
 

More from Michaël Figuière (17)

ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with CassandraApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
 
NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with CassandraNoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
 
GTUG Nantes (Dec 2011) - BigTable et NoSQL
GTUG Nantes (Dec 2011) - BigTable et NoSQLGTUG Nantes (Dec 2011) - BigTable et NoSQL
GTUG Nantes (Dec 2011) - BigTable et NoSQL
 
Duchess France (Nov 2011) - Atelier Apache Mahout
Duchess France (Nov 2011) - Atelier Apache MahoutDuchess France (Nov 2011) - Atelier Apache Mahout
Duchess France (Nov 2011) - Atelier Apache Mahout
 
JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...
JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...
JUG Summer Camp (Sep 2011) - Les applications et architectures d’entreprise d...
 
BreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec Cassandra
BreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec CassandraBreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec Cassandra
BreizhCamp (Jun 2011) - Haute disponibilité et élasticité avec Cassandra
 
Mix-IT (Apr 2011) - Intelligence Collective avec Apache Mahout
Mix-IT (Apr 2011) - Intelligence Collective avec Apache MahoutMix-IT (Apr 2011) - Intelligence Collective avec Apache Mahout
Mix-IT (Apr 2011) - Intelligence Collective avec Apache Mahout
 
Xebia Knowledge Exchange (mars 2011) - Machine Learning with Apache Mahout
Xebia Knowledge Exchange (mars 2011) - Machine Learning with Apache MahoutXebia Knowledge Exchange (mars 2011) - Machine Learning with Apache Mahout
Xebia Knowledge Exchange (mars 2011) - Machine Learning with Apache Mahout
 
Breizh JUG (mar 2011) - NoSQL : Des Grands du Web aux Entreprises
Breizh JUG (mar 2011) - NoSQL : Des Grands du Web aux EntreprisesBreizh JUG (mar 2011) - NoSQL : Des Grands du Web aux Entreprises
Breizh JUG (mar 2011) - NoSQL : Des Grands du Web aux Entreprises
 
FOSDEM (feb 2011) - A real-time search engine with Lucene and S4
FOSDEM (feb 2011) -  A real-time search engine with Lucene and S4FOSDEM (feb 2011) -  A real-time search engine with Lucene and S4
FOSDEM (feb 2011) - A real-time search engine with Lucene and S4
 
Xebia Knowledge Exchange (feb 2011) - Large Scale Web Development
Xebia Knowledge Exchange (feb 2011) - Large Scale Web DevelopmentXebia Knowledge Exchange (feb 2011) - Large Scale Web Development
Xebia Knowledge Exchange (feb 2011) - Large Scale Web Development
 
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
Xebia Knowledge Exchange (jan 2011) - Trends in Enterprise Applications Archi...
 
Lorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprises
Lorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprisesLorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprises
Lorraine JUG (dec 2010) - NoSQL, des grands du Web aux entreprises
 
Tours JUG (oct 2010) - NoSQL, des grands du Web aux entreprises
Tours JUG (oct 2010) - NoSQL, des grands du Web aux entreprisesTours JUG (oct 2010) - NoSQL, des grands du Web aux entreprises
Tours JUG (oct 2010) - NoSQL, des grands du Web aux entreprises
 
Paris JUG (sept 2010) - NoSQL : Des concepts à la réalité
Paris JUG (sept 2010) - NoSQL : Des concepts à la réalitéParis JUG (sept 2010) - NoSQL : Des concepts à la réalité
Paris JUG (sept 2010) - NoSQL : Des concepts à la réalité
 
Xebia Knowledge Exchange (mars 2010) - Lucene : From theory to real world
Xebia Knowledge Exchange (mars 2010) - Lucene : From theory to real worldXebia Knowledge Exchange (mars 2010) - Lucene : From theory to real world
Xebia Knowledge Exchange (mars 2010) - Lucene : From theory to real world
 
Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...
Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...
Xebia Knowledge Exchange (may 2010) - NoSQL : Using the right tool for the ri...
 

Recently uploaded

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 

Paris Cassandra Meetup - Cassandra for Developers

  • 1. Cassandra for Developers DataStax Drivers in Practice Michaël Figuière Drivers & Developer Tools Architect @mfiguiere
  • 2. Cassandra Peer to Peer Architecture © 2014 DataStax, All Rights Reserved. 2 Node Node Node Node Node Node Every node have the same role, there’s no Master or Slave Each node contains a replica of some partitions of tables
  • 3. Cassandra Peer to Peer Architecture © 2014 DataStax, All Rights Reserved. 3 Node Node Replica Replica Replica Node Each partition is stored in several Replicas to ensure durability and high availability
  • 4. Client / Server Communication © 2014 DataStax, All Rights Reserved. 4 Client Client Client Client Node Node Replica Replica Replica Node Coordinator node: Forwards all R/W requests to corresponding replicas
  • 5. Tunable Consistency © 2014 DataStax, All Rights Reserved. 5 3 replicas A A A Time 5
  • 6. Tunable Consistency © 2014 DataStax, All Rights Reserved. 66 Write and wait for acknowledge from one node Write ‘B’ B A A Time A A A
  • 7. Tunable Consistency © 2014 DataStax, All Rights Reserved. 77 Write and wait for acknowledge from one node Write ‘B’ B A A Time A A A
  • 8. Tunable Consistency © 2014 DataStax, All Rights Reserved. 88 R + W < N A A A Read waiting for one node to answer B A A 8 B A A Write and wait for acknowledge from one node Time
  • 9. Tunable Consistency © 2014 DataStax, All Rights Reserved. 9 R + W = N A A A B B A B B A Write and wait for acknowledges from two nodes Read waiting for one node to answer Time
  • 10. Tunable Consistency © 2014 DataStax, All Rights Reserved. 10 R + W > N A A A B B A B B A Write and wait for acknowledges from two nodes Read waiting for two nodes to answer Time
  • 11. Tunable Consistency © 2014 DataStax, All Rights Reserved. 11 R = W = QUORUM A A A B B A B B A Time QUORUM = (N / 2) + 1
  • 12. Cassandra Query Language (CQL) • Similar to SQL, mostly a subset • Without joins, sub-queries, and aggregations • Primary Key contains: • A Partition Key used to select the partition that will store the Row • Some Clustering Columns, used to define how Rows should be grouped and sorted on the disk • Support Collections • Support User Defined Types (UDT) © 2014 DataStax, All Rights Reserved. 12
  • 13. CQL: Create Table CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY (login)); Just like in SQL! login is the partition key, it will be hashed and rows will be spread over the cluster on different partitions © 2014 DataStax, All Rights Reserved. 13
  • 14. CQL: Clustered Table A TimeUUID is a UUID that can be sorted chronologically CREATE TABLE mailbox ( login text, message_id timeuuid, interlocutor text, message text, PRIMARY KEY((login), message_id) ); message_id is a clustering column, it means that all the rows with a same login will be grouped and sorted by message_id on the disk © 2014 DataStax, All Rights Reserved. 14
  • 15. CQL: Queries Get message by user and message_id (date) SELECT * FROM mailbox WHERE login = jdoe AND message_id = '2014-09-25 16:00:00'; Get message by user and date interval SELECT * FROM mailbox WHERE login = jdoe AND message_id <= '2014-09-25 16:00:00' AND message_id >= '2014-09-20 16:00:00'; WHERE clauses can only be constraints on the primary key and range queries are not possible on the partition key © 2014 DataStax, All Rights Reserved. 15
  • 16. CQL: Collections CREATE TABLE users ( login text, set and list have a similar name text, semantic as in Java age int, friends set<text>, hobbies list<text>, languages map<int, text>, … PRIMARY KEY (login) ); It’s not possible to use nested collections… yet © 2014 DataStax, All Rights Reserved. 16
  • 17. Cassandra 2.1: User Defined Type (UDT) CREATE TABLE users ( login text, … street_number int, street_name text, postcode int, country text, … PRIMARY KEY(login)); CREATE TYPE address ( street_number int, street_name text, postcode int, country text ); CREATE TABLE users ( login text, … location frozen<address>, … PRIMARY KEY(login) ); © 2014 DataStax, All Rights Reserved. 17
  • 18. Cassandra 2.1: UDT Insert / Update INSERT INTO users(login,name, location) VALUES ('jdoe','John DOE', { 'street_number': 124, 'street_name': 'Congress Avenue', 'postcode': 95054, 'country': 'USA' }); UPDATE users SET location = { 'street_number': 125, 'street_name': 'Congress Avenue', 'postcode': 95054, 'country': 'USA' } WHERE login = jdoe; © 2014 DataStax, All Rights Reserved. 18
  • 19. Client / Server Communication © 2014 DataStax, All Rights Reserved. 19 Client Client Client Client Node Node Replica Replica Replica Node Coordinator node: Forwards all R/W requests to corresponding replicas
  • 20. Request Pipelining © 2014 DataStax, All Rights Reserved. 20 Client Without Request Pipelining Cassandra Client Cassandra With Request Pipelining
  • 21. Notifications © 2014 DataStax, All Rights Reserved. 21 Client Without Notifications With Notifications Node Node Node Client Node Node Node
  • 22. Asynchronous Driver Architecture © 2014 DataStax, All Rights Reserved. 22 Client Thread Node Node Node Client Thread Client Thread Node Driver
  • 23. Asynchronous Driver Architecture © 2014 DataStax, All Rights Reserved. 23 Client Thread Node Node Node Client Thread Client Thread Node 6 2 3 4 5 1 Driver
  • 24. Failover © 2014 DataStax, All Rights Reserved. 24 Client Thread Node Node Node Client Thread Client Thread Node 7 2 4 3 5 1 Driver 6
  • 25. DataStax Drivers Highlights • Asynchronous architecture using Non Blocking IOs • Prepared Statements Support • Automatic Failover • Node Discovery • Tunable Load Balancing • Round Robin, Latency Awareness, Multi Data Centers, Replica Awareness • Cassandra Tracing Support • Compression & SSL © 2014 DataStax, All Rights Reserved. 25
  • 26. DataCenter Aware Balancing © 2014 DataStax, All Rights Reserved. 26 Node Node Client Node Node Datacenter B Node Node Client Client Client Client Client Datacenter A Local nodes are queried first, if non are available, the request could be sent to a remote node.
  • 27. Token Aware Balancing © 2014 DataStax, All Rights Reserved. Nodes that own a Replica of the PK being read or written by the query will be contacted first. 27 Node Node Replica Node Client Replica Replica Partition Key will be inferred from Prepared Statements metadata
  • 28. State of DataStax Drivers © 2014 DataStax, All Rights Reserved. 28 Cassandra 1.2 Cassandra 2.0 Cassandra 2.1 Java 1.0 - 2.1 2.0 - 2.1 2.1 Python 1.0 - 2.1 2.0 - 2.1 2.1 C# 1.0 - 2.1 2.0 - 2.1 2.1 Node.js 1.0 1.0 Later C++ 1.0-beta4 1.0-beta4 Later Ruby 1.0-beta3 1.0-beta3 Later Later versions of Cassandra can use earlier Drivers, but some features won’t be supported
  • 29. DataStax Driver in Practice Java <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-­‐driver-­‐core</artifactId> <version>2.1.0</version> </dependency> Python $ pip install cassandra-­‐driver C# PM> Install-­‐Package CassandraCSharpDriver Ruby gem install cassandra-­‐driver -­‐-­‐pre Node.js $ npm install cassandra-­‐driver © 2014 DataStax, All Rights Reserved. 29
  • 30. Connect and Write Cluster cluster = Cluster.builder() .addContactPoints("10.1.2.5", "cassandra_node3") .build(); Session session = cluster.connect(“my_keyspace"); session.execute( "INSERT INTO user (user_id, name, email) VALUES (12345, 'johndoe', 'john@doe.com')" ); The rest of the nodes will be discovered by the driver A keyspace is just like a schema in the SQL world © 2014 DataStax, All Rights Reserved. 30
  • 31. Read ResultSet resultSet = session.execute( Session is a thread safe object. A singleton should be instantiated at startup "SELECT * FROM user WHERE user_id IN (1,8,13)" ); List<Row> rows = resultSet.all(); for (Row row : rows) { String userId = row.getString("user_id"); String name = row.getString("name"); String email = row.getString("email"); } Actually ResultSet also implements Iterable<Row> © 2014 DataStax, All Rights Reserved. 31
  • 32. Write with Prepared Statements PreparedStatement objects are also threadsafe, just create a singleton at startup PreparedStatement insertUser = session.prepare( "INSERT INTO user (user_id, name, email) VALUES (?, ?, ?)" ); BoundStatement statement = insertUser .bind(12345, "johndoe", "john@doe.com") .setConsistencyLevel(ConsistencyLevel.QUORUM); session.execute(statement); Parameters can be named as well BoundStatement is a stateful, NON threadsafe object Consistency Level can be set for each statement © 2014 DataStax, All Rights Reserved. 32
  • 33. Asynchronous Read ResultSetFuture future = session.executeAsync( "SELECT * FROM user WHERE user_id IN (1,2,3)" ); ResultSet resultSet = future.get(); List<Row> rows = resultSet.all(); for (Row row : rows) { String userId = row.getString("user_id"); String name = row.getString("name"); String email = row.getString("email"); } Will not block. Returns immediately Will block until less all the connections are busy © 2014 DataStax, All Rights Reserved. 33
  • 34. Asynchronous Read with Callbacks ResultSetFuture future = session.executeAsync( "SELECT * FROM user WHERE user_id IN (1,2,3)" ); future.addListener(new Runnable() { public void run() { // Process the results here } }, executor); ResultSetFuture implements Guava’s ListenableFuture executor = Executors .newCachedThreadPool(); executor = MoreExecutors .sameThreadExecutor(); Only if your listener code is trivial and non blocking as it’ll be executed in the IO Thread …Or any thread pool that you prefer © 2014 DataStax, All Rights Reserved. 34
  • 35. Query Builder import static of QueryBuilder is required in order to use the DSL import static com.datastax.driver.core.querybuilder.QueryBuilder.*; Statement selectAll = select().all().from("user").where(eq("user_id", userId)); session.execute(selectAll); Statement insert = insertInto("user") .value("user_id", 2) .value("name", "johndoe") .value("email", "john@doe.com"); session.execute(insert); © 2014 DataStax, All Rights Reserved. 35
  • 36. Python cluster = Cluster(['10.1.1.3', '10.1.1.4', ’10.1.1.5']) session = cluster.connect('mykeyspace') def handle_success(rows): user = rows[0] try: process_user(user.name, user.age, user.id) except Exception: log.error("Failed to process user %s", user.id) # don't re-raise errors in the callback def handle_error(exception): log.error("Failed to fetch user info: %s", exception) future = session.execute_async("SELECT * FROM users WHERE user_id=3") future.add_callbacks(handle_success, handle_error) It’s also possible to retrieve the result from the future object synchronously © 2014 DataStax, All Rights Reserved. 36
  • 37. C# var cluster = Cluster.Builder() .AddContactPoints("host1", "host2", "host3") .Build(); var session = cluster.Connect("sample_keyspace"); var task = session.ExecuteAsync(statement); task.ContinueWith((t) => { var rs = t.Result; foreach (var row in rs) { //Get the values from each row } }, TaskContinuationOptions.OnlyOnRanToCompletion); Asynchronously execute a query using the TPL © 2014 DataStax, All Rights Reserved. 37
  • 38. C / C++ CassString query = cass_string_init("SELECT keyspace_name FROM system.schema_keyspaces;"); CassStatement* statement = cass_statement_new(query, 0); CassFuture* result_future = cass_session_execute(session, statement); if (cass_future_error_code(result_future) == CASS_OK) { const CassResult* result = cass_future_get_result(result_future); CassIterator* rows = cass_iterator_from_result(result); while (cass_iterator_next(rows)) { // Process results } cass_result_free(result); cass_iterator_free(rows); } cass_future_free(result_future); Each structure must be freed with the appropriate function © 2014 DataStax, All Rights Reserved. 38
  • 39. Node.js var cassandra = require('cassandra-driver'); var client = new cassandra.Client({ contactPoints: ['host1', 'h2'], keyspace: 'ks1' }); var query = 'SELECT email, last_name FROM user_profiles WHERE key=?'; client.execute(query, ['guy'], function(err, result) { assert.ifError(err); console.log('got user profile with email ' + result.rows[0].email); }); Here we’re using a Parameterized Statement, which is not prepared, but still allows parameters © 2014 DataStax, All Rights Reserved. 39
  • 40. Ruby cluster = Cassandra.cluster session = cluster.connect(‘system') future = session.execute_async('SELECT * FROM schema_columnfamilies') future.on_success do |rows| rows.each do |row| Register a listener on the future, which will be called when results are available puts "The keyspace #{row['keyspace_name']} has a table called #{row['columnfamily_name']}" end end future.join © 2014 DataStax, All Rights Reserved. 40
  • 41. Object Mapper • Avoid boilerplate for common use cases • Map Objects to Statements and ResultSets to Objects • Do NOT hide Cassandra from the developer • No “clever tricks” à la Hibernate • Not JPA compatible, but JPA-ish API © 2014 DataStax, All Rights Reserved. 41
  • 42. Object Mapper in Practice <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-­‐driver-­‐mapping</artifactId> <version>2.1.0</version> </dependency> Additional artifact for object mapping Available from Driver 2.1.0 © 2014 DataStax, All Rights Reserved. 42
  • 43. Basic Object Mapping CREATE TYPE address ( street text, city text, zip int ); CREATE TABLE users ( email text PRIMARY KEY, address address ); @UDT(keyspace = "ks", name = "address") public class Address { private String street; private String city; private int zip; // getters and setters omitted... } @Table(keyspace = "ks", name = "users") public class User { @PartitionKey private String email; private Address address; // getters and setters omitted... } © 2014 DataStax, All Rights Reserved. 43
  • 44. Basic Object Mapping MappingManager manager = new MappingManager(session); Mapper mapper = manager.mapper(User.class); UserProfile myProfile = mapper.get("xyz@example.com"); ListenableFuture saveFuture = mapper.saveAsync(anotherProfile); mapper.delete("xyz@example.com"); Mapper, just like Session, is a thread-safe object. Create a singleton at startup. get() returns a mapped row for the given Primary Key ListenableFuture from Guava. Completed when the write is acknowledged. © 2014 DataStax, All Rights Reserved. 44
  • 45. Accessors @Accessor interface UserAccessor { @Query("SELECT * FROM user_profiles LIMIT :max") Result<User> firstN(@Param("max") int limit); } UserAccessor accessor = manager.createAccessor(UserAccessor.class); Result<User> users = accessor.firstN(10); for (User user : users) { System.out.println( profile.getAddress().getZip() ); } Result is like ResultSet but specialized for a mapped class… …so we iterate over it just like we would with a ResultSet © 2014 DataStax, All Rights Reserved. 45
  • 46. We’re Hiring! Cassandra Tech Day - Paris November 4th Cassandra Summit Europe - London December 3-4th @mfiguiere