3. GFT Group 03.09.2015 3
Who I am and What I do
INTRODUCTION
GFT Group is a business change and
technology consultancy trusted by
the world’s leading financial services
institutions bringing together advisory,
creative and technology capabilities
with innovation culture and specialist
knowledge of the finance sector.
Bruno Tinoco
Father and Java Developer with more
than 15 years of software development
experience using the JavaEE platform for
different companies from Financial to
Travel and Distribution industries.
Currently working as a Software Engineer
at GFT Group for DeustcheBank projects.
Previously I worked as IT Architect at IBM
GBS projects.
Follow us
https://twitter.com/gft_br
https://twitter.com/brunocrt
4. GFT Group 03.09.2015 4
What is the thing about Cassandra
INTRODUCTION
Distributed Database (CAP)
Fault tolerant cluster
Nominated master
Native JSON
Distributed hash table (Partition Keys)
Replication factor
Tunable consistency (ONE,QUORUM, ALL)
SchemaKeyspace
Column Family Table
~Row Key
(Partition key
Clustering column)
Primary
key
No JOINS
No Foreign keys
No rollback/locking
Eventual/Tunable Consistence
SQLCQL
6. GFT Group 03.09.2015 6
Setup your project
HANDS-ON EXPERIENCE
Define your persistence model
Kundera Cassandra Driver
<dependency>
<groupId> com.impetus.kundera.client </groupId>
<artifactId> kundera-cassandra </ artifactId >
<version> 3.4 </version>
</dependency>
Native Queries (CQL)JPA
DataStax Driver *
<dependency>
<groupId> com.datastax.cassandra</groupId>
<artifactId> cassandra-driver-core </ artifactId >
<version> 3.0.0 </version>
</dependency>
* Also supports object mapping through
cassandra-driver-mapping
You can starting mapping your entity classes using
standard JPA annotations…
You must to learn the driver API and
create the schema yourself first
X
Others
Hector
Pelops
Astyanax
7. GFT Group 03.09.2015 7
Setup your tools
HANDS-ON EXPERIENCE
CQLSH
DataStax DevCenter
DataStax OpsCenter (Monitoring)
8. GFT Group 03.09.2015 8
How about data modelling
HANDS-ON EXPERIENCE
Think about sorted Maps
Data Types (List, Json)
N-to-N Relationships
SortedMap<RowKey, SortedMap<ColumnKey, ColumnValue>
9. GFT Group 03.09.2015 9
Keep in mind
HANDS-ON EXPERIENCE
First your Use Case (UI needs)
Do not try to normalize data (Hardest part)
Define carefully your partition key (Performance)
10. GFT Group 03.09.2015 10
More Information...
REFERENCES
Cassandra Community (Forum, Tutorials)
http://www.planetcassandra.org/
Apache Cassandra Home (Download)
https://cassandra.apache.org/
Netflix Cassandra benchmark
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
DataStax Documentation (Tutorials, Guide, Sample code)
https://docs.datastax.com/
13. GFT Group 03.09.2015 13
CQLSH
Interactive mode (shell)
$CASSANDRA_HOME/bin/cqlsh localhost –u user –p user123
clqsh>
Execute mode
$CASSANDRA_HOME/bin/cqlsh localhost –u user –p user123 –e “SELECT * from demo.users”
File mode
$CASSANDRA_HOME/bin/cqlsh localhost –u user –p user123 –f my_cql_commands.cql
TOOLS
Python script that comes with the default Cassandra a installation
14. GFT Group 03.09.2015 14
DataStax DevCenter
Free tool GUI for connect on Cassandra clusters
TOOLS
15. GFT Group 03.09.2015 15
DataStax Driver
Support both Binary (CQL) and Thrift Protocols
Connection Pool
Supports Annotation (Tables/Indexes/Types)
Supports Native Cassandra concepts
JAVA DRIVERS
16. GFT Group 03.09.2015 16
Kundera Driver
JPA Compliant
Annotation based
Auto schema creation
Connection Pool
Support other NoSQL databases (ie. Mongo)
JAVA DRIVERS
17. GFT Group 03.09.2015 17
Other Drivers
Hector
The most stable of the Java APIs, ready for prime-time.
Astyanax
A clean Java API from Netflix. It isn't as widely used as Hector, but it is solid.
Pelops
PlayORM (ORM without the constraints?)
It looks like it is trying to solve the impedance mismatch between traditional JPA-based ORMs
and NoSQL by introducing JQL. It looks promising.
Decision Considerations
Low latency overhead, Asynch API, and reliability/stability for production environment.
(e.g. a more user-friendly APIs that can be had in the DAL that wraps the client).
Connection pooling and partition awareness are some other good feature to have.
Able to detect any new nodes that got added.
Good Support as well (as pointed by dean below)
JAVA DRIVERS