SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Cassandra Data Access in Java
eBuddy use of Cassandra
XMS
● User Data Service
● User Discovery Service
● Persistent Session Store
● Message History
● Location-based Discovery
Cassandra in
eBuddy Messaging Platform
● Current size of data
● 1,4 TiB total (replication of 3x); 467 GiB actual data
● 12 million sessions (11 million users plus groups)
● Almost a billion rows in one column family
(inverse social graph)
Some Statistics
Data Access - Overview
Design Objectives
● Data Source Agnostic
● Testable
● Thread Safe
● Strong Typing
● Supports “transactions”, i.e. units of work in batch
● Efficient Mapping to Application Domain Model
● Follows Familiar Patterns (e.g. Spring JDBC Template)
Data Access in Layers
“Operations” Layer
Writing
● Use Generic Typing
● Has Interface
(for testability, etc.)
● Handles Exceptions
Reading
● Use Mappers
Serializers
● Constructed with serializers that convert to types needed
by data access layer
Reading
Data Access Layer
Data Access Object
● Data Access Object (DAO) is singleton
● Transforms from data model to domain model
● Operations object configured with serializers to convert
from data model to domain model
● Defines the mappers for read operations
Next Steps
CQL3
DataStax:
"We believe that CQL3 is a simpler and overall better API for Cassandra
than the thrift API is. Therefore, new projects/applications are encouraged
to use CQL3"
At eBuddy, we are still using the Thrift API and the Java Hector library.
We are currently looking at CQL3 and whether we want to use it going
forward and whether we will "upgrade" existing code.
Structured Data
● Object Mapping Frameworks
● Mapped vs. Embedded Objects
● Nested Properties ("path" access)
Object Mapping Frameworks
● Simple mapper frameworks with (some) JPA support
● Hector Object Mapper
● Kundera
● Firebrand (not JPA)
● has most features,
e.g supports both embedded and mapped object graphs
https://github.com/impetus-opensource/Kundera
http://github.com/hector-client/hector
http://firebrandocm.org
Hierarchical Properties
● Use DynamicComposites to model keys that have a
variable number of components
put(“accounts|msn|x.y.z|sign_in”, “0”);
put(“accounts|msn|x.y.z|key”, “value”);
get(“accounts”) --> retrieved as a map:
{"accounts":
{ "msn":
{ "x.y.z":
{ "sign_in": "0",
"key": "value" } } } }
● Use a slice query to retrieve properties using partial path:
Questions?
XMS
Unlimited messaging.
Better. Free.
We're Hiring!
Download XMS now:

Mais conteúdo relacionado

Mais procurados

IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
In-Memory Computing Summit
 
Webtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalabilityWebtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalability
Luca Bonmassar
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQL
DATAVERSITY
 

Mais procurados (20)

Exploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better TogetherExploring MongoDB & Elasticsearch: Better Together
Exploring MongoDB & Elasticsearch: Better Together
 
A head start on cloud native event driven applications - bigdatadays
A head start on cloud native event driven applications - bigdatadaysA head start on cloud native event driven applications - bigdatadays
A head start on cloud native event driven applications - bigdatadays
 
Overhauling a database engine in 2 months
Overhauling a database engine in 2 monthsOverhauling a database engine in 2 months
Overhauling a database engine in 2 months
 
Oslo bekk2014
Oslo bekk2014Oslo bekk2014
Oslo bekk2014
 
Stream Processing with Ballerina
Stream Processing with BallerinaStream Processing with Ballerina
Stream Processing with Ballerina
 
Multi-model databases and node.js
Multi-model databases and node.jsMulti-model databases and node.js
Multi-model databases and node.js
 
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
 
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
 
Electron, databases, and RxDB
Electron, databases, and RxDBElectron, databases, and RxDB
Electron, databases, and RxDB
 
Webtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalabilityWebtech Conference: NoSQL and Web scalability
Webtech Conference: NoSQL and Web scalability
 
Log analysis using elk
Log analysis using elkLog analysis using elk
Log analysis using elk
 
Accelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO WayAccelerating Delivery of Data Products - The EBSCO Way
Accelerating Delivery of Data Products - The EBSCO Way
 
FOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBFOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDB
 
The CIOs Guide to NoSQL
The CIOs Guide to NoSQLThe CIOs Guide to NoSQL
The CIOs Guide to NoSQL
 
Backbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTPBackbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTP
 
NoSQL for SQL Users
NoSQL for SQL UsersNoSQL for SQL Users
NoSQL for SQL Users
 
Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020
 
Document Database
Document DatabaseDocument Database
Document Database
 
Scaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOSScaling ArangoDB on Mesosphere DCOS
Scaling ArangoDB on Mesosphere DCOS
 
Db presentation google_megastore
Db presentation google_megastoreDb presentation google_megastore
Db presentation google_megastore
 

Destaque

Model assure media pembelajaran
Model assure media pembelajaranModel assure media pembelajaran
Model assure media pembelajaran
itnay cindo
 
Using puppet, foreman and git to develop and operate a large scale internet s...
Using puppet, foreman and git to develop and operate a large scale internet s...Using puppet, foreman and git to develop and operate a large scale internet s...
Using puppet, foreman and git to develop and operate a large scale internet s...
techblog
 

Destaque (13)

Meetup cassandra for_java_cql
Meetup cassandra for_java_cqlMeetup cassandra for_java_cql
Meetup cassandra for_java_cql
 
Retention and upsale in using customer data
Retention and upsale in using customer dataRetention and upsale in using customer data
Retention and upsale in using customer data
 
C* path
C* pathC* path
C* path
 
Агман Забуровна
Агман ЗабуровнаАгман Забуровна
Агман Забуровна
 
שירות כמנוע צמיחה ורווחיות
שירות כמנוע צמיחה ורווחיותשירות כמנוע צמיחה ורווחיות
שירות כמנוע צמיחה ורווחיות
 
Агман Забуровна
Агман ЗабуровнаАгман Забуровна
Агман Забуровна
 
Integrating Voice Of Customers with Customer Success
Integrating Voice Of Customers with Customer SuccessIntegrating Voice Of Customers with Customer Success
Integrating Voice Of Customers with Customer Success
 
B2B - Measuring Customer Satisfaction - Comverse ltd
B2B - Measuring Customer Satisfaction - Comverse ltdB2B - Measuring Customer Satisfaction - Comverse ltd
B2B - Measuring Customer Satisfaction - Comverse ltd
 
Voice of Customers programs in B2B
Voice of Customers programs in B2BVoice of Customers programs in B2B
Voice of Customers programs in B2B
 
Model assure media pembelajaran
Model assure media pembelajaranModel assure media pembelajaran
Model assure media pembelajaran
 
Using puppet, foreman and git to develop and operate a large scale internet s...
Using puppet, foreman and git to develop and operate a large scale internet s...Using puppet, foreman and git to develop and operate a large scale internet s...
Using puppet, foreman and git to develop and operate a large scale internet s...
 
Managing service projects in a B2B environment
Managing service projects in a B2B environmentManaging service projects in a B2B environment
Managing service projects in a B2B environment
 
Post Usage ROI?
Post Usage ROI?Post Usage ROI?
Post Usage ROI?
 

Semelhante a Cassandra data access

BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
Srinath Perera
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
harendra_pathak
 

Semelhante a Cassandra data access (20)

Composable Parallel Processing in Apache Spark and Weld
Composable Parallel Processing in Apache Spark and WeldComposable Parallel Processing in Apache Spark and Weld
Composable Parallel Processing in Apache Spark and Weld
 
Big data distributed processing: Spark introduction
Big data distributed processing: Spark introductionBig data distributed processing: Spark introduction
Big data distributed processing: Spark introduction
 
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica ColoftCassandra meetup slides - Oct 15 Santa Monica Coloft
Cassandra meetup slides - Oct 15 Santa Monica Coloft
 
MongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness PlatformMongoDB Versatility: Scaling the MapMyFitness Platform
MongoDB Versatility: Scaling the MapMyFitness Platform
 
GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™GraphFrames: DataFrame-based graphs for Apache® Spark™
GraphFrames: DataFrame-based graphs for Apache® Spark™
 
BISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple SpacesBISSA: Empowering Web gadget Communication with Tuple Spaces
BISSA: Empowering Web gadget Communication with Tuple Spaces
 
Architectures, Frameworks and Infrastructure
Architectures, Frameworks and InfrastructureArchitectures, Frameworks and Infrastructure
Architectures, Frameworks and Infrastructure
 
Shift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to CassandraShift: Real World Migration from MongoDB to Cassandra
Shift: Real World Migration from MongoDB to Cassandra
 
Summer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpointSummer 2017 undergraduate research powerpoint
Summer 2017 undergraduate research powerpoint
 
MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개MongoDB 4.0 새로운 기능 소개
MongoDB 4.0 새로운 기능 소개
 
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
JPoint'15 Mom, I so wish Hibernate for my NoSQL database...
 
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
 
Kafka Streams - From the Ground Up to the Cloud
Kafka Streams - From the Ground Up to the CloudKafka Streams - From the Ground Up to the Cloud
Kafka Streams - From the Ground Up to the Cloud
 
Apache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-AriApache Spark 101 - Demi Ben-Ari
Apache Spark 101 - Demi Ben-Ari
 
Spark
SparkSpark
Spark
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentation
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Introduction to TitanDB
Introduction to TitanDB Introduction to TitanDB
Introduction to TitanDB
 
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft AzureOtimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
Otimizações de Projetos de Big Data, Dw e AI no Microsoft Azure
 
PostgreSQL - Object Relational Database
PostgreSQL - Object Relational DatabasePostgreSQL - Object Relational Database
PostgreSQL - Object Relational Database
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

Cassandra data access

  • 2. eBuddy use of Cassandra
  • 3. XMS
  • 4. ● User Data Service ● User Discovery Service ● Persistent Session Store ● Message History ● Location-based Discovery Cassandra in eBuddy Messaging Platform
  • 5. ● Current size of data ● 1,4 TiB total (replication of 3x); 467 GiB actual data ● 12 million sessions (11 million users plus groups) ● Almost a billion rows in one column family (inverse social graph) Some Statistics
  • 6. Data Access - Overview
  • 7. Design Objectives ● Data Source Agnostic ● Testable ● Thread Safe ● Strong Typing ● Supports “transactions”, i.e. units of work in batch ● Efficient Mapping to Application Domain Model ● Follows Familiar Patterns (e.g. Spring JDBC Template)
  • 8. Data Access in Layers
  • 10. Writing ● Use Generic Typing ● Has Interface (for testability, etc.) ● Handles Exceptions
  • 12. Serializers ● Constructed with serializers that convert to types needed by data access layer
  • 15. Data Access Object ● Data Access Object (DAO) is singleton ● Transforms from data model to domain model ● Operations object configured with serializers to convert from data model to domain model ● Defines the mappers for read operations
  • 17. CQL3 DataStax: "We believe that CQL3 is a simpler and overall better API for Cassandra than the thrift API is. Therefore, new projects/applications are encouraged to use CQL3" At eBuddy, we are still using the Thrift API and the Java Hector library. We are currently looking at CQL3 and whether we want to use it going forward and whether we will "upgrade" existing code.
  • 18. Structured Data ● Object Mapping Frameworks ● Mapped vs. Embedded Objects ● Nested Properties ("path" access)
  • 19. Object Mapping Frameworks ● Simple mapper frameworks with (some) JPA support ● Hector Object Mapper ● Kundera ● Firebrand (not JPA) ● has most features, e.g supports both embedded and mapped object graphs https://github.com/impetus-opensource/Kundera http://github.com/hector-client/hector http://firebrandocm.org
  • 20. Hierarchical Properties ● Use DynamicComposites to model keys that have a variable number of components put(“accounts|msn|x.y.z|sign_in”, “0”); put(“accounts|msn|x.y.z|key”, “value”); get(“accounts”) --> retrieved as a map: {"accounts": { "msn": { "x.y.z": { "sign_in": "0", "key": "value" } } } } ● Use a slice query to retrieve properties using partial path: