SlideShare uma empresa Scribd logo
1 de 21
Sergey Titov
Software Architect
@sergtitov
AgendaAGENDA
• Cassandra Architecture
• CAP theorem and Consistency
• Scalability
• Astyanax client
• Data Modeling
• Queries
• DataStax OpsCenter
• Resources
Cassandra architectureARCHITECTURE
• Ring
• P2P
• Gossip
• Key hash-based sharding
CAP TheoremCAPTHEOREM
Consistency in CassandraCONSISTENCY
• ACID - Atomicity Consistency Isolation Durability
• BASE - Basically Available Soft-state Eventual consistency
• Isolation on the row level
• Atomic batches starting Cassandra 1.2
• Consistency level for READs and WRITEs set for every request
• Tunable consistency
• Log: CL_WRITE = ANY or ONE
• Strong: CL_READ + CL_WRITE > REPLICATION_FACTOR
• Recommended default: LOCAL_QUORUM
Consistency in Cassandra - continuedCONSISTENCY
Level Description
ANY
A write must be written to at least one node. If all replica nodes
for the given row key are down, the write can still succeed once
a hinted handoff has been written. Note that if all replica nodes
are down at write time, an ANY write will not be readable until
the replica nodes for that row key have recovered.
ONE
A write must be written to the commit log and memory table of
at least one replica node.
QUORUM
A write must be written to the commit log and memory table on
a quorum of replica nodes.
LOCAL_QUORUM
A write must be written to the commit log and memory table on
a quorum of replica nodes in the same data center as the
coordinator node. Avoids latency of inter-data center
communication.
EACH_QUORUM
A write must be written to the commit log and memory table on
a quorum of replica nodes in all data centers.
ALL
A write must be written to the commit log and memory table on
all replica nodes in the cluster for that row key.
Write Data FlowARCHITECTURE
Multiple Data CentersARCHITECTURE
ScalabilitySCALABILITY
Astyanax clientASTYANAX
• Based on Hector
• High level, simple object oriented interface to Cassandra.
• Fail-over behavior on the client side.
• Connection pool abstraction (round robin connection pool)
• Monitoring to get event notification from the connection pool.
• Complete encapsulation of the underlying Thrift API.
• Automatic retry of downed hosts.
• Automatic discovery of additional hosts in the cluster.
• Suspension of hosts for a short period of time after timeouts.
Astyanax – token aware clientASTYANAX
Data Modeling in CassandraDATAMODELING
• Column Families are NOT tables!
• Map<RowKey, SortedMap<ColumnKey, ColumnValue>>
• Values could be and often are stored in column names
• Number of columns could be different for different rows
• There could be 2 billions columns in one row!
• Use UUIDs
• Separate read-heavy from write-heavy data
Data Modeling in Cassandra - continuedDATAMODELING
• Client joins
• Denormalize data
• Wide rows
• Materialized views
• Model around queries
• Row key is “shard” key
Modeling nested entities and documentsDATAMODELING
Motivation
• Parent-child decomposition lacks performance in Cassandra.
• No JOIN operator in CQL!
• The only solution is to store tree-like structure with nested “children”
• Cassandra doesn’t have built-in support for a document object
Solution
• Column Families are NOT tables
• Domain object fields are traversed along with the nested entities
• Collection and Map fields (of any level of deepness) are unwrapped
into plain key-value pairs (mapped to Cassandra column name – value)
Modeling nested entities and documents. ExampleDATAMODELING
class Parent {
@Id
private UUID id;
@Column
private String stringField1;
@NestedCollection
private Map<String, byte[]> imageMap;
@NestedCollection
private List<Child> children;
}
class Child {
@Column
private Integer kidsNumber;
}
Modeling nested entities and documents. ExampleDATAMODELING
Let’s use JSON notation:
If Parent is
{
“id” : “edc39a6c-355f-4ad0-a4de-
b2103dbd610d”,
“stringField1” : “value1”,
“imageMap”: [
“name1” : “SW1hZ2VEYXRhMQ==“,
“name2” : “SW1hZ2VEYXRhMg==“
],
“children” : [
{
“kidsNumber” : 1
},
{
“kidsNumber” : 2
}
]
}
the corresponding Cassandra columns will be:
• “id” -> “edc39a6c-355f-4ad0-a4de-
b2103dbd610d”
• “stringField1” -> “value1”
• “imageMap:name1” -> “SW1…MQ==“
• “imageMap:name2” -> “SW1…MQ==“
• “children:0:kidsNumber” -> 1
• “children:1:kidsNumber” -> 2
Range queries in CassandraQUERIES
Motivation
• No CQL equivalent for SQL clause:
WHERE “field_name” >= value1 and “field_name” <= value2
• For indexed fields the only possible query is
WHERE “field_name” [<,>,<=,>=,=] “value” but “field_name” can be
specified in a cql query only once
Solution
• Any name of Cassandra column is a byte buffer ~ byte [] columnName
• Column names (in comparison with the values)
may be filtered by the specified range,
i.e. if two border values
• byte [] lowMargin,
• byte [] highMargin
are defined it is possible to select columns with columName
WHERE columnName >= lowMargin AND columnName <= highMargin
• As there are ~ 2 bln columns can be persisted for the same key
it is possible to search quickly among lists of size < 2 * 10^9
Composite Column FamiliesQUERIES
Motivation
• Raw untyped column names are not convenient in processing.
• If there are 2 or more components of a column name serialized
to a same byte buffer it is hard to build quick search on a single part.
For instance, let’s introduce column name consisting of two components:
• person_name: String
• time_stamp: Date
How to build a column range returning all the previously persisted
combinations of person_name = “Tom” and time_stamp >= “1999-01-01” and
time_stamp <= “2012-01-01”?
Solution
Cassandra has built-in CompositeType comparator which can be defined for
number of components and sorts columns first by component number 0, 1, …
Composite Column Families - mappingQUERIES
public class ReferenceCategoryValue {
@Id
private String category; //maps to row key
@Component(ordinal = 0) //the following three fields are serialized
private UUID id; //into a column name
@Component(ordinal = 1)
private String description;
@Component(ordinal = 2)
private String code;
@Value
private String value // the value which is saved for the column
}
DataStax OpsCenterOPSCENTER
ResourcesRESOURCES
• DataStax Documentation
• Free Cassandra Academy
• Tutorials
• Apache Cassandra Home Page
• Cassandra Summit Presentations
• 2014 Summit Videos
• Netflix blog
• Astyanax
• Ebay Cassandra Data Modeling best practices part 1 and part 2

Mais conteúdo relacionado

Mais procurados

Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...DataStax
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3uzzal basak
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraChetan Baheti
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCaleb Rackliffe
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraPatrick McFadin
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax EnablementVincent Poncet
 
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraHelsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraBruno Amaro Almeida
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsCheng Lian
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Lviv Startup Club
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionPatrick McFadin
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilitySergey Petrunya
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with CassandraJacek Lewandowski
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkPatrick Wendell
 

Mais procurados (20)

Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
Elassandra: Elasticsearch as a Cassandra Secondary Index (Rémi Trouville, Vin...
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Elk presentation1#3
Elk presentation1#3Elk presentation1#3
Elk presentation1#3
 
Introduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache CassandraIntroduction to NoSQL & Apache Cassandra
Introduction to NoSQL & Apache Cassandra
 
Cassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE SearchCassandra Summit 2015: Intro to DSE Search
Cassandra Summit 2015: Intro to DSE Search
 
Introduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandraIntroduction to data modeling with apache cassandra
Introduction to data modeling with apache cassandra
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Apache Spark and DataStax Enablement
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
 
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to CassandraHelsinki Cassandra Meetup #2: From Postgres to Cassandra
Helsinki Cassandra Meetup #2: From Postgres to Cassandra
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
DTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime InternalsDTCC '14 Spark Runtime Internals
DTCC '14 Spark Runtime Internals
 
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
 
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
 
Time series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long versionTime series with Apache Cassandra - Long version
Time series with Apache Cassandra - Long version
 
Mysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperabilityMysqlconf2013 mariadb-cassandra-interoperability
Mysqlconf2013 mariadb-cassandra-interoperability
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with Cassandra
 
Spark Introduction
Spark IntroductionSpark Introduction
Spark Introduction
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
Tuning and Debugging in Apache Spark
Tuning and Debugging in Apache SparkTuning and Debugging in Apache Spark
Tuning and Debugging in Apache Spark
 

Semelhante a Cassandra Overview

NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptxNaveen Kumar
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentationMurat Çakal
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache CassandraStu Hood
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelAndrey Lomakin
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparisonshsedghi
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandraAaron Ploetz
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016DataStax
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_finalSergioBruno21
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUGStu Hood
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka MeetupCliff Gilmore
 
N07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxN07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxNguyễn Thái
 

Semelhante a Cassandra Overview (20)

NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
Scaling web applications with cassandra presentation
Scaling web applications with cassandra presentationScaling web applications with cassandra presentation
Scaling web applications with cassandra presentation
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
On Rails with Apache Cassandra
On Rails with Apache CassandraOn Rails with Apache Cassandra
On Rails with Apache Cassandra
 
Apache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data modelApache Cassandra, part 1 – principles, data model
Apache Cassandra, part 1 – principles, data model
 
Cassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A ComparisonCassandra Java APIs Old and New – A Comparison
Cassandra Java APIs Old and New – A Comparison
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Cassandra
CassandraCassandra
Cassandra
 
Intro to cassandra
Intro to cassandraIntro to cassandra
Intro to cassandra
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Cassandra Talk: Austin JUG
Cassandra Talk: Austin JUGCassandra Talk: Austin JUG
Cassandra Talk: Austin JUG
 
Chicago Kafka Meetup
Chicago Kafka MeetupChicago Kafka Meetup
Chicago Kafka Meetup
 
N07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxN07_RoundII_20220405.pptx
N07_RoundII_20220405.pptx
 

Último

Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
cpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptcpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptrcbcrtm
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 

Último (20)

Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
cpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.pptcpct NetworkING BASICS AND NETWORK TOOL.ppt
cpct NetworkING BASICS AND NETWORK TOOL.ppt
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 

Cassandra Overview

  • 2. AgendaAGENDA • Cassandra Architecture • CAP theorem and Consistency • Scalability • Astyanax client • Data Modeling • Queries • DataStax OpsCenter • Resources
  • 3. Cassandra architectureARCHITECTURE • Ring • P2P • Gossip • Key hash-based sharding
  • 5. Consistency in CassandraCONSISTENCY • ACID - Atomicity Consistency Isolation Durability • BASE - Basically Available Soft-state Eventual consistency • Isolation on the row level • Atomic batches starting Cassandra 1.2 • Consistency level for READs and WRITEs set for every request • Tunable consistency • Log: CL_WRITE = ANY or ONE • Strong: CL_READ + CL_WRITE > REPLICATION_FACTOR • Recommended default: LOCAL_QUORUM
  • 6. Consistency in Cassandra - continuedCONSISTENCY Level Description ANY A write must be written to at least one node. If all replica nodes for the given row key are down, the write can still succeed once a hinted handoff has been written. Note that if all replica nodes are down at write time, an ANY write will not be readable until the replica nodes for that row key have recovered. ONE A write must be written to the commit log and memory table of at least one replica node. QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes. LOCAL_QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes in the same data center as the coordinator node. Avoids latency of inter-data center communication. EACH_QUORUM A write must be written to the commit log and memory table on a quorum of replica nodes in all data centers. ALL A write must be written to the commit log and memory table on all replica nodes in the cluster for that row key.
  • 10. Astyanax clientASTYANAX • Based on Hector • High level, simple object oriented interface to Cassandra. • Fail-over behavior on the client side. • Connection pool abstraction (round robin connection pool) • Monitoring to get event notification from the connection pool. • Complete encapsulation of the underlying Thrift API. • Automatic retry of downed hosts. • Automatic discovery of additional hosts in the cluster. • Suspension of hosts for a short period of time after timeouts.
  • 11. Astyanax – token aware clientASTYANAX
  • 12. Data Modeling in CassandraDATAMODELING • Column Families are NOT tables! • Map<RowKey, SortedMap<ColumnKey, ColumnValue>> • Values could be and often are stored in column names • Number of columns could be different for different rows • There could be 2 billions columns in one row! • Use UUIDs • Separate read-heavy from write-heavy data
  • 13. Data Modeling in Cassandra - continuedDATAMODELING • Client joins • Denormalize data • Wide rows • Materialized views • Model around queries • Row key is “shard” key
  • 14. Modeling nested entities and documentsDATAMODELING Motivation • Parent-child decomposition lacks performance in Cassandra. • No JOIN operator in CQL! • The only solution is to store tree-like structure with nested “children” • Cassandra doesn’t have built-in support for a document object Solution • Column Families are NOT tables • Domain object fields are traversed along with the nested entities • Collection and Map fields (of any level of deepness) are unwrapped into plain key-value pairs (mapped to Cassandra column name – value)
  • 15. Modeling nested entities and documents. ExampleDATAMODELING class Parent { @Id private UUID id; @Column private String stringField1; @NestedCollection private Map<String, byte[]> imageMap; @NestedCollection private List<Child> children; } class Child { @Column private Integer kidsNumber; }
  • 16. Modeling nested entities and documents. ExampleDATAMODELING Let’s use JSON notation: If Parent is { “id” : “edc39a6c-355f-4ad0-a4de- b2103dbd610d”, “stringField1” : “value1”, “imageMap”: [ “name1” : “SW1hZ2VEYXRhMQ==“, “name2” : “SW1hZ2VEYXRhMg==“ ], “children” : [ { “kidsNumber” : 1 }, { “kidsNumber” : 2 } ] } the corresponding Cassandra columns will be: • “id” -> “edc39a6c-355f-4ad0-a4de- b2103dbd610d” • “stringField1” -> “value1” • “imageMap:name1” -> “SW1…MQ==“ • “imageMap:name2” -> “SW1…MQ==“ • “children:0:kidsNumber” -> 1 • “children:1:kidsNumber” -> 2
  • 17. Range queries in CassandraQUERIES Motivation • No CQL equivalent for SQL clause: WHERE “field_name” >= value1 and “field_name” <= value2 • For indexed fields the only possible query is WHERE “field_name” [<,>,<=,>=,=] “value” but “field_name” can be specified in a cql query only once Solution • Any name of Cassandra column is a byte buffer ~ byte [] columnName • Column names (in comparison with the values) may be filtered by the specified range, i.e. if two border values • byte [] lowMargin, • byte [] highMargin are defined it is possible to select columns with columName WHERE columnName >= lowMargin AND columnName <= highMargin • As there are ~ 2 bln columns can be persisted for the same key it is possible to search quickly among lists of size < 2 * 10^9
  • 18. Composite Column FamiliesQUERIES Motivation • Raw untyped column names are not convenient in processing. • If there are 2 or more components of a column name serialized to a same byte buffer it is hard to build quick search on a single part. For instance, let’s introduce column name consisting of two components: • person_name: String • time_stamp: Date How to build a column range returning all the previously persisted combinations of person_name = “Tom” and time_stamp >= “1999-01-01” and time_stamp <= “2012-01-01”? Solution Cassandra has built-in CompositeType comparator which can be defined for number of components and sorts columns first by component number 0, 1, …
  • 19. Composite Column Families - mappingQUERIES public class ReferenceCategoryValue { @Id private String category; //maps to row key @Component(ordinal = 0) //the following three fields are serialized private UUID id; //into a column name @Component(ordinal = 1) private String description; @Component(ordinal = 2) private String code; @Value private String value // the value which is saved for the column }
  • 21. ResourcesRESOURCES • DataStax Documentation • Free Cassandra Academy • Tutorials • Apache Cassandra Home Page • Cassandra Summit Presentations • 2014 Summit Videos • Netflix blog • Astyanax • Ebay Cassandra Data Modeling best practices part 1 and part 2