SlideShare uma empresa Scribd logo
1 de 20
tyfs.rocks
tyfs.rocks 126.07.2017
tayfun.sevimli
The History of Cassandra
tyfs.rocks 226.07.2017
Where is Cassandra?
tyfs.rocks 326.07.2017
Cassandra Architecture – CAP Theorem
tyfs.rocks 426.07.2017
Cassandra was designed to fall in the “AP” intersection of
the CAP theorem that states that any distributed system can
only guarantee two of the following capabilities at same time;
Consistency, Availability and Partition Tolerance. In this way
Cassandra is a best fit for a solution seeking a distributed
database that brings high availability to a system and is also very
tolerant to partition to its data when some node in the cluster is
offline, which is common in distributed systems.
Cassandra Architecture – Data Model
tyfs.rocks 526.07.2017
Cassandra is classified as a column based database, which means that its
basic structure to store data is based upon a set of columns, which are
comprised, by a pair of column key and column value. Every row is identified
by a unique key, a string without a size limit, called partition key. Each set of
columns are called column families, similar to a relational database table.
Cassandra Architecture – Data Model
tyfs.rocks 626.07.2017
SortedMap<RowKey,SortedMap<ColumnKey, ColumnValue>>
 A map gives efficient key lookup, and the sorted nature gives efficient scans. In Cassandra, we can use row keys and column
keys to do efficient lookups and range scans.
 The number of column keys is unbounded. This means, you can have wide rows.
 A key can itself hold a value, meaning In other words, you can have a valueless column.
Cassandra Architecture – Write Path
tyfs.rocks 726.07.2017
Cassandra Write Path
 Every node first writes the mutation to the commit log
and then writes the mutation to the memtable.
 Writing to the commit log ensures durability of the write
as the memtable is an in-memory structure and is only
written to disk when the memtable is flushed to disk. A
memtable is flushed to disk when:
• It reaches its maximum allocated size in memory
• The number of minutes a memtable can stay in
memory elapses.
• Manually flushed by a user
 A memtable is flushed to an immutable structure called
and SSTable (Sorted String Table). The commit log is used
for playback purposes in case data from the memtable is
lost due to node failure.
 Every SSTable creates three files on disk which include a
bloom filter, a key index and a data file.
Cassandra Architecture – Read Path
tyfs.rocks 826.07.2017
Cassandra Read Path
 Every Column Family stores data in a number of
SSTables. Thus Data for a particular row can be located in
a number of SSTables and the memtable. Thus for every
read request Cassandra needs to read data from all
applicable SSTables ( all SSTables for a column family)
and scan the memtable for applicable data fragments.
This data is then merged and returned to the
coordinator.
 If the contacted replicas has a different version of the
data the coordinator returns the latest version to the
client and issues a read repair command to the
node/nodes with the older version of the data. The read
repair operation pushes the newer version of the data to
nodes with the older version.
Cassandra Architecture – Cluster Topology
tyfs.rocks 926.07.2017
Cluster Concepts
 a node is a cassandra instance (in
production: one node per machine)
 a partition is one ordered and replicable
unit of data on a node
 a rack is a logical set of nodes
 a Data Center is a logical set or racks
 Cluster is the full set of nodes which
map to a single complete token ring
 peer-to-peer communication gossip
protocol
Cassandra Architecture – Data Consistency
tyfs.rocks 1026.07.2017
Tunable Data Consistency
How many nodes must acknowledge a
read/write request
 choose between STRONG to
EVENTUAL
 possible CL: ANY, ONE, QUORUM
(RF/2+1), ALL
 tunable per request support
 multi-datacenter support
Cassandra Architecture – CQL Language
tyfs.rocks 1126.07.2017
Cassandra Query Language
 very similar to RDBMS SQL syntax
 create objects via DDL
 core DML commands insert,
update, delete supported
 query data with Select commands
Cassandra Architecture – Security
tyfs.rocks 1226.07.2017
Cassandra Security Features
 Authentication based on internally
controlled rolename/passwords
 Authorization based on object
permission management
 Authentication and authorization
based on JMX
username/passwords
 SSL encryption
Why Cassandra ?
tyfs.rocks 1326.07.2017
• Scales linearly with massive write
 Cassandra is a great database which can handle a big amount of data. So it is preferred for the companies that provide
Mobile phones and messaging services. These companies have a huge amount of data, so Cassandra is best for them.
• Highly Fault Tolerant
 Masterless cluster with no single point of failure. In simple terms, your users will never know if a server, an entire rack
of servers, or even if an entire data center fails. There is also the potential for zero downtime rolling upgrades.
• Easy Replication / Data Distribution
• Homogenous Environment
 No master-slave or sharding setup and that all nodes in the ring are equal.
• Ease of Administration
 Masterless, fault-tolerant, supports temporary loss of nodes with minimal impact to production performance.
• Wide Community
 No master-slave or sharding setup and that all nodes in the ring are equal.
Use Cases of Cassandra
tyfs.rocks 1426.07.2017
• Messaging & Event Sourcing
 Cassandra is a great database which can handle a big amount of data. So it is preferred for the companies that provide
Mobile phones and messaging services. These companies have a huge amount of data, so Cassandra is best for them.
• IoT & High Speed Applications
 Cassandra can handle the high speed data so it is a great database for the applications where data is coming at very
high speed from different devices or sensors.
• Product Catalogs and Retail Apps
 Cassandra is used by many retailers for durable shopping cart protection and fast product catalog input and output.
• Social Media Analytics & Recommendations
 Cassandra is a great database for many online companies and social media providers for analysis and
recommendation to their customers.
Cassandra for Akka Persistence
tyfs.rocks 1526.07.2017
• Linear scalability
 Expected Massive Load
• No SPOF
 Fault-tolerant, Resilient
• Always-On Multi-Data Center
 Data Distribution & Replication
 Cluster over Multi-Data Centers
• AKKA Persistence
 CQRS with Event-Sourcing
 Akka’s supported up to date plugin
(Lightbend)
• Akka Streams
 Batch Processing over Streaming
Cassandra Benchmarks
tyfs.rocks 1626.07.2017
University of TORONTO, NoSQL Database Performance Benchmarks, 2012
Write latency for workload read/write
Throughput for workload read/scan/write
Read latency for workload read/write
Throughput for workload read/write
Cassandra Benchmarks
tyfs.rocks 1726.07.2017
Netflix, Benchmarking Cassandra Scalability on AWS, 2011
Cassandra Benchmarks
tyfs.rocks 1826.07.2017
EndPoint database and open source consulting company, 2014
Cassandra Benchmarks
tyfs.rocks 1926.07.2017
EndPoint database and open source consulting company, 2014
Resources
tyfs.rocks 2026.07.2017
• Apache Cassandra Web Site
• Planet Cassandra Community
• DataStax Web Site
• The Distributed Architecture Behind Apache Cassandra, Bruno TINOCO
• Introduction to Apache Cassandra's Architecture, Akhil Mehra
• An Overview of Apache Cassandra, DataStax
• NoSQL Performance Benchmarks, DataStax
• Top 10 Reasons to Use Cassandra, Michael COLBY
• Security in Cassandra, IBM Developer Works

Mais conteúdo relacionado

Mais procurados

Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsOleg Magazov
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introductionPooyan Mehrparvar
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.Navdeep Charan
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseDataStax
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Big data vahidamiri-datastack.ir
Big data vahidamiri-datastack.irBig data vahidamiri-datastack.ir
Big data vahidamiri-datastack.irdatastack
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Benoit Perroud
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMIJCI JOURNAL
 
Big data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructureBig data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructuredatastack
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Comparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbComparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbsonalighai
 
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...Vivek Adithya Mohankumar
 

Mais procurados (20)

Cassandra Architecture FTW
Cassandra Architecture FTWCassandra Architecture FTW
Cassandra Architecture FTW
 
Cassandra
CassandraCassandra
Cassandra
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 
NoSQL databases - An introduction
NoSQL databases - An introductionNoSQL databases - An introduction
NoSQL databases - An introduction
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
Evaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud DatabaseEvaluating Apache Cassandra as a Cloud Database
Evaluating Apache Cassandra as a Cloud Database
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Big data vahidamiri-datastack.ir
Big data vahidamiri-datastack.irBig data vahidamiri-datastack.ir
Big data vahidamiri-datastack.ir
 
Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26Apache Cassandra @Geneva JUG 2013.02.26
Apache Cassandra @Geneva JUG 2013.02.26
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 
Big data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructureBig data architecture on cloud computing infrastructure
Big data architecture on cloud computing infrastructure
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Comparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbComparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsb
 
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
NoSQL Databases: An Introduction and Comparison between Dynamo, MongoDB and C...
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 

Semelhante a Why Cassandra?

cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQLPankaj Khattar
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdfhothyfa
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to CassandraUmair Mansoob
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppthothyfa
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAijfcstjournal
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAijfcstjournal
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideMohammed Fazuluddin
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_finalSergioBruno21
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandraPL dream
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentationSergey Enin
 
DSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraDSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraShrikant Samarth
 
Column db dol
Column db dolColumn db dol
Column db dolpoojabi
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!Edureka!
 

Semelhante a Why Cassandra? (20)

cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Data Storage Management
Data Storage ManagementData Storage Management
Data Storage Management
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQL
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
Migrating Oracle database to Cassandra
Migrating Oracle database to CassandraMigrating Oracle database to Cassandra
Migrating Oracle database to Cassandra
 
Cassandra Learning
Cassandra LearningCassandra Learning
Cassandra Learning
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
 
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRAA NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
 
Cassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction GuideCassandra - A Basic Introduction Guide
Cassandra - A Basic Introduction Guide
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 
Dsm project-h base-cassandra
Dsm project-h base-cassandraDsm project-h base-cassandra
Dsm project-h base-cassandra
 
Storage cassandra
Storage   cassandraStorage   cassandra
Storage cassandra
 
Cassandra presentation
Cassandra presentationCassandra presentation
Cassandra presentation
 
DSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and CassandraDSM - Comparison of Hbase and Cassandra
DSM - Comparison of Hbase and Cassandra
 
Column db dol
Column db dolColumn db dol
Column db dol
 
Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!
 

Último

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 

Último (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Why Cassandra?

  • 2. The History of Cassandra tyfs.rocks 226.07.2017
  • 4. Cassandra Architecture – CAP Theorem tyfs.rocks 426.07.2017 Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only guarantee two of the following capabilities at same time; Consistency, Availability and Partition Tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  • 5. Cassandra Architecture – Data Model tyfs.rocks 526.07.2017 Cassandra is classified as a column based database, which means that its basic structure to store data is based upon a set of columns, which are comprised, by a pair of column key and column value. Every row is identified by a unique key, a string without a size limit, called partition key. Each set of columns are called column families, similar to a relational database table.
  • 6. Cassandra Architecture – Data Model tyfs.rocks 626.07.2017 SortedMap<RowKey,SortedMap<ColumnKey, ColumnValue>>  A map gives efficient key lookup, and the sorted nature gives efficient scans. In Cassandra, we can use row keys and column keys to do efficient lookups and range scans.  The number of column keys is unbounded. This means, you can have wide rows.  A key can itself hold a value, meaning In other words, you can have a valueless column.
  • 7. Cassandra Architecture – Write Path tyfs.rocks 726.07.2017 Cassandra Write Path  Every node first writes the mutation to the commit log and then writes the mutation to the memtable.  Writing to the commit log ensures durability of the write as the memtable is an in-memory structure and is only written to disk when the memtable is flushed to disk. A memtable is flushed to disk when: • It reaches its maximum allocated size in memory • The number of minutes a memtable can stay in memory elapses. • Manually flushed by a user  A memtable is flushed to an immutable structure called and SSTable (Sorted String Table). The commit log is used for playback purposes in case data from the memtable is lost due to node failure.  Every SSTable creates three files on disk which include a bloom filter, a key index and a data file.
  • 8. Cassandra Architecture – Read Path tyfs.rocks 826.07.2017 Cassandra Read Path  Every Column Family stores data in a number of SSTables. Thus Data for a particular row can be located in a number of SSTables and the memtable. Thus for every read request Cassandra needs to read data from all applicable SSTables ( all SSTables for a column family) and scan the memtable for applicable data fragments. This data is then merged and returned to the coordinator.  If the contacted replicas has a different version of the data the coordinator returns the latest version to the client and issues a read repair command to the node/nodes with the older version of the data. The read repair operation pushes the newer version of the data to nodes with the older version.
  • 9. Cassandra Architecture – Cluster Topology tyfs.rocks 926.07.2017 Cluster Concepts  a node is a cassandra instance (in production: one node per machine)  a partition is one ordered and replicable unit of data on a node  a rack is a logical set of nodes  a Data Center is a logical set or racks  Cluster is the full set of nodes which map to a single complete token ring  peer-to-peer communication gossip protocol
  • 10. Cassandra Architecture – Data Consistency tyfs.rocks 1026.07.2017 Tunable Data Consistency How many nodes must acknowledge a read/write request  choose between STRONG to EVENTUAL  possible CL: ANY, ONE, QUORUM (RF/2+1), ALL  tunable per request support  multi-datacenter support
  • 11. Cassandra Architecture – CQL Language tyfs.rocks 1126.07.2017 Cassandra Query Language  very similar to RDBMS SQL syntax  create objects via DDL  core DML commands insert, update, delete supported  query data with Select commands
  • 12. Cassandra Architecture – Security tyfs.rocks 1226.07.2017 Cassandra Security Features  Authentication based on internally controlled rolename/passwords  Authorization based on object permission management  Authentication and authorization based on JMX username/passwords  SSL encryption
  • 13. Why Cassandra ? tyfs.rocks 1326.07.2017 • Scales linearly with massive write  Cassandra is a great database which can handle a big amount of data. So it is preferred for the companies that provide Mobile phones and messaging services. These companies have a huge amount of data, so Cassandra is best for them. • Highly Fault Tolerant  Masterless cluster with no single point of failure. In simple terms, your users will never know if a server, an entire rack of servers, or even if an entire data center fails. There is also the potential for zero downtime rolling upgrades. • Easy Replication / Data Distribution • Homogenous Environment  No master-slave or sharding setup and that all nodes in the ring are equal. • Ease of Administration  Masterless, fault-tolerant, supports temporary loss of nodes with minimal impact to production performance. • Wide Community  No master-slave or sharding setup and that all nodes in the ring are equal.
  • 14. Use Cases of Cassandra tyfs.rocks 1426.07.2017 • Messaging & Event Sourcing  Cassandra is a great database which can handle a big amount of data. So it is preferred for the companies that provide Mobile phones and messaging services. These companies have a huge amount of data, so Cassandra is best for them. • IoT & High Speed Applications  Cassandra can handle the high speed data so it is a great database for the applications where data is coming at very high speed from different devices or sensors. • Product Catalogs and Retail Apps  Cassandra is used by many retailers for durable shopping cart protection and fast product catalog input and output. • Social Media Analytics & Recommendations  Cassandra is a great database for many online companies and social media providers for analysis and recommendation to their customers.
  • 15. Cassandra for Akka Persistence tyfs.rocks 1526.07.2017 • Linear scalability  Expected Massive Load • No SPOF  Fault-tolerant, Resilient • Always-On Multi-Data Center  Data Distribution & Replication  Cluster over Multi-Data Centers • AKKA Persistence  CQRS with Event-Sourcing  Akka’s supported up to date plugin (Lightbend) • Akka Streams  Batch Processing over Streaming
  • 16. Cassandra Benchmarks tyfs.rocks 1626.07.2017 University of TORONTO, NoSQL Database Performance Benchmarks, 2012 Write latency for workload read/write Throughput for workload read/scan/write Read latency for workload read/write Throughput for workload read/write
  • 17. Cassandra Benchmarks tyfs.rocks 1726.07.2017 Netflix, Benchmarking Cassandra Scalability on AWS, 2011
  • 18. Cassandra Benchmarks tyfs.rocks 1826.07.2017 EndPoint database and open source consulting company, 2014
  • 19. Cassandra Benchmarks tyfs.rocks 1926.07.2017 EndPoint database and open source consulting company, 2014
  • 20. Resources tyfs.rocks 2026.07.2017 • Apache Cassandra Web Site • Planet Cassandra Community • DataStax Web Site • The Distributed Architecture Behind Apache Cassandra, Bruno TINOCO • Introduction to Apache Cassandra's Architecture, Akhil Mehra • An Overview of Apache Cassandra, DataStax • NoSQL Performance Benchmarks, DataStax • Top 10 Reasons to Use Cassandra, Michael COLBY • Security in Cassandra, IBM Developer Works

Notas do Editor

  1. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  2. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  3. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  4. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  5. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  6. Each node processes the request individually. Every node first writes the mutation to the commit log and then writes the mutation to the memtable. Writing to the commit log ensures durability of the write as the memtable is an in-memory structure and is only written to disk when the memtable is flushed to disk. A memtable is flushed to disk when: It reaches its maximum allocated size in memory The number of minutes a memtable can stay in memory elapses. Manually flushed by a user A memtable is flushed to an immutable structure called and SSTable (Sorted String Table). The commit log is used for playback purposes in case data from the memtable is lost due to node failure. For example the machine has a power outage before the memtable could get flushed. Every SSTable creates three files on disk which include a bloom filter, a key index and a data file. Over a period of time a number of SSTables are created. This results in the need to read multiple SSTables to satisfy a read request. Compaction is the process of combining SSTables so that related data can be found in a single SSTable. This helps with making reads much faster.
  7. At the cluster level a read operation is similar to a write operation. As with the write path the client can connect with any node in the cluster. The chosen node is called the coordinator and is responsible for returning the requested data.  A row key must be supplied for every read operation. The coordinator uses the row key to determine the first replica. The replication strategy in conjunction with the replication factor is used to determine all other applicable replicas. As with the write path the consistency level determines the number of replica's that must respond before successfully returning data. Let's assume that the request has a consistency level of QUORUM and a replication factor of three, thus requiring the coordinator to wait for successful replies from at least two nodes. If the contacted replicas has a different version of the data the coordinator returns the latest version to the client and issues a read repair command to the node/nodes with the older version of the data. The read repair operation pushes the newer version of the data to nodes with the older version. On a per SSTable basis the operation becomes a bit more complicated. The illustration above outlines key steps that take place when reading data from an SSTable. Every SSTable has an associated bloom filter which enables it to quickly ascertain if data for the requested row key exists on the corresponding SSTable. This reduces IO when performing an row key lookup. A bloom filter is always held in memory since the whole purpose is to save disk IO. Cassandra also keeps a copy of the bloom filter on disk which enables it to recreate the bloom filter in memory quickly .  Cassandra does not store the bloom filter Java Heap instead makes a separate allocation for it in memory.  If the bloom filter returns a negative response no data is returned from the particular SSTable. This is  a common case as the compaction operation tries to group all row key related data into as few SSTables as possible. If the bloom filter provides a positive response the partition key cache is scanned to ascertain the compression offset for the requested row key. It then proceeds to fetch the compressed data on disk and returns the result set. If the partition cache does not contain a corresponding entry the partition key summary is scanned. The partition summary is a subset to the partition index and helps determine the approximate location of the index entry in the partition index. The partition index is then scanned to locate the compression offset which is then used to find the appropriate data on disk. If you reached the end of this long post then well done. In this post I have provided an introduction to Cassandra architecture. In my upcoming posts I will try and explain Cassandra architecture using a more practical approach.
  8. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  9. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  10. Cassandra was designed to fall in the “AP” intersection of the CAP theorem that states that any distributed system can only  guarantee two of the following capabilities at same time; Consistency, Availability and Partition tolerance. In this way Cassandra is a best fit for a solution seeking a distributed database that brings high availability to a system and is also very tolerant to partition to its data when some node in the cluster is offline, which is common in distributed systems.
  11. Authentication based on internally controlled rolename/passwordsCassandra authentication is roles-based and stored internally in Cassandra system tables. Administrators can create, alter, drop, or list roles using CQL commands, with an associated password. Roles can be created with superuser, non-superuser, and login privileges. The internal authentication is used to access Cassandra keyspaces and tables, and by cqlsh and DevCenter to authenticate connections to Cassandra clusters and sstableloader to load SSTables. Authorization based on object permission managementAuthorization grants access privileges to Cassandra cluster operations based on role authentication. Authorization can grant permission to access the entire database or restrict a role to individual table access. Roles can grant authorization to authorize other roles. Roles can be granted to roles. CQL commands GRANT and REVOKE are used to manage authorization. Authentication and authorization based on JMX username/passwordsJMX (Java Management Extensions) technology provides a simple and standard way of managing and monitoring resources related to an instance of a Java Virtual Machine (JVM). This is achieved by instrumenting resources with Java objects known as Managed Beans (MBeans) that are registered with an MBean server. JMX authentication stores username and associated passwords in two files, one for passwords and one for access. JMX authentication is used by nodetool and external monitoring tools such as jconsole.In Cassandra 3.6 and later, JMX authentication and authorization can be accomplished using Cassandra's internal authentication and authorization capabilities. SSL encryptionCassandra provides secure communication between a client and a database cluster, and between nodes in a cluster. Enabling SSL encryption ensures that data in flight is not compromised and is transferred securely. Client-to-node and node-to-node encryption are independently configured. Cassandra tools (cqlsh, nodetool, DevCenter) can be configured to use SSL encryption. The DataStax drivers can be configured to secure traffic between the driver and Cassandra. General security measuresTypically, production Cassandra clusters will have all non-essential firewall ports closed. Some ports must be open in order for nodes to communicate in the cluster. These ports are detailed.
  12. Goals for the Tests Select workloads that are typical of today’s modern applications Use data volumes that are representative of ‘big data’ datasets that exceed the RAM capacity for each node Ensure that all data written was done in a manner that allowed no data loss (i.e. durable writes), which is what most production environments require Tested Workloads The following workloads were included in the benchmark: Read-mostly workload, based on YCSB’s provided workload B: 95% read to 5% update ratio Read/write combination, based on YCSB’s workload A: 50% read to 50% update ratio Read-modify-write, based on YCSB workload F: 50% read to 50% read-modify-write Mixed operational and analytical: 60% read, 25% update, 10% insert, and 5% scan Insert-mostly combined with read: 90% insert to 10% read ratio