SlideShare uma empresa Scribd logo
1 de 43
Baixar para ler offline
Nick Bailey
@nickmbailey
Intro to Cassandra Architecture
1
4.1 Cassandra - Introduction
Why does Cassandra Exist?
Dynamo Paper(2007)
• How do we build a data store that is:
• Reliable
• Performant
• “Always On”
• Nothing new and shiny
• 24 papers cited
Also the basis for Riak and Voldemort
BigTable(2006)
• Richer data model
• 1 key. Lots of values
• Fast sequential access
• 38 Papers cited
Cassandra(2008)
• Distributed features of Dynamo
• Data Model and storage from
BigTable
• February 17, 2010 it graduated to
a top-level Apache project
Cassandra - More than one server
• All nodes participate in a cluster
• Shared nothing
• Add or remove as needed
• More capacity? Add a server

7
8
Cassandra HBase Redis MySQL
THROUGHPUTOPS/SEC)
VLDB benchmark
Cassandra - Fully Replicated
• Client writes local
• Data syncs across WAN
• Replication per Data Center
9
Cassandra for Applications
APACHE
CASSANDRA
Summary
•The evolution of the internet and online data created new
problems
•Apache Cassandra was based on a variety of
technologies to solve these problems
•The goals of Apache Cassandra are all about staying
online and performant
•Apache Cassandra is a database best used for
applications, close to your users
4.1.2 Cassandra - Basic Architecture
Row
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Partition
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Partition with Clustering
Cluster
1
Partition
Key 1
Column
1
Column
2
Column
3
Cluster
2
Partition
Key 1
Column
1
Column
2
Column
3
Cluster
3
Partition
Key 1
Column
1
Column
2
Column
3
Cluster
4
Partition
Key 1
Column
1
Column
2
Column
3
Table Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Column
2
Column
3
Column
4
Column
1
Column
2
Column
3
Column
4
Column
1
Column
2
Column
3
Column
4
Partition
Key 2
Partition
Key 2
Partition
Key 2
Keyspace
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 1
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Column
1
Partition
Key 2
Column
2
Column
3
Column
4
Table 1 Table 2
Keyspace 1
Node
Server
Token
Server
•Each partition is a 128 bit value
•Consistent hash between 2-63 and 264
•Each node owns a range of those
values
•The token is the beginning of that
range to the next node’s token value
•Virtual Nodes break these down
further
Data
Token Range
0 …
The cluster Server
Token Range
0 0-100
0-100
The cluster Server
Token Range
0 0-50
51 51-100
Server
0-50
51-100
The cluster Server
Token Range
0 0-25
26 26-50
51 51-75
76 76-100
Server
ServerServer
0-25
76-100
26-5051-75
Summary
•Tables store rows of data by column
•Partitions are similar data grouped by a partition key
•Keyspaces contain tables and are grouped by data center
•Tokens show node placement in the range of cluster data
4.1.3 Cassandra - Replication, High Availability and Multi-datacenter
Replication
10.0.0.
1
DC1
DC1: RF=1
Node Primary
10.0.0.1 00-25
10.0.0.2 26-50
10.0.0.3 51-75
10.0.0.4 76-100
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
Replication
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
DC1
DC1: RF=2
Node Primary Replica
10.0.0.1 00-25 76-100
10.0.0.2 26-50 00-25
10.0.0.3 51-75 26-50
10.0.0.4 76-100 51-75
76-100
00-25
26-50
51-75
Replication DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Replication DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Client
Write to
partition 15
???
Consistency level
Consistency Level Number of Nodes Acknowledged
One One - Read repair triggered
Local One One - Read repair in local DC
Quorum 51%
Local Quorum 51% in local DC
Consistency DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Client
Write to
partition 15
CL= One
Consistency DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Client
Write to
partition 15
CL= One
Consistency DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Client
Write to
partition 15
CL= Quorum
Multi-datacenter
DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Client
Write to
partition 15
DC2
10.1.0.1
00-25
10.1.0.4
76-100
10.1.0.2
26-50
10.1.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
DC2: RF=3
Multi-datacenter
DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Client
Write to
partition 15
DC2
10.1.0.1
00-25
10.1.0.4
76-100
10.1.0.2
26-50
10.1.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
DC2: RF=3
Multi-datacenter
DC1
DC1: RF=3
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
10.0.0.1
00-25
10.0.0.4
76-100
10.0.0.2
26-50
10.0.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Client
Write to
partition 15
DC2
10.1.0.1
00-25
10.1.0.4
76-100
10.1.0.2
26-50
10.1.0.3
51-75
76-100
51-75
00-25
76-100
26-50
00-25
51-75
26-50
Node Primary Replica Replica
10.0.0.1 00-25 76-100 51-75
10.0.0.2 26-50 00-25 76-100
10.0.0.3 51-75 26-50 00-25
10.0.0.4 76-100 51-75 26-50
DC2: RF=3
Summary
•Replication Factor indicates how many times your data is
copied
•Consistency Level specifies how many replicas are
consistent at read or write
•Replication along with Consistency Factor are critical for
uptime
4.2.1.1.3 Cassandra - Read and Write Path (Node Architecture)
Writes
CREATE TABLE raw_weather_data (

wsid text,

year int,

month int,

day int,

hour int,

temperature double,

dewpoint double,

pressure double,

wind_direction int,

wind_speed double,

sky_condition int,

sky_condition_text text,

one_hour_precip double,

six_hour_precip double,

PRIMARY KEY ((wsid), year, month, day, hour)

) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
Writes
CREATE TABLE raw_weather_data (

wsid text,

year int,

month int,

day int,

hour int,

temperature double,

PRIMARY KEY ((wsid), year, month, day, hour)

) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,10,-5.6);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,9,-5.1);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,8,-4.9);
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,7,-5.3);
Write Path
Client
INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)

VALUES (‘10010:99999’,2005,12,1,7,-5.3);
year 1wsid 1 month 1 day 1 hour 1
year 2wsid 2 month 2 day 2 hour 2
Memtable
SSTable
SSTable
SSTable
SSTable
Node
Commit Log Data * Compaction *
Temp
Temp
Memory
Disk
Read Path
Client
SSTable
SSTable
SSTable
Node
Data
SELECT wsid,hour,temperature

FROM raw_weather_data

WHERE wsid='10010:99999'

AND year = 2005 AND month = 12 AND day = 1 

AND hour >= 7 AND hour <= 10;
year 1wsid 1 month 1 day 1 hour 1
year 2wsid 2 month 2 day 2 hour 2
Memtable
Temp
Temp
Memory
Disk
Summary
•By default, writes are durable
•Client receives ack when consistency level is achieved
•Reads must always go to disk
•Compaction is data housekeeping
43
Questions?

Mais conteúdo relacionado

Mais procurados

Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastDataWorks Summit
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache CassandraPatrick McFadin
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraDataStax
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret InternalsOracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret InternalsAnil Nair
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra Nikiforos Botis
 
cassandra
cassandracassandra
cassandraAkash R
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loadingalex_araujo
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseenissoz
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandraNguyen Quang
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesBobby Curtis
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overviewSean Murphy
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...Amazon Web Services
 

Mais procurados (20)

Cassandra 101
Cassandra 101Cassandra 101
Cassandra 101
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Storing time series data with Apache Cassandra
Storing time series data with Apache CassandraStoring time series data with Apache Cassandra
Storing time series data with Apache Cassandra
 
Understanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache CassandraUnderstanding Data Partitioning and Replication in Apache Cassandra
Understanding Data Partitioning and Replication in Apache Cassandra
 
Oracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret InternalsOracle RAC 19c: Best Practices and Secret Internals
Oracle RAC 19c: Best Practices and Secret Internals
 
Presentation of Apache Cassandra
Presentation of Apache Cassandra Presentation of Apache Cassandra
Presentation of Apache Cassandra
 
cassandra
cassandracassandra
cassandra
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
ETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk LoadingETL With Cassandra Streaming Bulk Loading
ETL With Cassandra Streaming Bulk Loading
 
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBaseHBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
 
Introduction to cassandra
Introduction to cassandraIntroduction to cassandra
Introduction to cassandra
 
Oracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best PracticesOracle GoldenGate 21c New Features and Best Practices
Oracle GoldenGate 21c New Features and Best Practices
 
Cassandra overview
Cassandra overviewCassandra overview
Cassandra overview
 
kafka
kafkakafka
kafka
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Cassandra
CassandraCassandra
Cassandra
 
Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
Deep Dive on Amazon Aurora MySQL Performance Tuning (DAT429-R1) - AWS re:Inve...
 

Destaque

Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014DataStax Academy
 
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...DataStax Academy
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraDataStax Academy
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!Patrick McFadin
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized ViewsCarl Yeksigian
 
Python and cassandra
Python and cassandraPython and cassandra
Python and cassandraJon Haddad
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014Patrick McFadin
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data modelDuyhai Doan
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsDataStax
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayDataStax Academy
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Modelebenhewitt
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsAcunu
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyBenjamin Black
 
Apache cassandra architecture internals
Apache cassandra architecture internalsApache cassandra architecture internals
Apache cassandra architecture internalsBhuvan Rawal
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesPatrick McFadin
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 
An Introduction to Distributed Search with Cassandra and Solr
An Introduction to Distributed Search with Cassandra and SolrAn Introduction to Distributed Search with Cassandra and Solr
An Introduction to Distributed Search with Cassandra and SolrDataStax Academy
 

Destaque (20)

Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014
 
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
Cassandra Community Webinar | Getting Started with Apache Cassandra with Patr...
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Cassandra Materialized Views
Cassandra Materialized ViewsCassandra Materialized Views
Cassandra Materialized Views
 
Python and cassandra
Python and cassandraPython and cassandra
Python and cassandra
 
Introduction to cassandra 2014
Introduction to cassandra 2014Introduction to cassandra 2014
Introduction to cassandra 2014
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural LessonsCassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
Cassandra Community Webinar: From Mongo to Cassandra, Architectural Lessons
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Cassandra Data Model
Cassandra Data ModelCassandra Data Model
Cassandra Data Model
 
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source EffortsCassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
 
Introduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and ConsistencyIntroduction to Cassandra: Replication and Consistency
Introduction to Cassandra: Replication and Consistency
 
Apache cassandra architecture internals
Apache cassandra architecture internalsApache cassandra architecture internals
Apache cassandra architecture internals
 
Cassandra 2.0 and timeseries
Cassandra 2.0 and timeseriesCassandra 2.0 and timeseries
Cassandra 2.0 and timeseries
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 
An Introduction to Distributed Search with Cassandra and Solr
An Introduction to Distributed Search with Cassandra and SolrAn Introduction to Distributed Search with Cassandra and Solr
An Introduction to Distributed Search with Cassandra and Solr
 

Semelhante a Intro to Cassandra Architecture

Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016StampedeCon
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and SparkPatrick McFadin
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDTony Rogerson
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the firePatrick McFadin
 
Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast DataPatrick McFadin
 
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...Zahid Anwar (OCM)
 
Redis Reliability, Performance & Innovation
Redis Reliability, Performance & InnovationRedis Reliability, Performance & Innovation
Redis Reliability, Performance & InnovationRedis Labs
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucssolarisyougood
 
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco InfrastructureVMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco InfrastructureVMworld
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...DataStax
 
Deploying OpenStack Private Cloud on NEC DX1000 MicroServer Chassis
Deploying OpenStack Private Cloud on NEC DX1000 MicroServer ChassisDeploying OpenStack Private Cloud on NEC DX1000 MicroServer Chassis
Deploying OpenStack Private Cloud on NEC DX1000 MicroServer ChassisPrincipled Technologies
 
Oracle Fleet Patching and Provisioning Deep Dive Webcast Slides
Oracle Fleet Patching and Provisioning Deep Dive Webcast SlidesOracle Fleet Patching and Provisioning Deep Dive Webcast Slides
Oracle Fleet Patching and Provisioning Deep Dive Webcast SlidesLudovico Caldara
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresmkorremans
 
Presentation why v mware v-sphere runs better in cisco ucs
Presentation   why v mware v-sphere runs better in cisco ucsPresentation   why v mware v-sphere runs better in cisco ucs
Presentation why v mware v-sphere runs better in cisco ucssolarisyougood
 
Comparison of ACFS and DBFS
Comparison of ACFS and DBFSComparison of ACFS and DBFS
Comparison of ACFS and DBFSDanielHillinger
 

Semelhante a Intro to Cassandra Architecture (20)

Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
Analyzing Time-Series Data with Apache Spark and Cassandra - StampedeCon 2016
 
Devops kc
Devops kcDevops kc
Devops kc
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and Spark
 
BigData Developers MeetUp
BigData Developers MeetUpBigData Developers MeetUp
BigData Developers MeetUp
 
NewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACIDNewSQL - Deliverance from BASE and back to SQL and ACID
NewSQL - Deliverance from BASE and back to SQL and ACID
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
 
Successful Architectures for Fast Data
Successful Architectures for Fast DataSuccessful Architectures for Fast Data
Successful Architectures for Fast Data
 
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
UKOUG Tech15 - Deploying Oracle 12c Cloud Control in Maximum Availability Arc...
 
Redis Reliability, Performance & Innovation
Redis Reliability, Performance & InnovationRedis Reliability, Performance & Innovation
Redis Reliability, Performance & Innovation
 
Unleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucsUnleash oracle 12c performance with cisco ucs
Unleash oracle 12c performance with cisco ucs
 
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco InfrastructureVMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
 
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
Operations, Consistency, Failover for Multi-DC Clusters (Alexander Dejanovski...
 
Deploying OpenStack Private Cloud on NEC DX1000 MicroServer Chassis
Deploying OpenStack Private Cloud on NEC DX1000 MicroServer ChassisDeploying OpenStack Private Cloud on NEC DX1000 MicroServer Chassis
Deploying OpenStack Private Cloud on NEC DX1000 MicroServer Chassis
 
Oracle Fleet Patching and Provisioning Deep Dive Webcast Slides
Oracle Fleet Patching and Provisioning Deep Dive Webcast SlidesOracle Fleet Patching and Provisioning Deep Dive Webcast Slides
Oracle Fleet Patching and Provisioning Deep Dive Webcast Slides
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-features
 
Presentation why v mware v-sphere runs better in cisco ucs
Presentation   why v mware v-sphere runs better in cisco ucsPresentation   why v mware v-sphere runs better in cisco ucs
Presentation why v mware v-sphere runs better in cisco ucs
 
Comparison of ACFS and DBFS
Comparison of ACFS and DBFSComparison of ACFS and DBFS
Comparison of ACFS and DBFS
 

Mais de nickmbailey

Clojure at DataStax: The Long Road From Python to Clojure
Clojure at DataStax: The Long Road From Python to ClojureClojure at DataStax: The Long Road From Python to Clojure
Clojure at DataStax: The Long Road From Python to Clojurenickmbailey
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandranickmbailey
 
Cassandra and Clojure
Cassandra and ClojureCassandra and Clojure
Cassandra and Clojurenickmbailey
 
An Introduction to Cassandra on Linux
An Introduction to Cassandra on LinuxAn Introduction to Cassandra on Linux
An Introduction to Cassandra on Linuxnickmbailey
 
Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modelingnickmbailey
 
CFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for HadoopCFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for Hadoopnickmbailey
 
Clojure and the Web
Clojure and the WebClojure and the Web
Clojure and the Webnickmbailey
 

Mais de nickmbailey (7)

Clojure at DataStax: The Long Road From Python to Clojure
Clojure at DataStax: The Long Road From Python to ClojureClojure at DataStax: The Long Road From Python to Clojure
Clojure at DataStax: The Long Road From Python to Clojure
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
 
Cassandra and Clojure
Cassandra and ClojureCassandra and Clojure
Cassandra and Clojure
 
An Introduction to Cassandra on Linux
An Introduction to Cassandra on LinuxAn Introduction to Cassandra on Linux
An Introduction to Cassandra on Linux
 
Introduction to Cassandra and Data Modeling
Introduction to Cassandra and Data ModelingIntroduction to Cassandra and Data Modeling
Introduction to Cassandra and Data Modeling
 
CFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for HadoopCFS: Cassandra backed storage for Hadoop
CFS: Cassandra backed storage for Hadoop
 
Clojure and the Web
Clojure and the WebClojure and the Web
Clojure and the Web
 

Último

Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceBrainSell Technologies
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROmotivationalword821
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentationvaddepallysandeep122
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 

Último (20)

Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
CRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. SalesforceCRM Contender Series: HubSpot vs. Salesforce
CRM Contender Series: HubSpot vs. Salesforce
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
How To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTROHow To Manage Restaurant Staff -BTRESTRO
How To Manage Restaurant Staff -BTRESTRO
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
PREDICTING RIVER WATER QUALITY ppt presentation
PREDICTING  RIVER  WATER QUALITY  ppt presentationPREDICTING  RIVER  WATER QUALITY  ppt presentation
PREDICTING RIVER WATER QUALITY ppt presentation
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 

Intro to Cassandra Architecture

  • 1. Nick Bailey @nickmbailey Intro to Cassandra Architecture 1
  • 2. 4.1 Cassandra - Introduction
  • 4. Dynamo Paper(2007) • How do we build a data store that is: • Reliable • Performant • “Always On” • Nothing new and shiny • 24 papers cited Also the basis for Riak and Voldemort
  • 5. BigTable(2006) • Richer data model • 1 key. Lots of values • Fast sequential access • 38 Papers cited
  • 6. Cassandra(2008) • Distributed features of Dynamo • Data Model and storage from BigTable • February 17, 2010 it graduated to a top-level Apache project
  • 7. Cassandra - More than one server • All nodes participate in a cluster • Shared nothing • Add or remove as needed • More capacity? Add a server
 7
  • 8. 8 Cassandra HBase Redis MySQL THROUGHPUTOPS/SEC) VLDB benchmark
  • 9. Cassandra - Fully Replicated • Client writes local • Data syncs across WAN • Replication per Data Center 9
  • 11. Summary •The evolution of the internet and online data created new problems •Apache Cassandra was based on a variety of technologies to solve these problems •The goals of Apache Cassandra are all about staying online and performant •Apache Cassandra is a database best used for applications, close to your users
  • 12. 4.1.2 Cassandra - Basic Architecture
  • 15. Partition with Clustering Cluster 1 Partition Key 1 Column 1 Column 2 Column 3 Cluster 2 Partition Key 1 Column 1 Column 2 Column 3 Cluster 3 Partition Key 1 Column 1 Column 2 Column 3 Cluster 4 Partition Key 1 Column 1 Column 2 Column 3
  • 16. Table Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Column 2 Column 3 Column 4 Column 1 Column 2 Column 3 Column 4 Column 1 Column 2 Column 3 Column 4 Partition Key 2 Partition Key 2 Partition Key 2
  • 17. Keyspace Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 1 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Column 1 Partition Key 2 Column 2 Column 3 Column 4 Table 1 Table 2 Keyspace 1
  • 19. Token Server •Each partition is a 128 bit value •Consistent hash between 2-63 and 264 •Each node owns a range of those values •The token is the beginning of that range to the next node’s token value •Virtual Nodes break these down further Data Token Range 0 …
  • 20. The cluster Server Token Range 0 0-100 0-100
  • 21. The cluster Server Token Range 0 0-50 51 51-100 Server 0-50 51-100
  • 22. The cluster Server Token Range 0 0-25 26 26-50 51 51-75 76 76-100 Server ServerServer 0-25 76-100 26-5051-75
  • 23. Summary •Tables store rows of data by column •Partitions are similar data grouped by a partition key •Keyspaces contain tables and are grouped by data center •Tokens show node placement in the range of cluster data
  • 24. 4.1.3 Cassandra - Replication, High Availability and Multi-datacenter
  • 25. Replication 10.0.0. 1 DC1 DC1: RF=1 Node Primary 10.0.0.1 00-25 10.0.0.2 26-50 10.0.0.3 51-75 10.0.0.4 76-100 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75
  • 26. Replication 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 DC1 DC1: RF=2 Node Primary Replica 10.0.0.1 00-25 76-100 10.0.0.2 26-50 00-25 10.0.0.3 51-75 26-50 10.0.0.4 76-100 51-75 76-100 00-25 26-50 51-75
  • 27. Replication DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50
  • 28. Replication DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Client Write to partition 15 ???
  • 29. Consistency level Consistency Level Number of Nodes Acknowledged One One - Read repair triggered Local One One - Read repair in local DC Quorum 51% Local Quorum 51% in local DC
  • 30. Consistency DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Client Write to partition 15 CL= One
  • 31. Consistency DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Client Write to partition 15 CL= One
  • 32. Consistency DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Client Write to partition 15 CL= Quorum
  • 33. Multi-datacenter DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Client Write to partition 15 DC2 10.1.0.1 00-25 10.1.0.4 76-100 10.1.0.2 26-50 10.1.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 DC2: RF=3
  • 34. Multi-datacenter DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Client Write to partition 15 DC2 10.1.0.1 00-25 10.1.0.4 76-100 10.1.0.2 26-50 10.1.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 DC2: RF=3
  • 35. Multi-datacenter DC1 DC1: RF=3 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 10.0.0.1 00-25 10.0.0.4 76-100 10.0.0.2 26-50 10.0.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Client Write to partition 15 DC2 10.1.0.1 00-25 10.1.0.4 76-100 10.1.0.2 26-50 10.1.0.3 51-75 76-100 51-75 00-25 76-100 26-50 00-25 51-75 26-50 Node Primary Replica Replica 10.0.0.1 00-25 76-100 51-75 10.0.0.2 26-50 00-25 76-100 10.0.0.3 51-75 26-50 00-25 10.0.0.4 76-100 51-75 26-50 DC2: RF=3
  • 36. Summary •Replication Factor indicates how many times your data is copied •Consistency Level specifies how many replicas are consistent at read or write •Replication along with Consistency Factor are critical for uptime
  • 37. 4.2.1.1.3 Cassandra - Read and Write Path (Node Architecture)
  • 38. Writes CREATE TABLE raw_weather_data (
 wsid text,
 year int,
 month int,
 day int,
 hour int,
 temperature double,
 dewpoint double,
 pressure double,
 wind_direction int,
 wind_speed double,
 sky_condition int,
 sky_condition_text text,
 one_hour_precip double,
 six_hour_precip double,
 PRIMARY KEY ((wsid), year, month, day, hour)
 ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC);
  • 39. Writes CREATE TABLE raw_weather_data (
 wsid text,
 year int,
 month int,
 day int,
 hour int,
 temperature double,
 PRIMARY KEY ((wsid), year, month, day, hour)
 ) WITH CLUSTERING ORDER BY (year DESC, month DESC, day DESC, hour DESC); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,10,-5.6); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,9,-5.1); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,8,-4.9); INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,7,-5.3);
  • 40. Write Path Client INSERT INTO raw_weather_data(wsid,year,month,day,hour,temperature)
 VALUES (‘10010:99999’,2005,12,1,7,-5.3); year 1wsid 1 month 1 day 1 hour 1 year 2wsid 2 month 2 day 2 hour 2 Memtable SSTable SSTable SSTable SSTable Node Commit Log Data * Compaction * Temp Temp Memory Disk
  • 41. Read Path Client SSTable SSTable SSTable Node Data SELECT wsid,hour,temperature
 FROM raw_weather_data
 WHERE wsid='10010:99999'
 AND year = 2005 AND month = 12 AND day = 1 
 AND hour >= 7 AND hour <= 10; year 1wsid 1 month 1 day 1 hour 1 year 2wsid 2 month 2 day 2 hour 2 Memtable Temp Temp Memory Disk
  • 42. Summary •By default, writes are durable •Client receives ack when consistency level is achieved •Reads must always go to disk •Compaction is data housekeeping