SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Cassandra Core Concepts
and why Netflix runs Cassandra on the cloud
Erick Ramirez @flightc, DataStax Engineering
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Welcome
2
• Introducing Cassandra
• Why Netflix runs Cassandra on the cloud
• Feel free to ask questions
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Relational data model
3
• Normalised schema, table joins, ACID
• Joins are very expensive on billions of rows
• Sharding tables across systems is complex
• Performance preferred over “always on”
• Requires massive high-end systems
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Big data requirements
4
• Distribute data across multiple nodes
• Relaxed consistency
• Relaxed schema
• Scale, scale, scale!
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
NoSQL landscape
5
• Graph, Key-value, Document, Column family
• Consistency - same result regardless of node
• Availability - high read/write volumes
• Partition tolerance - survive network isolation
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
CAP theorem
6
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
What is Cassandra?
7
• Massively scalable NoSQL database
• Fully distributed, no single-point-of-failure
• Open sourced by Facebook
• Linear horizontal scaling
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Modelling Cassandra
8
• Use Cassandra Query Language (CQL)
• Similar SQL-like approach
• CREATE, ALTER, DROP
• SELECT, INSERT, UPDATE, DELETE
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Modelling Cassandra
9
CREATE TABLE users (
userid text,
name text,
email text,
PRIMARY KEY (userid)
);
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Why Cassandra
10
• All nodes are the same - no SPOF
• Real-time, durable writes
• Linear scaling on commodity servers
• Real-time replication across data centres
• Always on - no offline operation
• Because you have a scale problem
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Why not Cassandra
11
• RDBMS excels in ACID transactions
• You need to justify your purchase of massive
high-end servers
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Common use cases
12
• Personalisation/recommendations (Netflix,ebay)
• Messaging (Instagram)
• IoT (Riptide IO)
• Fraud detection (Barracuda)
• Playlists and collections (Spotify)
• Graph (SpotRight)
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
A Cassandra cluster
13
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Cassandra Summit 2015
14
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
academy.datastax.com
15
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Thank you
16
• Erick Ramirez
• @flightc

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Coursera's Adoption of Cassandra
Coursera's Adoption of CassandraCoursera's Adoption of Cassandra
Coursera's Adoption of Cassandra
 
Rolling With Riak
Rolling With RiakRolling With Riak
Rolling With Riak
 
[DEISER Talks] "Atlassian Data Center: De la A a la Z" - Carlos Aparicio
[DEISER Talks] "Atlassian Data Center: De la A a la Z" - Carlos Aparicio[DEISER Talks] "Atlassian Data Center: De la A a la Z" - Carlos Aparicio
[DEISER Talks] "Atlassian Data Center: De la A a la Z" - Carlos Aparicio
 
[DEISER Talks] "Atlassian Data Center: Deployment Tips & Tricks"- Carlos Apar...
[DEISER Talks] "Atlassian Data Center: Deployment Tips & Tricks"- Carlos Apar...[DEISER Talks] "Atlassian Data Center: Deployment Tips & Tricks"- Carlos Apar...
[DEISER Talks] "Atlassian Data Center: Deployment Tips & Tricks"- Carlos Apar...
 
Riak at shareaholic
Riak at shareaholicRiak at shareaholic
Riak at shareaholic
 
Data Ingestion Engine
Data Ingestion EngineData Ingestion Engine
Data Ingestion Engine
 
Cassandra-as-a-Service
Cassandra-as-a-ServiceCassandra-as-a-Service
Cassandra-as-a-Service
 
Scylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDB
Scylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDBScylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDB
Scylla Summit 2022: Multi-cloud State for k8s: Anthos and ScyllaDB
 
From Idea to Model: Productionizing Data Pipelines with Apache Airflow
From Idea to Model: Productionizing Data Pipelines with Apache AirflowFrom Idea to Model: Productionizing Data Pipelines with Apache Airflow
From Idea to Model: Productionizing Data Pipelines with Apache Airflow
 
IVS CTO Night And Day 2018 Winter - [re:Cap] AWS Databases
IVS CTO Night And Day 2018 Winter - [re:Cap] AWS DatabasesIVS CTO Night And Day 2018 Winter - [re:Cap] AWS Databases
IVS CTO Night And Day 2018 Winter - [re:Cap] AWS Databases
 
How R Developers Can Build and Share Data and AI Applications that Scale with...
How R Developers Can Build and Share Data and AI Applications that Scale with...How R Developers Can Build and Share Data and AI Applications that Scale with...
How R Developers Can Build and Share Data and AI Applications that Scale with...
 
Low-latency real-time data processing at giga-scale with Kafka | John DesJard...
Low-latency real-time data processing at giga-scale with Kafka | John DesJard...Low-latency real-time data processing at giga-scale with Kafka | John DesJard...
Low-latency real-time data processing at giga-scale with Kafka | John DesJard...
 
Cassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming AnalyticsCassandra summit 2015 - Simplifying Streaming Analytics
Cassandra summit 2015 - Simplifying Streaming Analytics
 
Masterson Storage in the Cloud: On-demand DR, Backup & Archive Seminar
Masterson Storage in the Cloud: On-demand DR, Backup & Archive SeminarMasterson Storage in the Cloud: On-demand DR, Backup & Archive Seminar
Masterson Storage in the Cloud: On-demand DR, Backup & Archive Seminar
 
seminar presentation on apache-spark
seminar presentation on apache-sparkseminar presentation on apache-spark
seminar presentation on apache-spark
 
Elastic Cloud Enterprise in Azure with Devon
Elastic Cloud Enterprise in Azure with DevonElastic Cloud Enterprise in Azure with Devon
Elastic Cloud Enterprise in Azure with Devon
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack Trove
 
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDBHow to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
 
No sql databases
No sql databases No sql databases
No sql databases
 
Getting Started with Elasticsearch
Getting Started with ElasticsearchGetting Started with Elasticsearch
Getting Started with Elasticsearch
 

Semelhante a Meetup core concepts-erick-ramirez-20150729

Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 

Semelhante a Meetup core concepts-erick-ramirez-20150729 (20)

Meetup Crash Course: Cassandra Data Modelling
Meetup Crash Course: Cassandra Data ModellingMeetup Crash Course: Cassandra Data Modelling
Meetup Crash Course: Cassandra Data Modelling
 
A brave new world in mutable big data relational storage (Strata NYC 2017)
A brave new world in mutable big data  relational storage (Strata NYC 2017)A brave new world in mutable big data  relational storage (Strata NYC 2017)
A brave new world in mutable big data relational storage (Strata NYC 2017)
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda Moran
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE PlatformLarge Scale Data Analytics with Spark and Cassandra on the DSE Platform
Large Scale Data Analytics with Spark and Cassandra on the DSE Platform
 
Oracle RAD stack REST, APEX, Database
Oracle RAD stack REST, APEX, DatabaseOracle RAD stack REST, APEX, Database
Oracle RAD stack REST, APEX, Database
 
Intro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application MeetupIntro to Apache Kudu (short) - Big Data Application Meetup
Intro to Apache Kudu (short) - Big Data Application Meetup
 
Lambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale MLLambda architecture on Spark, Kafka for real-time large scale ML
Lambda architecture on Spark, Kafka for real-time large scale ML
 
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
DataStax Enterprise & Apache Cassandra – Essentials for Financial Services – ...
 
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
 
Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingReal Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark Streaming
 
Kudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast DataKudu: Fast Analytics on Fast Data
Kudu: Fast Analytics on Fast Data
 
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
 
Event Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache KafkaEvent Detection Pipelines with Apache Kafka
Event Detection Pipelines with Apache Kafka
 
Deploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloudDeploying Cassandra Multi-cloud
Deploying Cassandra Multi-cloud
 
Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)Large-Scale Data Science on Hadoop (Intel Big Data Day)
Large-Scale Data Science on Hadoop (Intel Big Data Day)
 
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CITApache Geode Meetup, Cork, Ireland at CIT
Apache Geode Meetup, Cork, Ireland at CIT
 
Introduction to Apache Kudu
Introduction to Apache KuduIntroduction to Apache Kudu
Introduction to Apache Kudu
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Meetup core concepts-erick-ramirez-20150729

  • 1. Cassandra Core Concepts and why Netflix runs Cassandra on the cloud Erick Ramirez @flightc, DataStax Engineering
  • 2. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Welcome 2 • Introducing Cassandra • Why Netflix runs Cassandra on the cloud • Feel free to ask questions
  • 3. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Relational data model 3 • Normalised schema, table joins, ACID • Joins are very expensive on billions of rows • Sharding tables across systems is complex • Performance preferred over “always on” • Requires massive high-end systems
  • 4. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Big data requirements 4 • Distribute data across multiple nodes • Relaxed consistency • Relaxed schema • Scale, scale, scale!
  • 5. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc NoSQL landscape 5 • Graph, Key-value, Document, Column family • Consistency - same result regardless of node • Availability - high read/write volumes • Partition tolerance - survive network isolation
  • 6. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc CAP theorem 6
  • 7. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc What is Cassandra? 7 • Massively scalable NoSQL database • Fully distributed, no single-point-of-failure • Open sourced by Facebook • Linear horizontal scaling
  • 8. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Modelling Cassandra 8 • Use Cassandra Query Language (CQL) • Similar SQL-like approach • CREATE, ALTER, DROP • SELECT, INSERT, UPDATE, DELETE
  • 9. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Modelling Cassandra 9 CREATE TABLE users ( userid text, name text, email text, PRIMARY KEY (userid) );
  • 10. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Why Cassandra 10 • All nodes are the same - no SPOF • Real-time, durable writes • Linear scaling on commodity servers • Real-time replication across data centres • Always on - no offline operation • Because you have a scale problem
  • 11. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Why not Cassandra 11 • RDBMS excels in ACID transactions • You need to justify your purchase of massive high-end servers
  • 12. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Common use cases 12 • Personalisation/recommendations (Netflix,ebay) • Messaging (Instagram) • IoT (Riptide IO) • Fraud detection (Barracuda) • Playlists and collections (Spotify) • Graph (SpotRight)
  • 13. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc A Cassandra cluster 13
  • 14. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Cassandra Summit 2015 14
  • 15. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc academy.datastax.com 15
  • 16. Erick Ramirez© 2015 DataStax, All Rights Reserved. @flightc Thank you 16 • Erick Ramirez • @flightc