SlideShare uma empresa Scribd logo
1 de 23
Introduction to NoSQL
and Cassandra
Janos Geronimo
Overview
• NoSQL
• Brief History of Cassandra
• Architecture
• Terminology
• Cassandra Query Language
• Basic CRUD Operations using CQL (Possibly in
MULE)
• References, For Further Reading/Implementation
pt2.
NoSQL
• originally referring to "non SQL" or "non relational”.
• also sometimes called "Not only SQL" to emphasize that it
may support SQL-like query languages.
• triggered by the growing needs of Web 2.0 companies such
as Facebook, Google and Amazon in which they use
“whole lot of data” (big data or real-time data) and the
need for faster responses to users (Using cache or small
data)
• Data that are not easily modelled into a
Traditional/Relational Database.
An Example Use Case of
NoSQL
Let’s create a new social engagement (dating) site
wherein Users can create posts, add pictures, videos
and music to them. Other users can comment on the
posts and give points (likes, thumbs up, thumbs down)
to rate the posts. The landing page (Home) will have a
feed of posts that users can share and interact with.
How we will map it using
SQL
How do we display a Post by a certain user using SQL?
How we will map it using
NoSQL
Use of NoSQL and SQL
Brief Comparison of SQL
and NoSQL
Brief History of Cassandra
• Cassandra was developed at Facebook for inbox search
(Messaging).
• It was open-sourced by Facebook in July 2008.
• Cassandra was accepted into Apache Incubator in March 2009.
• It was made an Apache top-level project since February 2010.
• The name “Cassandra” was from the Greek Mythology. A gifted
prophet who can see the future, but unfortunately no one
believed in her. It is said that one of the reasons behind the
name(Cassandra) was that NoSQL was not a “believable”
solution to today’s and future data needs.
Features of Cassandra
• Highly Scalable - add more nodes to a cluster / add another cluster to accommodate more customers/clients
and data
• Masterless Design - all nodes are the same, which provides operational simplicity and easy scale-out.
• “Always-on” / Continuous Availability - offers redundancy of both data and node function, has no single point
of failure and it is continuously available for business-critical applications that cannot afford a failure.
• Linear-scale performance - increases throughput through the number of nodes in the cluster.
• Flexible Data Storage - Supports Structured (RDBMS) and Semi Structured Data storage (column name-
value or key-value, Table x Row x Column).
• Data Replication - Data is replicated across all nodes, using Gossip Protocol (which is also used to identify
if a Node in a cluster is alive or not).
• Active “everywhere” design – all nodes may be written to and read from.
• Strong data protection – a commit log design ensures no data loss and built in security with backup/restore
keeps data protected and safe.
• Cassandra Query Language - primary language for communicating with the Cassandra database
Cassandra Architecture
Cassandra - Data Read and
Write
Terminologies
• In Cassandra, a keyspace is a container for your application
data. It is similar to the schema to Oracle or PostgreSQL the
database in RDBMS..
• Column Family / Table − the most basic unit in the Cassandra
data model, and each column consists of a name, a value, and a
timestamp or Time To Live.
• By ignoring the timestamp of the Column, you can represent a
column as a name value pair.
• *You can also configure a Column Family with a TTL.
• Cassandra always stores columns sorted by their Primary Key.
Terminologies (cont.)
Contents of Column Family /
Table
<- ColumnRow ->
<- Column Family
Cassandra Query Language
• Basic way to interact with Cassandra is using the
CQL shell
• you can Administer cluster nodes, roles and clients
(users) via CQL shell
• With the release of CQL3, it borrowed many of SQL
features such as orderBy, filtering but still no JOINS
and subqueries
Create a Keyspace
CREATE KEYSPACE users
WITH replication = {
'class' : ‘SimpleStrategy’,
//For single server/cluster only
// ‘NetworkTopologyStrategy’ for multiple clusters
'replication_factor' : 1
// number of copies across nodes
};
Create a Column Family
(Table)
CREATE TABLE | COLUMNFAMILY users.user_profile (
userId int,
checked_at timestamp,
departmentId int,
firstName text,
lastName text,
address text,
PRIMARY KEY (userId, checked_at))
WITH CLUSTERING ORDER BY ("checked_at"ASC);
<- Compound Primary Key
* Only Primary Keys when used for querying (WHERE) can sort results
Inserting Data
INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address)
VALUES (1,'2016-06-21T09:10+1300', 108, 'Dela Cruz', 'Juan','Manila');
INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address)
VALUES (2, '2016-06-21T09:11+1300', 109, 'Tambling', 'Ben','Manila');
INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName,
address)VALUES (3, '2016-06-21T09:12+1300', 110, 'Badiday', 'Inday','Manila');
INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address)
VALUES (4, '2016-06-21T09:13+1300' ,111, 'Ayala', 'Joey','Manila');
INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address)
VALUES (3, '2016-06-21T09:12+1300', 109, 'Badiday', ‘Inday','Manila') IF NOT EXISTS;
Selecting Data
SELECT * FROM users.user_profile WHERE userId =
1;
SELECT * FROM users.user_profile WHERE userId IN
(1,2,3, ...) ORDER BY departmentId ASC;
SELECT * FROM users.user_profile WHERE userId = 1
AND departmentId = 110;
Updating Data
UPDATE users.user_profile SET password='luxerey' WHERE
userid=1 AND checked_at='2016-06-21T09:14+1300';
* Per column, you can individually set its time to live
(useful for sessions, auth keys).
UPDATE users.user_profile USING TTL 100 SET
password='luxerey' WHERE userid=1 AND checked_at=‘2016-
06-21T09:14+1300';
Deleting Data (Row and
Columns)
* You can delete a specific column:
DELETE password FROM users.user_profile where userid = 1 AND
checked_at='2016-06-21T09:14+1300';
* Or you can delete a whole row:
DELETE FROM users.user_profile WHERE userid=1 AND
checked_at='2016-06-21T09:14+1300';
References
• DataStax -
http://www.datastax.com/documentation/cql/3.0/cql/cql_reference
• Planet Cassandra - http://www.planetcassandra.org/blog/cql-
cassandra-query-language/
• https://www.ibm.com/developerworks/library/os-apache-cassandra/
• http://mechanics.flite.com/blog/2013/11/05/breaking-down-the-cql-
where-clause/
• http://hector-client.github.io/hector/build/html/index.html
• http://www.ecyrd.com/cassandracalculator/

Mais conteúdo relacionado

Mais procurados

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
BITS: Introduction to MySQL - Introduction and Installation
BITS: Introduction to MySQL - Introduction and InstallationBITS: Introduction to MySQL - Introduction and Installation
BITS: Introduction to MySQL - Introduction and InstallationBITS
 
Introduction databases and MYSQL
Introduction databases and MYSQLIntroduction databases and MYSQL
Introduction databases and MYSQLNaeem Junejo
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!Edureka!
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 
cassandra
cassandracassandra
cassandraAkash R
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQLPankaj Khattar
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...DataStax
 
Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathJoshua McKenzie
 

Mais procurados (20)

Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
MySQL and its basic commands
MySQL and its basic commandsMySQL and its basic commands
MySQL and its basic commands
 
MYSQL-Database
MYSQL-DatabaseMYSQL-Database
MYSQL-Database
 
BITS: Introduction to MySQL - Introduction and Installation
BITS: Introduction to MySQL - Introduction and InstallationBITS: Introduction to MySQL - Introduction and Installation
BITS: Introduction to MySQL - Introduction and Installation
 
Introduction databases and MYSQL
Introduction databases and MYSQLIntroduction databases and MYSQL
Introduction databases and MYSQL
 
Learn Cassandra at edureka!
Learn Cassandra at edureka!Learn Cassandra at edureka!
Learn Cassandra at edureka!
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 
cassandra
cassandracassandra
cassandra
 
Hbase
HbaseHbase
Hbase
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Mysql ppt
Mysql pptMysql ppt
Mysql ppt
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQL
 
PHP and Cassandra
PHP and CassandraPHP and Cassandra
PHP and Cassandra
 
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
Apache Cassandra Multi-Datacenter Essentials (Julien Anguenot, iLand Internet...
 
Hive
HiveHive
Hive
 
Introduction Mysql
Introduction Mysql Introduction Mysql
Introduction Mysql
 
Introduction to my_sql
Introduction to my_sqlIntroduction to my_sql
Introduction to my_sql
 
Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write path
 
Nosql databases
Nosql databasesNosql databases
Nosql databases
 

Semelhante a Introduction to NoSQL CassandraDB

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra nehabsairam
 
Cassandra
Cassandra Cassandra
Cassandra Pooja GV
 
N07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxN07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxNguyễn Thái
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_finalSergioBruno21
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into CassandraBrent Theisen
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptxNaveen Kumar
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandraBrian Enochson
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introductionfardinjamshidi
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Md. Shohel Rana
 

Semelhante a Introduction to NoSQL CassandraDB (20)

Appache Cassandra
Appache Cassandra  Appache Cassandra
Appache Cassandra
 
Cassandra
Cassandra Cassandra
Cassandra
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
N07_RoundII_20220405.pptx
N07_RoundII_20220405.pptxN07_RoundII_20220405.pptx
N07_RoundII_20220405.pptx
 
Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
cassandra_presentation_final
cassandra_presentation_finalcassandra_presentation_final
cassandra_presentation_final
 
Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
 
Deep Dive into Cassandra
Deep Dive into CassandraDeep Dive into Cassandra
Deep Dive into Cassandra
 
NoSQL - Cassandra & MongoDB.pptx
NoSQL -  Cassandra & MongoDB.pptxNoSQL -  Cassandra & MongoDB.pptx
NoSQL - Cassandra & MongoDB.pptx
 
No sql
No sqlNo sql
No sql
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
No sql
No sqlNo sql
No sql
 
NoSQL Intro with cassandra
NoSQL Intro with cassandraNoSQL Intro with cassandra
NoSQL Intro with cassandra
 
Apache Cassandra introduction
Apache Cassandra introductionApache Cassandra introduction
Apache Cassandra introduction
 
Sql
SqlSql
Sql
 
Cassandra data modelling best practices
Cassandra data modelling best practicesCassandra data modelling best practices
Cassandra data modelling best practices
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Cassandra - A Distributed Database System
Cassandra - A Distributed Database System Cassandra - A Distributed Database System
Cassandra - A Distributed Database System
 
NoSQL
NoSQLNoSQL
NoSQL
 

Último

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Último (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Introduction to NoSQL CassandraDB

  • 1. Introduction to NoSQL and Cassandra Janos Geronimo
  • 2. Overview • NoSQL • Brief History of Cassandra • Architecture • Terminology • Cassandra Query Language • Basic CRUD Operations using CQL (Possibly in MULE) • References, For Further Reading/Implementation pt2.
  • 3. NoSQL • originally referring to "non SQL" or "non relational”. • also sometimes called "Not only SQL" to emphasize that it may support SQL-like query languages. • triggered by the growing needs of Web 2.0 companies such as Facebook, Google and Amazon in which they use “whole lot of data” (big data or real-time data) and the need for faster responses to users (Using cache or small data) • Data that are not easily modelled into a Traditional/Relational Database.
  • 4. An Example Use Case of NoSQL Let’s create a new social engagement (dating) site wherein Users can create posts, add pictures, videos and music to them. Other users can comment on the posts and give points (likes, thumbs up, thumbs down) to rate the posts. The landing page (Home) will have a feed of posts that users can share and interact with.
  • 5. How we will map it using SQL How do we display a Post by a certain user using SQL?
  • 6. How we will map it using NoSQL
  • 7. Use of NoSQL and SQL
  • 8. Brief Comparison of SQL and NoSQL
  • 9. Brief History of Cassandra • Cassandra was developed at Facebook for inbox search (Messaging). • It was open-sourced by Facebook in July 2008. • Cassandra was accepted into Apache Incubator in March 2009. • It was made an Apache top-level project since February 2010. • The name “Cassandra” was from the Greek Mythology. A gifted prophet who can see the future, but unfortunately no one believed in her. It is said that one of the reasons behind the name(Cassandra) was that NoSQL was not a “believable” solution to today’s and future data needs.
  • 10. Features of Cassandra • Highly Scalable - add more nodes to a cluster / add another cluster to accommodate more customers/clients and data • Masterless Design - all nodes are the same, which provides operational simplicity and easy scale-out. • “Always-on” / Continuous Availability - offers redundancy of both data and node function, has no single point of failure and it is continuously available for business-critical applications that cannot afford a failure. • Linear-scale performance - increases throughput through the number of nodes in the cluster. • Flexible Data Storage - Supports Structured (RDBMS) and Semi Structured Data storage (column name- value or key-value, Table x Row x Column). • Data Replication - Data is replicated across all nodes, using Gossip Protocol (which is also used to identify if a Node in a cluster is alive or not). • Active “everywhere” design – all nodes may be written to and read from. • Strong data protection – a commit log design ensures no data loss and built in security with backup/restore keeps data protected and safe. • Cassandra Query Language - primary language for communicating with the Cassandra database
  • 12. Cassandra - Data Read and Write
  • 13. Terminologies • In Cassandra, a keyspace is a container for your application data. It is similar to the schema to Oracle or PostgreSQL the database in RDBMS.. • Column Family / Table − the most basic unit in the Cassandra data model, and each column consists of a name, a value, and a timestamp or Time To Live. • By ignoring the timestamp of the Column, you can represent a column as a name value pair. • *You can also configure a Column Family with a TTL. • Cassandra always stores columns sorted by their Primary Key.
  • 15. Contents of Column Family / Table <- ColumnRow -> <- Column Family
  • 16. Cassandra Query Language • Basic way to interact with Cassandra is using the CQL shell • you can Administer cluster nodes, roles and clients (users) via CQL shell • With the release of CQL3, it borrowed many of SQL features such as orderBy, filtering but still no JOINS and subqueries
  • 17. Create a Keyspace CREATE KEYSPACE users WITH replication = { 'class' : ‘SimpleStrategy’, //For single server/cluster only // ‘NetworkTopologyStrategy’ for multiple clusters 'replication_factor' : 1 // number of copies across nodes };
  • 18. Create a Column Family (Table) CREATE TABLE | COLUMNFAMILY users.user_profile ( userId int, checked_at timestamp, departmentId int, firstName text, lastName text, address text, PRIMARY KEY (userId, checked_at)) WITH CLUSTERING ORDER BY ("checked_at"ASC); <- Compound Primary Key * Only Primary Keys when used for querying (WHERE) can sort results
  • 19. Inserting Data INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address) VALUES (1,'2016-06-21T09:10+1300', 108, 'Dela Cruz', 'Juan','Manila'); INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address) VALUES (2, '2016-06-21T09:11+1300', 109, 'Tambling', 'Ben','Manila'); INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address)VALUES (3, '2016-06-21T09:12+1300', 110, 'Badiday', 'Inday','Manila'); INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address) VALUES (4, '2016-06-21T09:13+1300' ,111, 'Ayala', 'Joey','Manila'); INSERT INTO users.user_profile (userId,checked_at,departmentId, lastName, firstName, address) VALUES (3, '2016-06-21T09:12+1300', 109, 'Badiday', ‘Inday','Manila') IF NOT EXISTS;
  • 20. Selecting Data SELECT * FROM users.user_profile WHERE userId = 1; SELECT * FROM users.user_profile WHERE userId IN (1,2,3, ...) ORDER BY departmentId ASC; SELECT * FROM users.user_profile WHERE userId = 1 AND departmentId = 110;
  • 21. Updating Data UPDATE users.user_profile SET password='luxerey' WHERE userid=1 AND checked_at='2016-06-21T09:14+1300'; * Per column, you can individually set its time to live (useful for sessions, auth keys). UPDATE users.user_profile USING TTL 100 SET password='luxerey' WHERE userid=1 AND checked_at=‘2016- 06-21T09:14+1300';
  • 22. Deleting Data (Row and Columns) * You can delete a specific column: DELETE password FROM users.user_profile where userid = 1 AND checked_at='2016-06-21T09:14+1300'; * Or you can delete a whole row: DELETE FROM users.user_profile WHERE userid=1 AND checked_at='2016-06-21T09:14+1300';
  • 23. References • DataStax - http://www.datastax.com/documentation/cql/3.0/cql/cql_reference • Planet Cassandra - http://www.planetcassandra.org/blog/cql- cassandra-query-language/ • https://www.ibm.com/developerworks/library/os-apache-cassandra/ • http://mechanics.flite.com/blog/2013/11/05/breaking-down-the-cql- where-clause/ • http://hector-client.github.io/hector/build/html/index.html • http://www.ecyrd.com/cassandracalculator/