SlideShare uma empresa Scribd logo
1 de 20
Cassandra
How Stuff Works
Sergey Enin
(sergey_enin@epam.com)
AGENDA

Agenda

Introduction
Architecture

Partitioning & Replication
Data management
Data model
2
Introduction

3
INTRODUCTION: SELECTED CASES

Selected Cases
Who use Cassandra?
eBay has Cassandra supporting multiple
applications (Social Signals, Hunch, and many
time series use cases) with clusters spanning
several data centers.
Netflix is using Cassandra on AWS as a key
infrastructure component of its globally distributed
streaming product.
Shazam uses Cassandra cluster to power their
recommendations system.

and many others…

Check - http://www.datastax.com/cassandrausers

4
INTRODUCTION: MOST ADVANTAGES

Most advantages
Most advantages of Cassandra are:
• Fast writes.
• Tunable consistency.
• Decentralization.

• Integration with Hadoop.
5
Architecture

6
ARCHITECTURE: FAST WRITES

Fast writes
Cassandra is very fast on writes, cause of
use of Log-structured merge tree.

Process of inserting new record into Cassandra

7
ARCHITECTURE: FAST WRITE

How LSM-tree is done: Memtables and SSTables
2
1

3

1

Commit log – all data is written to the
commit log for durability.

2

SSTables are immutable. A row is typically stored across multiple
SSTable files.

3

Each SSTable has a bloom filter associated with it. The
bloom filter is used to check if a requested row key exists in
the SSTable before doing any disk seeks.

4

Deleted data is not immediately removed from disk.
A deleted column can reappear. Tombstones.
8
ARCHITECTURE: NETWORK ARCHITECTURE

Network architecture

• All nodes – are peers
(no master).
• Client specify set of Cassandra nodes and get
connected to first live node.
• Nodes are using gossip protocol.
9
Partitioning & replication

10
PARTITIONING & REPLICATION: DATA PARTITIONING

Data partitioning

Partitioner – determines, where first replica would live in the ring.

•

RandomPartitioner – default strategy, provides ±same load of all
nodes.

•

ByteOrderedPartitioner - orders rows lexically by key bytes, allows
range scans, not recommended.

11
PARTITIONING & REPLICATION: REPLICATION

Replication
Replication = replication factor
+ replica placement strategy
Replica placement strategy:
SimpleStrategy:
•
default strategy;
•
not taking
network topology
into account;

NetworkTopology
Strategy:
•
preferred, when
you have information
about network map
of your nodes;
12
Data management

13
DATA MANAGEMENT: DATA ACCESSING

Data accessing
READ + WRITES:

• Tunable consistency. Consistency level specify
how many nodes should answer for read/write
request(but writes goes to all replicas).
• Batches - sets a global consistency level and
client-supplied timestamp for all columns
written by the statements in the batch.

14
DATA MANAGEMENT: ACID

ACID
ACID

• Atomicity – writes are atomic at row level.
• Consistency – tunable consistency.
• Isolation – writes are invisible until they are
complete.
• Durability – writes are durable.
• Read-repair, anti-entropy node repair, hinted
handoff.

15
Data model

16
DATA MODEL: CASSANDRA`S DATA MODEL

Cassandra`s data model
Relational databases – you design
schema, based on entities and
relationships.
Cassandra – you design schema, based
on what queries you would like to
perform.

17
DATA MODEL: INDEXES

Indexes
An index is a data structure that allows for
fast, efficient lookup of data matching a given
condition.
Primary key – the unique key used to identify
each row in a table.
Secondary indexes – refer to indexes on
column values.

18
DATA MODEL: CQL3

CQL3
cqlsh> INSERT INTO users
(user_name, password)
VALUES ('jsmith', 'ch@ngem3a');
cqlsh> SELECT * FROM users WHERE
user_name='jsmith';
user_name | password | state
-----------+-----------+------jsmith | ch@ngem3a | null
Confidential

19
THANK YOU!

Thank you!

20

Mais conteúdo relacionado

Mais procurados

Connecting kafka message systems with scylla
Connecting kafka message systems with scylla   Connecting kafka message systems with scylla
Connecting kafka message systems with scylla Maheedhar Gunturu
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Dave Gardner
 
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the CloudCassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the CloudDataStax
 
DynomiteDB - No spof High-availability Redis cluster solution
DynomiteDB -  No spof High-availability Redis cluster solutionDynomiteDB -  No spof High-availability Redis cluster solution
DynomiteDB - No spof High-availability Redis cluster solutionLeandro Totino Pereira
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesDataStax Academy
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Clustrix
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProRedis Labs
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical dataOleksandr Semenov
 
Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Clustrix
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsDataStax Academy
 
Discover some "Big Data" architectural concepts with Redis
Discover some  "Big Data" architectural concepts with  Redis Discover some  "Big Data" architectural concepts with  Redis
Discover some "Big Data" architectural concepts with Redis Maturin BADO
 
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with CassandraCisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with CassandraDataStax Academy
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScyllaDB
 
vSphere With OpenStack
vSphere With OpenStackvSphere With OpenStack
vSphere With OpenStackKenneth Hui
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScyllaDB
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2ScyllaDB
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosDiscover Pinterest
 

Mais procurados (20)

Connecting kafka message systems with scylla
Connecting kafka message systems with scylla   Connecting kafka message systems with scylla
Connecting kafka message systems with scylla
 
Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2Running Cassandra on Amazon EC2
Running Cassandra on Amazon EC2
 
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the CloudCassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
 
DynomiteDB - No spof High-availability Redis cluster solution
DynomiteDB -  No spof High-availability Redis cluster solutionDynomiteDB -  No spof High-availability Redis cluster solution
DynomiteDB - No spof High-availability Redis cluster solution
 
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the ScenesCassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS Beyond Aurora. Scale-out SQL databases for AWS
Beyond Aurora. Scale-out SQL databases for AWS
 
HIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoProHIgh Performance Redis- Tague Griffith, GoPro
HIgh Performance Redis- Tague Griffith, GoPro
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
Cassandra for mission critical data
Cassandra for mission critical dataCassandra for mission critical data
Cassandra for mission critical data
 
Zabbix at scale with Elasticsearch
Zabbix at scale with ElasticsearchZabbix at scale with Elasticsearch
Zabbix at scale with Elasticsearch
 
Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.Achieve new levels of performance for Magento e-commerce sites.
Achieve new levels of performance for Magento e-commerce sites.
 
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra OpsBeginning Operations: 7 Deadly Sins for Apache Cassandra Ops
Beginning Operations: 7 Deadly Sins for Apache Cassandra Ops
 
Discover some "Big Data" architectural concepts with Redis
Discover some  "Big Data" architectural concepts with  Redis Discover some  "Big Data" architectural concepts with  Redis
Discover some "Big Data" architectural concepts with Redis
 
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with CassandraCisco UCS Integrated Infrastructure for Big Data with Cassandra
Cisco UCS Integrated Infrastructure for Big Data with Cassandra
 
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDBScylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
 
vSphere With OpenStack
vSphere With OpenStackvSphere With OpenStack
vSphere With OpenStack
 
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond CassandraScylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
 
Scaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and MesosScaling Big Data with Hadoop and Mesos
Scaling Big Data with Hadoop and Mesos
 

Destaque

Sergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_CertificateSergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_CertificateSergey Enin
 
Approaching graph db
Approaching graph dbApproaching graph db
Approaching graph dbSergey Enin
 
Cassandra Read/Write Paths
Cassandra Read/Write PathsCassandra Read/Write Paths
Cassandra Read/Write Pathsjdsumsion
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBMongoDB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDBAlex Sharp
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBRavi Teja
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 

Destaque (13)

122201661137PM
122201661137PM122201661137PM
122201661137PM
 
Sergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_CertificateSergey Enin-ScrumAlliance_CSM_Certificate
Sergey Enin-ScrumAlliance_CSM_Certificate
 
BigDataNerds
BigDataNerdsBigDataNerds
BigDataNerds
 
Approaching graph db
Approaching graph dbApproaching graph db
Approaching graph db
 
Cassandra Read/Write Paths
Cassandra Read/Write PathsCassandra Read/Write Paths
Cassandra Read/Write Paths
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
 
Graph database
Graph databaseGraph database
Graph database
 
An Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDBAn Enterprise Architect's View of MongoDB
An Enterprise Architect's View of MongoDB
 
Mongo DB
Mongo DBMongo DB
Mongo DB
 
Intro To MongoDB
Intro To MongoDBIntro To MongoDB
Intro To MongoDB
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Cassandra NoSQL Tutorial
Cassandra NoSQL TutorialCassandra NoSQL Tutorial
Cassandra NoSQL Tutorial
 

Semelhante a Cassandra presentation

Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataChen Robert
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdfhothyfa
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...Qian Lin
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseJoe Alex
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...DataStax
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Clustrix
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDBI Goo Lee
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelRishikese MR
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranData Con LA
 
Breakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkBreakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkEvan Chan
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 

Semelhante a Cassandra presentation (20)

Cassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting dataCassandra implementation for collecting data and presenting data
Cassandra implementation for collecting data and presenting data
 
Cassandra
CassandraCassandra
Cassandra
 
Cassandra tutorial
Cassandra tutorialCassandra tutorial
Cassandra tutorial
 
04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf04-Introduction-to-CassandraDB-.pdf
04-Introduction-to-CassandraDB-.pdf
 
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
A Survey of Advanced Non-relational Database Systems: Approaches and Applicat...
 
NoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed DatabaseNoSQL A brief look at Apache Cassandra Distributed Database
NoSQL A brief look at Apache Cassandra Distributed Database
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Cassandra training
Cassandra trainingCassandra training
Cassandra training
 
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
Designing & Optimizing Micro Batching Systems Using 100+ Nodes (Ananth Ram, R...
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
Tech Talk Series, Part 2: Why is sharding not smart to do in MySQL?
 
NoSQL_Night
NoSQL_NightNoSQL_Night
NoSQL_Night
 
No sql
No sqlNo sql
No sql
 
Why Cassandra?
Why Cassandra?Why Cassandra?
Why Cassandra?
 
Introduction to ClustrixDB
Introduction to ClustrixDBIntroduction to ClustrixDB
Introduction to ClustrixDB
 
The No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra ModelThe No SQL Principles and Basic Application Of Casandra Model
The No SQL Principles and Basic Application Of Casandra Model
 
Cassandra Database
Cassandra DatabaseCassandra Database
Cassandra Database
 
Apache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda MoranApache Cassandra and The Multi-Cloud by Amanda Moran
Apache Cassandra and The Multi-Cloud by Amanda Moran
 
Breakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and SparkBreakthrough OLAP performance with Cassandra and Spark
Breakthrough OLAP performance with Cassandra and Spark
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 

Último

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 

Último (20)

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Cassandra presentation

  • 1. Cassandra How Stuff Works Sergey Enin (sergey_enin@epam.com)
  • 4. INTRODUCTION: SELECTED CASES Selected Cases Who use Cassandra? eBay has Cassandra supporting multiple applications (Social Signals, Hunch, and many time series use cases) with clusters spanning several data centers. Netflix is using Cassandra on AWS as a key infrastructure component of its globally distributed streaming product. Shazam uses Cassandra cluster to power their recommendations system. and many others… Check - http://www.datastax.com/cassandrausers 4
  • 5. INTRODUCTION: MOST ADVANTAGES Most advantages Most advantages of Cassandra are: • Fast writes. • Tunable consistency. • Decentralization. • Integration with Hadoop. 5
  • 7. ARCHITECTURE: FAST WRITES Fast writes Cassandra is very fast on writes, cause of use of Log-structured merge tree. Process of inserting new record into Cassandra 7
  • 8. ARCHITECTURE: FAST WRITE How LSM-tree is done: Memtables and SSTables 2 1 3 1 Commit log – all data is written to the commit log for durability. 2 SSTables are immutable. A row is typically stored across multiple SSTable files. 3 Each SSTable has a bloom filter associated with it. The bloom filter is used to check if a requested row key exists in the SSTable before doing any disk seeks. 4 Deleted data is not immediately removed from disk. A deleted column can reappear. Tombstones. 8
  • 9. ARCHITECTURE: NETWORK ARCHITECTURE Network architecture • All nodes – are peers (no master). • Client specify set of Cassandra nodes and get connected to first live node. • Nodes are using gossip protocol. 9
  • 11. PARTITIONING & REPLICATION: DATA PARTITIONING Data partitioning Partitioner – determines, where first replica would live in the ring. • RandomPartitioner – default strategy, provides ±same load of all nodes. • ByteOrderedPartitioner - orders rows lexically by key bytes, allows range scans, not recommended. 11
  • 12. PARTITIONING & REPLICATION: REPLICATION Replication Replication = replication factor + replica placement strategy Replica placement strategy: SimpleStrategy: • default strategy; • not taking network topology into account; NetworkTopology Strategy: • preferred, when you have information about network map of your nodes; 12
  • 14. DATA MANAGEMENT: DATA ACCESSING Data accessing READ + WRITES: • Tunable consistency. Consistency level specify how many nodes should answer for read/write request(but writes goes to all replicas). • Batches - sets a global consistency level and client-supplied timestamp for all columns written by the statements in the batch. 14
  • 15. DATA MANAGEMENT: ACID ACID ACID • Atomicity – writes are atomic at row level. • Consistency – tunable consistency. • Isolation – writes are invisible until they are complete. • Durability – writes are durable. • Read-repair, anti-entropy node repair, hinted handoff. 15
  • 17. DATA MODEL: CASSANDRA`S DATA MODEL Cassandra`s data model Relational databases – you design schema, based on entities and relationships. Cassandra – you design schema, based on what queries you would like to perform. 17
  • 18. DATA MODEL: INDEXES Indexes An index is a data structure that allows for fast, efficient lookup of data matching a given condition. Primary key – the unique key used to identify each row in a table. Secondary indexes – refer to indexes on column values. 18
  • 19. DATA MODEL: CQL3 CQL3 cqlsh> INSERT INTO users (user_name, password) VALUES ('jsmith', 'ch@ngem3a'); cqlsh> SELECT * FROM users WHERE user_name='jsmith'; user_name | password | state -----------+-----------+------jsmith | ch@ngem3a | null Confidential 19