CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx

A
Asst.Prof. M.GokilavaniProfessor em KL university
CCS334 BIG DATAANALYTICS
(R-21 III (I Sem))
Department of Artificial Intelligence and Data Science )
Session 3
by
Asst.Prof.M.Gokilavani
NIET
9/19/2023 Department of AI & DS 1
TEXT BOOKS
• Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data,
Big Analytics: Emerging Business Intelligence and Analytic Trends for
Today's Businesses", Wiley, 2013.
• Eric Sammer, "Hadoop Operations", O'Reilley, 2012.
• Sadalage, Pramod J. “NoSQL distilled”, 2013.
REFERENCES
• E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive",
O'Reilley, 2012.
• Lars George, "HBase: The Definitive Guide", O'Reilley, 2011.
• Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010.
9/19/2023 Department of AI & DS 2
Topics covered in Unit 2 session
9/19/2023 Department of AI & DS 3
UNIT II NOSQL DATA MANAGEMENT
Introduction to NoSQL – aggregate data models – key-value and
document data models – relationships – graph databases – schema
less databases – materialized views – distribution models – master-
slave replication – consistency - Cassandra – Cassandra data model
– Cassandra examples – Cassandra clients.
Distribution Models
• Depending on the distribution model the data store can give us the ability:
• To handle large Quantity of data
• To process a grater read or write traffic
• To have more availability in the case of network slowdowns of
breakages.
• There are two path for distribution:
• Sharding: Sharding distributes different data across multiple
servers, so each server acts as the single source for a subset of
data.
• Replication: Replication copies data across multiple servers,
so each bit of data can be found in multiple places.
9/19/2023 Department of AI & DS 4
Distributed Models
• Single server
• Sharding
• Master Slave Replication
• Peer-to-Peer Replication
• Combining Sharding & Replication
9/19/2023 Department of AI & DS 5
Single Server
• It is the first and simplest distribution option.
• Also if NoSQL database are designed to run on a cluster they can be
used in a single server application.
• This make sense if a NoSQL database is more suited for the
application data model.
• Graph database are the more obvious.
• If data usage is most about processing aggregates, than a key or a
document store may be useful.
9/19/2023 Department of AI & DS 6
Sharding
• Often, a data store is busy because different people are accessing
different part of the dataset.
• In this cases we can support horizontal scalability by putting different
part of the data onto different servers (Sharding)
• The concept of sharding is not new as a part of application logic.
• It consists in put all the customer with surname A-D on one shard and
E-G to another
• This complicates the programming model as the application code
needs to distributed the load across the shards.
• In the ideal setting we have each user to talk one server and the load is
balanced.
• Of course the ideal case is rare.
9/19/2023 Department of AI & DS 7
9/19/2023 Department of AI & DS 8
Sharding and NoSQL
• In general, many NoSQL databases offers auto- sharding.
• This can make much easier to use sharding in an application.
• Sharding is especially valuable for performance because it improves
read and write performances.
• It scales read and writes on the different nodes of the same cluster.
9/19/2023 Department of AI & DS 9
Master – Slave Replication
• In this setting one node is designated as the master, or primary ad the
other as slaves.
• The master is the authoritative source for the date and designed to
process updates and send them to slaves.
• The slaves are used for read operations.
• This allows us to scale in data intensive dataset.
9/19/2023 Department of AI & DS 10
9/19/2023 Department of AI & DS 11
Master – Slave Replication
• We can scale horizontally by adding more slaves.
• But, we are limited by the ability of the master in processing incoming
data.
An advantage is read resilience.
Disadvantage:
• Also if the master fails the slaves can still handle read requests.
• Anyway writes are not allowed until the master is not restored.
9/19/2023 Department of AI & DS 12
Master – Slave Replication
• Another characteristic is that a slave can be appointed as master.
• Masters can be appointed manually or automatically.
• In order to achieve resilience we need that read and write paths are
different.
• This is normally done using separate database connections.
• Disadvantage: Replication in master-slave have the analyzed
advantages but it come with the problem of inconsistency.
• The readers reading from the slaves can read data not updated.
9/19/2023 Department of AI & DS 13
Master – Slave Replication in MongoDB
9/19/2023 Department of AI & DS 14
Peer to Peer Replication
• Master-Slave replication helps with read scalability but has problems
on scalability of writes.
• Moreover, it provides resilience on read but not on writes.
• The master is still a single point of failure.
• Peer-to-Peer attacks these problems by not having a master.
• All the replica are equal (accept writes and reads).
• With a Peer-to-Peer we can have node failures without lose write
capability and losing data.
9/19/2023 Department of AI & DS 15
9/19/2023 Department of AI & DS 16
Combine Sharing
• We have multiple masters, but each data has a single master.
• Depending on the configuration we can decide the master for each
group of data.
• Peer-to-Peer and sharding is a common strategy for column-family
databases.
• This is commonly composed using replication of the shards.
9/19/2023 Department of AI & DS 17
Topics to be covered in next session 4
• Cassandra data model
9/19/2023 Department of AI & DS 18
Thank you!!!
1 de 18

Recomendados

Lecture6 introduction to data streams por
Lecture6 introduction to data streamsLecture6 introduction to data streams
Lecture6 introduction to data streamshktripathy
3.7K visualizações19 slides
Data Streaming For Big Data por
Data Streaming For Big DataData Streaming For Big Data
Data Streaming For Big DataSeval Çapraz
747 visualizações32 slides
Big Data & Hadoop Tutorial por
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop TutorialEdureka!
90.2K visualizações54 slides
Hadoop File system (HDFS) por
Hadoop File system (HDFS)Hadoop File system (HDFS)
Hadoop File system (HDFS)Prashant Gupta
8.4K visualizações54 slides
Introduction to NoSQL Databases por
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
47.7K visualizações36 slides
Hadoop Security Architecture por
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security ArchitectureOwen O'Malley
30.2K visualizações30 slides

Mais conteúdo relacionado

Mais procurados

MIT Deep Learning Basics: Introduction and Overview by Lex Fridman por
MIT Deep Learning Basics: Introduction and Overview by Lex FridmanMIT Deep Learning Basics: Introduction and Overview by Lex Fridman
MIT Deep Learning Basics: Introduction and Overview by Lex FridmanPeerasak C.
2.1K visualizações69 slides
Data Engineering Basics por
Data Engineering BasicsData Engineering Basics
Data Engineering BasicsCatherine Kimani
5.7K visualizações25 slides
Big data ppt por
Big data pptBig data ppt
Big data pptThirunavukkarasu Ps
75K visualizações33 slides
Big data and Hadoop por
Big data and HadoopBig data and Hadoop
Big data and HadoopRahul Agarwal
82.3K visualizações18 slides
Hadoop introduction , Why and What is Hadoop ? por
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
71.4K visualizações34 slides
Introduction to NoSQL por
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQLPolarSeven Pty Ltd
7K visualizações15 slides

Mais procurados(20)

MIT Deep Learning Basics: Introduction and Overview by Lex Fridman por Peerasak C.
MIT Deep Learning Basics: Introduction and Overview by Lex FridmanMIT Deep Learning Basics: Introduction and Overview by Lex Fridman
MIT Deep Learning Basics: Introduction and Overview by Lex Fridman
Peerasak C.2.1K visualizações
Data Engineering Basics por Catherine Kimani
Data Engineering BasicsData Engineering Basics
Data Engineering Basics
Catherine Kimani5.7K visualizações
Big data and Hadoop por Rahul Agarwal
Big data and HadoopBig data and Hadoop
Big data and Hadoop
Rahul Agarwal82.3K visualizações
Hadoop introduction , Why and What is Hadoop ? por sudhakara st
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
sudhakara st71.4K visualizações
Introduction to NoSQL por PolarSeven Pty Ltd
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
PolarSeven Pty Ltd7K visualizações
Introduction to Map Reduce por Apache Apex
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
Apache Apex7.7K visualizações
Big data lecture notes por Mohit Saini
Big data lecture notesBig data lecture notes
Big data lecture notes
Mohit Saini32.1K visualizações
Hadoop Overview & Architecture por EMC
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC64.6K visualizações
5.1 mining data streams por Krish_ver2
5.1 mining data streams5.1 mining data streams
5.1 mining data streams
Krish_ver214.4K visualizações
Introduction to Hadoop por Apache Apex
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
Apache Apex14.4K visualizações
Introduction to hadoop por karthika karthi
Introduction to hadoopIntroduction to hadoop
Introduction to hadoop
karthika karthi1.1K visualizações
Introduction to Hadoop and Hadoop component por rebeccatho
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
rebeccatho2.7K visualizações
Nosql-Module 1 PPT.pptx por Radhika R
Nosql-Module 1 PPT.pptxNosql-Module 1 PPT.pptx
Nosql-Module 1 PPT.pptx
Radhika R352 visualizações
Market oriented Cloud Computing por Jithin Parakka
Market oriented Cloud ComputingMarket oriented Cloud Computing
Market oriented Cloud Computing
Jithin Parakka19.5K visualizações
Introducing Technologies for Handling Big Data by Jaseela por Student
Introducing Technologies for Handling Big Data by JaseelaIntroducing Technologies for Handling Big Data by Jaseela
Introducing Technologies for Handling Big Data by Jaseela
Student1.5K visualizações
Machine Learning Project por Abhishek Singh
Machine Learning ProjectMachine Learning Project
Machine Learning Project
Abhishek Singh17.5K visualizações
5 Data Modeling for NoSQL 1/2 por Fabio Fumarola
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
Fabio Fumarola31K visualizações
Introduction to HDFS por Bhavesh Padharia
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
Bhavesh Padharia2.5K visualizações
Free Training: How to Build a Lakehouse por Databricks
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks3.3K visualizações

Similar a CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx

Session 1 Introduction to NoSQL.pptx por
 Session 1 Introduction to NoSQL.pptx Session 1 Introduction to NoSQL.pptx
Session 1 Introduction to NoSQL.pptxAsst.Prof. M.Gokilavani
13 visualizações18 slides
Report 2.0.docx por
Report 2.0.docxReport 2.0.docx
Report 2.0.docxpinstechwork
7 visualizações6 slides
Introduction to NoSQL and MongoDB por
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDBAhmed Farag
158 visualizações26 slides
NoSQLDatabases por
NoSQLDatabasesNoSQLDatabases
NoSQLDatabasesAdi Challa
395 visualizações39 slides
No sql database por
No sql databaseNo sql database
No sql databasevishal gupta
364 visualizações25 slides
6 Data Modeling for NoSQL 2/2 por
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
12.1K visualizações45 slides

Similar a CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx(20)

Report 2.0.docx por pinstechwork
Report 2.0.docxReport 2.0.docx
Report 2.0.docx
pinstechwork7 visualizações
Introduction to NoSQL and MongoDB por Ahmed Farag
Introduction to NoSQL and MongoDBIntroduction to NoSQL and MongoDB
Introduction to NoSQL and MongoDB
Ahmed Farag158 visualizações
NoSQLDatabases por Adi Challa
NoSQLDatabasesNoSQLDatabases
NoSQLDatabases
Adi Challa395 visualizações
No sql database por vishal gupta
No sql databaseNo sql database
No sql database
vishal gupta364 visualizações
6 Data Modeling for NoSQL 2/2 por Fabio Fumarola
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
Fabio Fumarola12.1K visualizações
the rising no sql technology por INFOGAIN PUBLICATION
the rising no sql technologythe rising no sql technology
the rising no sql technology
INFOGAIN PUBLICATION126 visualizações
Report 1.0.docx por pinstechwork
Report 1.0.docxReport 1.0.docx
Report 1.0.docx
pinstechwork4 visualizações
Scaling Up vs. Scaling-out por Christopher Nadeau
Scaling Up vs. Scaling-outScaling Up vs. Scaling-out
Scaling Up vs. Scaling-out
Christopher Nadeau54 visualizações
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt... por PascalDesmarets1
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
PascalDesmarets159 visualizações
A survey on data mining and analysis in hadoop and mongo db por Alexander Decker
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
Alexander Decker307 visualizações
A survey on data mining and analysis in hadoop and mongo db por Alexander Decker
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
Alexander Decker542 visualizações
Processing Drone data @Scale por Dr Hajji Hicham
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @Scale
Dr Hajji Hicham727 visualizações
Cloud Computing: The Hard Problems Never Go Away por ZendCon
Cloud Computing: The Hard Problems Never Go AwayCloud Computing: The Hard Problems Never Go Away
Cloud Computing: The Hard Problems Never Go Away
ZendCon2.4K visualizações
Relational and non relational database 7 por abdulrahmanhelan
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
abdulrahmanhelan230 visualizações
Objectivity/DB: A Multipurpose NoSQL Database por InfiniteGraph
Objectivity/DB: A Multipurpose NoSQL DatabaseObjectivity/DB: A Multipurpose NoSQL Database
Objectivity/DB: A Multipurpose NoSQL Database
InfiniteGraph5.1K visualizações
NoSQL databases por Filip Ilievski
NoSQL databasesNoSQL databases
NoSQL databases
Filip Ilievski627 visualizações
NoSQL.pptx por ShamrizSadat
NoSQL.pptxNoSQL.pptx
NoSQL.pptx
ShamrizSadat6 visualizações
NoSQL Seminer por Partha Das
NoSQL SeminerNoSQL Seminer
NoSQL Seminer
Partha Das760 visualizações

Mais de Asst.Prof. M.Gokilavani

AI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdf por
AI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdfAI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdf
AI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdfAsst.Prof. M.Gokilavani
8 visualizações16 slides
AI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdf por
AI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdfAI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdf
AI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdfAsst.Prof. M.Gokilavani
13 visualizações34 slides
AI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdf por
AI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdfAI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdf
AI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdfAsst.Prof. M.Gokilavani
12 visualizações33 slides
AI3391 Session 13 searching with Non-Deterministic Actions and partial observ... por
AI3391 Session 13 searching with Non-Deterministic Actions and partial observ...AI3391 Session 13 searching with Non-Deterministic Actions and partial observ...
AI3391 Session 13 searching with Non-Deterministic Actions and partial observ...Asst.Prof. M.Gokilavani
8 visualizações17 slides
AI3391 Session 12 Local search in continious space.pptx por
AI3391 Session 12 Local search in continious space.pptxAI3391 Session 12 Local search in continious space.pptx
AI3391 Session 12 Local search in continious space.pptxAsst.Prof. M.Gokilavani
5 visualizações16 slides
AI3391 Session 11 Hill climbing algorithm.pptx por
AI3391 Session 11 Hill climbing algorithm.pptxAI3391 Session 11 Hill climbing algorithm.pptx
AI3391 Session 11 Hill climbing algorithm.pptxAsst.Prof. M.Gokilavani
3 visualizações22 slides

Mais de Asst.Prof. M.Gokilavani(20)

AI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdf por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdfAI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdf
AI3391 ARTIFICIAL INTELLIGENCE Assignment 1 questions.pdf
Asst.Prof. M.Gokilavani8 visualizações
AI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdf por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdfAI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdf
AI3391 ARTIFICIAL INTELLIGENCE UNIT II notes.pdf
Asst.Prof. M.Gokilavani13 visualizações
AI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdf por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdfAI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdf
AI3391 ARTIFICIAL INTELLIGENCE Unit I notes.pdf
Asst.Prof. M.Gokilavani12 visualizações
AI3391 Session 13 searching with Non-Deterministic Actions and partial observ... por Asst.Prof. M.Gokilavani
AI3391 Session 13 searching with Non-Deterministic Actions and partial observ...AI3391 Session 13 searching with Non-Deterministic Actions and partial observ...
AI3391 Session 13 searching with Non-Deterministic Actions and partial observ...
Asst.Prof. M.Gokilavani8 visualizações
AI3391 Session 12 Local search in continious space.pptx por Asst.Prof. M.Gokilavani
AI3391 Session 12 Local search in continious space.pptxAI3391 Session 12 Local search in continious space.pptx
AI3391 Session 12 Local search in continious space.pptx
Asst.Prof. M.Gokilavani5 visualizações
AI3391 Session 11 Hill climbing algorithm.pptx por Asst.Prof. M.Gokilavani
AI3391 Session 11 Hill climbing algorithm.pptxAI3391 Session 11 Hill climbing algorithm.pptx
AI3391 Session 11 Hill climbing algorithm.pptx
Asst.Prof. M.Gokilavani3 visualizações
AI3391 Session 10 A searching algorithm.pptx por Asst.Prof. M.Gokilavani
AI3391 Session 10 A searching algorithm.pptxAI3391 Session 10 A searching algorithm.pptx
AI3391 Session 10 A searching algorithm.pptx
Asst.Prof. M.Gokilavani3 visualizações
AI3391 Session 9 Greedy Best first search algorithm.pptx por Asst.Prof. M.Gokilavani
AI3391 Session 9 Greedy Best first search algorithm.pptxAI3391 Session 9 Greedy Best first search algorithm.pptx
AI3391 Session 9 Greedy Best first search algorithm.pptx
Asst.Prof. M.Gokilavani3 visualizações
AI3391 ARTIFICIAL INTELLIGENCE Session 8 Iterative deepening DFS and Bidirect... por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Session 8 Iterative deepening DFS and Bidirect...AI3391 ARTIFICIAL INTELLIGENCE Session 8 Iterative deepening DFS and Bidirect...
AI3391 ARTIFICIAL INTELLIGENCE Session 8 Iterative deepening DFS and Bidirect...
Asst.Prof. M.Gokilavani3 visualizações
AI3391 ARTIFICIAL INTELLIGENCE Session 7 Uniformed search strategies.pptx por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Session 7 Uniformed search strategies.pptxAI3391 ARTIFICIAL INTELLIGENCE Session 7 Uniformed search strategies.pptx
AI3391 ARTIFICIAL INTELLIGENCE Session 7 Uniformed search strategies.pptx
Asst.Prof. M.Gokilavani3 visualizações
AI3391 ARTIFICIAL INTELLIGENCE Session 6 Search algorithm.pptx por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Session 6 Search algorithm.pptxAI3391 ARTIFICIAL INTELLIGENCE Session 6 Search algorithm.pptx
AI3391 ARTIFICIAL INTELLIGENCE Session 6 Search algorithm.pptx
Asst.Prof. M.Gokilavani2 visualizações
AI3391 ARTIFICAL INTELLIGENCE Session 5 Problem Solving Agent and searching f... por Asst.Prof. M.Gokilavani
AI3391 ARTIFICAL INTELLIGENCE Session 5 Problem Solving Agent and searching f...AI3391 ARTIFICAL INTELLIGENCE Session 5 Problem Solving Agent and searching f...
AI3391 ARTIFICAL INTELLIGENCE Session 5 Problem Solving Agent and searching f...
Asst.Prof. M.Gokilavani3 visualizações
AI3391 ARTIFICAL INTELLIGENCE Session 4 Structure of agent .pptx por Asst.Prof. M.Gokilavani
AI3391 ARTIFICAL INTELLIGENCE Session 4 Structure of agent .pptxAI3391 ARTIFICAL INTELLIGENCE Session 4 Structure of agent .pptx
AI3391 ARTIFICAL INTELLIGENCE Session 4 Structure of agent .pptx
Asst.Prof. M.Gokilavani3 visualizações
AI3391 ARTIFICAL INTELLIGENCE Session 3 Nature of environment.pptx por Asst.Prof. M.Gokilavani
AI3391 ARTIFICAL INTELLIGENCE Session 3 Nature of environment.pptxAI3391 ARTIFICAL INTELLIGENCE Session 3 Nature of environment.pptx
AI3391 ARTIFICAL INTELLIGENCE Session 3 Nature of environment.pptx
Asst.Prof. M.Gokilavani3 visualizações
AI3391 ARTIFICIAL INTELLIGENCE Session 2 Types of Agent .pptx por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Session 2 Types of Agent .pptxAI3391 ARTIFICIAL INTELLIGENCE Session 2 Types of Agent .pptx
AI3391 ARTIFICIAL INTELLIGENCE Session 2 Types of Agent .pptx
Asst.Prof. M.Gokilavani2 visualizações
AI3391 ARTIFICIAL INTELLIGENCE Session 1A Agents and Enviroments.pptx por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Session 1A Agents and Enviroments.pptxAI3391 ARTIFICIAL INTELLIGENCE Session 1A Agents and Enviroments.pptx
AI3391 ARTIFICIAL INTELLIGENCE Session 1A Agents and Enviroments.pptx
Asst.Prof. M.Gokilavani3 visualizações
AI3391 ARTIFICIAL INTELLIGENCE Session 1 Introduction to AI.pptx por Asst.Prof. M.Gokilavani
AI3391 ARTIFICIAL INTELLIGENCE Session 1 Introduction to AI.pptxAI3391 ARTIFICIAL INTELLIGENCE Session 1 Introduction to AI.pptx
AI3391 ARTIFICIAL INTELLIGENCE Session 1 Introduction to AI.pptx
Asst.Prof. M.Gokilavani4 visualizações
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx por Asst.Prof. M.Gokilavani
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptxCCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx
CCS334 BIG DATA ANALYTICS Session 2 Types NoSQL.pptx
Asst.Prof. M.Gokilavani22 visualizações
CS403PC Operating System Lec 11 operation of process.pptx por Asst.Prof. M.Gokilavani
CS403PC Operating System Lec 11 operation of process.pptxCS403PC Operating System Lec 11 operation of process.pptx
CS403PC Operating System Lec 11 operation of process.pptx
Asst.Prof. M.Gokilavani3 visualizações
CS403PC Operating System Lec 10 context switching.pptx por Asst.Prof. M.Gokilavani
CS403PC Operating System Lec 10 context switching.pptxCS403PC Operating System Lec 10 context switching.pptx
CS403PC Operating System Lec 10 context switching.pptx
Asst.Prof. M.Gokilavani3 visualizações

Último

Advances in micro milling: From tool fabrication to process outcomes por
Advances in micro milling: From tool fabrication to process outcomesAdvances in micro milling: From tool fabrication to process outcomes
Advances in micro milling: From tool fabrication to process outcomesShivendra Nandan
7 visualizações18 slides
SNMPx por
SNMPxSNMPx
SNMPxAmatullahbutt
17 visualizações12 slides
DESIGN OF SPRINGS-UNIT4.pptx por
DESIGN OF SPRINGS-UNIT4.pptxDESIGN OF SPRINGS-UNIT4.pptx
DESIGN OF SPRINGS-UNIT4.pptxgopinathcreddy
19 visualizações47 slides
NEW SUPPLIERS SUPPLIES (copie).pdf por
NEW SUPPLIERS SUPPLIES (copie).pdfNEW SUPPLIERS SUPPLIES (copie).pdf
NEW SUPPLIERS SUPPLIES (copie).pdfgeorgesradjou
15 visualizações30 slides
Machine Element II Course outline.pdf por
Machine Element II Course outline.pdfMachine Element II Course outline.pdf
Machine Element II Course outline.pdfodatadese1
9 visualizações2 slides
_MAKRIADI-FOTEINI_diploma thesis.pptx por
_MAKRIADI-FOTEINI_diploma thesis.pptx_MAKRIADI-FOTEINI_diploma thesis.pptx
_MAKRIADI-FOTEINI_diploma thesis.pptxfotinimakriadi
8 visualizações32 slides

Último(20)

Advances in micro milling: From tool fabrication to process outcomes por Shivendra Nandan
Advances in micro milling: From tool fabrication to process outcomesAdvances in micro milling: From tool fabrication to process outcomes
Advances in micro milling: From tool fabrication to process outcomes
Shivendra Nandan7 visualizações
SNMPx por Amatullahbutt
SNMPxSNMPx
SNMPx
Amatullahbutt17 visualizações
DESIGN OF SPRINGS-UNIT4.pptx por gopinathcreddy
DESIGN OF SPRINGS-UNIT4.pptxDESIGN OF SPRINGS-UNIT4.pptx
DESIGN OF SPRINGS-UNIT4.pptx
gopinathcreddy19 visualizações
NEW SUPPLIERS SUPPLIES (copie).pdf por georgesradjou
NEW SUPPLIERS SUPPLIES (copie).pdfNEW SUPPLIERS SUPPLIES (copie).pdf
NEW SUPPLIERS SUPPLIES (copie).pdf
georgesradjou15 visualizações
Machine Element II Course outline.pdf por odatadese1
Machine Element II Course outline.pdfMachine Element II Course outline.pdf
Machine Element II Course outline.pdf
odatadese19 visualizações
_MAKRIADI-FOTEINI_diploma thesis.pptx por fotinimakriadi
_MAKRIADI-FOTEINI_diploma thesis.pptx_MAKRIADI-FOTEINI_diploma thesis.pptx
_MAKRIADI-FOTEINI_diploma thesis.pptx
fotinimakriadi8 visualizações
CHEMICAL KINETICS.pdf por AguedaGutirrez
CHEMICAL KINETICS.pdfCHEMICAL KINETICS.pdf
CHEMICAL KINETICS.pdf
AguedaGutirrez12 visualizações
MSA Website Slideshow (16).pdf por msaucla
MSA Website Slideshow (16).pdfMSA Website Slideshow (16).pdf
MSA Website Slideshow (16).pdf
msaucla68 visualizações
K8S Roadmap.pdf por MaryamTavakkoli2
K8S Roadmap.pdfK8S Roadmap.pdf
K8S Roadmap.pdf
MaryamTavakkoli26 visualizações
fakenews_DBDA_Mar23.pptx por deepmitra8
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
deepmitra814 visualizações
SUMIT SQL PROJECT SUPERSTORE 1.pptx por Sumit Jadhav
SUMIT SQL PROJECT SUPERSTORE 1.pptxSUMIT SQL PROJECT SUPERSTORE 1.pptx
SUMIT SQL PROJECT SUPERSTORE 1.pptx
Sumit Jadhav 13 visualizações
DevOps-ITverse-2023-IIT-DU.pptx por Anowar Hossain
DevOps-ITverse-2023-IIT-DU.pptxDevOps-ITverse-2023-IIT-DU.pptx
DevOps-ITverse-2023-IIT-DU.pptx
Anowar Hossain9 visualizações
Control Systems Feedback.pdf por LGGaming5
Control Systems Feedback.pdfControl Systems Feedback.pdf
Control Systems Feedback.pdf
LGGaming56 visualizações
Saikat Chakraborty Java Oracle Certificate.pdf por SaikatChakraborty787148
Saikat Chakraborty Java Oracle Certificate.pdfSaikat Chakraborty Java Oracle Certificate.pdf
Saikat Chakraborty Java Oracle Certificate.pdf
SaikatChakraborty78714815 visualizações
Instrumentation & Control Lab Manual.pdf por NTU Faisalabad
Instrumentation & Control Lab Manual.pdfInstrumentation & Control Lab Manual.pdf
Instrumentation & Control Lab Manual.pdf
NTU Faisalabad 5 visualizações
What is Unit Testing por Sadaaki Emura
What is Unit TestingWhat is Unit Testing
What is Unit Testing
Sadaaki Emura23 visualizações
START Newsletter 3 por Start Project
START Newsletter 3START Newsletter 3
START Newsletter 3
Start Project5 visualizações
sam_software_eng_cv.pdf por sammyigbinovia
sam_software_eng_cv.pdfsam_software_eng_cv.pdf
sam_software_eng_cv.pdf
sammyigbinovia5 visualizações

CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx

  • 1. CCS334 BIG DATAANALYTICS (R-21 III (I Sem)) Department of Artificial Intelligence and Data Science ) Session 3 by Asst.Prof.M.Gokilavani NIET 9/19/2023 Department of AI & DS 1
  • 2. TEXT BOOKS • Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013. • Eric Sammer, "Hadoop Operations", O'Reilley, 2012. • Sadalage, Pramod J. “NoSQL distilled”, 2013. REFERENCES • E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive", O'Reilley, 2012. • Lars George, "HBase: The Definitive Guide", O'Reilley, 2011. • Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010. 9/19/2023 Department of AI & DS 2
  • 3. Topics covered in Unit 2 session 9/19/2023 Department of AI & DS 3 UNIT II NOSQL DATA MANAGEMENT Introduction to NoSQL – aggregate data models – key-value and document data models – relationships – graph databases – schema less databases – materialized views – distribution models – master- slave replication – consistency - Cassandra – Cassandra data model – Cassandra examples – Cassandra clients.
  • 4. Distribution Models • Depending on the distribution model the data store can give us the ability: • To handle large Quantity of data • To process a grater read or write traffic • To have more availability in the case of network slowdowns of breakages. • There are two path for distribution: • Sharding: Sharding distributes different data across multiple servers, so each server acts as the single source for a subset of data. • Replication: Replication copies data across multiple servers, so each bit of data can be found in multiple places. 9/19/2023 Department of AI & DS 4
  • 5. Distributed Models • Single server • Sharding • Master Slave Replication • Peer-to-Peer Replication • Combining Sharding & Replication 9/19/2023 Department of AI & DS 5
  • 6. Single Server • It is the first and simplest distribution option. • Also if NoSQL database are designed to run on a cluster they can be used in a single server application. • This make sense if a NoSQL database is more suited for the application data model. • Graph database are the more obvious. • If data usage is most about processing aggregates, than a key or a document store may be useful. 9/19/2023 Department of AI & DS 6
  • 7. Sharding • Often, a data store is busy because different people are accessing different part of the dataset. • In this cases we can support horizontal scalability by putting different part of the data onto different servers (Sharding) • The concept of sharding is not new as a part of application logic. • It consists in put all the customer with surname A-D on one shard and E-G to another • This complicates the programming model as the application code needs to distributed the load across the shards. • In the ideal setting we have each user to talk one server and the load is balanced. • Of course the ideal case is rare. 9/19/2023 Department of AI & DS 7
  • 9. Sharding and NoSQL • In general, many NoSQL databases offers auto- sharding. • This can make much easier to use sharding in an application. • Sharding is especially valuable for performance because it improves read and write performances. • It scales read and writes on the different nodes of the same cluster. 9/19/2023 Department of AI & DS 9
  • 10. Master – Slave Replication • In this setting one node is designated as the master, or primary ad the other as slaves. • The master is the authoritative source for the date and designed to process updates and send them to slaves. • The slaves are used for read operations. • This allows us to scale in data intensive dataset. 9/19/2023 Department of AI & DS 10
  • 12. Master – Slave Replication • We can scale horizontally by adding more slaves. • But, we are limited by the ability of the master in processing incoming data. An advantage is read resilience. Disadvantage: • Also if the master fails the slaves can still handle read requests. • Anyway writes are not allowed until the master is not restored. 9/19/2023 Department of AI & DS 12
  • 13. Master – Slave Replication • Another characteristic is that a slave can be appointed as master. • Masters can be appointed manually or automatically. • In order to achieve resilience we need that read and write paths are different. • This is normally done using separate database connections. • Disadvantage: Replication in master-slave have the analyzed advantages but it come with the problem of inconsistency. • The readers reading from the slaves can read data not updated. 9/19/2023 Department of AI & DS 13
  • 14. Master – Slave Replication in MongoDB 9/19/2023 Department of AI & DS 14
  • 15. Peer to Peer Replication • Master-Slave replication helps with read scalability but has problems on scalability of writes. • Moreover, it provides resilience on read but not on writes. • The master is still a single point of failure. • Peer-to-Peer attacks these problems by not having a master. • All the replica are equal (accept writes and reads). • With a Peer-to-Peer we can have node failures without lose write capability and losing data. 9/19/2023 Department of AI & DS 15
  • 17. Combine Sharing • We have multiple masters, but each data has a single master. • Depending on the configuration we can decide the master for each group of data. • Peer-to-Peer and sharding is a common strategy for column-family databases. • This is commonly composed using replication of the shards. 9/19/2023 Department of AI & DS 17
  • 18. Topics to be covered in next session 4 • Cassandra data model 9/19/2023 Department of AI & DS 18 Thank you!!!