SlideShare uma empresa Scribd logo
1 de 15
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1
Outline
‱ Introduction
‱ Background
‱ Distributed Database Design
‱ Database Integration
‱ Semantic Data Control
‱ Distributed Query Processing
➡ Overview
➡ Query decomposition and localization
➡ Distributed query optimization
‱ Multidatabase Query Processing
‱ Distributed Transaction Management
‱ Data Replication
‱ Parallel Database Systems
‱ Distributed Object DBMS
‱ Peer-to-Peer Data Management
‱ Web Data Management
‱ Current Issues
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2
Query Processing in a DDBMS
high level user query
query
processor
Low-level data manipulation
commands for D-DBMS
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3
Query Processing Components
‱ Query language that is used
➡ SQL: “intergalactic dataspeak”
‱ Query execution methodology
➡ The steps that one goes through in executing high-level (declarative) user
queries.
‱ Query optimization
➡ How do we determine the “best” execution plan?
‱ We assume a homogeneous D-DBMS
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4
SELECT ENAME
FROM EMP,ASG
WHERE EMP.ENO = ASG.ENO
AND RESP = "Manager"
Strategy 1
ENAME( RESP=“Manager” EMP.ENO=ASG.ENO(EMP×ASG))
Strategy 2
ENAME(EMP ⋈ENO ( RESP=“Manager” (ASG))
Strategy 2 avoids Cartesian product, so may be “better”
Selecting Alternatives
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5
What is the Problem?
Site 1 Site 2 Site 3 Site 4 Site 5
EMP1= ENO≀“E3”(EMP) EMP2= ENO>“E3”(EMP)ASG2= ENO>“E3”(ASG)ASG1= ENO≀“E3”(ASG) Result
Site 5
Site 1 Site 2 Site 3 Site 4
ASG1 EMP1 EMP2ASG2Site 4Site 3
Site 1 Site 2
Site 5
EMP’
1=EMP1 ⋈ENO ASG’
1
'
2
EMPEMPresult
'
1
1Manager""RESP1
ASGσASG
'
2Manager""RESP2
ASGσASG
'
'
1
ASG
'
2
ASG
'
1
EMP
'
2
EMP
result= (EMP1 × EMP2)⋈ENOσRESP=“Manager”(ASG1× ASG2)
EMP’
2=EMP2 ⋈ENO ASG’
2
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6
Cost of Alternatives
‱ Assume
➡ size(EMP) = 400, size(ASG) = 1000
➡ tuple access cost = 1 unit; tuple transfer cost = 10 units
‱ Strategy 1
➡ produce ASG': (10+10) tuple access cost 20
➡ transfer ASG' to the sites of EMP: (10+10) tuple transfer cost 200
➡ produce EMP': (10+10) tuple access cost 2 40
➡ transfer EMP' to result site: (10+10) tuple transfer cost 200
Total Cost 460
‱ Strategy 2
➡ transfer EMP to site 5: 400 tuple transfer cost 4,000
➡ transfer ASG to site 5: 1000 tuple transfer cost 10,000
➡ produce ASG': 1000 tuple access cost 1,000
➡ join EMP and ASG': 400 20 tuple access cost 8,000
Total Cost 23,000
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7
Query Optimization Objectives
‱ Minimize a cost function
I/O cost + CPU cost + communication cost
These might have different weights in different distributed environments
‱ Wide area networks
➡ communication cost may dominate or vary much
✩ bandwidth
✩ speed
✩ high protocol overhead
‱ Local area networks
➡ communication cost not that dominant
➡ total cost function should be considered
‱ Can also maximize throughput
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/8
Complexity of Relational
Operations
‱ Assume
➡ relations of cardinality n
➡ sequential scan
Operation Complexity
Select
Project
(without duplicate elimination)
O(n)
Project
(with duplicate elimination)
Group
O(n log n)
Join
Semi-join
Division
Set Operators
O(n log n)
Cartesian Product O(n2)
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9
Query Optimization Issues –
Types Of Optimizers
‱ Exhaustive search
➡ Cost-based
➡ Optimal
➡ Combinatorial complexity in the number of relations
‱ Heuristics
➡ Not optimal
➡ Regroup common sub-expressions
➡ Perform selection, projection first
➡ Replace a join by a series of semijoins
➡ Reorder operations to reduce intermediate relation size
➡ Optimize individual operations
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/10
Query Optimization Issues –
Optimization Granularity
‱ Single query at a time
➡ Cannot use common intermediate results
‱ Multiple queries at a time
➡ Efficient if many similar queries
➡ Decision space is much larger
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11
Query Optimization Issues –
Optimization Timing
‱ Static
➡ Compilation  optimize prior to the execution
➡ Difficult to estimate the size of the intermediate results error
propagation
➡ Can amortize over many executions
➡ R*
‱ Dynamic
➡ Run time optimization
➡ Exact information on the intermediate relation sizes
➡ Have to reoptimize for multiple executions
➡ Distributed INGRES
‱ Hybrid
➡ Compile using a static algorithm
➡ If the error in estimate sizes > threshold, reoptimize at run time
➡ Mermaid
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12
Query Optimization Issues –
Statistics
‱ Relation
➡ Cardinality
➡ Size of a tuple
➡ Fraction of tuples participating in a join with another relation
‱ Attribute
➡ Cardinality of domain
➡ Actual number of distinct values
‱ Common assumptions
➡ Independence between different attribute values
➡ Uniform distribution of attribute values within their domain
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13
Query Optimization Issues –
Decision Sites
‱ Centralized
➡ Single site determines the “best” schedule
➡ Simple
➡ Need knowledge about the entire distributed database
‱ Distributed
➡ Cooperation among sites to determine the schedule
➡ Need only local information
➡ Cost of cooperation
‱ Hybrid
➡ One site determines the global schedule
➡ Each site optimizes the local subqueries
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14
Query Optimization Issues –
Network Topology
‱ Wide area networks (WAN) – point-to-point
➡ Characteristics
✩ Low bandwidth
✩ Low speed
✩ High protocol overhead
➡ Communication cost will dominate; ignore all other cost factors
➡ Global schedule to minimize communication cost
➡ Local schedules according to centralized query optimization
‱ Local area networks (LAN)
➡ Communication cost not that dominant
➡ Total cost function should be considered
➡ Broadcasting can be exploited (joins)
➡ Special algorithms exist for star networks
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15
Distributed Query Processing
Methodology
Calculus Query on Distributed Relations
CONTROL
SITE
LOCAL
SITES
Query
Decomposition
Data
Localization
Algebraic Query on Distributed
Relations
Global
Optimization
Fragment Query
Local
Optimization
Optimized Fragment Query
with Communication Operations
Optimized Local Queries
GLOBAL
SCHEMA
FRAGMENT
SCHEMA
STATS ON
FRAGMENTS
LOCAL
SCHEMAS

Mais conteĂșdo relacionado

Mais procurados

b+ tree
b+ treeb+ tree
b+ treebitistu
 
Quantifiers and its Types
Quantifiers and its TypesQuantifiers and its Types
Quantifiers and its TypesHumayunNaseer4
 
Database Normalization
Database NormalizationDatabase Normalization
Database NormalizationArun Sharma
 
Dbms 10: Conversion of ER model to Relational Model
Dbms 10: Conversion of ER model to Relational ModelDbms 10: Conversion of ER model to Relational Model
Dbms 10: Conversion of ER model to Relational ModelAmiya9439793168
 
Normalization in DBMS
Normalization in DBMSNormalization in DBMS
Normalization in DBMSHitesh Mohapatra
 
Red Black Trees
Red Black TreesRed Black Trees
Red Black TreesVarun Mahajan
 
Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Jargalsaikhan Alyeksandr
 
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...Hiroyuki KASAI
 
Databases: Normalisation
Databases: NormalisationDatabases: Normalisation
Databases: NormalisationDamian T. Gordon
 
Chapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberChapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberHouw Liong The
 
Properties of Square Numbers (Class 8) (Audio in Hindi)
Properties of Square Numbers (Class 8) (Audio in Hindi)Properties of Square Numbers (Class 8) (Audio in Hindi)
Properties of Square Numbers (Class 8) (Audio in Hindi)Parth Nagpal
 
Binary Search Tree in Data Structure
Binary Search Tree in Data StructureBinary Search Tree in Data Structure
Binary Search Tree in Data StructureDharita Chokshi
 
Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1Junaid Nadeem
 
Database Management System
Database Management SystemDatabase Management System
Database Management SystemNishant Munjal
 
479f3df10a8c0 mathsproject quadrilaterals
479f3df10a8c0 mathsproject quadrilaterals479f3df10a8c0 mathsproject quadrilaterals
479f3df10a8c0 mathsproject quadrilateralsvineeta yadav
 

Mais procurados (20)

b+ tree
b+ treeb+ tree
b+ tree
 
Quantifiers and its Types
Quantifiers and its TypesQuantifiers and its Types
Quantifiers and its Types
 
Database Normalization
Database NormalizationDatabase Normalization
Database Normalization
 
Dbms 10: Conversion of ER model to Relational Model
Dbms 10: Conversion of ER model to Relational ModelDbms 10: Conversion of ER model to Relational Model
Dbms 10: Conversion of ER model to Relational Model
 
Triangle
TriangleTriangle
Triangle
 
Normalization in DBMS
Normalization in DBMSNormalization in DBMS
Normalization in DBMS
 
Red Black Trees
Red Black TreesRed Black Trees
Red Black Trees
 
Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)
 
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
ICML2016: Low-rank tensor completion: a Riemannian manifold preconditioning a...
 
Databases: Normalisation
Databases: NormalisationDatabases: Normalisation
Databases: Normalisation
 
ER to Relational Mapping
ER to Relational MappingER to Relational Mapping
ER to Relational Mapping
 
Chapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & KamberChapter 11 cluster advanced, Han & Kamber
Chapter 11 cluster advanced, Han & Kamber
 
Heaps
HeapsHeaps
Heaps
 
Properties of Square Numbers (Class 8) (Audio in Hindi)
Properties of Square Numbers (Class 8) (Audio in Hindi)Properties of Square Numbers (Class 8) (Audio in Hindi)
Properties of Square Numbers (Class 8) (Audio in Hindi)
 
Binary Search Tree in Data Structure
Binary Search Tree in Data StructureBinary Search Tree in Data Structure
Binary Search Tree in Data Structure
 
Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1Presentation of DBMS (database management system) part 1
Presentation of DBMS (database management system) part 1
 
Trees
TreesTrees
Trees
 
Dependency preserving
Dependency preservingDependency preserving
Dependency preserving
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
479f3df10a8c0 mathsproject quadrilaterals
479f3df10a8c0 mathsproject quadrilaterals479f3df10a8c0 mathsproject quadrilaterals
479f3df10a8c0 mathsproject quadrilaterals
 

Destaque

Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localizationAli Usman
 
Query decomposition in data base
Query decomposition in data baseQuery decomposition in data base
Query decomposition in data baseSalman Memon
 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P Ali Usman
 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 TransactionsAli Usman
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 ReplicationAli Usman
 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 BackgroundAli Usman
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query OptimizationAli Usman
 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data IntegrationAli Usman
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Beat Signer
 
Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.Beingprp
 
Coyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivenciaCoyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivenciasebasecret
 
Anwar e-sabiri(complete)
Anwar e-sabiri(complete)Anwar e-sabiri(complete)
Anwar e-sabiri(complete)Ali Usman
 
Ethernet Technology
Ethernet Technology Ethernet Technology
Ethernet Technology Ali Usman
 
Hank Iving Media Plan
Hank Iving Media PlanHank Iving Media Plan
Hank Iving Media Planconfar90
 
Virgen de ChiquinquirĂĄ en Colombia
Virgen de ChiquinquirĂĄ en ColombiaVirgen de ChiquinquirĂĄ en Colombia
Virgen de ChiquinquirĂĄ en ColombiaMaria Daud
 
Mariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguoMariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguosebasecret
 
College Students
College StudentsCollege Students
College Studentsconfar90
 
Network internet
Network internetNetwork internet
Network internetKumar
 

Destaque (20)

Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localization
 
Query decomposition in data base
Query decomposition in data baseQuery decomposition in data base
Query decomposition in data base
 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P
 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 Transactions
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 Replication
 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 Background
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data Integration
 
Introduction to database
Introduction to databaseIntroduction to database
Introduction to database
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
 
Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.Query optimization and challenges in DDBMS with Review Algorithms.
Query optimization and challenges in DDBMS with Review Algorithms.
 
Coyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivenciaCoyaima ie. juan xxiii manual de convivencia
Coyaima ie. juan xxiii manual de convivencia
 
Anwar e-sabiri(complete)
Anwar e-sabiri(complete)Anwar e-sabiri(complete)
Anwar e-sabiri(complete)
 
BrunnerForbes2
BrunnerForbes2BrunnerForbes2
BrunnerForbes2
 
Ethernet Technology
Ethernet Technology Ethernet Technology
Ethernet Technology
 
Hank Iving Media Plan
Hank Iving Media PlanHank Iving Media Plan
Hank Iving Media Plan
 
Virgen de ChiquinquirĂĄ en Colombia
Virgen de ChiquinquirĂĄ en ColombiaVirgen de ChiquinquirĂĄ en Colombia
Virgen de ChiquinquirĂĄ en Colombia
 
Mariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguoMariquita iet francisco nuñez pedrozo manual convivencia antiguo
Mariquita iet francisco nuñez pedrozo manual convivencia antiguo
 
College Students
College StudentsCollege Students
College Students
 
Network internet
Network internetNetwork internet
Network internet
 

Semelhante a Database , 6 Query Introduction

6-Query_Intro (5).pdf
6-Query_Intro (5).pdf6-Query_Intro (5).pdf
6-Query_Intro (5).pdfJaveriaShoaib4
 
Hadoop Map Reduce OS
Hadoop Map Reduce OSHadoop Map Reduce OS
Hadoop Map Reduce OSVedant Mane
 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMSAli Usman
 
Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current IssuesAli Usman
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxPPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxneju3
 
HFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative AnatomyHFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative Anatomyaa026593
 
1 introduction
1 introduction1 introduction
1 introductionAmrit Kaur
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarKognitio
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...areej qasrawi
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big dataJ Singh
 
1 introduction DDBS
1 introduction DDBS1 introduction DDBS
1 introduction DDBSnaimanighat
 
Database , 1 Introduction
 Database , 1 Introduction Database , 1 Introduction
Database , 1 IntroductionAli Usman
 
fmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deckfmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deckConsortech
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Sumeet Singh
 
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successArchitecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successDataWorks Summit
 
Introduction of MapReduce
Introduction of MapReduceIntroduction of MapReduce
Introduction of MapReduceHC Lin
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at ScaleMateusz Dymczyk
 
Tools and best practices for sustainable software
Tools and best practices for sustainable softwareTools and best practices for sustainable software
Tools and best practices for sustainable softwareGreen Software Development
 

Semelhante a Database , 6 Query Introduction (20)

6-Query_Intro (5).pdf
6-Query_Intro (5).pdf6-Query_Intro (5).pdf
6-Query_Intro (5).pdf
 
Hadoop Map Reduce OS
Hadoop Map Reduce OSHadoop Map Reduce OS
Hadoop Map Reduce OS
 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMS
 
Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current Issues
 
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptxPPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
PPT-UEU-Database-Objek-Terdistribusi-Pertemuan-8.pptx
 
HFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative AnatomyHFM vs Essbase BSO: A Comparative Anatomy
HFM vs Essbase BSO: A Comparative Anatomy
 
1 introduction
1 introduction1 introduction
1 introduction
 
try
trytry
try
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...MapReduce:Simplified Data Processing on Large Cluster  Presented by Areej Qas...
MapReduce:Simplified Data Processing on Large Cluster Presented by Areej Qas...
 
Designing analytics for big data
Designing analytics for big dataDesigning analytics for big data
Designing analytics for big data
 
1 introduction DDBS
1 introduction DDBS1 introduction DDBS
1 introduction DDBS
 
Database , 1 Introduction
 Database , 1 Introduction Database , 1 Introduction
Database , 1 Introduction
 
Manjeet Singh.pptx
Manjeet Singh.pptxManjeet Singh.pptx
Manjeet Singh.pptx
 
fmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deckfmewt19 - Around the world stories master deck
fmewt19 - Around the world stories master deck
 
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
Hadoop Summit Brussels 2015: Architecting a Scalable Hadoop Platform - Top 10...
 
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for successArchitecting a Scalable Hadoop Platform: Top 10 considerations for success
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
 
Introduction of MapReduce
Introduction of MapReduceIntroduction of MapReduce
Introduction of MapReduce
 
Deep Learning at Scale
Deep Learning at ScaleDeep Learning at Scale
Deep Learning at Scale
 
Tools and best practices for sustainable software
Tools and best practices for sustainable softwareTools and best practices for sustainable software
Tools and best practices for sustainable software
 

Mais de Ali Usman

Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer OverviewAli Usman
 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and ArchitectureAli Usman
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 WebAli Usman
 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMSAli Usman
 
Database , 12 Reliability
Database , 12 ReliabilityDatabase , 12 Reliability
Database , 12 ReliabilityAli Usman
 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency ControlAli Usman
 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 SemanticAli Usman
 
Database, 3 Distribution Design
Database, 3 Distribution DesignDatabase, 3 Distribution Design
Database, 3 Distribution DesignAli Usman
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor SpecificationsAli Usman
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of MicroprocessorAli Usman
 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2Ali Usman
 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1 Ali Usman
 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyAli Usman
 
Muslim Contributions in Geography
Muslim Contributions in GeographyMuslim Contributions in Geography
Muslim Contributions in GeographyAli Usman
 
Muslim Contributions in Astronomy
Muslim Contributions in AstronomyMuslim Contributions in Astronomy
Muslim Contributions in AstronomyAli Usman
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor SpecificationsAli Usman
 
Ptcl modem (user manual)
Ptcl modem (user manual)Ptcl modem (user manual)
Ptcl modem (user manual)Ali Usman
 
Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Ali Usman
 
Muslim Contributions in Mathematics
Muslim Contributions in MathematicsMuslim Contributions in Mathematics
Muslim Contributions in MathematicsAli Usman
 
Osi protocols
Osi protocolsOsi protocols
Osi protocolsAli Usman
 

Mais de Ali Usman (20)

Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer Overview
 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and Architecture
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 Web
 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMS
 
Database , 12 Reliability
Database , 12 ReliabilityDatabase , 12 Reliability
Database , 12 Reliability
 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency Control
 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 Semantic
 
Database, 3 Distribution Design
Database, 3 Distribution DesignDatabase, 3 Distribution Design
Database, 3 Distribution Design
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of Microprocessor
 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2
 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1
 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-Astronomy
 
Muslim Contributions in Geography
Muslim Contributions in GeographyMuslim Contributions in Geography
Muslim Contributions in Geography
 
Muslim Contributions in Astronomy
Muslim Contributions in AstronomyMuslim Contributions in Astronomy
Muslim Contributions in Astronomy
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
 
Ptcl modem (user manual)
Ptcl modem (user manual)Ptcl modem (user manual)
Ptcl modem (user manual)
 
Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali Nimat-ul-ALLAH shah wali
Nimat-ul-ALLAH shah wali
 
Muslim Contributions in Mathematics
Muslim Contributions in MathematicsMuslim Contributions in Mathematics
Muslim Contributions in Mathematics
 
Osi protocols
Osi protocolsOsi protocols
Osi protocols
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Database , 6 Query Introduction

  • 1. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/1 Outline ‱ Introduction ‱ Background ‱ Distributed Database Design ‱ Database Integration ‱ Semantic Data Control ‱ Distributed Query Processing ➡ Overview ➡ Query decomposition and localization ➡ Distributed query optimization ‱ Multidatabase Query Processing ‱ Distributed Transaction Management ‱ Data Replication ‱ Parallel Database Systems ‱ Distributed Object DBMS ‱ Peer-to-Peer Data Management ‱ Web Data Management ‱ Current Issues
  • 2. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/2 Query Processing in a DDBMS high level user query query processor Low-level data manipulation commands for D-DBMS
  • 3. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/3 Query Processing Components ‱ Query language that is used ➡ SQL: “intergalactic dataspeak” ‱ Query execution methodology ➡ The steps that one goes through in executing high-level (declarative) user queries. ‱ Query optimization ➡ How do we determine the “best” execution plan? ‱ We assume a homogeneous D-DBMS
  • 4. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/4 SELECT ENAME FROM EMP,ASG WHERE EMP.ENO = ASG.ENO AND RESP = "Manager" Strategy 1 ENAME( RESP=“Manager” EMP.ENO=ASG.ENO(EMP×ASG)) Strategy 2 ENAME(EMP ⋈ENO ( RESP=“Manager” (ASG)) Strategy 2 avoids Cartesian product, so may be “better” Selecting Alternatives
  • 5. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/5 What is the Problem? Site 1 Site 2 Site 3 Site 4 Site 5 EMP1= ENO≀“E3”(EMP) EMP2= ENO>“E3”(EMP)ASG2= ENO>“E3”(ASG)ASG1= ENO≀“E3”(ASG) Result Site 5 Site 1 Site 2 Site 3 Site 4 ASG1 EMP1 EMP2ASG2Site 4Site 3 Site 1 Site 2 Site 5 EMP’ 1=EMP1 ⋈ENO ASG’ 1 ' 2 EMPEMPresult ' 1 1Manager""RESP1 ASGσASG ' 2Manager""RESP2 ASGσASG ' ' 1 ASG ' 2 ASG ' 1 EMP ' 2 EMP result= (EMP1 × EMP2)⋈ENOσRESP=“Manager”(ASG1× ASG2) EMP’ 2=EMP2 ⋈ENO ASG’ 2
  • 6. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/6 Cost of Alternatives ‱ Assume ➡ size(EMP) = 400, size(ASG) = 1000 ➡ tuple access cost = 1 unit; tuple transfer cost = 10 units ‱ Strategy 1 ➡ produce ASG': (10+10) tuple access cost 20 ➡ transfer ASG' to the sites of EMP: (10+10) tuple transfer cost 200 ➡ produce EMP': (10+10) tuple access cost 2 40 ➡ transfer EMP' to result site: (10+10) tuple transfer cost 200 Total Cost 460 ‱ Strategy 2 ➡ transfer EMP to site 5: 400 tuple transfer cost 4,000 ➡ transfer ASG to site 5: 1000 tuple transfer cost 10,000 ➡ produce ASG': 1000 tuple access cost 1,000 ➡ join EMP and ASG': 400 20 tuple access cost 8,000 Total Cost 23,000
  • 7. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/7 Query Optimization Objectives ‱ Minimize a cost function I/O cost + CPU cost + communication cost These might have different weights in different distributed environments ‱ Wide area networks ➡ communication cost may dominate or vary much ✩ bandwidth ✩ speed ✩ high protocol overhead ‱ Local area networks ➡ communication cost not that dominant ➡ total cost function should be considered ‱ Can also maximize throughput
  • 8. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/8 Complexity of Relational Operations ‱ Assume ➡ relations of cardinality n ➡ sequential scan Operation Complexity Select Project (without duplicate elimination) O(n) Project (with duplicate elimination) Group O(n log n) Join Semi-join Division Set Operators O(n log n) Cartesian Product O(n2)
  • 9. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/9 Query Optimization Issues – Types Of Optimizers ‱ Exhaustive search ➡ Cost-based ➡ Optimal ➡ Combinatorial complexity in the number of relations ‱ Heuristics ➡ Not optimal ➡ Regroup common sub-expressions ➡ Perform selection, projection first ➡ Replace a join by a series of semijoins ➡ Reorder operations to reduce intermediate relation size ➡ Optimize individual operations
  • 10. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/10 Query Optimization Issues – Optimization Granularity ‱ Single query at a time ➡ Cannot use common intermediate results ‱ Multiple queries at a time ➡ Efficient if many similar queries ➡ Decision space is much larger
  • 11. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/11 Query Optimization Issues – Optimization Timing ‱ Static ➡ Compilation  optimize prior to the execution ➡ Difficult to estimate the size of the intermediate results error propagation ➡ Can amortize over many executions ➡ R* ‱ Dynamic ➡ Run time optimization ➡ Exact information on the intermediate relation sizes ➡ Have to reoptimize for multiple executions ➡ Distributed INGRES ‱ Hybrid ➡ Compile using a static algorithm ➡ If the error in estimate sizes > threshold, reoptimize at run time ➡ Mermaid
  • 12. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/12 Query Optimization Issues – Statistics ‱ Relation ➡ Cardinality ➡ Size of a tuple ➡ Fraction of tuples participating in a join with another relation ‱ Attribute ➡ Cardinality of domain ➡ Actual number of distinct values ‱ Common assumptions ➡ Independence between different attribute values ➡ Uniform distribution of attribute values within their domain
  • 13. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/13 Query Optimization Issues – Decision Sites ‱ Centralized ➡ Single site determines the “best” schedule ➡ Simple ➡ Need knowledge about the entire distributed database ‱ Distributed ➡ Cooperation among sites to determine the schedule ➡ Need only local information ➡ Cost of cooperation ‱ Hybrid ➡ One site determines the global schedule ➡ Each site optimizes the local subqueries
  • 14. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/14 Query Optimization Issues – Network Topology ‱ Wide area networks (WAN) – point-to-point ➡ Characteristics ✩ Low bandwidth ✩ Low speed ✩ High protocol overhead ➡ Communication cost will dominate; ignore all other cost factors ➡ Global schedule to minimize communication cost ➡ Local schedules according to centralized query optimization ‱ Local area networks (LAN) ➡ Communication cost not that dominant ➡ Total cost function should be considered ➡ Broadcasting can be exploited (joins) ➡ Special algorithms exist for star networks
  • 15. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.6/15 Distributed Query Processing Methodology Calculus Query on Distributed Relations CONTROL SITE LOCAL SITES Query Decomposition Data Localization Algebraic Query on Distributed Relations Global Optimization Fragment Query Local Optimization Optimized Fragment Query with Communication Operations Optimized Local Queries GLOBAL SCHEMA FRAGMENT SCHEMA STATS ON FRAGMENTS LOCAL SCHEMAS