SlideShare uma empresa Scribd logo
1 de 50
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/1
Outline
• Introduction
➡ What is a distributed DBMS
➡ Distributed DBMS Architecture
• Background
• Distributed Database Design
• Database Integration
• Semantic Data Control
• Distributed Query Processing
• Multidatabase query processing
• Distributed Transaction Management
• Data Replication
• Parallel Database Systems
• Distributed Object DBMS
• Peer-to-Peer Data Management
• Web Data Management
• Current Issues
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/2
File Systems
program 1
data description 1
program 2
data description 2
program 3
data description 3
File 1
File 2
File 3
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/3
Database Management
database
DBMS
Application
program 1
(with data
semantics)
Application
program 2
(with data
semantics)
Application
program 3
(with data
semantics)
description
manipulation
control
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/4
Motivation
Database
Technology
Computer
Networks
integration distribution
integration
integration ≠ centralization
Distributed
Database
Systems
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/5
Distributed Computing
• A number of autonomous processing elements (not necessarily
homogeneous) that are interconnected by a computer network and that
cooperate in performing their assigned tasks.
• What is being distributed?
➡ Processing logic
➡ Function
➡ Data
➡ Control
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/6
What is a Distributed Database
System?
A distributed database (DDB) is a collection of multiple, logically
interrelated databases distributed over a computer network.
A distributed database management system (D–DBMS) is the software
that manages the DDB and provides an access mechanism that makes this
distribution transparent to the users.
Distributed database system (DDBS) = DDB + D–DBMS
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/7
What is not a DDBS?
• A timesharing computer system
• A loosely (separate primary memory and shared secondary memory) or
tightly coupled (shared memory) multiprocessor system
• A database system which resides at one of the nodes of a network of
computers - this is a centralized database on a network node
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/8
Centralized DBMS on a Network
Site 5
Site 1
Site 2
Site 3Site 4
Communication
Network
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/9
Distributed DBMS Environment
Site 5
Site 1
Site 2
Site 3Site 4
Communication
Network
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/10
Implicit Assumptions
• Data stored at a number of sites  each site logically consists of a single
processor.
• Processors at different sites are interconnected by a computer network 
not a multiprocessor system
➡ Parallel database systems
• Distributed database is a database, not a collection of files  data logically
related as exhibited in the users’ access patterns
➡ Relational data model
• D-DBMS is a full-fledged DBMS
➡ Not remote file system, not a TP system
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/11
Data Delivery Alternatives
• We characterize the data delivery alternatives along three orthogonal
dimensions:
• Delivery modes
• Frequency
• Communication Methods
• Note: not all combinations make sense
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/12
Data delivery
• Delivery modes
➡ Pull-only {the transfer of data from servers to clients is initiated by a client
pull}
• Push-only {the transfer of data from servers to clients is initiated by a
server push in the absence of any specific request from clients. periodic,
irregular, or conditional}
➡ Hybrid (mix of pull and push)
• Frequency
• Periodic (A client request for IBM’s stock price every week is an example
of a periodic pull.)
• Conditional (An application that sends out stock prices only when they
change is an example of conditional push.)
➡ Ad-hoc or irregular
• Communication Methods {Unicast, One-to-many}
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/13
Distributed DBMS Promises
Transparent management of distributed, fragmented, and replicated data
Improved reliability/availability through distributed transactions
Improved performance
Easier and more economical system expansion
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/14
Transparency
• Transparency is the separation of the higher level semantics of a system
from the lower level implementation issues.
• Fundamental issue is to provide
data independence
in the distributed environment
➡ Network (distribution) transparency
➡ Replication transparency
➡ Fragmentation transparency
✦ horizontal fragmentation: selection
✦ vertical fragmentation: projection
✦ hybrid
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/15
Example
SELECT ENAME,SAL
FROM EMP,ASG,PAY
WHERE DUR > 12
AND EMP.ENO = ASG.ENO
AND PAY.TITLE =
EMP.TITLE
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/16
Transparent Access
SELECT ENAME,SAL
FROM EMP,ASG,PAY
WHERE DUR > 12
AND EMP.ENO = ASG.ENO
AND PAY.TITLE = EMP.TITLE
Paris projects
Paris employees
Paris assignments
Boston employees
Montreal projects
Paris projects
New York projects
with budget > 200000
Montreal employees
Montreal assignments
Boston
Communication
Network
Montreal
Paris
New
York
Boston projects
Boston employees
Boston assignments
Boston projects
New York employees
New York projects
New York assignments
Tokyo
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/17
Distributed Database - User View
Distributed Database
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/18
Distributed DBMS - Reality
Communication
Subsystem
DBMS
Software
User
ApplicationUser
Query
DBMS
Software
DBMS
Software
DBMS
Software
User
Query
DBMS
Software
User
Query
User
Application
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/19
Types of Transparency
• Data independence {It refers to the immunity of user applications to
changes in the definition and organization of data, and vice versa.
• Logical data independence and physical data independence}
• Network transparency (or distribution transparency)
➡ Location transparency
➡ Fragmentation transparency
• Replication transparency
• Fragmentation transparency
{global queries to fragment gueries}
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/20
Who Should Provide
Transparency?
• Nevertheless, the level of transparency is inevitably a compromise between
ease of use and the difficulty and overhead cost of providing high levels of
transparency.
• Gray argues that full transparency makes the management of distributed
data very difficult and claims that “applications coded with transparent
access to geographically distributed databases have: poor manageability,
poor modularity, and poor message performance”.
• He proposes a remote procedure call mechanism between the requestor
• users and the server DBMSs whereby the users would direct their queries
to a specific DBMS.
• Application level {code of application, little transperancy}
• Operating system {device drivers within the operating system}
• DBMS
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/21
Reliability Through Transactions
• Replicated components and data should make distributed DBMS more
reliable. {eliminate single points of failure}
• Distributed transactions provide
• Concurrency transparency {sequence of database operations executed as
an atomic action. consistent db transformed to another consistent db state}
➡ Failure atomicity {update salary by 10%}
• Distributed transaction support requires implementation of
➡ Distributed concurrency control protocols
➡ Commit protocols
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/22
Potentially Improved
Performance
• Proximity of data to its points of use {data localization}
➡ Requires some support for fragmentation and replication
1. Since each site handles only a portion of the database, contention for
CPU and I/O services is not as severe as for centralized databases.
2. Localization reduces remote access delays that are usually involved in
wide area networks (for example, the minimum round-trip message
propagation delay in satellite-based systems is about 1 second).
• Parallelism in execution
➡ Inter-query parallelism
➡ Intra-query parallelism
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/23
• Parallelism Requirements
• read only queries
Have as much of the data required by each application at the site where the
application executes
➡ Full replication
• How about updates?
➡ Mutual consistency
➡ Freshness of copies
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/24
System Expansion
• Issue is database scaling
• Emergence of microprocessor and workstation technologies
➡ Demise of Grosh's law
➡ Client-server model of computing
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/25
Complications Introduced by
Distribution
• data may be replicated, the distributed database system is responsible for
(1) choosing one of the stored copies of the requested data for access in
case of retrievals, and
(2) making sure that the effect of an update is reflected on each and every
copy of that data item.
• if some sites fail or communication fail,
DBMS will ensure update for fail site as soon as system recovers
• the synchronization of transactions on multiple sites is considerably
harder than for a centralized system.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/26
Distributed DBMS Issues
• Distributed Database Design {chapter 3}
➡ How to distribute the database {portioned and replicated}
➡ Replicated (partial dupliacated or fully duplicated) & non-replicated database
distribution
➡ Fragmentation
➡ {research area to minimize cost of storing, processing transactions and
communication is NP hard. Proposed solution are based on heuristics}
• Distributed directory management {chapter 3}
➡ Contains information such as description and location about items in db.
➡ Global directory to entire DDBS or local to each site, centralized or
distributed, single copy or multiple copy
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/27
Distributed DBMS Issues
• Query Processing {chapter 6-8}
➡ Convert user transactions to data manipulation instructions
➡ Optimization problem
✦ min{cost = data transmission + local processing}
➡ General formulation is NP-hard
• Concurrency Control {chapter 11}
➡ Synchronization of concurrent accesses
➡ The condition that requires all the values of multiple copies of every data item
to converge to the same value is called mutual consistency.
➡ Consistency and isolation of transactions' effects
➡ Deadlock management
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/28
• Distributed deadlock management
• The deadlock problem in DDBSs is similar in nature to that encountered in
operating systems. The competition among users for access to a set of
resources (data, in this case) can result in a deadlock if the synchronization
mechanism is based on locking.
• The well-known alternatives of prevention, avoidance, and
detection/recovery also apply to DDBSs.
• Reliability and availability
➡ How to make the system resilient to failures
➡ Atomicity and durability
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/29
Directory
Management
Relationship Between Issues
Reliability
Deadlock
Management
Query
Processing
Concurrency
Control
Distribution
Design
The design of
distributed databases affects many areas. It affects
directory management, because the
definition of fragments and their placement determine the
contents of the directory
(or directories) as well as the strategies that may be
employed to manage them.
The same information (i.e., fragment structure
and placement) is used by the query
processor to determine the query evaluation
strategy.
On the other hand, the access
and usage patterns that are determined by the
query processor are used as inputs to
the data distribution and fragmentation
algorithms. Similarly, directory placement
and contents influence the processing of
queries.
The replication of
fragments when
they are
distributed affects
the concurrency
control strategies
that might be
employed.
There is a strong relationship among the concurrency control
problem, the deadlock
management problem, and reliability issues. This is to be expected,
since together
they are usually called the transaction management problem. The
concurrency
control algorithm that is employed will determine whether or not a
separate deadlock
management facility is required. If a locking-based algorithm is
used, deadlocks will
occur, whereas they will not if timestamping is the chosen
alternative.
Finally, the need for replication protocols arise if
data distribution involves replicas.
As indicated above, there is a strong
relationship between replication protocols and
concurrency control techniques, since both deal
with the consistency of data, but from
different perspectives.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/30
Architecture
• Defines the structure of the system
➡ components identified
➡ functions of each component defined
➡ interrelationships and interactions between components defined
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/31
ANSI/SPARC Architecture
External
Schema
Conceptual
Schema
Internal
Schema
Internal view
Users
External
view
Conceptual
view
External
view
External
view
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/32
Differences between Three Levels
of ANSI-SPARC Architecture
© Pearson Education Limited 1995, 2005
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/33
Data Independence
• Logical Data Independence
➡ Refers to immunity of external schemas to changes in conceptual
schema.
➡ Conceptual schema changes (e.g. addition/removal of entities).
➡ Should not require changes to external schema or rewrites of
application programs.
© Pearson Education Limited 1995, 2005
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/34
Data Independence
• Physical Data Independence
➡ Refers to immunity of conceptual schema to changes in the internal
schema.
➡ Internal schema changes (e.g. using different file organizations, storage
structures/devices).
➡ Should not require change to conceptual or external schemas.
© Pearson Education Limited 1995, 2005
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/35
Data Independence and the ANSI-
SPARC Three-Level Architecture
© Pearson Education Limited 1995, 2005
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/36
Generic DBMS Architecture
The interface layer manages the
interface to the applications.
View management consists of
translating the user query from
external data to
conceptual data.The control layer controls the
query by adding semantic integrity
predicates and
authorization predicates.The query processing (or
compilation) layer maps the query
into an optimized
sequence of lower-level
operations.
decomposes the query into a tree of
algebra operations and tries to find the
“optimal”
ordering of the operations. The result
is stored in an access plan. The output
of this
layer is a query expressed in lower-
level code (algebra operations).
The execution layer directs the
execution of the access plans,
including transaction
management (commit, restart) and
synchronization of algebra
operations
The data access layer manages
the data structures that
implement the files, indices,
etc. It also manages the buffers
by caching the most frequently
accessed data.
Finally, the consistency layer
manages concurrency control
and logging for update
requests. This layer allows
transaction, system, and media
recovery after failure.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/37
DBMS Implementation
Alternatives
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/38
Autonomy
Autonomy is a function of a number of factors such as whether the
component systems (i.e., individual DBMSs) exchange information,
whether they can independently execute transactions, and whether
one is allowed to modify them.
Degree to which member databases can operate independently
1. The local operations of the individual DBMSs are not affected
by their participation in the distributed system.
2. The manner in which the individual DBMSs process queries and
optimize them should not be affected by the execution of global
queries that access multiple databases.
3. System consistency or operation should not be compromised
when individual DBMSs join or leave the distributed system.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/39
Dimensions of the Problem
• Distribution
➡ Whether the components of the system are located on the same machine or not
• Heterogeneity
➡ Various levels (hardware, communications, operating system)
➡ DBMS important one
✦ data model, query language, transaction management algorithms
• Autonomy
➡ Not well understood and most troublesome
➡ Various versions
✦ Design autonomy: Ability of a component DBMS to decide on issues related to its own
design.
✦ Communication autonomy: Ability of a component DBMS to decide whether and how to
communicate with other DBMSs.
✦ Execution autonomy: Ability of a component DBMS to execute local operations in any
manner it wants to.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/40
In Figure 1.10, we have identified three alternative
architectures that are the focus of this book and that we
discus in more detail in the next three subsections: (A0,
D1, H0) that corresponds to client/server distributed
DBMSs, (A0, D2, H0) that is a peer-to-peer distributed
DBMS and (A2, D2, H1) which represents a (peer-topeer)
distributed, heterogeneous multidatabase system. Note
that we discuss the heterogeneity issues within the context
of one system architecture, although the issue
arises in other models as well.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/41
Client/Server Architecture
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/42
Advantages of Client-Server
Architectures
• More efficient division of labor
• Horizontal and vertical scaling of resources
• Better price/performance on client machines
• Ability to use familiar tools on client machines
• Client access to remote data (via standards)
• Full DBMS functionality provided to client workstations
• Overall better system price/performance
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/43
Database Server
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/44
Distributed Database Servers
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/45
Datalogical Distributed DBMS
Architecture
...
...
...
ES1 ES2 ESn
GCS
LCS1 LCS2 LCSn
LIS1 LIS2 LISn
We first note that the physical data
organization on each machine may
be, and
probably is, different. This means
that there needs to be an individual
internal schema
definition at each site, which we call
the local internal schema (LIS).
The enterprise
view of the data is described by the
global conceptual schema (GCS),
which is global
because it describes the logical
structure of the data at all the sites.
To handle data fragmentation and
replication, the logical organization of
data
at each site needs to be described.
Therefore, there needs to be a third
layer in the
architecture, the local conceptual
schema (LCS). In the architectural
model we have
chosen, then, the global conceptual
schema is the union of the local
conceptual
schemas.
Finally, user applications and user
access to the database is supported
by
external schemas (ESs), defined as
being above the global conceptual
schema.
Data independence is supported
since the model is an
extension of ANSI/SPARC, which
provides such independence
naturally. Location
and replication transparencies are
supported by the definition of the
local and global
conceptual schemas and the
mapping in between. Network
transparency, on the
other hand, is supported by the
definition of the global conceptual
schema.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/46
Peer-to-Peer Component
Architecture
Database
DATA PROCESSORUSER PROCESSOR
USER
User
requests
System
responses
External
Schema
UserInterface
Handler
Global
Conceptual
Schema
SemanticData
Controller
Global
Execution
Monitor
System
Log
LocalRecovery
Manager
Local
Internal
Schema
Runtime
Support
Processor
LocalQuery
Processor
Local
Conceptual
Schema
GlobalQuery
Optimizer
GD/D
The detailed components of a distributed DBMS
are shown. One
component handles the interaction with users, and
another deals with the storage. The
first major component, which we call the user
processor, consists of four elements:
The user interface handler is
responsible for interpreting user
commands as
they come in, and formatting the
result data as it is sent to the user.
The semantic data controller uses the
integrity constraints and authorizations
that are defined as part of the global
conceptual schema to check if the user
query can be processed.
The global query optimizer and decomposer determines an
execution strategy to minimize a cost function, and translates the
global queries into local ones using the global and local conceptual
schemas as well as the global directory.
The distributed execution monitor coordinates the distributed
execution of the user request. The execution monitor is also
called the distributed transaction manager. In executing
queries in a distributed fashion, the execution monitors
at various sites may, and usually do, communicate with one
another.
The second major
component of a distributed
DBMS is the data processor
and
consists of three elements:
The local query optimizer, which
actually acts as the access path
selector,
is responsible for choosing the best
access path5 to access any data item
The local recovery manager is
responsible for making sure that
the local
database remains consistent
even when failures occur
The run-time support processor physically accesses the
database according to the physical commands in the
schedule generated by the query optimizer. The run-time
support processor is the interface to the operating system
and contains the database buffer (or cache) manager,
which is responsible for maintaining the main memory
buffers and managing the data accesses.
It is important to note, at this point, that our use of the
terms “user processor”
and “data processor” does not imply a functional division
similar to client/server
systems. These divisions are merely organizational and
there is no suggestion that
they should be placed on different machines.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/47
Multidatabase systems (MDBS)
• Multidatabase systems (MDBS) represent the case where individual
DBMSs (whether distributed or not) are fully autonomous and have no
concept of cooperation; they may not even “know” of each other’s
existence or how to talk to each other.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/48
Datalogical Multi-DBMS
Architecture
...
GCS… …
GES1
LCS2 LCSn…
…LIS2 LISn
LES11 LES1n LESn1 LESnm
GES2 GESn
LIS1
LCS1
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/49
MDBS Components & Execution
Multi-DBMS
Layer
DBMS1 DBMS3DBMS2
Global
User
Request
Local
User
Request
Global
Subrequest
Global
Subrequest
Global
Subrequest
Local
User
Request
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/50
Mediator/Wrapper Architecture

Mais conteúdo relacionado

Mais procurados

Distributed Database System
Distributed Database SystemDistributed Database System
Distributed Database SystemSulemang
 
Ddb 1.6-design issues
Ddb 1.6-design issuesDdb 1.6-design issues
Ddb 1.6-design issuesEsar Qasmi
 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management systemPooja Dixit
 
Distributed design alternatives
Distributed design alternativesDistributed design alternatives
Distributed design alternativesPooja Dixit
 
Distributed file system
Distributed file systemDistributed file system
Distributed file systemAnamika Singh
 
Distributed database system
Distributed database systemDistributed database system
Distributed database systemM. Ahmad Mahmood
 
File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing modelsishmecse13
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query ProcessingMythili Kannan
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management SystemHardik Patil
 
Introduction to distributed database
Introduction to distributed databaseIntroduction to distributed database
Introduction to distributed databaseSonia Panesar
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactionsNilu Desai
 
Replication Techniques for Distributed Database Design
Replication Techniques for Distributed Database DesignReplication Techniques for Distributed Database Design
Replication Techniques for Distributed Database DesignMeghaj Mallick
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating SystemsDr Sandeep Kumar Poonia
 
Foult Tolerence In Distributed System
Foult Tolerence In Distributed SystemFoult Tolerence In Distributed System
Foult Tolerence In Distributed SystemRajan Kumar
 

Mais procurados (20)

Distributed Database System
Distributed Database SystemDistributed Database System
Distributed Database System
 
Ddb 1.6-design issues
Ddb 1.6-design issuesDdb 1.6-design issues
Ddb 1.6-design issues
 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management system
 
Distributed design alternatives
Distributed design alternativesDistributed design alternatives
Distributed design alternatives
 
Distributed file system
Distributed file systemDistributed file system
Distributed file system
 
Distributed database system
Distributed database systemDistributed database system
Distributed database system
 
File models and file accessing models
File models and file accessing modelsFile models and file accessing models
File models and file accessing models
 
11. dfs
11. dfs11. dfs
11. dfs
 
Distributed Query Processing
Distributed Query ProcessingDistributed Query Processing
Distributed Query Processing
 
Virtual machine security
Virtual machine securityVirtual machine security
Virtual machine security
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
 
Distributed deadlock
Distributed deadlockDistributed deadlock
Distributed deadlock
 
Introduction to distributed database
Introduction to distributed databaseIntroduction to distributed database
Introduction to distributed database
 
Unit 6
Unit 6Unit 6
Unit 6
 
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query ProcessingDistributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 6 - Query Processing
 
management of distributed transactions
management of distributed transactionsmanagement of distributed transactions
management of distributed transactions
 
Replication Techniques for Distributed Database Design
Replication Techniques for Distributed Database DesignReplication Techniques for Distributed Database Design
Replication Techniques for Distributed Database Design
 
Database fragmentation
Database fragmentationDatabase fragmentation
Database fragmentation
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
 
Foult Tolerence In Distributed System
Foult Tolerence In Distributed SystemFoult Tolerence In Distributed System
Foult Tolerence In Distributed System
 

Semelhante a Database , 1 Introduction

Semelhante a Database , 1 Introduction (20)

1 introduction ddbms
1 introduction ddbms1 introduction ddbms
1 introduction ddbms
 
1 introduction
1 introduction1 introduction
1 introduction
 
Distributed dbms (ddbms)
Distributed dbms (ddbms)Distributed dbms (ddbms)
Distributed dbms (ddbms)
 
Database ,14 Parallel DBMS
Database ,14 Parallel DBMSDatabase ,14 Parallel DBMS
Database ,14 Parallel DBMS
 
Distributed database management system
Distributed database management systemDistributed database management system
Distributed database management system
 
lecture-13.pptx
lecture-13.pptxlecture-13.pptx
lecture-13.pptx
 
DDBMS
DDBMSDDBMS
DDBMS
 
nnnn.pptx
nnnn.pptxnnnn.pptx
nnnn.pptx
 
DBMS.pptx
DBMS.pptxDBMS.pptx
DBMS.pptx
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Intro to Distributed Database Management System
Intro to Distributed Database Management SystemIntro to Distributed Database Management System
Intro to Distributed Database Management System
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
DDBMS.pptx
DDBMS.pptxDDBMS.pptx
DDBMS.pptx
 
Database SystemsDesign, Implementation, and Management
Database SystemsDesign, Implementation, and ManagementDatabase SystemsDesign, Implementation, and Management
Database SystemsDesign, Implementation, and Management
 
Santosh Kumar Meher(2105040008) DISTRIBUTED DATABASE.pptx
Santosh Kumar Meher(2105040008) DISTRIBUTED DATABASE.pptxSantosh Kumar Meher(2105040008) DISTRIBUTED DATABASE.pptx
Santosh Kumar Meher(2105040008) DISTRIBUTED DATABASE.pptx
 
distributed dbms
distributed dbmsdistributed dbms
distributed dbms
 
Distributed Database
Distributed DatabaseDistributed Database
Distributed Database
 
Distributed information system
Distributed information systemDistributed information system
Distributed information system
 
Distributed Database Management System.pptx
Distributed Database Management System.pptxDistributed Database Management System.pptx
Distributed Database Management System.pptx
 
Distributed DBMS - Unit 1 - Introduction
Distributed DBMS - Unit 1 - IntroductionDistributed DBMS - Unit 1 - Introduction
Distributed DBMS - Unit 1 - Introduction
 

Mais de Ali Usman

Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer OverviewAli Usman
 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and ArchitectureAli Usman
 
Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current IssuesAli Usman
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 WebAli Usman
 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P Ali Usman
 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMSAli Usman
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 ReplicationAli Usman
 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency ControlAli Usman
 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 TransactionsAli Usman
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query OptimizationAli Usman
 
Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localizationAli Usman
 
Database , 6 Query Introduction
Database , 6 Query Introduction Database , 6 Query Introduction
Database , 6 Query Introduction Ali Usman
 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 SemanticAli Usman
 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data IntegrationAli Usman
 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 BackgroundAli Usman
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor SpecificationsAli Usman
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of MicroprocessorAli Usman
 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2Ali Usman
 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1 Ali Usman
 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyAli Usman
 

Mais de Ali Usman (20)

Cisco Packet Tracer Overview
Cisco Packet Tracer OverviewCisco Packet Tracer Overview
Cisco Packet Tracer Overview
 
Islamic Arts and Architecture
Islamic Arts and  ArchitectureIslamic Arts and  Architecture
Islamic Arts and Architecture
 
Database ,18 Current Issues
Database ,18 Current IssuesDatabase ,18 Current Issues
Database ,18 Current Issues
 
Database , 17 Web
Database , 17 WebDatabase , 17 Web
Database , 17 Web
 
Database ,16 P2P
Database ,16 P2P Database ,16 P2P
Database ,16 P2P
 
Database , 15 Object DBMS
Database , 15 Object DBMSDatabase , 15 Object DBMS
Database , 15 Object DBMS
 
Database , 13 Replication
Database , 13 ReplicationDatabase , 13 Replication
Database , 13 Replication
 
Database ,11 Concurrency Control
Database ,11 Concurrency ControlDatabase ,11 Concurrency Control
Database ,11 Concurrency Control
 
Database ,10 Transactions
Database ,10 TransactionsDatabase ,10 Transactions
Database ,10 Transactions
 
Database , 8 Query Optimization
Database , 8 Query OptimizationDatabase , 8 Query Optimization
Database , 8 Query Optimization
 
Database ,7 query localization
Database ,7 query localizationDatabase ,7 query localization
Database ,7 query localization
 
Database , 6 Query Introduction
Database , 6 Query Introduction Database , 6 Query Introduction
Database , 6 Query Introduction
 
Database , 5 Semantic
Database , 5 SemanticDatabase , 5 Semantic
Database , 5 Semantic
 
Database , 4 Data Integration
Database , 4 Data IntegrationDatabase , 4 Data Integration
Database , 4 Data Integration
 
Database ,2 Background
 Database ,2 Background Database ,2 Background
Database ,2 Background
 
Processor Specifications
Processor SpecificationsProcessor Specifications
Processor Specifications
 
Fifty Year Of Microprocessor
Fifty Year Of MicroprocessorFifty Year Of Microprocessor
Fifty Year Of Microprocessor
 
Discrete Structures lecture 2
 Discrete Structures lecture 2 Discrete Structures lecture 2
Discrete Structures lecture 2
 
Discrete Structures. Lecture 1
 Discrete Structures. Lecture 1  Discrete Structures. Lecture 1
Discrete Structures. Lecture 1
 
Muslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-AstronomyMuslim Contributions in Medicine-Geography-Astronomy
Muslim Contributions in Medicine-Geography-Astronomy
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Último (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Database , 1 Introduction

  • 1. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/1 Outline • Introduction ➡ What is a distributed DBMS ➡ Distributed DBMS Architecture • Background • Distributed Database Design • Database Integration • Semantic Data Control • Distributed Query Processing • Multidatabase query processing • Distributed Transaction Management • Data Replication • Parallel Database Systems • Distributed Object DBMS • Peer-to-Peer Data Management • Web Data Management • Current Issues
  • 2. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/2 File Systems program 1 data description 1 program 2 data description 2 program 3 data description 3 File 1 File 2 File 3
  • 3. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/3 Database Management database DBMS Application program 1 (with data semantics) Application program 2 (with data semantics) Application program 3 (with data semantics) description manipulation control
  • 4. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/4 Motivation Database Technology Computer Networks integration distribution integration integration ≠ centralization Distributed Database Systems
  • 5. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/5 Distributed Computing • A number of autonomous processing elements (not necessarily homogeneous) that are interconnected by a computer network and that cooperate in performing their assigned tasks. • What is being distributed? ➡ Processing logic ➡ Function ➡ Data ➡ Control
  • 6. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/6 What is a Distributed Database System? A distributed database (DDB) is a collection of multiple, logically interrelated databases distributed over a computer network. A distributed database management system (D–DBMS) is the software that manages the DDB and provides an access mechanism that makes this distribution transparent to the users. Distributed database system (DDBS) = DDB + D–DBMS
  • 7. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/7 What is not a DDBS? • A timesharing computer system • A loosely (separate primary memory and shared secondary memory) or tightly coupled (shared memory) multiprocessor system • A database system which resides at one of the nodes of a network of computers - this is a centralized database on a network node
  • 8. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/8 Centralized DBMS on a Network Site 5 Site 1 Site 2 Site 3Site 4 Communication Network
  • 9. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/9 Distributed DBMS Environment Site 5 Site 1 Site 2 Site 3Site 4 Communication Network
  • 10. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/10 Implicit Assumptions • Data stored at a number of sites  each site logically consists of a single processor. • Processors at different sites are interconnected by a computer network  not a multiprocessor system ➡ Parallel database systems • Distributed database is a database, not a collection of files  data logically related as exhibited in the users’ access patterns ➡ Relational data model • D-DBMS is a full-fledged DBMS ➡ Not remote file system, not a TP system
  • 11. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/11 Data Delivery Alternatives • We characterize the data delivery alternatives along three orthogonal dimensions: • Delivery modes • Frequency • Communication Methods • Note: not all combinations make sense
  • 12. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/12 Data delivery • Delivery modes ➡ Pull-only {the transfer of data from servers to clients is initiated by a client pull} • Push-only {the transfer of data from servers to clients is initiated by a server push in the absence of any specific request from clients. periodic, irregular, or conditional} ➡ Hybrid (mix of pull and push) • Frequency • Periodic (A client request for IBM’s stock price every week is an example of a periodic pull.) • Conditional (An application that sends out stock prices only when they change is an example of conditional push.) ➡ Ad-hoc or irregular • Communication Methods {Unicast, One-to-many}
  • 13. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/13 Distributed DBMS Promises Transparent management of distributed, fragmented, and replicated data Improved reliability/availability through distributed transactions Improved performance Easier and more economical system expansion
  • 14. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/14 Transparency • Transparency is the separation of the higher level semantics of a system from the lower level implementation issues. • Fundamental issue is to provide data independence in the distributed environment ➡ Network (distribution) transparency ➡ Replication transparency ➡ Fragmentation transparency ✦ horizontal fragmentation: selection ✦ vertical fragmentation: projection ✦ hybrid
  • 15. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/15 Example SELECT ENAME,SAL FROM EMP,ASG,PAY WHERE DUR > 12 AND EMP.ENO = ASG.ENO AND PAY.TITLE = EMP.TITLE
  • 16. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/16 Transparent Access SELECT ENAME,SAL FROM EMP,ASG,PAY WHERE DUR > 12 AND EMP.ENO = ASG.ENO AND PAY.TITLE = EMP.TITLE Paris projects Paris employees Paris assignments Boston employees Montreal projects Paris projects New York projects with budget > 200000 Montreal employees Montreal assignments Boston Communication Network Montreal Paris New York Boston projects Boston employees Boston assignments Boston projects New York employees New York projects New York assignments Tokyo
  • 17. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/17 Distributed Database - User View Distributed Database
  • 18. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/18 Distributed DBMS - Reality Communication Subsystem DBMS Software User ApplicationUser Query DBMS Software DBMS Software DBMS Software User Query DBMS Software User Query User Application
  • 19. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/19 Types of Transparency • Data independence {It refers to the immunity of user applications to changes in the definition and organization of data, and vice versa. • Logical data independence and physical data independence} • Network transparency (or distribution transparency) ➡ Location transparency ➡ Fragmentation transparency • Replication transparency • Fragmentation transparency {global queries to fragment gueries}
  • 20. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/20 Who Should Provide Transparency? • Nevertheless, the level of transparency is inevitably a compromise between ease of use and the difficulty and overhead cost of providing high levels of transparency. • Gray argues that full transparency makes the management of distributed data very difficult and claims that “applications coded with transparent access to geographically distributed databases have: poor manageability, poor modularity, and poor message performance”. • He proposes a remote procedure call mechanism between the requestor • users and the server DBMSs whereby the users would direct their queries to a specific DBMS. • Application level {code of application, little transperancy} • Operating system {device drivers within the operating system} • DBMS
  • 21. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/21 Reliability Through Transactions • Replicated components and data should make distributed DBMS more reliable. {eliminate single points of failure} • Distributed transactions provide • Concurrency transparency {sequence of database operations executed as an atomic action. consistent db transformed to another consistent db state} ➡ Failure atomicity {update salary by 10%} • Distributed transaction support requires implementation of ➡ Distributed concurrency control protocols ➡ Commit protocols
  • 22. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/22 Potentially Improved Performance • Proximity of data to its points of use {data localization} ➡ Requires some support for fragmentation and replication 1. Since each site handles only a portion of the database, contention for CPU and I/O services is not as severe as for centralized databases. 2. Localization reduces remote access delays that are usually involved in wide area networks (for example, the minimum round-trip message propagation delay in satellite-based systems is about 1 second). • Parallelism in execution ➡ Inter-query parallelism ➡ Intra-query parallelism
  • 23. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/23 • Parallelism Requirements • read only queries Have as much of the data required by each application at the site where the application executes ➡ Full replication • How about updates? ➡ Mutual consistency ➡ Freshness of copies
  • 24. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/24 System Expansion • Issue is database scaling • Emergence of microprocessor and workstation technologies ➡ Demise of Grosh's law ➡ Client-server model of computing
  • 25. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/25 Complications Introduced by Distribution • data may be replicated, the distributed database system is responsible for (1) choosing one of the stored copies of the requested data for access in case of retrievals, and (2) making sure that the effect of an update is reflected on each and every copy of that data item. • if some sites fail or communication fail, DBMS will ensure update for fail site as soon as system recovers • the synchronization of transactions on multiple sites is considerably harder than for a centralized system.
  • 26. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/26 Distributed DBMS Issues • Distributed Database Design {chapter 3} ➡ How to distribute the database {portioned and replicated} ➡ Replicated (partial dupliacated or fully duplicated) & non-replicated database distribution ➡ Fragmentation ➡ {research area to minimize cost of storing, processing transactions and communication is NP hard. Proposed solution are based on heuristics} • Distributed directory management {chapter 3} ➡ Contains information such as description and location about items in db. ➡ Global directory to entire DDBS or local to each site, centralized or distributed, single copy or multiple copy
  • 27. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/27 Distributed DBMS Issues • Query Processing {chapter 6-8} ➡ Convert user transactions to data manipulation instructions ➡ Optimization problem ✦ min{cost = data transmission + local processing} ➡ General formulation is NP-hard • Concurrency Control {chapter 11} ➡ Synchronization of concurrent accesses ➡ The condition that requires all the values of multiple copies of every data item to converge to the same value is called mutual consistency. ➡ Consistency and isolation of transactions' effects ➡ Deadlock management
  • 28. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/28 • Distributed deadlock management • The deadlock problem in DDBSs is similar in nature to that encountered in operating systems. The competition among users for access to a set of resources (data, in this case) can result in a deadlock if the synchronization mechanism is based on locking. • The well-known alternatives of prevention, avoidance, and detection/recovery also apply to DDBSs. • Reliability and availability ➡ How to make the system resilient to failures ➡ Atomicity and durability
  • 29. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/29 Directory Management Relationship Between Issues Reliability Deadlock Management Query Processing Concurrency Control Distribution Design The design of distributed databases affects many areas. It affects directory management, because the definition of fragments and their placement determine the contents of the directory (or directories) as well as the strategies that may be employed to manage them. The same information (i.e., fragment structure and placement) is used by the query processor to determine the query evaluation strategy. On the other hand, the access and usage patterns that are determined by the query processor are used as inputs to the data distribution and fragmentation algorithms. Similarly, directory placement and contents influence the processing of queries. The replication of fragments when they are distributed affects the concurrency control strategies that might be employed. There is a strong relationship among the concurrency control problem, the deadlock management problem, and reliability issues. This is to be expected, since together they are usually called the transaction management problem. The concurrency control algorithm that is employed will determine whether or not a separate deadlock management facility is required. If a locking-based algorithm is used, deadlocks will occur, whereas they will not if timestamping is the chosen alternative. Finally, the need for replication protocols arise if data distribution involves replicas. As indicated above, there is a strong relationship between replication protocols and concurrency control techniques, since both deal with the consistency of data, but from different perspectives.
  • 30. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/30 Architecture • Defines the structure of the system ➡ components identified ➡ functions of each component defined ➡ interrelationships and interactions between components defined
  • 31. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/31 ANSI/SPARC Architecture External Schema Conceptual Schema Internal Schema Internal view Users External view Conceptual view External view External view
  • 32. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/32 Differences between Three Levels of ANSI-SPARC Architecture © Pearson Education Limited 1995, 2005
  • 33. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/33 Data Independence • Logical Data Independence ➡ Refers to immunity of external schemas to changes in conceptual schema. ➡ Conceptual schema changes (e.g. addition/removal of entities). ➡ Should not require changes to external schema or rewrites of application programs. © Pearson Education Limited 1995, 2005
  • 34. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/34 Data Independence • Physical Data Independence ➡ Refers to immunity of conceptual schema to changes in the internal schema. ➡ Internal schema changes (e.g. using different file organizations, storage structures/devices). ➡ Should not require change to conceptual or external schemas. © Pearson Education Limited 1995, 2005
  • 35. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/35 Data Independence and the ANSI- SPARC Three-Level Architecture © Pearson Education Limited 1995, 2005
  • 36. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/36 Generic DBMS Architecture The interface layer manages the interface to the applications. View management consists of translating the user query from external data to conceptual data.The control layer controls the query by adding semantic integrity predicates and authorization predicates.The query processing (or compilation) layer maps the query into an optimized sequence of lower-level operations. decomposes the query into a tree of algebra operations and tries to find the “optimal” ordering of the operations. The result is stored in an access plan. The output of this layer is a query expressed in lower- level code (algebra operations). The execution layer directs the execution of the access plans, including transaction management (commit, restart) and synchronization of algebra operations The data access layer manages the data structures that implement the files, indices, etc. It also manages the buffers by caching the most frequently accessed data. Finally, the consistency layer manages concurrency control and logging for update requests. This layer allows transaction, system, and media recovery after failure.
  • 37. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/37 DBMS Implementation Alternatives
  • 38. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/38 Autonomy Autonomy is a function of a number of factors such as whether the component systems (i.e., individual DBMSs) exchange information, whether they can independently execute transactions, and whether one is allowed to modify them. Degree to which member databases can operate independently 1. The local operations of the individual DBMSs are not affected by their participation in the distributed system. 2. The manner in which the individual DBMSs process queries and optimize them should not be affected by the execution of global queries that access multiple databases. 3. System consistency or operation should not be compromised when individual DBMSs join or leave the distributed system.
  • 39. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/39 Dimensions of the Problem • Distribution ➡ Whether the components of the system are located on the same machine or not • Heterogeneity ➡ Various levels (hardware, communications, operating system) ➡ DBMS important one ✦ data model, query language, transaction management algorithms • Autonomy ➡ Not well understood and most troublesome ➡ Various versions ✦ Design autonomy: Ability of a component DBMS to decide on issues related to its own design. ✦ Communication autonomy: Ability of a component DBMS to decide whether and how to communicate with other DBMSs. ✦ Execution autonomy: Ability of a component DBMS to execute local operations in any manner it wants to.
  • 40. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/40 In Figure 1.10, we have identified three alternative architectures that are the focus of this book and that we discus in more detail in the next three subsections: (A0, D1, H0) that corresponds to client/server distributed DBMSs, (A0, D2, H0) that is a peer-to-peer distributed DBMS and (A2, D2, H1) which represents a (peer-topeer) distributed, heterogeneous multidatabase system. Note that we discuss the heterogeneity issues within the context of one system architecture, although the issue arises in other models as well.
  • 41. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/41 Client/Server Architecture
  • 42. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/42 Advantages of Client-Server Architectures • More efficient division of labor • Horizontal and vertical scaling of resources • Better price/performance on client machines • Ability to use familiar tools on client machines • Client access to remote data (via standards) • Full DBMS functionality provided to client workstations • Overall better system price/performance
  • 43. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/43 Database Server
  • 44. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/44 Distributed Database Servers
  • 45. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/45 Datalogical Distributed DBMS Architecture ... ... ... ES1 ES2 ESn GCS LCS1 LCS2 LCSn LIS1 LIS2 LISn We first note that the physical data organization on each machine may be, and probably is, different. This means that there needs to be an individual internal schema definition at each site, which we call the local internal schema (LIS). The enterprise view of the data is described by the global conceptual schema (GCS), which is global because it describes the logical structure of the data at all the sites. To handle data fragmentation and replication, the logical organization of data at each site needs to be described. Therefore, there needs to be a third layer in the architecture, the local conceptual schema (LCS). In the architectural model we have chosen, then, the global conceptual schema is the union of the local conceptual schemas. Finally, user applications and user access to the database is supported by external schemas (ESs), defined as being above the global conceptual schema. Data independence is supported since the model is an extension of ANSI/SPARC, which provides such independence naturally. Location and replication transparencies are supported by the definition of the local and global conceptual schemas and the mapping in between. Network transparency, on the other hand, is supported by the definition of the global conceptual schema.
  • 46. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/46 Peer-to-Peer Component Architecture Database DATA PROCESSORUSER PROCESSOR USER User requests System responses External Schema UserInterface Handler Global Conceptual Schema SemanticData Controller Global Execution Monitor System Log LocalRecovery Manager Local Internal Schema Runtime Support Processor LocalQuery Processor Local Conceptual Schema GlobalQuery Optimizer GD/D The detailed components of a distributed DBMS are shown. One component handles the interaction with users, and another deals with the storage. The first major component, which we call the user processor, consists of four elements: The user interface handler is responsible for interpreting user commands as they come in, and formatting the result data as it is sent to the user. The semantic data controller uses the integrity constraints and authorizations that are defined as part of the global conceptual schema to check if the user query can be processed. The global query optimizer and decomposer determines an execution strategy to minimize a cost function, and translates the global queries into local ones using the global and local conceptual schemas as well as the global directory. The distributed execution monitor coordinates the distributed execution of the user request. The execution monitor is also called the distributed transaction manager. In executing queries in a distributed fashion, the execution monitors at various sites may, and usually do, communicate with one another. The second major component of a distributed DBMS is the data processor and consists of three elements: The local query optimizer, which actually acts as the access path selector, is responsible for choosing the best access path5 to access any data item The local recovery manager is responsible for making sure that the local database remains consistent even when failures occur The run-time support processor physically accesses the database according to the physical commands in the schedule generated by the query optimizer. The run-time support processor is the interface to the operating system and contains the database buffer (or cache) manager, which is responsible for maintaining the main memory buffers and managing the data accesses. It is important to note, at this point, that our use of the terms “user processor” and “data processor” does not imply a functional division similar to client/server systems. These divisions are merely organizational and there is no suggestion that they should be placed on different machines.
  • 47. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/47 Multidatabase systems (MDBS) • Multidatabase systems (MDBS) represent the case where individual DBMSs (whether distributed or not) are fully autonomous and have no concept of cooperation; they may not even “know” of each other’s existence or how to talk to each other.
  • 48. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/48 Datalogical Multi-DBMS Architecture ... GCS… … GES1 LCS2 LCSn… …LIS2 LISn LES11 LES1n LESn1 LESnm GES2 GESn LIS1 LCS1
  • 49. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/49 MDBS Components & Execution Multi-DBMS Layer DBMS1 DBMS3DBMS2 Global User Request Local User Request Global Subrequest Global Subrequest Global Subrequest Local User Request
  • 50. Distributed DBMS © M. T. Özsu & P. Valduriez Ch.1/50 Mediator/Wrapper Architecture

Notas do Editor

  1. Animated slide