SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
PMSCS-653 (Advance DBMS Final-exam)
Indexing and Hashing
1- Explain the basic concept of indexing. Mention some
factors to be considered for evaluating an indexing
technique.
Basic Concept
An index for a file in a database system works in much the same way as the index in
this textbook. If we want to learn about a particular topic, we can search for the topic in
the index at the back of the book, find the pages where it occurs, and then read the pages
to find the information we are looking for. The words in the index are in sorted order,
making it easy to find the word we are looking for. Moreover, the index is much smaller
than the book, further reducing the effort needed to find the words we are looking for.
Evaluation Factors
 Access types: Access types can include finding records with a specified attribute
value and finding records whose attribute values fall in a specified range.
 Access time: The time it takes to find a particular data item, or set of items, using the
technique in question.
 Insertion time: The time it takes to insert a new data item. This value includes the
time it takes to find the correct place to insert the new data item, as well as the time it
takes to update the index structure.
 Deletion time: The time it takes to delete a data item. This value includes the time it
takes to find the item to be deleted, as well as the time it takes to update the index
structure.
 Space overhead: The additional space occupied by an index structure. Provided that
the amount of additional space is moderate, it is usually worthwhile to sacrifice the
space to achieve improved performance.
2- Explain multilevel indexing with example. When is it
preferable to use a multilevel index rather than a single
level index?
Multi-Level Indexing
 If primary index does not fit in memory, access becomes expensive.
 Solution: treat primary index kept on disk as a sequential file and construct a sparse
index on it.
- Outer index – a sparse index of primary index
- Inner index – the primary index file
 If even outer index is too large to fit in main memory, yet another level of index can
be created, and so on.
 Indices at all levels must be updated on insertion or deletion from the file.
An Example
 Consider 100,000 records, 10 per block, at one index record per block, that's 10,000
index records. Even if we can fit 100 index records per block, this is 100 blocks. If
index is too large to be kept in main memory, a search results in several disk reads.
 For very large files, additional levels of indexing may be required.
 Indices must be updated at all levels when insertions or deletions require it.
 Frequently, each level of index corresponds to a unit of physical storage.
Fig: Multilevel Index
3- Differentiate between Dense & Sparse indexing. What is
the advantage of sparse index over dense index?
Dense VS Sparse Indices
 It is generally faster to locate a record if we have a dense index rather than a sparse
index.
 However, sparse indices have advantages over dense indices in that they require less
space and they impose less maintenance overhead for insertions and deletions.
 There is a trade-off that the system designer must make between access time and
space overhead.
However, sparse indices have advantages over dense indices in that they require
less space and they impose less maintenance overhead for insertions and deletions.
4- Explain hash file organization with example.
Hash File Organization
 In a hash file organization, we obtain the address of the disk block, also called the
bucket containing a desired record directly by computing a function on the search-key
value of the record.
Hash File Organization: An Example
 Let us choose a hash function for the account file using the search key branch_name.
 Suppose we have 26 buckets and we define a hash function that maps names
beginning with the ith letter of the alphabet to the ith bucket.
 This hash function has the virtue of simplicity, but it fails to provide a uniform
distribution, since we expect more branch names to begin with such letters as B and R
than Q and X.
 Instead, we consider 10 buckets and a hash function that computes the sum of the
binary representations of the characters of a key, then returns the sum modulo the
number of buckets.
 For branch name ‘Perryridge’
 Bucket no=h(Perryridge) = 5
 For branch name ‘Round Hill’
 Bucket no=h(Round Hill) = 3
 For branch name ‘Brighton’
 Bucket no=h(Brighton) = 3
Fig. Hash organization of account file, with branch-name as the key
5- Construct a B+-tree for the following set of key values:
(2, 3, 5, 7, 11, 17, 19, 23, 29, 31) for n=4 and n=6.
For n=4
19
2 3 5 7 11 17 19 23 3129
5 11 29
7 19
2 3 5 7 11 17 19 23 3129
5 29
2 3 7 11
17
19 23 31
For n=6
6- Construct a B -tree for the following set of key values: (2,
3, 5, 7, 11, 17, 19, 23, 29, 31) for n=4 and n=6.
For n=4
For n=6
7 23
2 3 5 11 17 19 29 31
Functional Dependencies & Normalization
1- Normalize step by step the following database into 1NF,
2NF and 3NF:
PROJECT (Proj-ID, Proj-Name, Proj-Mgr-ID, Emp-ID,
Emp-Name, Emp-Dpt, Emp-Hrly-rate, Total-Hrs)
FIRST NORMAL FORM (1NF)
A relation r(R) is said to be in First Normal Form (1NF), if and only if every entry
of the relation has at most a single value. Thus, it requires to decompose the table
PROJECT into two new tables. It is also necessary to identify a PK for each table.
First table with nonrepeating groups:
PROJECT (Proj-ID, Proj-Name, Proj-Mgr-ID)
Second table with table identifier and all repeating groups:
PROJECT-EMPLOYEE (Proj-ID, Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate, Total-Hrs)
Here Proj-ID is the table identifier.
SECOND NORMAL FORM (2NF)
A relation r (R) is in Second Normal Form (2NF) if and only if the following two
conditions are met simultaneously:
1. r(R) is already in 1NF.
2. No nonprime attribute is partially dependent on any key or, equivalently, each
nonprime attribute in R is fully dependent on every key.
In PROJECT-EMPLOYEE relation attributes Emp-Name, Emp-Dpt and Emp-Hrly-rate are
fully dependent only on Emp-ID. The only attribute Total-Hrs fully depends on the
composite PK (Proj-ID, Emp-ID). This is shown below.
Now break this relation into two new relations so that partial dependency does not exist.
PHOURS-ASSIGNED (Proj-ID, Emp-ID, Total-Hrs)
EMPLOYEE (Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate)
THIRD NORMAL FORM (3NF)
A relation r(R) is in Third Normal Form (3NF) if and only if the following conditions are
satisfied simultaneously:
1. r(R) is already in 2NF.
2. No nonprime attribute is transitively dependent on the key (i.e. No nonprime
attribute functionally determines any other nonprime attribute).
In EMPLOYEE relation, attribute Emp-Hrly-rate is transitively dependent on the PK Emp-
ID through the functional dependency Emp-Dpt → Emp-Hrly-rate. This is shown below.
Now break this relation into two new relations so that transitive dependency does not
exist.
EMPLOYEE (Emp-ID, Emp-Name, Emp-Dpt)
CHARGES (Emp-Dpt, Emp-Hrly-rate)
The new set of relations that we have obtained through the normalization process does
not exhibit any of the anomalies. That is, we can insert, delete and update tuples without
any of the side effects.
PROJECT-EMPLOYEE (Proj-ID, Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate, Total-Hrs)
EMPLOYEE (Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate)
2- Explain BCNF with example.
Boyce-Codd Normal Form:
A relation r(R) is in Boyce-Codd Normal Form (BCNF) if and only of the following
conditions are met simultaneously:
(1) The relation is in 3NF.
(2) For every functional dependency of the form X → A, we have that either A  X
or X is a super key of r. In other words, every functional dependency is either a
trivial dependency or in the case that the functional dependency is not trivial, then
X must be a super key.
To transform the relation of example 2 into BCNF, we can decompose it onto the
following set of relational schemes:
Set No. 1
MANUFACTURER (ID, Name)
MANUFACTURER-PART (ID, Item-No, Quantity)
Or
Set No. 2
MANUFACTURER (ID, Name)
MANUFACTURER-PART (Name, Item-No, Quantity)
Notice that both relations are in BCNF and that the update data anomaly is no longer
present.
3- Given the relation r(A, B, C) and the set F={AB → C, B
→ D, D → B} of functional dependencies, find the candidate
keys of the relation. How many candidate keys are in this
relation? What are the prime attributes?
AB is a candidate key.
(1) AB → A and AB → B ( Reflexivity axiom)
(2) AB → C (given)
(3) AB → B (step 1) and B → D (given) then AB → D ( Transitivity axiom)
Since AB determines every other attribute of the relation, we have that AB is a candidate
key.
AD is a candidate key.
(1) AD → A and AD → D ( Reflexivity axiom)
(2) AD → D (step 1) and D → B (given) then AD → B
(3) AD → B (step 2) and AB → C (given) then AAD → C (Pseudotransitivity
axiom).
i.e. AD → C
Since AD determines every other attribute of the relation, we have that AB is a candidate
key.
The prime attributes are: A, B, and D.
4- What is Canonical Cover? Given the set F of FDs shown
below, find a canonical cover for F.
F= {X → Z, XY → WP, XY → ZWQ, XZ → R}.
Canonical cover
For a given set F of FDs, a canonical cover, denoted by Fc, is a set of FDs where the
following conditions are simultaneously satisfied:
(1) Every FD of Fc is simple. That is, the right-hand side of every functional
dependency of Fc has only one attribute.
(2) Fc is left-reduced.
(3) Fc is nonredundant.
Example: Given the set F of FDs shown below, find a canonical cover for F.
F= {X → Z, XY → WP, XY → ZWQ, XZ → R}.
FC = {X → Z, XY → WP, XY → ZWQ, XZ → R}
= {X → Z, XY → W, XY → P, XY → Z, XY → W, XY → Q, XZ → R}.
[Simple]
= {X → Z, XY → W, XY → P, XY → Z, XY → Q, XZ → R}. [Remove
redundancy]
= {X → Z, XY → W, XY → P, XY → Z, XY → Q, XZ → R}.
= {X → Z, XY → W, XY → P, XY → Q, XZ → R}. [Remove redundancy]
= {X → Z, XY → W, XY → P, XY → Q, XZ → R}.
= {X → Z, XY → W, XY → P, XY → Q, X → R}. [Left-reduced]
FC = {X → Z, XY → W, XY → P, XY → Q, X → R}. This resulting set where all FDs
are simple, left-reduced and non-redundant, is the canonical cover of F.
Distributed Database Management System (DDBMS)
1- Distinguish between local transaction & global
transaction.
 A local transaction is one that accesses data only from sites where the transaction
was initiated.
 A global transaction, on the other hand, is one that either accesses data in a site
different from the one at which the transaction was initiated, or accesses data in
several different sites.
2- Draw & explain the general structure of a
distributed database system. Describe the relative
advantages & disadvantages of DDBMS.
• Consists of a single logical database that is split into a number of fragments. Each
fragment is stored on one or more computers.
• The computers in a distributed system communicate with one another through various
communication media, such as high-speed networks or telephone lines.
• They do not share main memory or disks.
• Each site is capable of independently processing user requests that require access to
local data as well as it is capable of processing user requests that require access to
remote data stored on other computers in the network.
The general structure of a distributed system appears in the following figure.
Advantages of DDBMS
 Sharing data. Users at one site may be able to access the data residing at other sites.
 Increased Local Autonomy. Global database administrator responsible for the entire
system. A part of these responsibilities is delegated to the local database administrator
for each site.
 Increased Availability. If one site fails in a distributed system, the remaining sites
may be able to continue operating.
 Speeding up of query processing. It is possible to process data at several sites
simultaneously. If a query involves data stored at several sites, it may be possible to
split the query into a number of subqueries that can be executed in parallel. Thus,
query processing becomes faster in distributed systems.
 Increased reliability. A single data item may exist at several sites. If one site fails, a
transaction requiring a particular data item from that site may access it from other
sites. Thus, reliability increases.
 Better performance. As each site handles only a part of the entire database, there
may not be the same level of contention for CPU and I/O services that characterizes a
centralized DBMS.
 Processor independence. A distributed system may contain multiple copies of a
particular data item. Thus, users can access any available copy of the data item, and
user requests can be processed by the processor at the data location. Hence, user-
requests do not depend on a specific processor.
 Modular extensibility. Modular extension in a distributed system is much easier.
New sites can be added to the network without affecting the operations of other sites.
Such flexibility allows organizations to extend their system in a relatively rapid and
easier way.
Disadvantages of DDBMS
 Software-development cost. It is more difficult to implement a distributed database
system; thus, it is more costly.
 Greater potential for bugs. Since the sites operate in parallel, it is harder to ensure
the correctness of algorithms, especially operation during failures of part of the
system, and recovery from failures.
 Increased processing overhead. The exchange of messages and the additional
computation required to achieve inter-site coordination are a form of overhead that
does not arise in centralized systems.
 Security. As data in a distributed DBMS are located at multiple sites, the probability
of security lapses increases. Further, all communications between different sites in a
distributed DBMS are conveyed through the network, so the underlying network has
to be made secure to maintain system security.
 Lack of standards. A distributed system can consists of a number of sites that are
heterogeneous in terms of processors, communication networks and DBMSs. This
lack of standards significantly limits the potential of distributed DBMSs.
 Increased storage requirements.
 Maintenance of integrity is very difficult. Database integrity refers to the validity
and consistency of stored data.
 Lack of experience. Distributed DBMSs have not been widely accepted.
Consequently, we not yet have the same level of experience in industry .
 Database design is more complex. The design of a distributed DBMS involves
fragmentation of data, allocation of fragments to specific sites and data replication.
3- Differentiate between Homogeneous and
Heterogeneous Databases.
Homogeneous Distributed Database
 All sites have identical database management system software, are aware of one
another, and agree to cooperate in processing users’ requests.
 Use same DB schemas at all sites.
 Easier to design and manage
 Addition of a new site is much easier.
Heterogeneous distributed database
 Usually constructed over a no. of existing sites.
 Each site has its local database. Different sites may use different schemas (relational
model, OO model etc.).
 Use different DBMS software.
 Query processing is more difficult.
 Use gateways (as query translator) which convert the language and data model of
each different DBMS into the language and data model of the relational system.
4- Define data replication in DDBs. Mention some
major advantages and disadvantages of data
replication.
Data Storage in DDBMS
 Replication. The system maintains several identical replicas (copies) of the relation,
and stores each replica at a different site.
 Fragmentation. The system partitions the relation into several fragments, and stores
each fragment at a different site.
 Fragmentation and replication can be combined.
Advantages and disadvantages of Replication
 Availability.
 Increased parallelism.
 Increased overhead on update.
5- What is Data Fragmentation? Discuss different
types of data fragmentation with examples.
Data Fragmentation
 If relation r is fragmented, r is divided into a number of fragments r1, r2, . . . , rn.
 These fragments contain sufficient information to allow reconstruction of the original
relation r.
 There are two different schemes for fragmenting a relation:
- Horizontal fragmentation and
- Vertical fragmentation
Horizontal Fragmentation
 In horizontal fragmentation, a relation r is partitioned into a number of subsets,
r1, r2, . . . , rn.
 Each tuple of relation r must belong to at least one of the fragments, so that the
original relation can be reconstructed, if needed.
As an illustration, consider the account relation:
Account-schema = (account-number, branch-name, balance)
The relation can be divided into several different fragments. If the banking system has
only two branches - Savar and Dhanmondi - then there are two different fragments:
account1 = σbranch-name = “Savar” (account)
account2 = σbranch-name = “Dhanmondi” (account)
 Use a predicate Pi to construct fragment ri:
ri = σPi (r)
 Reconstruct the relation r by taking the union of all fragments. That is,
r = r1 r2 · · · rn
 The fragments are disjoint.
Vertical Fragmentation
 Vertical fragmentation of r(R) involves the definition of several subsets of
attributes R1, R2, . . .,Rn of the schema R so that
R = R1 R2 · · · Rn
 Each fragment ri of r is defined by
ri = ΠRi (r)
 We can reconstruct relation r from the fragments by taking the natural join
r = r1 ⋈ r2 ⋈ r3 ⋈ · · · ⋈ rn
 One way of ensuring that the relation r can be reconstructed is to include the
primary-key attributes of R in each of the Ri.
- To illustrate vertical fragmentation, consider the following relation:
employee-info=(employee-id, name, designation, salary)
- For privacy reasons, the relation may be fragmented into a relation employee-
privateinfo containing employee-id and salary, and another relation employee-
public-info containing attributes employee-id, name, and designation.
employee-privateinfo=(employee-id, salary)
employee-publicinfo=(employee-id, name, designation)
- These may be stored at different sites, again for security reasons.
6- Write down the correctness rules for data
fragmentation. Give an example of horizontal &
vertical fragments that satisfy all the correctness
rules of fragmentation.
Correctness Rules for Data Fragmentation
To ensure no loss of information and no redundancy of data, there are three different
rules that must be considered during fragmentation.
 Completeness
If a relation instance R is decomposed into fragments R1, R2, . . . .Rn, each data item in
R must appear in at least one of the fragments. It is necessary in fragmentation to ensure
that there is no loss of data during data fragmentation.
 Reconstruction
If relation R is decomposed into fragments R1, R2, . . . .Rn, it must be possible to define
a relational operation that will reconstruct the relation R from fragments R1, R2, . . . .Rn.
This rule ensures that constrains defined on the data are preserved during data
fragmentation.
 Disjoint ness
If a relation R is decomposed into fragments R1, R2, . . . .Rn and if a data item is found
in the fragment Ri, then it must not appear in any other fragments. This rule ensures
minimal data redundancy.
In case of vertical fragmentation, primary key attribute must be repeated to allow
reconstruction. Therefore, in case of vertical fragmentation, disjoint ness is defined only
on non-primary key attributes of a relation.
 Example
Let us consider the relational schema Project where project-type represents whether the
project is an inside project or abroad project. Assume that P1 and P2 are two horizontal
fragments of the relation Project, which are obtained by using the predicate “whether the
value of project-type attribute is ‘inside’ or ‘abroad’.
 Example (Horizontal Fragmentation)
P1: σproject-type = “inside” (Project)
P2: σproject-type = “abroad” (Project)
 Example (Horizontal Fragmentation)
These horizontal fragments satisfy all the correctness rules of fragmentation as
shown below.
Completeness: Each tuple in the relation Project appears either in fragment P1 or
P2. Thus, it satisfies completeness rule for fragmentation.
Reconstruction: The Project relation can be reconstructed from the horizontal
fragments P1 and P2 by using the union operation of relational algebra, which
ensures the reconstruction rule.
Thus, P1 P2 = Project.
Disjointness: The fragments P1 and P2 are disjoint, since there can be no such
project whose project type is both “inside” and “abroad”.
 Example (Vertical Fragmentation)
These vertical fragments also ensure the correctness rules of fragmentation as shown
below.
Completeness: Each tuple in the relation Project appears either in fragment V1 or V2
which satisfies completeness rule for fragmentation.
Reconstruction: The Project relation can be reconstructed from the vertical fragments
V1 and V2 by using the natural join operation of relational algebra, which ensures the
reconstruction rule.
Thus, V1 ⋈ V2 = Project.
Disjointness: The fragments V1 and V2 are disjoint, except for the primary key project-
id, which is repeated in both fragments and is necessary for reconstruction.

Transparencies in DDBMS
1- What is meant by transparency in database
management system? Discuss different types of
distributed transparency.
Transparency
o It refers to the separation of the high-level semantics of a system from lower-level
implementation issues. In a distributed system, it hides the implementation details
from users of the system.
o The user believes that he/she is working with a centralized database system and that
all the complexities of a distributed database system are either hidden or transparent
to the user.
Distribution transparency can be classified into:
 Fragmentation transparency
 Location transparency
 Replication transparency
 Local mapping transparency
 Naming transparency
 Fragmentation transparency
It hides the fact from users that the data are fragmented. Users are not required to
know how a relation has been fragmented. To retrieve data, the user needs not to
specify the particular fragment names.
 Location transparency
Users are not required to know the physical location of the data. To retrieve data
from a distributed database with location transparency, the user has to specify the
database fragment names but need not to specify where the fragments are located in
the system.
 Replication transparency
The user is unaware of the fact that the fragments of relations are replicated and
stored in different sites of the system.
 Local mapping transparency
It refers to the fact that users are aware of both the fragment names and the
location of fragments, taking into account that any replication of the fragments may
exist. In this case, the user has to mention both the fragment names and the location
for data access.
 Naming transparency
In a distributed database system, each database object such as- relations,
fragments, replicas etc, must have a unique name. Therefore, the DDBMS must
ensure that no two sites create a database object with the same name. One solution to
this problem is to create a central name server, which has the responsibility to ensure
uniqueness of all names in the system. Naming transparency means that the users are
not aware of the actual name of the database object in the system. In this case, the
user will specify the alias names of the database objects for data accessing.
Distributed Deadlock Management
1- Define Deadlock in distributed DBMS. Discuss
how deadlock situation can be characterized in
distributed DBMS.
Deadlock
In a database environment, a deadlock is a situation when transactions are
endlessly waiting for one another. Any lock-based concurrency control algorithm and
some timestamp-based concurrency control algorithms may result in deadlocks, as these
algorithms require transactions to wait for one another.
Deadlock Situation
 Deadlock situations can be characterized by wait-for graphs, directed graphs that
indicate which transactions are waiting for which other transactions.
 In a wait-for graph, nodes of the graph represent transactions and edges of the graph
represent the waiting-for relationships among transactions. An edge is drawn in the
wait-for graph from transaction Ti to transaction Tj, if the transaction Ti is waiting for
a lock on a data item that is currently held by the transaction Tj.
 Using wait-for graphs, it is very easy to detect whether a deadlock situation has
occurred in a database environment or not. There is a deadlock in the system if and
only if the corresponding wait-for graph contains a cycle.
2 Explain distributed deadlock prevention method.
Distributed Deadlock Prevention Method
 Distributed Deadlock prevention is a cautious scheme in which a transaction is
restarted when the system suspects that a deadlock might occur. Deadlock prevention
is an alternative method to resolve deadlock situations in which a system is designed
in such a way that deadlocks are impossible. In this scheme, the transaction manager
checks a transaction when it is first initiated and does not permit to proceed if there is
a risk that it may cause a deadlock.
 In the case of lock-based concurrency control, deadlock prevention in a distributed
system is implemented in the following way:
Let us consider that a transaction Ti is initiated at a particular site in a distributed
database system and that it requires a lock on a data item that is currently owned by
another transaction Tj. Here, a deadlock prevention test is done to check whether there is
any possibility of a deadlock occurring in the system. The transaction Ti is not permitted
to enter into a wait state for the transaction Tj, if there is the risk of a deadlock situation.
In this case, one of the two transactions is aborted to prevent a deadlock.
 The deadlock prevention algorithm is called non-preemptive if the transaction Ti is
aborted and restarted. On the other hand, if the transaction Tj is aborted and restarted,
then the deadlock prevention algorithm is called preemptive. The transaction Ti is
permitted to wait for the transaction Tj as usual, if they pass the prevention test. The
prevention test must guarantee that if Ti is allowed to wait for Tj, a deadlock can
never occur.
 A better approach to implement deadlock prevention test is by assigning priorities to
transactions and checking priorities to determine whether one transaction would wait
for the other transaction or not. These priorities can be assigned by using a unique
identifier for each transaction in a distributed system. For instance, consider that i and
j are priorities of two transactions Ti and Tj respectively. The transaction Ti would
wait for the transaction Tj, if Ti has a lower priority than Tj, that is, if i<j. This
approach prevents deadlock, but one problem with this approach is that cyclic restart
is possible. Thus, some transactions could be restarted repeatedly without ever
finishing.
3-Explain the working principal of wait die &
wound wait algorithms for deadlock prevention.
There are two different techniques for deadlock prevention: Wait-die and Wound-wait.
Wait-die is a non-preemptive deadlock prevention technique based on timestamp values
of transactions:
In this technique, when one transaction is about to block and is waiting for a lock
on a data item that is already locked by another transaction, timestamp values of both the
transactions are checked to give priority to the older transaction. If a younger transaction
is holding the lock on data item then the older transaction is allowed to wait, but if an
older transaction is holding the lock, the younger transaction is aborted and restarted with
the same timestamp value. This forces the wait-for graph to be directed from the older to
the younger transactions, making cyclic restarts impossible. For example, if the
transaction Ti requests a lock on a data item that is already locked by the transaction Tj,
then Ti is permitted to wait only if Ti has a lower timestamp value than Tj. On the other
hand, if Ti is younger than Tj, then Ti is aborted and restarted with the same timestamp
value.
Wound-Wait is an alternative preemptive deadlock prevention technique by which
cyclic restarts can be avoided.
In this method, if a younger transaction requests for a lock on a data item that is
already held by an older transaction, the younger transaction is allowed to wait until the
older transaction releases the corresponding lock. In this case, the wait-for graph flows
from the younger to the older transactions, and cyclic restart is again avoided. For
instance, if the transaction Ti requests a lock on a data item that is already locked by the
transaction Tj, then Ti is permitted to wait only if Ti has a higher timestamp value than
Tj, otherwise, the transaction Tj is aborted and the lock is granted to the transaction Ti.
4- Discuss distributed deadlock detection technique
with appropriate figure.
Distributed Deadlock detection
 In distributed deadlock detection method, a deadlock detector exists at each site of the
distributed system. In this method, each site has the same amount of responsibility,
and there is no such distinction as local or global deadlock detector. A variety of
approaches have been proposed for distributed deadlock detection algorithms, but the
most well-known and simplified version, which is presented here, was developed by
R. Obermarck in 1982.
 In this approach, a LWFG is constructed for each site by the respective local deadlock
detectors. An additional external node is added to the LWFGs, as each site in the
distributed system receives the potential deadlock cycles from other sites. In the
distributed deadlock detection algorithm, the external node Tex is added to the
LWFGs to indicate whether any transaction from any remote site is waiting for a data
item that is being held by a transaction at the local site or whether any transaction
from the local site is waiting for a data item is currently being used by any transaction
at any remote site. For instance, an edge from the node Ti to Tex exists in the LWFG,
if the transaction Ti is waiting for a data item that is already held by any transaction at
any remote site. Similarly, an edge from the external node Tex to Ti exists in the
graph, if a transaction from a remote site is waiting to acquire a data item that is
currently being held by the transaction Ti at the local site.
 Thus, the local detector checks for two things to determine a deadlock situation.
 If a LWFG contains a cycle that does not involve the external node Tex, then it
indicates that a deadlock has occurred locally and it can be handled locally.
 On the other hand, a global deadlock potentially exists if the LWFG contains a cycle
involving the external node Tex. However, the existence of such a cycle does not
necessarily imply that there is a global deadlock, as the external node Tex represents
different agents.
 The LWFGs are merged so as to determine global deadlock situations. To avoid sites
transmitting their LWFGs to each other, a simple strategy is followed here. According
to this strategy, one timestamp value is allocated to each transaction and a rule is
imposed such that one site Si transmits its LWFG to the site Sk, if a transaction, say
Tk, at site Sk is waiting for a data item that is currently being held by a transaction Ti
at site Si and ts(Ti)<ts(Tk). If ts (Ti)<ts(Tk), the site Si transmits its LWFG to the site
Sk, and the site Sk adds this information to its LWFG and checks for cycles not
involving the external the node Tex in the extended graph. If there is no cycle in the
extended graph, the process continues until a cycle appears and it may happen that the
entire GWFG is constructed and no cycle is detected. In this case, it is decided that
there is no deadlock in the entire distributed system.
 On the other hand, if the GWFG contains a cycle not involving the external node Tex,
it is concluded that a deadlock has occurred in the system. The distributed deadlock
detection method is illustrated below.
Site 1 Site 2 Site 3
Fig. Distributed Deadlock Detection
Ti
Tex
Tk
Site 3
Site 2
Tj
Tex
Ti
Site 1
Site 3
Tk
Tex
Tj
Site 2
Site 1
5- Define false deadlock & phantom deadlock in
distributed database environment.
False Deadlocks
 To handle the deadlock situation in distributed database systems, a number of
messages are transmitted among the sites. The delay associated with the transmission
of messages that is necessary for deadlock detection can cause the detection of false
deadlocks.
 For instance, consider that at a particular time the deadlock detector has received the
information that the transaction Ti is waiting for the transaction Tj. Further assume
that after some time the transaction Tj releases the data item requested by the
transaction Ti and requests for data item that is being currently held by the transaction
Ti. If the deadlock detector receives the information that the transaction Tj has
requested for a data item that is held by the transaction Ti before receiving the
information that the transaction Ti is not blocked by the transaction Tj any more, a
false deadlock situation is detected.
Phantom Deadlocks
 Another problem is that a transaction Ti that blocks another transaction may be
restarted for reasons that are not related to deadlock detection. In this case, until the
restart message of the transaction Ti is transmitted to the deadlock detector, the
deadlock detector can find a cycle in the wait-for graph that includes the transaction
Ti. Hence, a deadlock situation is detected by the deadlock detector and this is called
a phantom deadlock. When the deadlock detector detects a phantom deadlock, it
may unnecessarily restart a transaction other than Ti. To avoid unnecessary restarts
for phantom deadlock, special safety measures are required.
6- Explain Centralized deadlock detection.
Centralized Deadlock detection
In Centralized Deadlock detection method, a single site is chosen as Deadlock
Detection Coordinator (DDC) for the entire distributed system. The DDC is responsible
for constructing the GWFG for system. Each lock manager in the distributed database
transmits its LWFG to the DDC periodically. The DDC constructs the GWFG from these
LWFGs and checks for cycles in it. The occurrence of a global deadlock situation is
detected if there are one or more cycles in the GWFG. The DDC must break each cycle in
the GWFG by selecting the transactions to be rolled back and restarted to recover from a
deadlock situation. The information regarding the transactions that are to be rolled back
and restarted must be transmitted to the corresponding lock managers by the deadlock
detection coordinator.
– The centralized deadlock detection approach is very simple, but it has several
drawbacks.
– This method is less reliable, as the failure of the central site makes the deadlock
detection impossible.
– The communication cost is very high in the case, as other sites in the distributed
system send their LWFGs to the central site.
– Another disadvantage of centralized deadlock detection technique is that false
detection of deadlocks can occur, for which the deadlock recovery procedure may be
initiated, although no deadlock has occurred. In this method, unnecessary rollbacks
and restarts of transactions may also result owing to phantom deadlocks.

Mais conteúdo relacionado

Mais procurados

Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organization
Jafar Nesargi
 
Chapter 2 database environment
Chapter 2 database environmentChapter 2 database environment
Chapter 2 database environment
>. &lt;
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMS
koolkampus
 
Data base management system
Data base management systemData base management system
Data base management system
Navneet Jingar
 

Mais procurados (20)

Database management systems
Database management systemsDatabase management systems
Database management systems
 
Chapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organizationChapter 4 record storage and primary file organization
Chapter 4 record storage and primary file organization
 
Introduction to distributed database
Introduction to distributed databaseIntroduction to distributed database
Introduction to distributed database
 
Data models
Data modelsData models
Data models
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
 
DDBMS
DDBMSDDBMS
DDBMS
 
Distributed database management systems
Distributed database management systemsDistributed database management systems
Distributed database management systems
 
Query processing and optimization (updated)
Query processing and optimization (updated)Query processing and optimization (updated)
Query processing and optimization (updated)
 
Database security
Database securityDatabase security
Database security
 
Active database
Active databaseActive database
Active database
 
Chapter 2 database environment
Chapter 2 database environmentChapter 2 database environment
Chapter 2 database environment
 
Data models
Data modelsData models
Data models
 
15. Transactions in DBMS
15. Transactions in DBMS15. Transactions in DBMS
15. Transactions in DBMS
 
Ddb 1.6-design issues
Ddb 1.6-design issuesDdb 1.6-design issues
Ddb 1.6-design issues
 
Distributed DBMS - Unit 5 - Semantic Data Control
Distributed DBMS - Unit 5 - Semantic Data ControlDistributed DBMS - Unit 5 - Semantic Data Control
Distributed DBMS - Unit 5 - Semantic Data Control
 
Data base management system
Data base management systemData base management system
Data base management system
 
Concurrency control
Concurrency  controlConcurrency  control
Concurrency control
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Introduction to transaction management
Introduction to transaction managementIntroduction to transaction management
Introduction to transaction management
 
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
Distributed DBMS - Unit 8 - Distributed Transaction Management & Concurrency ...
 

Destaque

Transaction concurrency control
Transaction concurrency controlTransaction concurrency control
Transaction concurrency control
Anand Grewal
 
Transaction management
Transaction managementTransaction management
Transaction management
renuka_a
 
Databases: Concurrency Control
Databases: Concurrency ControlDatabases: Concurrency Control
Databases: Concurrency Control
Damian T. Gordon
 
16. Concurrency Control in DBMS
16. Concurrency Control in DBMS16. Concurrency Control in DBMS
16. Concurrency Control in DBMS
koolkampus
 

Destaque (14)

Semi join
Semi joinSemi join
Semi join
 
Dbms acid
Dbms acidDbms acid
Dbms acid
 
Transaction Management - Lecture 11 - Introduction to Databases (1007156ANR)
Transaction Management - Lecture 11 - Introduction to Databases (1007156ANR)Transaction Management - Lecture 11 - Introduction to Databases (1007156ANR)
Transaction Management - Lecture 11 - Introduction to Databases (1007156ANR)
 
Dbms Final Examination Answer Key
Dbms Final Examination Answer KeyDbms Final Examination Answer Key
Dbms Final Examination Answer Key
 
Acid properties
Acid propertiesAcid properties
Acid properties
 
Fragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed DatabaseFragmentation and types of fragmentation in Distributed Database
Fragmentation and types of fragmentation in Distributed Database
 
Transaction & Concurrency Control
Transaction & Concurrency ControlTransaction & Concurrency Control
Transaction & Concurrency Control
 
Transaction concurrency control
Transaction concurrency controlTransaction concurrency control
Transaction concurrency control
 
Chapter 5 Database Transaction Management
Chapter 5 Database Transaction ManagementChapter 5 Database Transaction Management
Chapter 5 Database Transaction Management
 
Transaction management
Transaction managementTransaction management
Transaction management
 
8 drived horizontal fragmentation
8  drived horizontal fragmentation8  drived horizontal fragmentation
8 drived horizontal fragmentation
 
Databases: Concurrency Control
Databases: Concurrency ControlDatabases: Concurrency Control
Databases: Concurrency Control
 
16. Concurrency Control in DBMS
16. Concurrency Control in DBMS16. Concurrency Control in DBMS
16. Concurrency Control in DBMS
 
Concurrency control
Concurrency controlConcurrency control
Concurrency control
 

Semelhante a Final exam in advance dbms

CSC388 Online Programming Languages Homework 3 (due b.docx
CSC388 Online Programming Languages  Homework 3 (due b.docxCSC388 Online Programming Languages  Homework 3 (due b.docx
CSC388 Online Programming Languages Homework 3 (due b.docx
annettsparrow
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03
Avelin Huo
 

Semelhante a Final exam in advance dbms (20)

Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management System
 
Introduction to database-Normalisation
Introduction to database-NormalisationIntroduction to database-Normalisation
Introduction to database-Normalisation
 
Unit05 dbms
Unit05 dbmsUnit05 dbms
Unit05 dbms
 
RDBMS
RDBMSRDBMS
RDBMS
 
CSC388 Online Programming Languages Homework 3 (due b.docx
CSC388 Online Programming Languages  Homework 3 (due b.docxCSC388 Online Programming Languages  Homework 3 (due b.docx
CSC388 Online Programming Languages Homework 3 (due b.docx
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
ORDBMS.pptx
ORDBMS.pptxORDBMS.pptx
ORDBMS.pptx
 
Bt0066 dbms
Bt0066 dbmsBt0066 dbms
Bt0066 dbms
 
Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03Designing A Syntax Based Retrieval System03
Designing A Syntax Based Retrieval System03
 
Query Optimization - Brandon Latronica
Query Optimization - Brandon LatronicaQuery Optimization - Brandon Latronica
Query Optimization - Brandon Latronica
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database Design
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 
Ir 02
Ir   02Ir   02
Ir 02
 
Mit202 data base management system(dbms)
Mit202  data base management system(dbms)Mit202  data base management system(dbms)
Mit202 data base management system(dbms)
 
Bt0066
Bt0066Bt0066
Bt0066
 
B T0066
B T0066B T0066
B T0066
 
TEXT CLUSTERING.doc
TEXT CLUSTERING.docTEXT CLUSTERING.doc
TEXT CLUSTERING.doc
 
Normalisation in Database management System (DBMS)
Normalisation in Database management System (DBMS)Normalisation in Database management System (DBMS)
Normalisation in Database management System (DBMS)
 
A-Study_TopicModeling
A-Study_TopicModelingA-Study_TopicModeling
A-Study_TopicModeling
 
Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...Learning from similarity and information extraction from structured documents...
Learning from similarity and information extraction from structured documents...
 

Mais de Md. Mashiur Rahman

Mais de Md. Mashiur Rahman (20)

Rule for creating power point slide
Rule for creating power point slideRule for creating power point slide
Rule for creating power point slide
 
Answer sheet of switching &amp; routing
Answer sheet of switching &amp; routingAnswer sheet of switching &amp; routing
Answer sheet of switching &amp; routing
 
Routing and switching question1
Routing and switching question1Routing and switching question1
Routing and switching question1
 
Lecture 1 networking &amp; internetworking
Lecture 1 networking &amp; internetworkingLecture 1 networking &amp; internetworking
Lecture 1 networking &amp; internetworking
 
Lec 7 query processing
Lec 7 query processingLec 7 query processing
Lec 7 query processing
 
Lec 1 indexing and hashing
Lec 1 indexing and hashing Lec 1 indexing and hashing
Lec 1 indexing and hashing
 
Cloud computing lecture 7
Cloud computing lecture 7Cloud computing lecture 7
Cloud computing lecture 7
 
Cloud computing lecture 1
Cloud computing lecture 1Cloud computing lecture 1
Cloud computing lecture 1
 
parallel Questions &amp; answers
parallel Questions &amp; answersparallel Questions &amp; answers
parallel Questions &amp; answers
 
Mis 8
Mis 8Mis 8
Mis 8
 
Mis 3
Mis 3Mis 3
Mis 3
 
Mis 1
Mis 1Mis 1
Mis 1
 
Mis 5
Mis 5Mis 5
Mis 5
 
Mis 2
Mis 2Mis 2
Mis 2
 
Mis 9
Mis 9Mis 9
Mis 9
 
Mis 6
Mis 6Mis 6
Mis 6
 
Mis 7
Mis 7Mis 7
Mis 7
 
Mis 4
Mis 4Mis 4
Mis 4
 
Computer network solution
Computer network solutionComputer network solution
Computer network solution
 
Computer network answer
Computer network answerComputer network answer
Computer network answer
 

Último

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 

Último (20)

SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 

Final exam in advance dbms

  • 1. PMSCS-653 (Advance DBMS Final-exam) Indexing and Hashing 1- Explain the basic concept of indexing. Mention some factors to be considered for evaluating an indexing technique. Basic Concept An index for a file in a database system works in much the same way as the index in this textbook. If we want to learn about a particular topic, we can search for the topic in the index at the back of the book, find the pages where it occurs, and then read the pages to find the information we are looking for. The words in the index are in sorted order, making it easy to find the word we are looking for. Moreover, the index is much smaller than the book, further reducing the effort needed to find the words we are looking for. Evaluation Factors  Access types: Access types can include finding records with a specified attribute value and finding records whose attribute values fall in a specified range.  Access time: The time it takes to find a particular data item, or set of items, using the technique in question.  Insertion time: The time it takes to insert a new data item. This value includes the time it takes to find the correct place to insert the new data item, as well as the time it takes to update the index structure.  Deletion time: The time it takes to delete a data item. This value includes the time it takes to find the item to be deleted, as well as the time it takes to update the index structure.  Space overhead: The additional space occupied by an index structure. Provided that the amount of additional space is moderate, it is usually worthwhile to sacrifice the space to achieve improved performance. 2- Explain multilevel indexing with example. When is it preferable to use a multilevel index rather than a single level index? Multi-Level Indexing  If primary index does not fit in memory, access becomes expensive.  Solution: treat primary index kept on disk as a sequential file and construct a sparse index on it. - Outer index – a sparse index of primary index - Inner index – the primary index file  If even outer index is too large to fit in main memory, yet another level of index can be created, and so on.  Indices at all levels must be updated on insertion or deletion from the file. An Example  Consider 100,000 records, 10 per block, at one index record per block, that's 10,000 index records. Even if we can fit 100 index records per block, this is 100 blocks. If index is too large to be kept in main memory, a search results in several disk reads.  For very large files, additional levels of indexing may be required.  Indices must be updated at all levels when insertions or deletions require it.  Frequently, each level of index corresponds to a unit of physical storage.
  • 2. Fig: Multilevel Index 3- Differentiate between Dense & Sparse indexing. What is the advantage of sparse index over dense index? Dense VS Sparse Indices  It is generally faster to locate a record if we have a dense index rather than a sparse index.  However, sparse indices have advantages over dense indices in that they require less space and they impose less maintenance overhead for insertions and deletions.  There is a trade-off that the system designer must make between access time and space overhead. However, sparse indices have advantages over dense indices in that they require less space and they impose less maintenance overhead for insertions and deletions. 4- Explain hash file organization with example. Hash File Organization  In a hash file organization, we obtain the address of the disk block, also called the bucket containing a desired record directly by computing a function on the search-key value of the record. Hash File Organization: An Example  Let us choose a hash function for the account file using the search key branch_name.  Suppose we have 26 buckets and we define a hash function that maps names beginning with the ith letter of the alphabet to the ith bucket.
  • 3.  This hash function has the virtue of simplicity, but it fails to provide a uniform distribution, since we expect more branch names to begin with such letters as B and R than Q and X.  Instead, we consider 10 buckets and a hash function that computes the sum of the binary representations of the characters of a key, then returns the sum modulo the number of buckets.  For branch name ‘Perryridge’  Bucket no=h(Perryridge) = 5  For branch name ‘Round Hill’  Bucket no=h(Round Hill) = 3  For branch name ‘Brighton’  Bucket no=h(Brighton) = 3 Fig. Hash organization of account file, with branch-name as the key 5- Construct a B+-tree for the following set of key values: (2, 3, 5, 7, 11, 17, 19, 23, 29, 31) for n=4 and n=6. For n=4 19 2 3 5 7 11 17 19 23 3129 5 11 29
  • 4. 7 19 2 3 5 7 11 17 19 23 3129 5 29 2 3 7 11 17 19 23 31 For n=6 6- Construct a B -tree for the following set of key values: (2, 3, 5, 7, 11, 17, 19, 23, 29, 31) for n=4 and n=6. For n=4 For n=6 7 23 2 3 5 11 17 19 29 31 Functional Dependencies & Normalization 1- Normalize step by step the following database into 1NF, 2NF and 3NF: PROJECT (Proj-ID, Proj-Name, Proj-Mgr-ID, Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate, Total-Hrs) FIRST NORMAL FORM (1NF) A relation r(R) is said to be in First Normal Form (1NF), if and only if every entry of the relation has at most a single value. Thus, it requires to decompose the table PROJECT into two new tables. It is also necessary to identify a PK for each table. First table with nonrepeating groups: PROJECT (Proj-ID, Proj-Name, Proj-Mgr-ID) Second table with table identifier and all repeating groups: PROJECT-EMPLOYEE (Proj-ID, Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate, Total-Hrs) Here Proj-ID is the table identifier.
  • 5. SECOND NORMAL FORM (2NF) A relation r (R) is in Second Normal Form (2NF) if and only if the following two conditions are met simultaneously: 1. r(R) is already in 1NF. 2. No nonprime attribute is partially dependent on any key or, equivalently, each nonprime attribute in R is fully dependent on every key. In PROJECT-EMPLOYEE relation attributes Emp-Name, Emp-Dpt and Emp-Hrly-rate are fully dependent only on Emp-ID. The only attribute Total-Hrs fully depends on the composite PK (Proj-ID, Emp-ID). This is shown below. Now break this relation into two new relations so that partial dependency does not exist. PHOURS-ASSIGNED (Proj-ID, Emp-ID, Total-Hrs) EMPLOYEE (Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate) THIRD NORMAL FORM (3NF) A relation r(R) is in Third Normal Form (3NF) if and only if the following conditions are satisfied simultaneously: 1. r(R) is already in 2NF. 2. No nonprime attribute is transitively dependent on the key (i.e. No nonprime attribute functionally determines any other nonprime attribute). In EMPLOYEE relation, attribute Emp-Hrly-rate is transitively dependent on the PK Emp- ID through the functional dependency Emp-Dpt → Emp-Hrly-rate. This is shown below. Now break this relation into two new relations so that transitive dependency does not exist. EMPLOYEE (Emp-ID, Emp-Name, Emp-Dpt) CHARGES (Emp-Dpt, Emp-Hrly-rate) The new set of relations that we have obtained through the normalization process does not exhibit any of the anomalies. That is, we can insert, delete and update tuples without any of the side effects. PROJECT-EMPLOYEE (Proj-ID, Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate, Total-Hrs) EMPLOYEE (Emp-ID, Emp-Name, Emp-Dpt, Emp-Hrly-rate)
  • 6. 2- Explain BCNF with example. Boyce-Codd Normal Form: A relation r(R) is in Boyce-Codd Normal Form (BCNF) if and only of the following conditions are met simultaneously: (1) The relation is in 3NF. (2) For every functional dependency of the form X → A, we have that either A  X or X is a super key of r. In other words, every functional dependency is either a trivial dependency or in the case that the functional dependency is not trivial, then X must be a super key. To transform the relation of example 2 into BCNF, we can decompose it onto the following set of relational schemes: Set No. 1 MANUFACTURER (ID, Name) MANUFACTURER-PART (ID, Item-No, Quantity) Or Set No. 2 MANUFACTURER (ID, Name) MANUFACTURER-PART (Name, Item-No, Quantity) Notice that both relations are in BCNF and that the update data anomaly is no longer present. 3- Given the relation r(A, B, C) and the set F={AB → C, B → D, D → B} of functional dependencies, find the candidate keys of the relation. How many candidate keys are in this relation? What are the prime attributes? AB is a candidate key. (1) AB → A and AB → B ( Reflexivity axiom) (2) AB → C (given) (3) AB → B (step 1) and B → D (given) then AB → D ( Transitivity axiom) Since AB determines every other attribute of the relation, we have that AB is a candidate key. AD is a candidate key. (1) AD → A and AD → D ( Reflexivity axiom) (2) AD → D (step 1) and D → B (given) then AD → B (3) AD → B (step 2) and AB → C (given) then AAD → C (Pseudotransitivity axiom). i.e. AD → C Since AD determines every other attribute of the relation, we have that AB is a candidate key. The prime attributes are: A, B, and D.
  • 7. 4- What is Canonical Cover? Given the set F of FDs shown below, find a canonical cover for F. F= {X → Z, XY → WP, XY → ZWQ, XZ → R}. Canonical cover For a given set F of FDs, a canonical cover, denoted by Fc, is a set of FDs where the following conditions are simultaneously satisfied: (1) Every FD of Fc is simple. That is, the right-hand side of every functional dependency of Fc has only one attribute. (2) Fc is left-reduced. (3) Fc is nonredundant. Example: Given the set F of FDs shown below, find a canonical cover for F. F= {X → Z, XY → WP, XY → ZWQ, XZ → R}. FC = {X → Z, XY → WP, XY → ZWQ, XZ → R} = {X → Z, XY → W, XY → P, XY → Z, XY → W, XY → Q, XZ → R}. [Simple] = {X → Z, XY → W, XY → P, XY → Z, XY → Q, XZ → R}. [Remove redundancy] = {X → Z, XY → W, XY → P, XY → Z, XY → Q, XZ → R}. = {X → Z, XY → W, XY → P, XY → Q, XZ → R}. [Remove redundancy] = {X → Z, XY → W, XY → P, XY → Q, XZ → R}. = {X → Z, XY → W, XY → P, XY → Q, X → R}. [Left-reduced] FC = {X → Z, XY → W, XY → P, XY → Q, X → R}. This resulting set where all FDs are simple, left-reduced and non-redundant, is the canonical cover of F. Distributed Database Management System (DDBMS) 1- Distinguish between local transaction & global transaction.  A local transaction is one that accesses data only from sites where the transaction was initiated.  A global transaction, on the other hand, is one that either accesses data in a site different from the one at which the transaction was initiated, or accesses data in several different sites. 2- Draw & explain the general structure of a distributed database system. Describe the relative advantages & disadvantages of DDBMS. • Consists of a single logical database that is split into a number of fragments. Each fragment is stored on one or more computers. • The computers in a distributed system communicate with one another through various communication media, such as high-speed networks or telephone lines. • They do not share main memory or disks.
  • 8. • Each site is capable of independently processing user requests that require access to local data as well as it is capable of processing user requests that require access to remote data stored on other computers in the network. The general structure of a distributed system appears in the following figure. Advantages of DDBMS  Sharing data. Users at one site may be able to access the data residing at other sites.  Increased Local Autonomy. Global database administrator responsible for the entire system. A part of these responsibilities is delegated to the local database administrator for each site.  Increased Availability. If one site fails in a distributed system, the remaining sites may be able to continue operating.  Speeding up of query processing. It is possible to process data at several sites simultaneously. If a query involves data stored at several sites, it may be possible to split the query into a number of subqueries that can be executed in parallel. Thus, query processing becomes faster in distributed systems.  Increased reliability. A single data item may exist at several sites. If one site fails, a transaction requiring a particular data item from that site may access it from other sites. Thus, reliability increases.  Better performance. As each site handles only a part of the entire database, there may not be the same level of contention for CPU and I/O services that characterizes a centralized DBMS.  Processor independence. A distributed system may contain multiple copies of a particular data item. Thus, users can access any available copy of the data item, and user requests can be processed by the processor at the data location. Hence, user- requests do not depend on a specific processor.  Modular extensibility. Modular extension in a distributed system is much easier. New sites can be added to the network without affecting the operations of other sites. Such flexibility allows organizations to extend their system in a relatively rapid and easier way.
  • 9. Disadvantages of DDBMS  Software-development cost. It is more difficult to implement a distributed database system; thus, it is more costly.  Greater potential for bugs. Since the sites operate in parallel, it is harder to ensure the correctness of algorithms, especially operation during failures of part of the system, and recovery from failures.  Increased processing overhead. The exchange of messages and the additional computation required to achieve inter-site coordination are a form of overhead that does not arise in centralized systems.  Security. As data in a distributed DBMS are located at multiple sites, the probability of security lapses increases. Further, all communications between different sites in a distributed DBMS are conveyed through the network, so the underlying network has to be made secure to maintain system security.  Lack of standards. A distributed system can consists of a number of sites that are heterogeneous in terms of processors, communication networks and DBMSs. This lack of standards significantly limits the potential of distributed DBMSs.  Increased storage requirements.  Maintenance of integrity is very difficult. Database integrity refers to the validity and consistency of stored data.  Lack of experience. Distributed DBMSs have not been widely accepted. Consequently, we not yet have the same level of experience in industry .  Database design is more complex. The design of a distributed DBMS involves fragmentation of data, allocation of fragments to specific sites and data replication. 3- Differentiate between Homogeneous and Heterogeneous Databases. Homogeneous Distributed Database  All sites have identical database management system software, are aware of one another, and agree to cooperate in processing users’ requests.  Use same DB schemas at all sites.  Easier to design and manage  Addition of a new site is much easier. Heterogeneous distributed database  Usually constructed over a no. of existing sites.  Each site has its local database. Different sites may use different schemas (relational model, OO model etc.).  Use different DBMS software.  Query processing is more difficult.  Use gateways (as query translator) which convert the language and data model of each different DBMS into the language and data model of the relational system.
  • 10. 4- Define data replication in DDBs. Mention some major advantages and disadvantages of data replication. Data Storage in DDBMS  Replication. The system maintains several identical replicas (copies) of the relation, and stores each replica at a different site.  Fragmentation. The system partitions the relation into several fragments, and stores each fragment at a different site.  Fragmentation and replication can be combined. Advantages and disadvantages of Replication  Availability.  Increased parallelism.  Increased overhead on update. 5- What is Data Fragmentation? Discuss different types of data fragmentation with examples. Data Fragmentation  If relation r is fragmented, r is divided into a number of fragments r1, r2, . . . , rn.  These fragments contain sufficient information to allow reconstruction of the original relation r.  There are two different schemes for fragmenting a relation: - Horizontal fragmentation and - Vertical fragmentation Horizontal Fragmentation  In horizontal fragmentation, a relation r is partitioned into a number of subsets, r1, r2, . . . , rn.  Each tuple of relation r must belong to at least one of the fragments, so that the original relation can be reconstructed, if needed. As an illustration, consider the account relation: Account-schema = (account-number, branch-name, balance) The relation can be divided into several different fragments. If the banking system has only two branches - Savar and Dhanmondi - then there are two different fragments: account1 = σbranch-name = “Savar” (account) account2 = σbranch-name = “Dhanmondi” (account)  Use a predicate Pi to construct fragment ri: ri = σPi (r)  Reconstruct the relation r by taking the union of all fragments. That is, r = r1 r2 · · · rn  The fragments are disjoint.
  • 11. Vertical Fragmentation  Vertical fragmentation of r(R) involves the definition of several subsets of attributes R1, R2, . . .,Rn of the schema R so that R = R1 R2 · · · Rn  Each fragment ri of r is defined by ri = ΠRi (r)  We can reconstruct relation r from the fragments by taking the natural join r = r1 ⋈ r2 ⋈ r3 ⋈ · · · ⋈ rn  One way of ensuring that the relation r can be reconstructed is to include the primary-key attributes of R in each of the Ri. - To illustrate vertical fragmentation, consider the following relation: employee-info=(employee-id, name, designation, salary) - For privacy reasons, the relation may be fragmented into a relation employee- privateinfo containing employee-id and salary, and another relation employee- public-info containing attributes employee-id, name, and designation. employee-privateinfo=(employee-id, salary) employee-publicinfo=(employee-id, name, designation) - These may be stored at different sites, again for security reasons. 6- Write down the correctness rules for data fragmentation. Give an example of horizontal & vertical fragments that satisfy all the correctness rules of fragmentation. Correctness Rules for Data Fragmentation To ensure no loss of information and no redundancy of data, there are three different rules that must be considered during fragmentation.  Completeness If a relation instance R is decomposed into fragments R1, R2, . . . .Rn, each data item in R must appear in at least one of the fragments. It is necessary in fragmentation to ensure that there is no loss of data during data fragmentation.  Reconstruction If relation R is decomposed into fragments R1, R2, . . . .Rn, it must be possible to define a relational operation that will reconstruct the relation R from fragments R1, R2, . . . .Rn. This rule ensures that constrains defined on the data are preserved during data fragmentation.  Disjoint ness If a relation R is decomposed into fragments R1, R2, . . . .Rn and if a data item is found in the fragment Ri, then it must not appear in any other fragments. This rule ensures minimal data redundancy.
  • 12. In case of vertical fragmentation, primary key attribute must be repeated to allow reconstruction. Therefore, in case of vertical fragmentation, disjoint ness is defined only on non-primary key attributes of a relation.  Example Let us consider the relational schema Project where project-type represents whether the project is an inside project or abroad project. Assume that P1 and P2 are two horizontal fragments of the relation Project, which are obtained by using the predicate “whether the value of project-type attribute is ‘inside’ or ‘abroad’.  Example (Horizontal Fragmentation) P1: σproject-type = “inside” (Project) P2: σproject-type = “abroad” (Project)
  • 13.  Example (Horizontal Fragmentation) These horizontal fragments satisfy all the correctness rules of fragmentation as shown below. Completeness: Each tuple in the relation Project appears either in fragment P1 or P2. Thus, it satisfies completeness rule for fragmentation. Reconstruction: The Project relation can be reconstructed from the horizontal fragments P1 and P2 by using the union operation of relational algebra, which ensures the reconstruction rule. Thus, P1 P2 = Project. Disjointness: The fragments P1 and P2 are disjoint, since there can be no such project whose project type is both “inside” and “abroad”.  Example (Vertical Fragmentation) These vertical fragments also ensure the correctness rules of fragmentation as shown below. Completeness: Each tuple in the relation Project appears either in fragment V1 or V2 which satisfies completeness rule for fragmentation. Reconstruction: The Project relation can be reconstructed from the vertical fragments V1 and V2 by using the natural join operation of relational algebra, which ensures the reconstruction rule. Thus, V1 ⋈ V2 = Project. Disjointness: The fragments V1 and V2 are disjoint, except for the primary key project- id, which is repeated in both fragments and is necessary for reconstruction. 
  • 14. Transparencies in DDBMS 1- What is meant by transparency in database management system? Discuss different types of distributed transparency. Transparency o It refers to the separation of the high-level semantics of a system from lower-level implementation issues. In a distributed system, it hides the implementation details from users of the system. o The user believes that he/she is working with a centralized database system and that all the complexities of a distributed database system are either hidden or transparent to the user. Distribution transparency can be classified into:  Fragmentation transparency  Location transparency  Replication transparency  Local mapping transparency  Naming transparency  Fragmentation transparency It hides the fact from users that the data are fragmented. Users are not required to know how a relation has been fragmented. To retrieve data, the user needs not to specify the particular fragment names.  Location transparency Users are not required to know the physical location of the data. To retrieve data from a distributed database with location transparency, the user has to specify the database fragment names but need not to specify where the fragments are located in the system.  Replication transparency The user is unaware of the fact that the fragments of relations are replicated and stored in different sites of the system.  Local mapping transparency It refers to the fact that users are aware of both the fragment names and the location of fragments, taking into account that any replication of the fragments may exist. In this case, the user has to mention both the fragment names and the location for data access.  Naming transparency In a distributed database system, each database object such as- relations, fragments, replicas etc, must have a unique name. Therefore, the DDBMS must ensure that no two sites create a database object with the same name. One solution to this problem is to create a central name server, which has the responsibility to ensure uniqueness of all names in the system. Naming transparency means that the users are not aware of the actual name of the database object in the system. In this case, the user will specify the alias names of the database objects for data accessing.
  • 15. Distributed Deadlock Management 1- Define Deadlock in distributed DBMS. Discuss how deadlock situation can be characterized in distributed DBMS. Deadlock In a database environment, a deadlock is a situation when transactions are endlessly waiting for one another. Any lock-based concurrency control algorithm and some timestamp-based concurrency control algorithms may result in deadlocks, as these algorithms require transactions to wait for one another. Deadlock Situation  Deadlock situations can be characterized by wait-for graphs, directed graphs that indicate which transactions are waiting for which other transactions.  In a wait-for graph, nodes of the graph represent transactions and edges of the graph represent the waiting-for relationships among transactions. An edge is drawn in the wait-for graph from transaction Ti to transaction Tj, if the transaction Ti is waiting for a lock on a data item that is currently held by the transaction Tj.  Using wait-for graphs, it is very easy to detect whether a deadlock situation has occurred in a database environment or not. There is a deadlock in the system if and only if the corresponding wait-for graph contains a cycle. 2 Explain distributed deadlock prevention method. Distributed Deadlock Prevention Method  Distributed Deadlock prevention is a cautious scheme in which a transaction is restarted when the system suspects that a deadlock might occur. Deadlock prevention is an alternative method to resolve deadlock situations in which a system is designed in such a way that deadlocks are impossible. In this scheme, the transaction manager checks a transaction when it is first initiated and does not permit to proceed if there is a risk that it may cause a deadlock.  In the case of lock-based concurrency control, deadlock prevention in a distributed system is implemented in the following way: Let us consider that a transaction Ti is initiated at a particular site in a distributed database system and that it requires a lock on a data item that is currently owned by another transaction Tj. Here, a deadlock prevention test is done to check whether there is any possibility of a deadlock occurring in the system. The transaction Ti is not permitted to enter into a wait state for the transaction Tj, if there is the risk of a deadlock situation. In this case, one of the two transactions is aborted to prevent a deadlock.  The deadlock prevention algorithm is called non-preemptive if the transaction Ti is aborted and restarted. On the other hand, if the transaction Tj is aborted and restarted, then the deadlock prevention algorithm is called preemptive. The transaction Ti is permitted to wait for the transaction Tj as usual, if they pass the prevention test. The prevention test must guarantee that if Ti is allowed to wait for Tj, a deadlock can never occur.  A better approach to implement deadlock prevention test is by assigning priorities to transactions and checking priorities to determine whether one transaction would wait for the other transaction or not. These priorities can be assigned by using a unique
  • 16. identifier for each transaction in a distributed system. For instance, consider that i and j are priorities of two transactions Ti and Tj respectively. The transaction Ti would wait for the transaction Tj, if Ti has a lower priority than Tj, that is, if i<j. This approach prevents deadlock, but one problem with this approach is that cyclic restart is possible. Thus, some transactions could be restarted repeatedly without ever finishing. 3-Explain the working principal of wait die & wound wait algorithms for deadlock prevention. There are two different techniques for deadlock prevention: Wait-die and Wound-wait. Wait-die is a non-preemptive deadlock prevention technique based on timestamp values of transactions: In this technique, when one transaction is about to block and is waiting for a lock on a data item that is already locked by another transaction, timestamp values of both the transactions are checked to give priority to the older transaction. If a younger transaction is holding the lock on data item then the older transaction is allowed to wait, but if an older transaction is holding the lock, the younger transaction is aborted and restarted with the same timestamp value. This forces the wait-for graph to be directed from the older to the younger transactions, making cyclic restarts impossible. For example, if the transaction Ti requests a lock on a data item that is already locked by the transaction Tj, then Ti is permitted to wait only if Ti has a lower timestamp value than Tj. On the other hand, if Ti is younger than Tj, then Ti is aborted and restarted with the same timestamp value. Wound-Wait is an alternative preemptive deadlock prevention technique by which cyclic restarts can be avoided. In this method, if a younger transaction requests for a lock on a data item that is already held by an older transaction, the younger transaction is allowed to wait until the older transaction releases the corresponding lock. In this case, the wait-for graph flows from the younger to the older transactions, and cyclic restart is again avoided. For instance, if the transaction Ti requests a lock on a data item that is already locked by the transaction Tj, then Ti is permitted to wait only if Ti has a higher timestamp value than Tj, otherwise, the transaction Tj is aborted and the lock is granted to the transaction Ti. 4- Discuss distributed deadlock detection technique with appropriate figure. Distributed Deadlock detection  In distributed deadlock detection method, a deadlock detector exists at each site of the distributed system. In this method, each site has the same amount of responsibility, and there is no such distinction as local or global deadlock detector. A variety of approaches have been proposed for distributed deadlock detection algorithms, but the most well-known and simplified version, which is presented here, was developed by R. Obermarck in 1982.  In this approach, a LWFG is constructed for each site by the respective local deadlock detectors. An additional external node is added to the LWFGs, as each site in the distributed system receives the potential deadlock cycles from other sites. In the
  • 17. distributed deadlock detection algorithm, the external node Tex is added to the LWFGs to indicate whether any transaction from any remote site is waiting for a data item that is being held by a transaction at the local site or whether any transaction from the local site is waiting for a data item is currently being used by any transaction at any remote site. For instance, an edge from the node Ti to Tex exists in the LWFG, if the transaction Ti is waiting for a data item that is already held by any transaction at any remote site. Similarly, an edge from the external node Tex to Ti exists in the graph, if a transaction from a remote site is waiting to acquire a data item that is currently being held by the transaction Ti at the local site.  Thus, the local detector checks for two things to determine a deadlock situation.  If a LWFG contains a cycle that does not involve the external node Tex, then it indicates that a deadlock has occurred locally and it can be handled locally.  On the other hand, a global deadlock potentially exists if the LWFG contains a cycle involving the external node Tex. However, the existence of such a cycle does not necessarily imply that there is a global deadlock, as the external node Tex represents different agents.  The LWFGs are merged so as to determine global deadlock situations. To avoid sites transmitting their LWFGs to each other, a simple strategy is followed here. According to this strategy, one timestamp value is allocated to each transaction and a rule is imposed such that one site Si transmits its LWFG to the site Sk, if a transaction, say Tk, at site Sk is waiting for a data item that is currently being held by a transaction Ti at site Si and ts(Ti)<ts(Tk). If ts (Ti)<ts(Tk), the site Si transmits its LWFG to the site Sk, and the site Sk adds this information to its LWFG and checks for cycles not involving the external the node Tex in the extended graph. If there is no cycle in the extended graph, the process continues until a cycle appears and it may happen that the entire GWFG is constructed and no cycle is detected. In this case, it is decided that there is no deadlock in the entire distributed system.  On the other hand, if the GWFG contains a cycle not involving the external node Tex, it is concluded that a deadlock has occurred in the system. The distributed deadlock detection method is illustrated below. Site 1 Site 2 Site 3 Fig. Distributed Deadlock Detection Ti Tex Tk Site 3 Site 2 Tj Tex Ti Site 1 Site 3 Tk Tex Tj Site 2 Site 1
  • 18. 5- Define false deadlock & phantom deadlock in distributed database environment. False Deadlocks  To handle the deadlock situation in distributed database systems, a number of messages are transmitted among the sites. The delay associated with the transmission of messages that is necessary for deadlock detection can cause the detection of false deadlocks.  For instance, consider that at a particular time the deadlock detector has received the information that the transaction Ti is waiting for the transaction Tj. Further assume that after some time the transaction Tj releases the data item requested by the transaction Ti and requests for data item that is being currently held by the transaction Ti. If the deadlock detector receives the information that the transaction Tj has requested for a data item that is held by the transaction Ti before receiving the information that the transaction Ti is not blocked by the transaction Tj any more, a false deadlock situation is detected. Phantom Deadlocks  Another problem is that a transaction Ti that blocks another transaction may be restarted for reasons that are not related to deadlock detection. In this case, until the restart message of the transaction Ti is transmitted to the deadlock detector, the deadlock detector can find a cycle in the wait-for graph that includes the transaction Ti. Hence, a deadlock situation is detected by the deadlock detector and this is called a phantom deadlock. When the deadlock detector detects a phantom deadlock, it may unnecessarily restart a transaction other than Ti. To avoid unnecessary restarts for phantom deadlock, special safety measures are required. 6- Explain Centralized deadlock detection. Centralized Deadlock detection In Centralized Deadlock detection method, a single site is chosen as Deadlock Detection Coordinator (DDC) for the entire distributed system. The DDC is responsible for constructing the GWFG for system. Each lock manager in the distributed database transmits its LWFG to the DDC periodically. The DDC constructs the GWFG from these LWFGs and checks for cycles in it. The occurrence of a global deadlock situation is detected if there are one or more cycles in the GWFG. The DDC must break each cycle in the GWFG by selecting the transactions to be rolled back and restarted to recover from a deadlock situation. The information regarding the transactions that are to be rolled back and restarted must be transmitted to the corresponding lock managers by the deadlock detection coordinator. – The centralized deadlock detection approach is very simple, but it has several drawbacks. – This method is less reliable, as the failure of the central site makes the deadlock detection impossible. – The communication cost is very high in the case, as other sites in the distributed system send their LWFGs to the central site. – Another disadvantage of centralized deadlock detection technique is that false detection of deadlocks can occur, for which the deadlock recovery procedure may be initiated, although no deadlock has occurred. In this method, unnecessary rollbacks and restarts of transactions may also result owing to phantom deadlocks.