2. Objectives
Why
Relational Database?
Why there are so many keys like
Primary Key & Foreign Key?
How to design a database?
Entity-Relationship Model
Conversion of E-R Diagram to Tables
Normalization of Database
Designing of Databases
PCTE, Ludhiana, 7/3/2014
2
3. Database
The
related information when placed is
an organized form makes a database.
The organization of data/information is
necessary because unorganized
information has no meaning.
A database is a computer based record
keeping system whose over all purpose
is to record and maintain information.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
3
4. Operations on Database
Insertion
To add new information (e.g. to add the address of a new friend
in your address book)
Retrieval
To view or retrieve the stored information (e.g. you have to find
the address of one of your old friends)
Updation
To modify or edit the existing information (e.g. your friend has
shifted to a new place so his address would get changed)
Deletion
To remove or delete the unwanted information (e.g. your friend
has changed his/her mobile number, so his/her mobile number
would have to be removed from list.)
Designing of Databases
PCTE, Ludhiana, 7/3/2014
4
5. Database Management System
A database management system is the software
system that allows users to define, create and
maintain a database and provides controlled access to
the data.
Applications
Computerized library systems
Automated teller machines
Flight reservation systems
Computerized parts inventory systems
Software
dBase, Foxpro, IMS, SQL Server, MySQL and Oracle
Designing of Databases
PCTE, Ludhiana, 7/3/2014
5
7. Relational DBMS
Relational
model stores data in the form
of tables.
It indicates the relation between rows
and columns of a tables.
In Simple words, for every row-column
combination there must at most a single
value.
This concept purposed by Dr. E.F. Codd,
a researcher of IBM in the year 1960s.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
7
8. Example of a Relation
Designing of Databases
PCTE, Ludhiana, 7/3/2014
8
9. Keys of a Relation
Candidate
Key
Primary Key
Super Key
Alternate Key
Foreign Key
Artificial Key
Designing of Databases
PCTE, Ludhiana, 7/3/2014
9
10. Why there are so many keys like
Primary Key & Foreign Key?
Designing of Databases
PCTE, Ludhiana, 7/3/2014
10
11. Candidate Key
Candidate keys are those attributes of a
relation, which have the properties of
uniqueness and irreducibility.
Irreducibility: No proper subset of K has
the uniqueness property.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
11
12. Super Key
A super key has the uniqueness property but not
necessarily the irreducibility property.
For example if Roll_number is unique in relation
STUDENT then, the set of attributes
(Roll_number, Name, Class) is a super key for a
relation STUDENT, these set of attributes are also
unique, but this combination of keys (composite key) is
not having the property of irreducibility because
Roll_number which is one subset of the composite key
is also unique itself. Thus, this composite key is called
as super key because it has the property of
uniqueness but not the irreducibilty.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
12
13. Primary Key
Primary key is a candidate key choose by the
designer for unique identification of records of a
relation.
Primary key cannot contain any Null value because we
cannot uniquely identify multiple Null values.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
13
14. Alternate Key
The alternate keys of any table are
simply those candidate keys, which are
not currently selected as the primary
key.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
14
16. Foreign Key
Foreign keys are the attributes of a table, which refers to the
primary key of some another table. Foreign Keys permit only
those values, which appears in the primary key of the table to
which it refers or may be null. Foreign keys are used to link
together two or more different tables which have some form of
relationship with each other. The foreign key is a reference to the
tuple of a table from which it was taken, this tuple being called the
Referenced or Target tuple. The table containing the
referenced tuple will be called as Target table.
The matter of integrity of foreign keys is referred to as
Referential Integrity.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
16
19. How to design the database?
Designing of Databases
PCTE, Ludhiana, 7/3/2014
19
20. How to design the database?
There are two approaches
E-R Modeling: Identifying entity and
relations
Normalization: Refinement of database
designing
Designing of Databases
PCTE, Ludhiana, 7/3/2014
20
24. E-R Model
The
Entity-Relationship (ER) model was
originally proposed by Peter in 1976
The ER model is a conceptual data
model that views the real world as
entities and relationships.
A basic component of the model is the
Entity-Relationship diagram, which is
used to visually represent data objects.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
24
25. Basic Constructs of E-R Modeling
A database can be modeled as:
a collection of entities,
relationship among entities.
An entity is an object that exists and is distinguishable
from other objects.
Example:
plant
Entities have attributes
specific person, company, event,
Example: people have names and addresses
An entity set is a set of entities of the same type that
share the same properties.
Example: set of all persons, companies, trees, holidays
Designing of Databases
PCTE, Ludhiana, 7/3/2014
25
26. Entity Sets customer and loan
Designing of Databases
PCTE, Ludhiana, 7/3/2014
26
28. Attributes
Attributes describe the properties of the
entity of which they are associated. We
can classify attributes as following:
Simple
Composite
Single-values
Multi-values
Derived
Designing of Databases
PCTE, Ludhiana, 7/3/2014
28
33. Degree of a Relationship
The degree of a relationship is the
number of entities associated with the
relationship. The n-ary relationship is the
general form for degree n. Special cases
are the binary, and ternary, where the
degree is 2, and 3, respectively.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
33
34. Connectivity and Cardinality
The connectivity of a relationship describes the
mapping of associated entity instances in the
relationship. The values of connectivity are "one" or
"many". The cardinality of a relationship is the actual
number of related occurrences for each of the two
entities.
The basic types of connectivity for relations are:
One to One (1:1)
One to Many (1:M)
Many to One (M:1)
Many to Many (M:M)
Designing of Databases
PCTE, Ludhiana, 7/3/2014
34
39. Direction
The direction of a relationship indicates the originating
entity of a relationship. The entity from which a
relationship originates is the parent entity; the entity
where the relationship terminates is the child entity.
The type of the relation is determined by the direction
of line connecting relationship component and the
entity. To distinguish different types of relation, we
draw either a directed line or an undirected line
between the relationship set and the entity set.
Directed line is used to indicate one occurrence and
undirected line is used to indicate many occurrences
in a relation as shown in next case.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
39
43. E-R Notation
Entities are represented by labeled rectangles. The label is the
name of the entity. Entity names should be singular nouns.
Attributes are represented by Ellipses.
A solid line connecting two entities represents relationships. The
name of the relationship is written above the line. Relationship
names should be verbs and diamonds sign is used to represent
relationship sets.
Attributes, when included, are listed inside the entity rectangle.
Attributes, which are identifiers, are underlined. Attribute names
should be singular nouns.
Multi-valued attributes are represented by double ellipses.
Directed line is used to indicate one occurrence and undirected
line is used to indicate many occurrences in a relation.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
43
51. Total Participation
Total participation (indicated by double line): every entity in the entity set
participates in at least one relationship in the relationship set
E.g. participation of loan in borrower is total
every loan must have a customer associated to it via borrower
Partial participation: some entities may not participate in any relationship
in the relationship set
Example: participation of customer in borrower is partial
Designing of Databases
PCTE, Ludhiana, 7/3/2014
51
55. Strong and Weak Entity Sets
The entity set which does not has sufficient attributes
to form a primary key is called as weak entity set. An
entity set that has a primary key is called as Strong
entity set.
Consider an entity set Payment which has three
attributes: payment_number, payment_date and
payment_amount. Although each payment entity is
distinct but payment for different loans may share the
same payment number. Thus, this entity set does not
have a primary key and it is a weak entity set. Each
weak set must be a part of one-to-many relationship
set.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
55
56. Strong and Weak Entity Sets
A member of a strong entity set is called dominant entity and
member of weak entity set is called as subordinate entity. A weak
entity set does not have a primary key but we need a means of
distinguishing among all those entries in the entity set that
depend on one particular strong entity set. The discriminator of a
weak entity set is a set of attributes that allows this distinction to
be made. For example, payment_number is acts as discriminator
for payment entity set. It is also called as the Partial key of the
entity set.
The primary key of a weak entity set is formed by the primary key
of the strong entity set on which the weak entity set is existence
dependent, plus the weak entity set’s discriminator. In the above
example {loan_number, payment_number} acts as primary key
for payment entity set.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
56
61. Entity Set to Table
For
each entity set and relationship set
there is a unique table, which is
assigned the name of the corresponding
entity set or relationship set. Each table
has a number of columns (generally
corresponding to attributes), which have
unique names. Primary keys allow entity
sets and relationship sets to be
expressed uniformly as tables, which
represent the contents of the database.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
61
63. Composite and Multi-Value
Attributes
In
order to convert an entity having
composite attributes, the composite
attributes are flattened out by creating a
separate attribute for each component
attribute.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
63
70. Normalization
It is a process of decomposing a larger table into
smaller tables so that it satisfy series of tests. If the
database satisfies the test, then database is
considered normalized according to that test or rule or
degree. There are five series of test that we apply on
the database, so there are five degree or rules of
normalization which are known as Normal First
Normal Form, Second Normal Form and so on.
When a test fails, the relation violating that test must
be decomposed into relations so that it individually
meet the normalization tests.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
70
71. Objectives of Normalization
To create a formal framework for analyzing relation schemas
based on their keys and on the functional dependencies among
their attributes.
To obtain powerful relational retrieval algorithms based on a
collection of primitive relational operators.
To free relations from undesirable insertion, update and deletion
anomalies.
To reduce the need for restructuring the relations as new data
types are introduced.
To carry out series of tests on individual relation schema so that
the relational database can be normalized to some degree. When
a test fails, the relation violating that test must be decomposed
into relations that individually meet the normalization tests.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
71
73. Functional Dependence (FD)
In a relation R having two attributes X and Y, if for
each value of X there should be one value of Y, then Y
is called functionally dependent on X.
In other words, X is the determinant and Y is the
determined then we say that X functionally determines
Y and graphically represent this as XY. The symbols
XY can also be expressed as Y is functionally
determined by X.
For each value of the determinant there is
associated one and only one value of the
determined.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
73
74. Functional Dependence (FD)
The
following table illustrates A B:
A B
1 1
2 4
3 9
4 16
2 4
7 9
Designing of Databases
PCTE, Ludhiana, 7/3/2014
74
75. Functional Dependence (FD)
The following table illustrates that A does not functionally determine B:
A
B
1
1
2
4
3
9
4
16
3
10
Since for A = 3 there is associated more than one value of B.
Functional dependency can also be defined as follows:
An attribute in a relational model is said to be functionally dependent on
another attribute in the table if it can take only one value for a given value
of the attribute upon which it is functionally dependent.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
75
77. Fully Functional Dependence (FFD)
Fully Functional Dependence(FFD) is
defined as Attribute Y is FFD on attribute
X , if it is FD on X and not FD on any
proper subset of X.
(X1, X2) Y
X
X1 Y
X2 Y
Designing of Databases
PCTE, Ludhiana, 7/3/2014
77
78. Fully Functional Dependence (FFD)
For example, in relation Supplier, different cities may
have the same status. It may be possible that cities
like Amritsar, Jalandhar may have the same status 10.
So, the City is not FD on Status
But, the combination of Sno,Status can give only one
corresponding City ,because Sno is unique. Thus,
(Sno, Status) City
It means city is FD on composite attribute (Sno,Status)
however City is not fully functional dependent on this
composite attribute,
Designing of Databases
PCTE, Ludhiana, 7/3/2014
78
79. Fully Functional Dependence (FFD)
Consider the another case of SP table:
Here, Qty is FD on combination of Sno, Pno.
Here, X has two proper subsets Sno and Pno
Qty is not FD on Sno, because one Sno can
supply more than one quantity.
Qty is also not FD on Pno, because one Pno
may be supplied many times by different
suppliers with different or same quantities.
So, Qty is FFD on composite attribute of (Sno,
Pno)à Qty.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
79
80. First Normal Form
Definition of First Normal Form
A relation is said to be in First Normal
Form (1NF) if and only if every entry of
the relation (the intersection of a tuple
and a column) has at most a single
value. In other words “a relation is in
First Normal Form if and only if all
underlying domains contain atomic
values or single value only.”
Designing of Databases
PCTE, Ludhiana, 7/3/2014
80
82. First Approach: Flattening the table
The
first approach known as “flattening
the table” removes repeating groups by
filling in the “missing” entries of each
“incomplete row” of the table with copies
of their corresponding non-repeating
attributes.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
82
84. Second Approach: Decomposition
of the table
The
second approach for normalizing a
table requires that the table be
decomposed into two new tables that will
replace the original table.
However, before decomposing the
original table it is necessary to identify
an attribute or a set of its attributes that
can be used as table identifiers.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
84
85. Rule of decomposition
One of the two tables contains the table
identifier of the original table and all the
non-repeating attributes.
The other table contains a copy of the
table identifier and all the repeating
attributes.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
85
87. Anomalies in 1NF Relations
(Considering STUDENT table)
Designing of Databases
PCTE, Ludhiana, 7/3/2014
87
88. Second Normal Form
A relation R is in second normal form (2NF) if and only
if it is in 1NF and every non-key attribute is fully
functional dependent on the primary key.
A resultant database of first normal form
COURSE_CODE does not satisfy above rule, because
non-key attributes Name, System_Used and
Hourly_Rate are not fully dependent on the primary
key (Course_Code, Rollno) because Name,
System_Used and Hourly_Rate are functional
dependent on Rollno and Rollno is a subset of the
primary key so it does not hold the law of fully
functional dependence. In order to convert
COURSE_CODE database into second normal form
following rule is used.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
88
94. Third Normal Form
A relation R is in Third Normal Form (3NF) if and only if the
following conditions are satisfied simultaneously:
R is already in 2NF
No nonprime attribute is transitively dependent on the key.
Another way of expressing the conditions for Third Normal Form
is as follows:
R is already in 2NF
No nonprime attribute functionally determines any other nonprime
attribute.
These two sets of conditions are equivalent.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
94
95. Transitive Dependencies
Assume that A,B and C are the set of
attributes of a relation R. Further assume that
the following functional dependencies are
satisfied simultaneously: AB, BA (B not
functionally depends A), BC, AC and
CA (C not functionally depends A). Observe
that C B is neither prohibited nor required. If
all these conditions are true, we will say that
attribute C is transitively dependent on
attribute A. It should be clear that these
functional depend
Designing of Databases
PCTE, Ludhiana, 7/3/2014
95
102. Special Case
For example, consider a relation
SSP ( Sno, Sname, Pno, Qty )
Designing of Databases
PCTE, Ludhiana, 7/3/2014
102
103. Boyce/Codd N/F (BCNF)
BCNF
states that
A relation R is in Boyce/Codd N/F (BCNF) if
and only if every determinant is a candidate
key. Here determinant is a simple attribute or
composite attribute on which some other
attribute is fully functionally dependent.
For example Qty is FFD on (Sno, Pno)
(Sno, Pno) Qty, here
(Sno, Pno) is a composite determinant.
Sno Sname
Here Sno is simple attribute determinat.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
103
104. Overlapping of Candidate
keys
In
order to show the difference between 3NF
and BCNF, relations having overlapping of
candidate keys are considered in detail.
Two candidate keys overlap if they involve two
or more attributes each and have an attribute
in common.
(Id_no, Item_No) Quantity
(Name,Item_No) Quantity
Item_NoName
NameItem_No
Designing of Databases
PCTE, Ludhiana, 7/3/2014
104
105. Another Case
Sno,
Qty)
Here, let us suppose
that Sname (supplier
name) is unique for
each Sno (supplier
number) as shown
below:
Sname
Pno
Qty
S1
Rahat
P1
300
S2
Raju
P2
200
S1
Rahat
P3
100
S2
Sname, Pno,
Sno
Raju
P1
200
Designing of Databases
PCTE, Ludhiana, 7/3/2014
105
107. DENORMALIZATION
Denormalization is the process of attempting to
optimize the performance of a database by adding
redundant data or by grouping data. In some cases,
denormalization helps cover up the inefficiencies
inherent in relational database software.
A normalized design will often store different but
related pieces of information in separate logical tables
(called relations). If these relations are stored
physically as separate disk files, completing a
database query that draws information from several
relations (a join operation) can be slow. If many
relations are joined, it may be prohibitively slow.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
107
108. Uses of Denormalization
Databases intended for Online Transaction Processing (OLTP)
are typically more normalized than databases intended for Online
Analytical Processing (OLAP). OLTP Applications are
characterized by a high volume of small transactions such as
updating a sales record at a super market checkout counter. The
expectation is that each transaction will leave the database in a
consistent state. By contrast, databases intended for OLAP
operations are primarily "read mostly" databases. OLAP
applications tend to extract historical data that has accumulated
over a long period of time. For such databases, redundant or
"denormalized" data may facilitate Business Intelligence
applications. Specifically, dimensional tables in a star schema
often contain denormalized data.
Helpful in retrieval based applications.
Designing of Databases
PCTE, Ludhiana, 7/3/2014
108