O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Dbms ii mca-ch8-db design-2013

...................................................................................................

  • Entre para ver os comentários

  • Seja a primeira pessoa a gostar disto

Dbms ii mca-ch8-db design-2013

  1. 1. Aruna Devi C (DSCASC) 1 Database Design Chapter 8 Unit V Data Base Management System [DBMS]
  2. 2. Aruna Devi C (DSCASC) 2 Database Design • What is relational database design? The grouping of attributes to form "good" relation schemas • Two levels of relation schemas – The logical "user view" level – The storage "base relation" level • Design is concerned mainly with base relations • What are the criteria for "good" base relations? Informal Design Guidelines for Relational Databases:
  3. 3. Aruna Devi C (DSCASC) 3 Informal Design Guidelines for Relational Databases (Cont.,) Informal guidelines that may be used as measures to determine the quality of relation schema design: 1. Making sure that the semantics of the attributes is clear in the schema. 2. Reducing the redundant information in tuples. 3. Reducing the NULL values in tuples. 4. Disallowing the possibility of generating spurious tuples.
  4. 4. Aruna Devi C (DSCASC) 4 Informal Design Guidelines for Relational Databases (Cont.,) GUIDELINE 1: Informally, each tuple in a relation should represent one entity or relationship instance. (Applies to individual relations and their attributes). • Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be mixed in the same relation. • Only foreign keys should be used to refer to other entities. • Entity and relationship attributes should be kept apart as much as possible. Bottom Line: Design a schema that can be explained easily relation by relation. The semantics of attributes should be easy to interpret. 1. Imparting Clear Semantics to Attributes in Relation:
  5. 5. Aruna Devi C (DSCASC) 5 Informal Design Guidelines for Relational Databases (Cont.,) A simplified COMPANY relational database schema
  6. 6. Aruna Devi C (DSCASC) 6 Informal Design Guidelines for Relational Databases (Cont.,)
  7. 7. Aruna Devi C (DSCASC) 7 Informal Design Guidelines for Relational Databases (Cont.,) • Mixing attributes of multiple entities may cause problems. • Information is stored redundantly wasting storage. • Problems with update anomalies » Insertion anomalies » Deletion anomalies » Modification anomalies 2. Redundant Information in Tuples and Update Anomalies
  8. 8. Aruna Devi C (DSCASC) 8 Informal Design Guidelines for Relational Databases (Cont.,) • Insert Anomaly: Cannot insert a project unless an employee is assigned to. (OR) Inversely - Cannot insert an employee unless an he/she is assigned to a project. • Delete Anomaly: When a project is deleted, it will result in deleting all the employees who work on that project. Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project. • Update Anomaly: Changing the name of project number P1 from “Billing” to “Customer-Accounting” may cause this update to be made for all 100 employees working on project P1. Problems with update anomalies:
  9. 9. Aruna Devi C (DSCASC) 9 Informal Design Guidelines for Relational Databases (Cont.,) Two relation schemas suffering from update anomalies
  10. 10. Aruna Devi C (DSCASC) 10 Informal Design Guidelines for Relational Databases (Cont.,) GUIDELINE 2:  Design base relation schemas so that no update anomalies are present in the relations.  If any anomalies are present:  Note them clearly  Make sure that the programs that update the database will operate correctly Guideline to Redundant Information in Tuples and Update Anomalies
  11. 11. Aruna Devi C (DSCASC) 11 Informal Design Guidelines for Relational Databases (Cont.,)  May group many attributes together into a “fat” relation  Can end up with many NULLs  Problems with NULLs  Wasted storage space  Problems understanding meaning 3. NULL Values in Tuples
  12. 12. Aruna Devi C (DSCASC) 12 Informal Design Guidelines for Relational Databases (Cont.,) GUIDELINE 3:  Avoid placing attributes in a base relation whose values may frequently be NULL.  If NULLs are unavoidable:  Make sure that they apply in exceptional cases only, not to a majority of tuples
  13. 13. Aruna Devi C (DSCASC) 13 Informal Design Guidelines for Relational Databases (Cont.,) • Bad designs for a relational database may result in erroneous results for certain JOIN operations. • The "lossless join" property is used to guarantee meaningful results for join operations  NATURAL JOIN  Result produces many more tuples than the original set of tuples in EMP_PROJ  Called spurious tuples  Represent spurious information that is not valid Generation of Spurious Tuples
  14. 14. Aruna Devi C (DSCASC) 14 Informal Design Guidelines for Relational Databases (Cont.,) GUIDELINE 4:  Design relation schemas to be joined with equality conditions on attributes that are appropriately related  Guarantees that no spurious tuples are generated  Avoid relations that contain matching attributes that are not (foreign key, primary key) combinations
  15. 15. Aruna Devi C (DSCASC) 15 Summary and Discussion of Design Guidelines  Anomalies cause redundant work to be done.  Waste of storage space due to NULLs .  Difficulty of performing operations and joins due to NULL values.  Generation of invalid and spurious data during joins. --------------------------------------------
  16. 16. Aruna Devi C (DSCASC) 16 Functional Dependencies • Functional dependencies (FDs) are used to specify formal measures of the "goodness" of relational designs. • FDs and keys are used to define normal forms for relations. • FDs are constraints that are derived from the meaning and interrelationships of the data attributes. • A set of attributes X functionally determines a set of attributes Y if the value of X determines a unique value for Y.
  17. 17. Aruna Devi C (DSCASC) 17 Functional Dependencies (Cont.,)  Constraint between two sets of attributes from the database Definition of Functional Dependency This means that the values of the Y component of a tuple in r depend on, or are determined by, the values of the X component; alternatively, the values of the X component of a tuple uniquely (or functionally) determine the values of the Y component. We also say that there is a functional dependency from X to Y, or that Y is functionally dependent on X. The abreviation for functional dependency is FD or f.d. The set of attributes X is called the left-hand side of the FD, and Y is called the right-hand side.
  18. 18. Aruna Devi C (DSCASC) 18 Functional Dependencies (Cont.,) • Thus, X functionally determines Y in a relation schema R if, and only if, whenever two tuples of r(R) agree on their X-value, they must necessarily agree on their Y-value. Note the following: ■ If a constraint on R states that there cannot be more than one tuple with a given X- value in any relation instance r(R)—that is, X is a candidate key of R—this implies that X → Y for any subset of attributes Y of R (because the key constraint implies that no two tuples in any legal state r(R) will have the same value of X). If X is a candidate key of R, then X→R. ■ If X→Y in R, this does not say whether or not Y→X in R. Definition of Functional Dependency (Cont.,)
  19. 19. Aruna Devi C (DSCASC) 19 Functional Dependencies (Cont.,)  Property of semantics or meaning of the attributes The database designers will use their understanding of the semantics of the attributes of R—that is, how they relate to one another—to specify the functional dependencies that should hold on all relation states (extensions) r of R.  Legal relation states  Satisfy the functional dependency constraints
  20. 20. Aruna Devi C (DSCASC) 20 Functional Dependencies (Cont.,)  Given a populated relation  Cannot determine which FDs hold and which do not  Unless meaning of and relationships among attributes known  Can state that FD does not hold if there are tuples that show violation of such an FD Definition of Functional Dependency (cont.,)
  21. 21. Aruna Devi C (DSCASC) 21 Functional Dependencies (Cont.,) • Social security number determines employee name SSN -> ENAME • Project number determines project name and location PNUMBER -> {PNAME, PLOCATION} • Employee ssn and project number determines the hours per week that the employee works on the project {SSN, PNUMBER} -> HOURS Examples of FD constraints:
  22. 22. Aruna Devi C (DSCASC) 22 Functional Dependencies (Cont.,) • An FD is a property of the attributes in the schema R • The constraint must hold on every relation instance r(R) • If K is a key of R, then K functionally determines all attributes in R (since we never have two distinct tuples with t1[K]=t2[K]) Examples of FD constraints (Cont.,):
  23. 23. Aruna Devi C (DSCASC) 23 Functional Dependencies (Cont.,) • Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold Armstrong's inference rules: IR1. (Reflexive) If Y subset-of X, then X -> Y IR2. (Augmentation) If X -> Y, then XZ -> YZ (Notation: XZ stands for X U Z) IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z • IR1, IR2, IR3 form a sound and complete set of inference rules Inference Rules for FDs
  24. 24. Aruna Devi C (DSCASC) 24 Functional Dependencies (Cont.,) Some additional inference rules that are useful: (Decomposition) If X -> YZ, then X -> Y and X -> Z (Union) If X -> Y and X -> Z, then X -> YZ (Psuedotransitivity) If X -> Y and WY -> Z, then WX -> Z • The last three inference rules, as well as any other inference rules, can be deduced from IR1, IR2, and IR3 (completeness property) Inference Rules for FDs (Cont.,)
  25. 25. Aruna Devi C (DSCASC) 25 Functional Dependencies (Cont.,) • Closure of a set F of FDs is the set F+ of all FDs that can be inferred from F • Closure of a set of attributes X with respect to F is the set X + of all attributes that are functionally determined by X • X + can be calculated by repeatedly applying IR1, IR2, IR3 using the FDs in F Inference Rules for FDs (Cont.,)
  26. 26. Aruna Devi C (DSCASC) 26 Functional Dependencies (Cont.,) • Two sets of FDs F and G are equivalent if: - every FD in F can be inferred from G, and - every FD in G can be inferred from F • Hence, F and G are equivalent if F + =G + Definition: F covers G if every FD in G can be inferred from F (i.e., if G + subset-of F + ) • F and G are equivalent if F covers G and G covers F • There is an algorithm for checking equivalence of sets of FDs Equivalence of Sets of FDs
  27. 27. Aruna Devi C (DSCASC) 27 Functional Dependencies (Cont.,) A set of FDs is minimal if it satisfies the following conditions: (1) Every dependency in F has a single attribute for its RHS. (2) We cannot remove any dependency from F and have a set of dependencies that is equivalent to F. (3) We cannot replace any dependency X -> A in F with a dependency Y -> A, where Y proper-subset-of X ( Y subset-of X) and still have a set of dependencies that is equivalent to F. Minimal Sets of FDs
  28. 28. Aruna Devi C (DSCASC) 28 Functional Dependencies (Cont.,) • Every set of FDs has an equivalent minimal set • There can be several equivalent minimal sets • There is no simple algorithm for computing a minimal set of FDs that is equivalent to a set F of FDs • To synthesize a set of relations, we assume that we start with a set of dependencies that is a minimal set (e.g., see algorithms 11.2 and 11.4) Minimal Sets of FDs (Cont.,)
  29. 29. Aruna Devi C (DSCASC) 29 Normal Forms Based on Primary Keys Normalization process:  Approaches for relational schema design  Perform a conceptual schema design using a conceptual model then map conceptual design into a set of relations.  Design relations based on external knowledge derived from existing implementation of files or forms or reports.
  30. 30. Aruna Devi C (DSCASC) 30 Normal Forms Based on Primary Keys (Cont.,)  Takes a relation schema through a series of tests  Certify whether it satisfies a certain normal form  Proceeds in a top-down fashion  Normal form tests Normalization of Relations
  31. 31. Aruna Devi C (DSCASC) 31 Normal Forms Based on Primary Keys (Cont.,)  Properties that the relational schemas should have:  Lossless join or No additive join property • Extremely critical (This property, which guarantees that the spurious tuple generation problem does not occur with respect to the relation schemas created after decomposition.)  Dependency preservation property • Desirable but sometimes sacrificed for other factors ( This property, which ensures that each functional dependency is represented in some individual relation resulting after decomposition.) Normalization of Relations (Cont.,)
  32. 32. Aruna Devi C (DSCASC) 32 Normal Forms Based on Primary Keys (Cont.,)  Definition of superkey and key  Candidate key  If more than one key in a relation schema is called candidate key • One is primary key • Others are secondary keys Definitions of Keys and Attributes Participating in Keys
  33. 33. Aruna Devi C (DSCASC) 33 Normal Form • Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations. • Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form.
  34. 34. Aruna Devi C (DSCASC) 34 Normal Form First Normal Form Second Normal Form Third Normal Form Un Ordered Table with Multi valued attributes
  35. 35. Aruna Devi C (DSCASC) 35 Normal Form 1NF • Disallows composite attributes, multi valued attributes, and nested relations; attributes whose values for an individual tuple are non-atomic. • A relation schema R is in First normal form (1NF) iff each attribute value is atomic. 2NF A relation schema R is in second normal form (2NF) iff every non-key attribute in R is fully functionally dependent on the primary key. 3NF A relation schema R is in Third normal form (3NF) iff it is in 2NF and each non-key attribute in R is non-transitively dependent on the primary key.
  36. 36. Aruna Devi C (DSCASC) 36 Normal Forms Based on Primary Keys (Cont.,) • First normal form (1NF) is now considered to be part of the formal definition of a relation in the basic (flat) relational model. • Defined to disallow multivalued attributes, composite attributes, and their combinations. Definition: "A relation R is in INF iff each attribute don't have repeating groups as well as multi valued attributes i.e. intersection of any column & row contains only one value".  Only attribute values permitted are single atomic (or indivisible) values.  Techniques to achieve first normal form:  Remove attribute and place in separate relation.  Expand the key.  Use several atomic attributes. First Normal Form
  37. 37. Aruna Devi C (DSCASC) 37 Normal Forms Based on Primary Keys (Cont.,)  Does not allow nested relations  Each tuple can have a relation within it  To change to 1NF:  Remove nested relation attributes into a new relation  Propagate the primary key into it  Unnest relation into a set of 1NF relations First Normal Form (Cont.,)
  38. 38. Aruna Devi C (DSCASC) 38 Normal Forms Based on Primary Keys (Cont.,) Redundancy Multi valued First Normal Form (Cont.,)
  39. 39. Aruna Devi C (DSCASC) 39 Normal Forms Based on Primary Keys (Cont.,)  Based on concept of full functional dependency  Versus partial dependency Second Normal Form Nonprime attributes are associated only with part of primary key on which they are fully functionally dependent. Definition: A relation R is said to be in 2NF iff it is in 1NF and every non-key attributes are fully functional dependent on the primary key.
  40. 40. Aruna Devi C (DSCASC) 40 Normal Forms Based on Primary Keys (Cont.,) Second Normal Form (Cont.,) decomposing
  41. 41. Aruna Devi C (DSCASC) 41 Normal Forms Based on Primary Keys (Cont.,) • The EMP_PROJ relation in Figure is in 1NF but is not in 2NF. The nonprime attribute Ename violates 2NF because of FD2, as do the nonprime attributes Pname and Plocation because of FD3. The functional dependencies FD2 and FD3 make Ename, Pname, and Plocation partially dependent on the primary key {Ssn, Pnumber} of EMP_PROJ, thus violating the 2NF test. • Therefore, the functional dependencies FD1, FD2, and FD3 in Figure lead to the decomposition of EMP_PROJ into the three relation schemas EP1, EP2, and EP3 shown in Figure, each of which is in 2NF. Second Normal Form (Cont.,)
  42. 42. Aruna Devi C (DSCASC) 42 Normal Forms Based on Primary Keys (Cont.,) Definition: "A relation R is in 3NF iff it is in 2NF and every non-key attribute is on-transitively dependent on the primary key". • The dependency Ssn Dmgr_ssn is→ transitive through Dnumber in EMP_DEPT, because both the dependencies Ssn Dnumber and Dnumber Dmgr_ssn hold→ → and Dnumber is neither a key itself nor a subset of the key of EMP_DEPT. • We can normalize EMP_DEPT by decomposing it into the two 3NF relation schemas ED1 and ED2 Third Normal Form
  43. 43. Aruna Devi C (DSCASC) 43 Normal Forms Based on Primary Keys (Cont.,) Third Normal Form (Cont.,) Normalizing into 3NF (b) Normalizing EMP_DEPT into 3NF relations
  44. 44. Aruna Devi C (DSCASC) 44 Normal Forms Based on Primary Keys (Cont.,)
  45. 45. Aruna Devi C (DSCASC) 45 Boyce - Codd Normal Form • BCNF is Slightly stronger version of the 3NF. • A table is in BCNF if and only if: 1) It is in 3NF. 2) For every of its nontrivial function dependency X -> Y, X is super key. (A super key is a set of one or more attributes that, taken collectively) (OR) A relation schema ‘r’ is in BCNF if whenever a non-trivial function dependency X -> A holds in ‘r’ then X is a super key of ‘r’. • A table is in BCNF if every determinant in that table is a candidate key. If a table contains only one candidate key, 3NF and BCNF are equivalent. • Def: A determinant is any attribute whose value determines other values in a row. • If an attribute of a composite key is dependent on an attribute of the other composite key, BCNF is needed.
  46. 46. Aruna Devi C (DSCASC) 46 BCNF Example Example: • A professor can work in more than one department. • The percentage of time he spend in each department is given. • Each department has only one Head of Department. Relation: Professor Prof_code Department HOD Percent Time P1 Physics Geetha 50 P1 Maths Krishnan 50 P2 Chemistry Rao 25 P2 Physics Geetha 75 P3 Maths Krishnan 100
  47. 47. Aruna Devi C (DSCASC) 47 BCNF Example • The two possible composite key (or) functional dependency are Prof_code, Department -> Percent time Department -> HOD • The names of department and HOD are duplicated. • If P2 resigns row 3 and 4 are deleted. We lose the information that Rao is HOD of chemistry. • Split the relation Professor into two 1) Prof 2) Dept Prof ( Prof-Code, Dept, Percent time) Dept(Dept, HOD)
  48. 48. Aruna Devi C (DSCASC) 48 BCNF Example • Student • There are 2 candidate keys ( Stud-Id,Subject) and Sname,Subject) The FDs are: Sname, Subject -> Grade Stud_Id, Subject -> Grade Stud-Id -> SName Stud-Id SName Subject Grade 10 Vijay Computer A 10 Vijay Physics B 10 Vijay Maths B 20 Gopal Computer A 20 Gopal Physics A 20 Gopal Maths C
  49. 49. Aruna Devi C (DSCASC) 49 BCNF Example • Now BCNF decompose R into R1 and R2 R1 - (Stud-Id, Subject, Grade) R2 - ( Stud-Id, Sname)
  50. 50. Aruna Devi C (DSCASC) 50 EXAMPLE of Normalization
  51. 51. Aruna Devi C (DSCASC) 51 SNoPNo Qty City Status S1 P1 3 B'lore 12 P2 4 B'lore 12 S2 P1 3 Mysore 20 P2 5 Mysore 20 P3 6 Mysore 20 SNo PNo Qty City Status S1 P1 3 B'lore 12 S1 P2 4 B'lore 12 S2 P1 3 Mysore 20 S2 P2 5 Mysore 20 S2 P3 6 Mysore 20 1) Supplier (in ordered form) Supplier table 1st Normal form - Supplier table Normalization
  52. 52. Aruna Devi C (DSCASC) 52 Anomalies: INSERTION: We cannot enter information of supplier unless PNO is entered DELETION: If we delete a particular supplier then entire information of the product is also deleted. UPDATION: For each Items supplied by a particular supplier we will duplicate supplier information also which is redundancy, hence leads, to inconsistency. STATUS CITY SNO PNO First Normal Form Normalization
  53. 53. Aruna Devi C (DSCASC) 53 SECOND NORMAL FORM : A relation R is said to be in 2NF iff it is in INF and every non-key attributes are fully functional dependent on the primary key. These above diagram is decomposed as below, which is in 2NF. Supplier (SNO, STATUS, CITY) product (SNO, PNO, QTY) SNO SNO PNO SNO PNO STATUS CITY STATUS CITY Second Normal Form First Normal Form Normalization
  54. 54. Aruna Devi C (DSCASC) 54 Anomalies: INSERTION: We cannot enter the status of particular city until unless a supplier exists in that city. DELETION: If we delete a record in the supplier table, we destroy information of not only supplier, but also the information of city. UPDATION: The city information in the relation may appear more than once, thus redundancy of data may cause inconsistency of information. SNO Second Normal Form CITYSNO STATUSCITY STATUS CITY Third Normal Form SNO PNO Normalization
  55. 55. Aruna Devi C (DSCASC) 55 "A relation R is in 3NF iff it is in 2NF and every non-key attribute is non- transitively dependent on the primary key". A table is said to be in 3NF iff it is in 2NF and every non-key attribute is functionally dependent on primary key only i.e. there should not be any dependency between non-key attributes. Third Normal form: supplier-info (SNO, CITY) city-info (CITY, STATUS) product (SNO, PNO, QTY) Normalization
  56. 56. Aruna Devi C (DSCASC) 56 SNoPNo Qty City Status S1 P1 3 B'lore 12 P2 4 B'lore 12 S2 P1 3 Mysore 20 P2 5 Mysore 20 P3 6 Mysore 20 SNo PNo Qty City Status S1 P1 3 B'lore 12 S1 P2 4 B'lore 12 S2 P1 3 Mysore 20 S2 P2 5 Mysore 20 S2 P3 6 Mysore 20 1) Supplier (in ordered form) Supplier table 1st Normal form - Supplier table Normalization
  57. 57. Aruna Devi C (DSCASC) 57 SNo City Status S1 B'lore 12 S2 Mysore 20 2nd Normal form Supplier (SNO, STATUS, CITY) product (SNO, PNO, QTY) SNo PNo Qty S1 P1 3 S1 P2 4 S2 P1 3 S2 P2 5 S2 P3 6 3rd Normal form supplier-info (SNO, CITY) city-info (CITY, STATUS) product (SNO, PNO, QTY) SNo City S1 B'lore S2 Mysore City Status B'lore 12 Mysore 20 SNo PNo Qty S1 P1 3 S1 P2 4 S2 P1 3 S2 P2 5 S2 P3 6
  58. 58. Aruna Devi C (DSCASC) 58 Example Normalization case

×