Topics Covered:
=================================================
The Three-Level ANSI-SPARC Architecture
Database Languages
Data Models and Conceptual Modeling
social pharmacy d-pharm 1st year by Pragati K. Mahajan
CS3270 - DATABASE SYSTEM - Lecture (2)
1. CS3270
DATABASE – I
Mr. Dilawar
Lecturer,
Computer Science Faculty,
Bakhtar University
Kabul, Afghanistan.
2. Previous Lecture Outline
• Introduction
• Traditional File-Based Systems
• Database Approach
• Roles in the Database Environment
• History of DBMS
• Advantages and Disadvantages of DBMSs
4. Chapter Outline
• The Three-Level ANSI-SPARC Architecture
• Database Languages
• Data Models and Conceptual Modeling
• Functions of DBMS
• Components of DBMS
• Multi-User DBMS Architecture
5. Lecture Outline
• The Three-Level ANSI-SPARC Architecture
• Database Languages
• Data Models and Conceptual Modeling
6. The Three-Level ANSI-SPARC Architecture
• Identifies three levels of abstraction.
• Three distinct levels at which data items can be
described.
• The three-level architecture comprising an
external, a conceptual, and an internal level as
shown in the figure.
• The way users observes the data is called external
level.
• The way DBMS and OS observes data is called the
internal level.
• Where the data is actually stored using the data
structures and file organizations.
• Conceptual level provides both the mapping and the
desired independence between external and internal
levels.
7. The Three-Level ANSI-SPARC Architecture
• The objective of three-level architecture is to separate each user’s view of the
database from the way the database is physically represented.
• There are several reasons why this is desirable.
• All users should be able to access same data but have a different customized view.
• A user’s view is safe to changes made in other views.
• Users should not need to know physical database storage details.
• DBA should be able to change database storage structures without affecting the users’ views.
• Internal structure of database should be unaffected by changes to physical aspects of storage
such as changeover to a new storage device.
• DBA should be able to change conceptual structure of database without affecting all users.
8. The Three-Level ANSI-SPARC Architecture
• External Level
• The users’ view of the database. It Describes that part of database that is
relevant to a each user.
• The external level consists of a number of different external views of the
database.
• Each user has a view of the ‘real world’ represented in a form that is familiar for that
user.
• Different views may have different representation of same data (e.g. different
date formats, age derived from DOB etc.).
9. The Three-Level ANSI-SPARC Architecture
• Conceptual Level
• Community view of the database. It describes what data is stored in database
and the relationships among the data.
• It is a complete view of the data requirements of the organization that is
independent of any storage considerations.
• It represents
• All entities, their attributes, and their relationships.
• The constraints on the data.
• Semantic information about the data.
• Security and integrity information.
10. The Three-Level ANSI-SPARC Architecture
• Internal Level
• Physical representation of the database on the computer. This level describes
how the data is stored in the database.
• It covers the physical implementation of the database to achieve optimal
runtime performance and storage space utilization.
• It covers data structures and file organizations used to store data on storage
devices.
• Interfaces with the operating system access methods to place the data on the
storage devices, build the indexes, retrieve the data, and so on.
12. The Three-Level ANSI-SPARC Architecture
Schemas
• The overall description of the database is called the database schema.
• There are three schemas as per three-level architecture.
• External Schema
• Conceptual Schema
• Internal Schema
13. The Three-Level ANSI-SPARC Architecture
• External Schemas
• Also called subschemas.
• Corresponds to different views of data.
• Multiple schemas per database.
• Conceptual Schema
• Describes all the entities, attributes, and relationships together with integrity
constraints
• Only one schema per database.
14. The Three-Level ANSI-SPARC Architecture
• Internal Schema
• A complete description of the internal model, containing the definitions of
stored records, the methods of representation, the data fields, and the
indexes and storage structures used.
• Only one schema per database.
15. The Three-Level ANSI-SPARC Architecture
Mapping
• The DBMS is responsible for mapping between these three types of
schema:
• The DBMS must check that each external schema is derivable from the
conceptual schema, and it must use the information in the conceptual
schema to map between each external schema and the internal schema.
• Types of mappings
• Conceptual/Internal mapping
• External/Conceptual mapping
16. The Three-Level ANSI-SPARC Architecture
• Conceptual/Internal Mapping
• Enables the DBMS to find the actual record or combination of records in
physical storage that establishes a logical record in the conceptual schema,
together with any constraints to be enforced on the operations for that logical
record.
• It also allows any differences in entity names, attribute names, attribute
order, data types, and so on, to be resolved.
• External/Conceptual Mapping
• Enables the DBMS to map names in the user’s view on to the relevant part of
the conceptual schema.
18. The Three-Level ANSI-SPARC Architecture
• Database Schema
• Description of database (also called intension).
• Specified during design phase.
• Remain almost static.
• Database Instance
• Data in the database at any particular point in time.
• Dynamic (changes with the time).
• Also called an extension (or state) of database.
19. The Three-Level ANSI-SPARC Architecture
Data Independence
• A major objective for the three-level architecture is to provide data-
independence.
• Upper levels are unaffected by changes to lower levels.
• There are two kinds of data-independence:
• Logical Data Independence
• Physical Data Independence
20. The Three-Level ANSI-SPARC Architecture
• Logical Data Independence
• Refers to protection of external schemas to changes in conceptual schema.
• Conceptual schema changes (e.g. addition/removal of entities).
• Should not require changes to external schema or rewrites of application
programs.
21. The Three-Level ANSI-SPARC Architecture
• Physical Data Independence
• Refers to immunity of conceptual schema to changes in the internal schema.
• Internal schema changes (e.g. using different file organizations, storage
structures, storage devices etc.).
• Should not require change to conceptual or external schemas.
23. Database Languages
• Data sublanguage consist of two parts:
• DDL (Data Definition Language)
• DML (Data Manipulation Language)
• Data sublanguage
• Does not include constructs for all computing needs such as iterations or
conditional statements, which are provided by HLL.
• Many DBMSs provide embedding the sublanguage in a high level
programming language e.g. C, C++, Java etc.
• In this case , these high level languages are called host languages.
24. Database Languages
• Data Definition Language
• Allows the DBA or user to describe and name entities, attributes, and
relationships required for the application
• Plus any associated integrity and security constraints.
• The result is a set of tables stored in special files collectively called the System
catalog (data dictionary, data directory).
• Metadata (data about data, data description, data definitions).
25. Database Languages
• Data Manipulation Language
• Provides basic data manipulation operations on data held in the database.
• Procedural DML
• Non-Procedural DML
26. Database Languages
• Data Manipulation Language
• Procedural DML allows user to tell system exactly how to manipulate data.
• Operate on records individually.
• Typically, embedded in a high level language.
• More work is done by user (programmer).
• Network or hierarchical DMLs.
27. Database Languages
• Data Manipulation Language
• Non-Procedural DML Allows user to state what data is needed rather than
how it is to be retrieved
• Operate on set of records.
• Relational DBMS include e.g. SQL, QBE etc.
• Easy to understand and learn than procedural DML.
• More work is done by DBMS than user.
• Also called declarative languages.
28. Database Languages
• Fourth Generation Languages
• No clear agreement
• Forms generators
• Report generators
• Graphics generators
• Application generators
• Examples : SQL and QBE
29. Data Models
• A set of concepts to describe the structure of a database, the
operation for manipulating these structures, and certain constraints.
• Represents the organization itself.
• To represent data in an understandable way.
• Should provide the basic concepts and notations that will allow
database designers and end-users to accurately communicate their
understanding of the organizational data.
30. Data Models
• A data model comprises:
• A structural part
• Consisting of a set of rules according to which
databases can be constructed.
• A manipulative part
• Operations that are allowed on the data.
• Possibly a set of integrity rules
• Which ensures that the data is accurate.
Structure Constraints
Operations
Data Model describes
31. Data Models
• ANSI-SPARC architecture related models
• External data model (Universe of Discourse)
• Represent each user’s view of the organization.
• Conceptual data model (DBMS independent)
• Represent the community view that is DBMS independent.
• Internal data model
• Represent the conceptual schema in such a way that it can be understood by the DBMS.
32. Data Models
• Categories of data models include:
• Object-based
• Entity-Relationship
• Semantic
• Functional
• Object-Oriented
• Record-based
• Relational Data Model
• Network Data Model
• Hierarchical Data Model
• Physical
Relational Systems adopt a declarative
approach to database processing (that is,
they specify what data is to be retrieved).
Network and hierarchical systems adopt a
navigational approach (that is, they
specify how the data is to be retrieved).
36. Conceptual Modeling
• Conceptual modeling is process of developing a model of information
use in an enterprise that is independent of implementation details.
• Should be complete and accurate representation of an organization’s data
requirements.
• Conceptual schema is the core of a system supporting all user views.
• Conceptual vs. logical data model
37. Summery
• The Three-Level ANSI-SPARC Architecture
• Database Languages
• Data Models and Conceptual Modeling
An early proposal for a standard terminology and general architecture for database systems was produced in 1971 by the DBTG (Data Base Task Group) appointed by the Conference on Data Systems and Languages (CODASYL, 1971).
The DBTG recognized the need for a two-level approach with a system view called the schema and user views called subschemas.
Entities, attributes and relationships that are desirable to the user.
The conceptual level supports each external view, in that any data available to a user must be contained in, or derivable from, the conceptual level. However, this level must not contain any storage-dependent details.
Description of an entity should contain only datatypes of attributes and their lengths but not any storage consideration such as the number of bytes occupied.
Below the internal level there is a physical level that may be managed by the operating system under the direction of the DBMS.
The physical level below the DBMS consists of items only the operating system knows, such as exactly how the sequencing is implemented and whether the fields of internal records are stored as contiguous bytes on the disk.
Mapping – charting
Clearly, the users for whom the changes have been made need to be aware of them, but what is important is that other users should not be.
Unambiguous – unmistakable
We mentioned earlier that a schema is written using a data definition language. In fact, it is written in the data definition language of a particular DBMS. Unfortunately, this type of language is too low level to describe the data requirements of an organization in a way that is readily understandable by a variety of users. What we require is a higher-level description of the schema: that is, a data model.
The purpose of a data model is to represent data and to make the data understandable. If it does this, then it can be easily used to design a database.
Provide concepts that are close to the way many users identify data.
A logical data model describes the data in as much detail as possible that can be implemented in a specified DBMS.
Provide concepts that describes details of how data is stored in the computer.
Physical data model represents how the model will be built in the database.
Object-based, record-based – conceptual and external levels.
Physical model describes data at the internal level.
Object-based data models use concepts such as entities, attributes, and relationships.
The object-oriented data model extends the definition of an entity to include not only the attributes that describe the state of the object but also the actions that are associated with the object, that is, its behavior. The object is said to encapsulate both state and behavior.
In a record-based model, the database consists of a number of fixed-format records possibly of differing types. Each record type defines a fixed number of fields, each typically of a fixed length.
Physical data models describe how data is stored in the computer, representing information such as record structures, record orderings, and access paths. There are not as many physical data models as logical data models, the most common ones being the unifying model and the frame memory.
The relational data model is based on the concept of mathematical relations. In the relational model, data and relationships are represented as tables, each of which has a number of columns with a unique name.
In the network model, data is represented as collections of records, and relationships are represented by sets. Compared with the relational model, relationships are explicitly modeled by the sets, which become pointers in the implementation. The records are organized as generalized graph structures with records appearing as nodes (also called segments) and sets as edges in the graph. Figure 2.5 illustrates an instance of a network schema for the same data set presented in Figure 2.4. The most popular network DBMS is
Computer Associates’ IDMS/ R. We discuss the network data model in more detail on the Web site for this book (see Preface for the URL).
The hierarchical model is a restricted type of network model. Again, data is represented as collections of records and relationships are represented by sets. However, the hierarchical model allows a node to have only one parent. A hierarchical model can be represented as a tree graph, with records appearing as nodes(also called segments) and sets as edges.
The conceptual model is independent of all implementation details, whereas the logical model assumes knowledge of the underlying data model of the target DBMS.
Provide concepts that are close to the way many users identify data.
A logical data model describes the data in as much detail as possible that can be implemented in a specified DBMS.