Denormalization

‫الرحيم‬‫الرحمن‬‫هللا‬ ‫بسم‬
By:-
1- Amna Magzoub
2- Selma Yahyia
3- Ala Eltayeb
Denormalization
Objectives:-
 Introduction
 Definition
Why and when to denormalize data
Method of denormalization
Manage denormalization data
Advantages and disadvantages of
denormalization
 References
Introduction
• Result of normalization is a design that is
structurally consistent with minimal
redundancy.
• However, sometimes a normalized database
does not provide maximum processing
efficiency.
• May be necessary to accept loss of some
benefits of a fully normalized design in favor
of performance.
Definition :-
Denormalization is a process of combine two
relation into one new relation.
 Denormalization is the process of taking a
normalized database and modifying table
structures to allow controlled redundancy for
increased database performance.
Cont..
 The argument in favor of denormalization is
basically that it makes retrievals esaier to
express and makes the perform better.
It sometimes claimed to make the database
easier to understand.
When and why to denormailize
Some issues need to be considered before
denormalization:
1- Is the system’s performance unacceptable with fully
normalized data? Meet a client and do some testing.
2- If the performance is unacceptable, will
denormalizing make it acceptable?
3- If you denormalize to clear those bottlenecks, will the
system and its data still be reliable?
Cont…
 Speed up retrievals.
 A strict performance is required.
 It is not heavily updated.
- So, denormalize only when there is a very clear
advantage to doing.
Balancing denormalization issues
Method of denormalization:-
1) Adding Redundant Columns.
2) Adding Derived Columns
3) Combining Tables
4) Repeating Groups
5) Creating extract tables
6) Partitioning Relations
Adding Redundant Columns
• You can add redundant columns to eliminate
frequent joins. For example, if frequent joins
are performed on the titleauthor and authors
tables in order to retrieve the author's last
name, you can add the au_lname column to
titleauthor.
Denormalization
• Adding redundant columns eliminates joins for
many queries. The problems with this solution
are that it:
• Requires maintenance of new column. All
changes must be made to two tables, and
possibly to many rows in one of the tables.
• Requires more disk space, since au_lname is
duplicated.
Adding Derived Columns
• Adding derived columns can help eliminate
joins and reduce the time needed to produce
aggregate values.
• The example shows both benefits. Frequent
joins are needed between the titleauthor and
titles tables to provide the total advance for a
particular book title.
Denormalization
• You can create and maintain a derived data
column in the titles table, eliminating both the
join and the aggregate at run time. This
increases storage needs, and requires
maintenance of the derived column whenever
changes are made to the titles table.
Combining Tables
• If most users need to see the full set of joined data
from two tables, collapsing the two tables into one
can improve performance by eliminating the join.
• For example, users frequently need to see the
author name, author ID, and the blurbs copy data
at the same time. The solution is to collapse the
two tables into one. The data from the two tables
must be in a one-to-one relationship to collapse
tables.
Denormalization
More examples:-
Repeating Groups
These repeating groups can be stored as a nested
table within the original table.
example
Denormalization
Creating extract tables
• Reports can access derived data and perform
multi-relation joins on same set of base
relations. However, data the report is based on
may be relatively static or may not have to be
current.
• Possible to create a single, highly
denormalized extract table based on relations
required by reports, and allow users to access
extract table directly instead of base relations.
Partitioning Relations
• Rather than combining relations together,
alternative approach is to decompose them into
a number of smaller and more manageable
partitions.
• Two main types of partitioning:-
horizontal and vertical.
Horizontal :-
Distributing the tuples of relation across a
number of (smaller) partitioning relation.
Vertical :-
Distributing the attributes of a relation a cross a
number of (smaller) partitioning relation(the
primary key duplicated to allow the original
relation to be reconstucted.
Denormalization
Horizanteial:-
•Separate data into partitions so that queries
do not need to examine all data in a table
when WHERE clause filters specify only a
subset of the partitions.
•Horizontal splitting can also be more secure
since file level of security can be used to
prohibit users from seeing certain rows of
data
The example shows how the authors table might
be split to separate active and inactive authors:
Vertical Splitting:
• Vertical splitting can be used when some
columns are rarely accessed rather than other
columns
The example shows how the authors table
can be partitioned.
Vertical Examples
Managing Denormalized Data
Whatever denormalization techniques you use,
you need to develop management techniques
to ensure data integrity. Choices include:
• Triggers, which can update derived or
duplicated data anytime the base data changes.
Denormalization
• Application logic, using transactions in each
application that updates denormalized data to be
sure that changes are atomic.
Cont…
• Batch reconciliation, run at appropriate
intervals to bring the denormalized data back
into agreement.
• If 100-percent consistency is not required at all
times, you can run a batch job or stored
procedure during off hours to reconcile
duplicate or derived data.
• You can run short, frequent batches or longer,
less frequent batches.
Cont…
Advantages vs. disadvantages
Advantages:-
• Precomputing derived data
• Minimizing the need for joins
• Reducing the number of foreign keys in
relations
• Reducing the number of relations.
disadvantages :-
• May speed up retrievals but can slow down
updates.
• Always application-specific and needs to be re-
evaluated in the application changes.
• Can increase the size of relations.
• May simplify implementation in some cases but
may make it more complex in other.
• reduce flexibility.
Summary
 Denormalization aids the process of adding
redundancy to the database to improve performance .
 Denormalize can be done with tables or columns.
 Require aknowledge of how data is being used .
 There are costs of denormalization reduces the
“integrity” of the design ,always slow DML (data
manipulation language) , need more memory space
to store redundant data and required additional
programming to maintain the denormalized data.
References
 Database design & Relational theory, C.J.Date
 Database system .8th edition.
 Data Normalization, Denormalization,and the Forces of Darkness
a white paper by Melissa Hollingsworth.
 www.icard.ru/~nail/sybase/perf/10.88.html
Q & A
Thanks you
1 de 41

Recomendados

Data modelsData models
Data modelsUsman Tariq
14.2K visualizações69 slides
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
28.2K visualizações18 slides
Deductive databasesDeductive databases
Deductive databasesDabbal Singh Mahara
6K visualizações30 slides
Object oriented databasesObject oriented databases
Object oriented databasesSajith Ekanayaka
9.1K visualizações13 slides
Data Modeling PPTData Modeling PPT
Data Modeling PPTTrinath Raavi
80.9K visualizações16 slides

Mais conteúdo relacionado

Mais procurados

Types of Database ModelsTypes of Database Models
Types of Database ModelsMurassa Gillani
17.1K visualizações46 slides
Oltp vs olapOltp vs olap
Oltp vs olapMr. Fmhyudin
61.6K visualizações19 slides
Data Engineering BasicsData Engineering Basics
Data Engineering BasicsCatherine Kimani
5.6K visualizações25 slides
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
101.3K visualizações87 slides

Mais procurados(20)

Data warehouse 21 snowflake schemaData warehouse 21 snowflake schema
Data warehouse 21 snowflake schema
Vaibhav Khanna443 visualizações
Types of Database ModelsTypes of Database Models
Types of Database Models
Murassa Gillani17.1K visualizações
Oltp vs olapOltp vs olap
Oltp vs olap
Mr. Fmhyudin61.6K visualizações
Data Engineering BasicsData Engineering Basics
Data Engineering Basics
Catherine Kimani5.6K visualizações
Business intelligence and data warehousesBusiness intelligence and data warehouses
Business intelligence and data warehouses
Dhani Ahmad3K visualizações
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
vivekjv101.3K visualizações
Introduction to Oracle DatabaseIntroduction to Oracle Database
Introduction to Oracle Database
puja_dhar21.2K visualizações
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)
Pratik Tambekar3.5K visualizações
5. stored procedure and functions5. stored procedure and functions
5. stored procedure and functions
Amrit Kaur3.2K visualizações
Fundamentals of Database ppt ch01Fundamentals of Database ppt ch01
Fundamentals of Database ppt ch01
Jotham Gadot75.1K visualizações
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
James Serra22.6K visualizações
DBMS PPTDBMS PPT
DBMS PPT
Prabhu Goyal6.3K visualizações
Data preprocessingData preprocessing
Data preprocessing
Jason Rodrigues58.4K visualizações
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
Lovely Professional University69.5K visualizações
MYSQL.pptMYSQL.ppt
MYSQL.ppt
webhostingguy40.1K visualizações
Data DictionaryData Dictionary
Data Dictionary
Vishal Anand6.6K visualizações
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
FellowBuddy.com4.8K visualizações
Types of keys in database management system by Dr. Kamal GulatiTypes of keys in database management system by Dr. Kamal Gulati
Types of keys in database management system by Dr. Kamal Gulati
Amity University | FMS - DU | IMT | Stratford University | KKMI International Institute | AIMA | DTU10.2K visualizações

Destaque

DenormalizationDenormalization
DenormalizationSohail Haider
18.9K visualizações45 slides
Databases: DenormalisationDatabases: Denormalisation
Databases: DenormalisationDamian T. Gordon
1.3K visualizações9 slides
Hierarchical DenormalizationHierarchical Denormalization
Hierarchical DenormalizationKamaluddin Panhwar
1.9K visualizações12 slides
DBMS - NormalizationDBMS - Normalization
DBMS - NormalizationJitendra Tomar
1.1M visualizações33 slides

Destaque(20)

DenormalizationDenormalization
Denormalization
Sohail Haider18.9K visualizações
Databases: DenormalisationDatabases: Denormalisation
Databases: Denormalisation
Damian T. Gordon1.3K visualizações
Hierarchical DenormalizationHierarchical Denormalization
Hierarchical Denormalization
Kamaluddin Panhwar1.9K visualizações
Database design & Normalization (1NF, 2NF, 3NF)Database design & Normalization (1NF, 2NF, 3NF)
Database design & Normalization (1NF, 2NF, 3NF)
Jargalsaikhan Alyeksandr520.9K visualizações
DBMS - NormalizationDBMS - Normalization
DBMS - Normalization
Jitendra Tomar1.1M visualizações
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NFDatabase Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Oum Saokosal112.2K visualizações
Data baseData base
Data base
maha yasin88 visualizações
Lecture 9Lecture 9
Lecture 9
Shani7291.2K visualizações
Lecture 03 - The Data Warehouse and Design Lecture 03 - The Data Warehouse and Design
Lecture 03 - The Data Warehouse and Design
phanleson2.9K visualizações
Sql  joinSql  join
Sql join
Vikas Gupta3.4K visualizações
Napkin folding finalNapkin folding final
Napkin folding final
Dr. Sunil Kumar5.4K visualizações
Data warehouse-dimensional-modeling-and-designData warehouse-dimensional-modeling-and-design
Data warehouse-dimensional-modeling-and-design
Sarita Kataria5.6K visualizações
RdbmsRdbms
Rdbms
Muhammad Adeel Rajput15.8K visualizações
SQL JoinsSQL Joins
SQL Joins
Paul Harkins3.6K visualizações
Compiler vs interpreterCompiler vs interpreter
Compiler vs interpreter
Paras Patel11.2K visualizações
High Level Language (HLL)High Level Language (HLL)
High Level Language (HLL)
Maliha Jahan11.3K visualizações
Relational Database DesignRelational Database Design
Relational Database Design
Archit Saxena15K visualizações
SQL JOINSQL JOIN
SQL JOIN
Ritwik Das25.8K visualizações

Similar a Denormalization

De normalozationDe normalozation
De normalozationKhuram Shahzad
105 visualizações26 slides
Cs437 lecture 7-8Cs437 lecture 7-8
Cs437 lecture 7-8Aneeb_Khawar
365 visualizações18 slides
Rise of Column Oriented DatabaseRise of Column Oriented Database
Rise of Column Oriented DatabaseSuvradeep Rudra
15.2K visualizações15 slides

Similar a Denormalization(20)

De normalozationDe normalozation
De normalozation
Khuram Shahzad105 visualizações
Cs437 lecture 7-8Cs437 lecture 7-8
Cs437 lecture 7-8
Aneeb_Khawar365 visualizações
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of Denormalization
Aliya Saldanha602 visualizações
Rise of Column Oriented DatabaseRise of Column Oriented Database
Rise of Column Oriented Database
Suvradeep Rudra15.2K visualizações
Query Optimization in SQL ServerQuery Optimization in SQL Server
Query Optimization in SQL Server
Rajesh Gunasundaram935 visualizações
Systems and methods for improving database performanceSystems and methods for improving database performance
Systems and methods for improving database performance
Eyjólfur Gislason539 visualizações
Data warehouse physical designData warehouse physical design
Data warehouse physical design
Er. Nawaraj Bhandari4.8K visualizações
very large databasevery large database
very large database
Kazem Taghandiky579 visualizações
Tableau Best Practices.pptxTableau Best Practices.pptx
Tableau Best Practices.pptx
AnitaB3317 visualizações
denormalization.pptdenormalization.ppt
denormalization.ppt
ABUSUFYAN552 visualizações
Dal deckDal deck
Dal deck
Caroline_Rose759 visualizações
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
paulguerin296 visualizações
PostgreSQL Table Partitioning / ShardingPostgreSQL Table Partitioning / Sharding
PostgreSQL Table Partitioning / Sharding
Amir Reza Hashemi262 visualizações
Csld phan tan va song songCsld phan tan va song song
Csld phan tan va song song
Lê Anh Trung588 visualizações
Case Study: A Multi-Source Time Variant Data warehouseCase Study: A Multi-Source Time Variant Data warehouse
Case Study: A Multi-Source Time Variant Data warehouse
tarun kumar sharma142 visualizações
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
AAKANKSHA JAIN3.4K visualizações
Dwh   lecture 07-denormalizationDwh   lecture 07-denormalization
Dwh lecture 07-denormalization
Sulman Ahmed228 visualizações

Último(20)

231112 (WR) v1  ChatGPT OEB 2023.pdf231112 (WR) v1  ChatGPT OEB 2023.pdf
231112 (WR) v1 ChatGPT OEB 2023.pdf
WilfredRubens.com118 visualizações
ANATOMY AND PHYSIOLOGY UNIT 1 { PART-1}ANATOMY AND PHYSIOLOGY UNIT 1 { PART-1}
ANATOMY AND PHYSIOLOGY UNIT 1 { PART-1}
DR .PALLAVI PATHANIA190 visualizações
Student Voice Student Voice
Student Voice
Pooky Knightsmith125 visualizações
Class 10 English notes 23-24.pptxClass 10 English notes 23-24.pptx
Class 10 English notes 23-24.pptx
TARIQ KHAN74 visualizações
STERILITY TEST.pptxSTERILITY TEST.pptx
STERILITY TEST.pptx
Anupkumar Sharma107 visualizações
Women from Hackney’s History: Stoke Newington by Sue DoeWomen from Hackney’s History: Stoke Newington by Sue Doe
Women from Hackney’s History: Stoke Newington by Sue Doe
History of Stoke Newington117 visualizações
Universe revised.pdfUniverse revised.pdf
Universe revised.pdf
DrHafizKosar88 visualizações
Classification of crude drugs.pptxClassification of crude drugs.pptx
Classification of crude drugs.pptx
GayatriPatra1460 visualizações
Azure DevOps Pipeline setup for Mule APIs #36Azure DevOps Pipeline setup for Mule APIs #36
Azure DevOps Pipeline setup for Mule APIs #36
MysoreMuleSoftMeetup84 visualizações
ICS3211_lecture 08_2023.pdfICS3211_lecture 08_2023.pdf
ICS3211_lecture 08_2023.pdf
Vanessa Camilleri79 visualizações
AI Tools for Business and StartupsAI Tools for Business and Startups
AI Tools for Business and Startups
Svetlin Nakov74 visualizações
discussion post.pdfdiscussion post.pdf
discussion post.pdf
jessemercerail85 visualizações
Nico Baumbach IMR Media ComponentNico Baumbach IMR Media Component
Nico Baumbach IMR Media Component
InMediaRes1368 visualizações
CWP_23995_2013_17_11_2023_FINAL_ORDER.pdfCWP_23995_2013_17_11_2023_FINAL_ORDER.pdf
CWP_23995_2013_17_11_2023_FINAL_ORDER.pdf
SukhwinderSingh895865480 visualizações

Denormalization

  • 1. ‫الرحيم‬‫الرحمن‬‫هللا‬ ‫بسم‬ By:- 1- Amna Magzoub 2- Selma Yahyia 3- Ala Eltayeb Denormalization
  • 2. Objectives:-  Introduction  Definition Why and when to denormalize data Method of denormalization Manage denormalization data Advantages and disadvantages of denormalization  References
  • 3. Introduction • Result of normalization is a design that is structurally consistent with minimal redundancy. • However, sometimes a normalized database does not provide maximum processing efficiency. • May be necessary to accept loss of some benefits of a fully normalized design in favor of performance.
  • 4. Definition :- Denormalization is a process of combine two relation into one new relation.  Denormalization is the process of taking a normalized database and modifying table structures to allow controlled redundancy for increased database performance.
  • 5. Cont..  The argument in favor of denormalization is basically that it makes retrievals esaier to express and makes the perform better. It sometimes claimed to make the database easier to understand.
  • 6. When and why to denormailize Some issues need to be considered before denormalization: 1- Is the system’s performance unacceptable with fully normalized data? Meet a client and do some testing. 2- If the performance is unacceptable, will denormalizing make it acceptable? 3- If you denormalize to clear those bottlenecks, will the system and its data still be reliable?
  • 7. Cont…  Speed up retrievals.  A strict performance is required.  It is not heavily updated. - So, denormalize only when there is a very clear advantage to doing.
  • 9. Method of denormalization:- 1) Adding Redundant Columns. 2) Adding Derived Columns 3) Combining Tables 4) Repeating Groups 5) Creating extract tables 6) Partitioning Relations
  • 10. Adding Redundant Columns • You can add redundant columns to eliminate frequent joins. For example, if frequent joins are performed on the titleauthor and authors tables in order to retrieve the author's last name, you can add the au_lname column to titleauthor.
  • 12. • Adding redundant columns eliminates joins for many queries. The problems with this solution are that it: • Requires maintenance of new column. All changes must be made to two tables, and possibly to many rows in one of the tables. • Requires more disk space, since au_lname is duplicated.
  • 13. Adding Derived Columns • Adding derived columns can help eliminate joins and reduce the time needed to produce aggregate values. • The example shows both benefits. Frequent joins are needed between the titleauthor and titles tables to provide the total advance for a particular book title.
  • 15. • You can create and maintain a derived data column in the titles table, eliminating both the join and the aggregate at run time. This increases storage needs, and requires maintenance of the derived column whenever changes are made to the titles table.
  • 16. Combining Tables • If most users need to see the full set of joined data from two tables, collapsing the two tables into one can improve performance by eliminating the join. • For example, users frequently need to see the author name, author ID, and the blurbs copy data at the same time. The solution is to collapse the two tables into one. The data from the two tables must be in a one-to-one relationship to collapse tables.
  • 19. Repeating Groups These repeating groups can be stored as a nested table within the original table.
  • 22. Creating extract tables • Reports can access derived data and perform multi-relation joins on same set of base relations. However, data the report is based on may be relatively static or may not have to be current. • Possible to create a single, highly denormalized extract table based on relations required by reports, and allow users to access extract table directly instead of base relations.
  • 23. Partitioning Relations • Rather than combining relations together, alternative approach is to decompose them into a number of smaller and more manageable partitions. • Two main types of partitioning:- horizontal and vertical.
  • 24. Horizontal :- Distributing the tuples of relation across a number of (smaller) partitioning relation. Vertical :- Distributing the attributes of a relation a cross a number of (smaller) partitioning relation(the primary key duplicated to allow the original relation to be reconstucted.
  • 26. Horizanteial:- •Separate data into partitions so that queries do not need to examine all data in a table when WHERE clause filters specify only a subset of the partitions. •Horizontal splitting can also be more secure since file level of security can be used to prohibit users from seeing certain rows of data
  • 27. The example shows how the authors table might be split to separate active and inactive authors:
  • 28. Vertical Splitting: • Vertical splitting can be used when some columns are rarely accessed rather than other columns
  • 29. The example shows how the authors table can be partitioned.
  • 31. Managing Denormalized Data Whatever denormalization techniques you use, you need to develop management techniques to ensure data integrity. Choices include: • Triggers, which can update derived or duplicated data anytime the base data changes.
  • 33. • Application logic, using transactions in each application that updates denormalized data to be sure that changes are atomic.
  • 34. Cont… • Batch reconciliation, run at appropriate intervals to bring the denormalized data back into agreement. • If 100-percent consistency is not required at all times, you can run a batch job or stored procedure during off hours to reconcile duplicate or derived data. • You can run short, frequent batches or longer, less frequent batches.
  • 36. Advantages vs. disadvantages Advantages:- • Precomputing derived data • Minimizing the need for joins • Reducing the number of foreign keys in relations • Reducing the number of relations.
  • 37. disadvantages :- • May speed up retrievals but can slow down updates. • Always application-specific and needs to be re- evaluated in the application changes. • Can increase the size of relations. • May simplify implementation in some cases but may make it more complex in other. • reduce flexibility.
  • 38. Summary  Denormalization aids the process of adding redundancy to the database to improve performance .  Denormalize can be done with tables or columns.  Require aknowledge of how data is being used .  There are costs of denormalization reduces the “integrity” of the design ,always slow DML (data manipulation language) , need more memory space to store redundant data and required additional programming to maintain the denormalized data.
  • 39. References  Database design & Relational theory, C.J.Date  Database system .8th edition.  Data Normalization, Denormalization,and the Forces of Darkness a white paper by Melissa Hollingsworth.  www.icard.ru/~nail/sybase/perf/10.88.html
  • 40. Q & A