SlideShare uma empresa Scribd logo
1 de 22
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Agenda
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Agenda
▪ What Is Data Warehousing?
▪ Data Warehousing Concepts:
▪ OLAP (On-Line Analytical Processing)
▪ Types Of OLAP Cubes
▪ Dimensions, Facts & Measures
▪ Schemas
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Data Warehousing?
Let’s first understand what is Data Warehousing, why it’s
needed and what are the added benefits.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is A Data Warehouse?
➢ Data Warehouse is like a relational database designed for analytical needs.
➢ It functions on the basis of OLAP (Online Analytical Processing).
➢ It is a central location where consolidated data from multiple locations (databases) are stored.
Data Analysis & Visualization
Data
Warehouse
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
What Is Data Warehousing?
➢ Data Warehousing is the act of organizing & storing data in a way so as to make its retrieval efficient and insightful.
➢ It’s also called as the process of transforming data into information.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Data Warehousing Concepts
Now let’s understand the various concepts revolving around
Data Warehousing like: OLAP, Dimensions, Facts & Schemas
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
OLAP (Online Analytical Processing)
➢ OLAP is a flexible way for you to make complicated analysis of multidimensional data.
➢ DWH is modeled on the concept of OLAP. DBs are modeled on the concept of OLTP (Online Transaction Processing).
➢ OLTP systems use data stored in the form of two-dimensional tables, with rows and columns.
OLAPOLTP
1. Opens up new views of looking at data.
2. Supports filtering/ sorting of data.
3. Data can be refined.
Advantages Of OLAP Over OLTP
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Types Of OLAP Cubes
MOLAP is a form of OLAP that processes and stores the data directly into a
multidimensional database.
Advantage:- Excellent performance; Can perform complex calculations.
Disadvantage:- Only limited data can be handled.
MOLAP1
ROLAP is a form of OLAP that performs dynamic multidimensional analysis of data
stored in a relational database rather than in a multidimensional database.
Advantage:- Greater amount of data can be processed.
Disadvantage:- Requires more processing time/ disk space.
ROLAP2
HOLAP (Hybrid OLAP) is a combination of the advantages of MOLAP and ROLAP.
Advantages: HOLAP can "drill through" from the cube into underlying relational data.
HOLAP3
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
OLAP Operations:- Roll-up
Roll-up performs aggregation on a data cube by either:
1. Climbing up a concept hierarchy for a dimension
2. Dimension reduction
The following diagram illustrates how roll-up works.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
OLAP Operations:- Drill-down
Drill-down is the reverse operation of roll-up.
It is performed by either:
1. Stepping down a concept hierarchy for a
dimension
2. Introducing a new dimension.
The following diagram illustrates how drill-down works.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
OLAP Operations:- Slice
The slice operation provides a new sub-cube from one
particular dimension in a given cube.
Consider the following diagram that shows how slice works.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
OLAP Operations:- Dice
The Dice operation provides a new sub-cube from two or more
dimensions in a given cube.
Consider the following diagram that shows the dice operation.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
OLAP Operations:- Pivot
The pivot operation is also known as rotation operation.
It transposes the axes in order to provide an alternative
presentation of data.
Consider the following diagram that shows the pivot operation.
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Dimensions
➢ The tables that describe the dimensions involved are called Dimension tables.
➢ Dividing a Data Warehouse project into dimensions provides structured information for analysis & reporting.
Dimensions
Subject
Attributes
E-commerce Company
Customer Product Date
ID Name Address ID Name Type
Order
date
Shipment
date
Delivery
date
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Dimensions
➢ End users fire queries on these dimension tables which contain descriptive information.
E-commerce Company
Customer Product Date
ID Name Address ID Name Type
Order
date
Shipment
date
Delivery
date
1 Rita ABC 001 CD 1A 1/06/14 3/06/14 5/06/14
2 John XYZ 002 AC 2B 6/06/14 9/06/14 11/06/14
3 Paul PQR 003 TV 3C 10/06/14 14/06/14 16/06/14
Result
Query
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Facts & Measures
➢ A fact is a measure that can be summed, averaged or manipulated.
➢ A Fact table contains 2 kinds of data – a dimension key and a measure.
➢ Every Dimension table is linked to a Fact table.
Dimension
Product
Number of units sold
Fact Table
Product_ID Dimension key
Measure
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Schemas
➢ A schema gives the logical description of the entire data base.
➢ It gives details about the constraints placed on the tables, key values present & how the key values are linked
between the different tables.
➢ A database uses relational model, while a data warehouse uses Star, Snowflake and Fact Constellation schema.
Employee
ID First Name Last Name Age Dept_ID
1234 Rita Joe 25 0674
4321 John Smith 35 0825
5678 Paul Brady 45 0752
7890 Rose Michael 65 0825
Department
Dept_ID Dept_Name
0674 Sales
0752 HR
0825 Production
Linked
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Types Of Schemas:- Star Schema
➢ Each dimension in a star schema is represented with a one-dimension table which contains a set of attributes.
➢ Fact table is at the center. which contains keys to every dimension table & attributes like: units sold and revenue.
Revenue
Dealer_ID
Model_ID
Branch_ID
Date_ID
Units_Sold
Revenue
Dealer
Dealer_ID
Location_ID
Country_ID
Dealer_NM
Dealer_CNTCT
Branch Dim
Branch _ID
Name
Address
Country
Date Dim
Date_ID
Year
Month
Quarter
Date
Product
Product_ID
Product_Name
Model_ID
Variant_ID
Fact Table
Dimension Table
Dimension TableDimension Table
Dimension Table
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Types Of Schemas:- Snowflake Schema
➢ Dimension tables in the Snowflake schema are normalized. (Split into additional tables).
➢ Dealer dimension table is split into Location & Country. Product dimension table is split into Product & Variant.
Revenue
Dealer_ID
Model_ID
Branch_ID
Date_ID
Units_Sold
Revenue
Dealer
Dealer_ID
Location_ID
Country_ID
Dealer_NM
Dealer_CNTCT
Branch Dim
Branch _ID
Name
Address
Country
Date Dim
Date_ID
Year
Month
Quarter
Date
Product
Product_ID
Product_Name
Model_ID
Variant_ID
Fact Table
Dimension Table
Dimension TableDimension Table
Dimension Table
Location
Location_ID
Region
Country
Country_ID
Country_Name
Dimension Table
Dimension Table
Variant
Variant_ID
Variant_Name
Fuel type
Dimension Table
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Types Of Schemas:- Galaxy Schema
➢ Also known as Fact Constellation schema. Contains more than 1 Fact table.
➢ Below, there are two fact tables: Revenue and Product.
➢ Dimensions which are shared are called Conformed Dimensions.
Revenue
Dealer_ID
Branch_ID
Date_ID
Units_Sold
Revenue
Dealer
Dealer_ID
Location_ID
Country_ID
Dealer_NM
Dealer_CNTCT
Branch Dim
Branch _ID
Name
Address
Country
Date Dim
Date_ID
Year
Month
Quarter
Date
Product
Product_ID
Product_Name
Model_ID
Variant_ID
Fact Table
Dimension Table
Dimension Table
Dimension Table
Dimension Table
Product
Product_ID
Product_Name
Variant_ID
Fact Table
Copyright © 2017, edureka and/or its affiliates. All rights reserved.
Session In A Minute
What Is Data Warehousing?
Dimensions, Facts & Measures
OLAP
Schemas
Data Warehouse Concepts | Data Warehouse Tutorial | Data Warehousing | Edureka

Mais conteúdo relacionado

Mais procurados

Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
Girish Dhareshwar
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
Shanthi Mukkavilli
 

Mais procurados (20)

Introduction to data warehousing
Introduction to data warehousing   Introduction to data warehousing
Introduction to data warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Classification of data mart
Classification of data martClassification of data mart
Classification of data mart
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
OLAP OnLine Analytical Processing
OLAP OnLine Analytical ProcessingOLAP OnLine Analytical Processing
OLAP OnLine Analytical Processing
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Schemas for multidimensional databases
Schemas for multidimensional databasesSchemas for multidimensional databases
Schemas for multidimensional databases
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Ppt
PptPpt
Ppt
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 
Data ware house architecture
Data ware house architectureData ware house architecture
Data ware house architecture
 

Semelhante a Data Warehouse Concepts | Data Warehouse Tutorial | Data Warehousing | Edureka

Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 
Dataware housing
Dataware housingDataware housing
Dataware housing
work
 

Semelhante a Data Warehouse Concepts | Data Warehouse Tutorial | Data Warehousing | Edureka (20)

Data Warehousing.pptx
Data Warehousing.pptxData Warehousing.pptx
Data Warehousing.pptx
 
3 OLAP.pptx
3 OLAP.pptx3 OLAP.pptx
3 OLAP.pptx
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
IRJET- Business Intelligence using Hadoop
IRJET-  	  Business Intelligence using HadoopIRJET-  	  Business Intelligence using Hadoop
IRJET- Business Intelligence using Hadoop
 
Application Middleware Overview
Application Middleware OverviewApplication Middleware Overview
Application Middleware Overview
 
Complete unit ii notes
Complete unit ii notesComplete unit ii notes
Complete unit ii notes
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
Sql Server 2005 Business Inteligence
Sql Server 2005 Business InteligenceSql Server 2005 Business Inteligence
Sql Server 2005 Business Inteligence
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
Online analytical processing
Online analytical processingOnline analytical processing
Online analytical processing
 
86921864 olap-case-study-vj
86921864 olap-case-study-vj86921864 olap-case-study-vj
86921864 olap-case-study-vj
 
SAP BODS -quick guide.docx
SAP BODS -quick guide.docxSAP BODS -quick guide.docx
SAP BODS -quick guide.docx
 
Msbi by quontra us
Msbi by quontra usMsbi by quontra us
Msbi by quontra us
 
DWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptxDWDM Unit 1 (1).pptx
DWDM Unit 1 (1).pptx
 
Business analysis
Business analysisBusiness analysis
Business analysis
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 

Mais de Edureka!

Mais de Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Data Warehouse Concepts | Data Warehouse Tutorial | Data Warehousing | Edureka

  • 1. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Agenda
  • 2. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Agenda ▪ What Is Data Warehousing? ▪ Data Warehousing Concepts: ▪ OLAP (On-Line Analytical Processing) ▪ Types Of OLAP Cubes ▪ Dimensions, Facts & Measures ▪ Schemas
  • 3. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Data Warehousing? Let’s first understand what is Data Warehousing, why it’s needed and what are the added benefits.
  • 4. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is A Data Warehouse? ➢ Data Warehouse is like a relational database designed for analytical needs. ➢ It functions on the basis of OLAP (Online Analytical Processing). ➢ It is a central location where consolidated data from multiple locations (databases) are stored. Data Analysis & Visualization Data Warehouse
  • 5. Copyright © 2017, edureka and/or its affiliates. All rights reserved. What Is Data Warehousing? ➢ Data Warehousing is the act of organizing & storing data in a way so as to make its retrieval efficient and insightful. ➢ It’s also called as the process of transforming data into information.
  • 6. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Data Warehousing Concepts Now let’s understand the various concepts revolving around Data Warehousing like: OLAP, Dimensions, Facts & Schemas
  • 7. Copyright © 2017, edureka and/or its affiliates. All rights reserved. OLAP (Online Analytical Processing) ➢ OLAP is a flexible way for you to make complicated analysis of multidimensional data. ➢ DWH is modeled on the concept of OLAP. DBs are modeled on the concept of OLTP (Online Transaction Processing). ➢ OLTP systems use data stored in the form of two-dimensional tables, with rows and columns. OLAPOLTP 1. Opens up new views of looking at data. 2. Supports filtering/ sorting of data. 3. Data can be refined. Advantages Of OLAP Over OLTP
  • 8. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Types Of OLAP Cubes MOLAP is a form of OLAP that processes and stores the data directly into a multidimensional database. Advantage:- Excellent performance; Can perform complex calculations. Disadvantage:- Only limited data can be handled. MOLAP1 ROLAP is a form of OLAP that performs dynamic multidimensional analysis of data stored in a relational database rather than in a multidimensional database. Advantage:- Greater amount of data can be processed. Disadvantage:- Requires more processing time/ disk space. ROLAP2 HOLAP (Hybrid OLAP) is a combination of the advantages of MOLAP and ROLAP. Advantages: HOLAP can "drill through" from the cube into underlying relational data. HOLAP3
  • 9. Copyright © 2017, edureka and/or its affiliates. All rights reserved. OLAP Operations:- Roll-up Roll-up performs aggregation on a data cube by either: 1. Climbing up a concept hierarchy for a dimension 2. Dimension reduction The following diagram illustrates how roll-up works.
  • 10. Copyright © 2017, edureka and/or its affiliates. All rights reserved. OLAP Operations:- Drill-down Drill-down is the reverse operation of roll-up. It is performed by either: 1. Stepping down a concept hierarchy for a dimension 2. Introducing a new dimension. The following diagram illustrates how drill-down works.
  • 11. Copyright © 2017, edureka and/or its affiliates. All rights reserved. OLAP Operations:- Slice The slice operation provides a new sub-cube from one particular dimension in a given cube. Consider the following diagram that shows how slice works.
  • 12. Copyright © 2017, edureka and/or its affiliates. All rights reserved. OLAP Operations:- Dice The Dice operation provides a new sub-cube from two or more dimensions in a given cube. Consider the following diagram that shows the dice operation.
  • 13. Copyright © 2017, edureka and/or its affiliates. All rights reserved. OLAP Operations:- Pivot The pivot operation is also known as rotation operation. It transposes the axes in order to provide an alternative presentation of data. Consider the following diagram that shows the pivot operation.
  • 14. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Dimensions ➢ The tables that describe the dimensions involved are called Dimension tables. ➢ Dividing a Data Warehouse project into dimensions provides structured information for analysis & reporting. Dimensions Subject Attributes E-commerce Company Customer Product Date ID Name Address ID Name Type Order date Shipment date Delivery date
  • 15. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Dimensions ➢ End users fire queries on these dimension tables which contain descriptive information. E-commerce Company Customer Product Date ID Name Address ID Name Type Order date Shipment date Delivery date 1 Rita ABC 001 CD 1A 1/06/14 3/06/14 5/06/14 2 John XYZ 002 AC 2B 6/06/14 9/06/14 11/06/14 3 Paul PQR 003 TV 3C 10/06/14 14/06/14 16/06/14 Result Query
  • 16. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Facts & Measures ➢ A fact is a measure that can be summed, averaged or manipulated. ➢ A Fact table contains 2 kinds of data – a dimension key and a measure. ➢ Every Dimension table is linked to a Fact table. Dimension Product Number of units sold Fact Table Product_ID Dimension key Measure
  • 17. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Schemas ➢ A schema gives the logical description of the entire data base. ➢ It gives details about the constraints placed on the tables, key values present & how the key values are linked between the different tables. ➢ A database uses relational model, while a data warehouse uses Star, Snowflake and Fact Constellation schema. Employee ID First Name Last Name Age Dept_ID 1234 Rita Joe 25 0674 4321 John Smith 35 0825 5678 Paul Brady 45 0752 7890 Rose Michael 65 0825 Department Dept_ID Dept_Name 0674 Sales 0752 HR 0825 Production Linked
  • 18. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Types Of Schemas:- Star Schema ➢ Each dimension in a star schema is represented with a one-dimension table which contains a set of attributes. ➢ Fact table is at the center. which contains keys to every dimension table & attributes like: units sold and revenue. Revenue Dealer_ID Model_ID Branch_ID Date_ID Units_Sold Revenue Dealer Dealer_ID Location_ID Country_ID Dealer_NM Dealer_CNTCT Branch Dim Branch _ID Name Address Country Date Dim Date_ID Year Month Quarter Date Product Product_ID Product_Name Model_ID Variant_ID Fact Table Dimension Table Dimension TableDimension Table Dimension Table
  • 19. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Types Of Schemas:- Snowflake Schema ➢ Dimension tables in the Snowflake schema are normalized. (Split into additional tables). ➢ Dealer dimension table is split into Location & Country. Product dimension table is split into Product & Variant. Revenue Dealer_ID Model_ID Branch_ID Date_ID Units_Sold Revenue Dealer Dealer_ID Location_ID Country_ID Dealer_NM Dealer_CNTCT Branch Dim Branch _ID Name Address Country Date Dim Date_ID Year Month Quarter Date Product Product_ID Product_Name Model_ID Variant_ID Fact Table Dimension Table Dimension TableDimension Table Dimension Table Location Location_ID Region Country Country_ID Country_Name Dimension Table Dimension Table Variant Variant_ID Variant_Name Fuel type Dimension Table
  • 20. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Types Of Schemas:- Galaxy Schema ➢ Also known as Fact Constellation schema. Contains more than 1 Fact table. ➢ Below, there are two fact tables: Revenue and Product. ➢ Dimensions which are shared are called Conformed Dimensions. Revenue Dealer_ID Branch_ID Date_ID Units_Sold Revenue Dealer Dealer_ID Location_ID Country_ID Dealer_NM Dealer_CNTCT Branch Dim Branch _ID Name Address Country Date Dim Date_ID Year Month Quarter Date Product Product_ID Product_Name Model_ID Variant_ID Fact Table Dimension Table Dimension Table Dimension Table Dimension Table Product Product_ID Product_Name Variant_ID Fact Table
  • 21. Copyright © 2017, edureka and/or its affiliates. All rights reserved. Session In A Minute What Is Data Warehousing? Dimensions, Facts & Measures OLAP Schemas