SlideShare uma empresa Scribd logo
1 de 16
Data Warehousing
Lecture # 13
Process of Dimensional Modeling
1
The Process of Dimensional Modeling
Four Step Method from ER to DM
1. Choose the Business Process
2. Choose the Grain
3. Choose the Facts
4. Choose the Dimensions
2
Step-1: Choose the Business Process
• A business process is a major operational
process in an organization.
• Typically supported by a legacy system
(database) or an OLTP.
– Examples: Orders, Invoices, Inventory etc.
• Business Processes are often termed as Data
Marts and that is why many people criticize
DM as being data mart oriented.
3
Step-1: Separating the Process
4
Star-1
Star-2
Snow-flake
Step-2: Choosing the Grain
• Grain is the fundamental, atomic level of data to be
represented.
• Grain is also termed as the unit of analyses.
• Example grain statements
• Typical grains
– Individual Transactions
– Daily aggregates (snapshots)
– Monthly aggregates
• Relationship between grain and expressiveness.
• Grain vs. hardware trade-off.
5
The case FOR data aggregation
• Works well for repetitive queries.
• Follows the known thought process.
• Justifiable if used for max number of queries.
• Provides a “big picture” or macroscopic view.
• Application dependent, usually inflexible to
business changes (remember lack of absoluteness
of conventions).
6
The case AGAINST data aggregation
• Aggregation is irreversible.
– Can create monthly sales data from weekly sales data,
but the reverse is not possible.
• Aggregation limits the questions that can be
answered.
– What, when, why, where, what-else, what-next
7
The case AGAINST data aggregation
• Aggregation can hide crucial facts.
–The average of 100 & 100 is same as 150 & 50
8
Aggregation hides crucial facts Example
Week-1 Week-2 Week-3 Week-4 Average
Zone-1 100 100 100 100 100
Zone-2 50 100 150 100 100
Zone-3 50 100 100 150 100
Zone-4 200 100 50 50 100
Average 100 100 100 100
9
Just looking at the averages i.e. aggregate
Aggregation hides crucial facts chart
0
50
100
150
200
250
Week-1 Week-2 Week-3 Week-4
Z1 Z2 Z3 Z4
10
Z1: Sale is constant (need to work on it)
Z2: Sale went up, then fell (need of concern)
Z3: Sale is on the rise, why?
Z4: Sale dropped sharply, need to look deeply.
W2: Static sale
Step 3: Choose Facts statement
11
“We need monthly sales
volume and Rs. by
week, product and Zone”
Facts
Dimensions
Step 3: Choose Facts
• Choose the facts that will populate each
fact table record.
–Remember that best Facts are Numeric,
Continuously Valued and Additive.
–Example: Quantity Sold, Amount etc.
12
Step 4: Choose Dimensions
• Choose the dimensions that apply to each
fact in the fact table.
–Typical dimensions: time, product, geography
etc.
–Identify the descriptive attributes that explain
each dimension.
–Determine hierarchies within each dimension.
13
Step-4: How to Identify a Dimension?
• The single valued attributes during recording of a
transaction are dimensions.
14
Calendar_Date
Time_of_Day
Account _No
ATM_Location
Transaction_Type
Transaction_RsTransaction_Rs
Fact Table
Dim
Time_of_day:Time_of_day: Morning, Mid Morning, Lunch Break etc.
Transaction_Type:Transaction_Type: Withdrawal, Deposit, Check balance etc.
Step-4: Can Dimensions be Multi-valued?
• Are dimensions ALWYS single?
– Not really
– What are the problems? And how to handle them
15
 Calendar_Date (of inspection)
 Reg_No
 Technician
 Workshop
 Maintenance_Operation
 How many maintenance operations are possible?How many maintenance operations are possible?
 FewFew
 Maybe more for old cars.Maybe more for old cars.
Step-4: Dimensions & Grain
• Several grains are possible as per business
requirement.
– For some aggregations certain descriptions do not remain
atomic.
– Example: Time_of_Day may change several times during
daily aggregate, but not during a transaction
• Choose the dimensions that are applicable within the
selected grain.
16

Mais conteúdo relacionado

Semelhante a Dwh lecture 13-process dm

Dataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsDataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsQuontra Solutions
 
Dimensional modeling primer
Dimensional modeling primerDimensional modeling primer
Dimensional modeling primerTerry Bunio
 
Dataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClassesDataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClassesInformaticaTrainingClasses
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGZaranTech LLC
 
dataWarehouse.pptx
dataWarehouse.pptxdataWarehouse.pptx
dataWarehouse.pptxhqlm1
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveHyderabad Scalability Meetup
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Terry Bunio
 
Lecture 01.ppt
Lecture 01.pptLecture 01.ppt
Lecture 01.pptHFLEX
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Vivastream
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Vivastream
 
Sad lecture 1
Sad lecture 1Sad lecture 1
Sad lecture 1Amin Omi
 
datamining and warehousing ppt
datamining  and warehousing pptdatamining  and warehousing ppt
datamining and warehousing pptSatyamverma2011
 
Data warehousing and mining furc
Data warehousing and mining furcData warehousing and mining furc
Data warehousing and mining furcShani729
 
Lecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptxLecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptxAsadkhan47384
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemKiran kumar
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingDunn Solutions Group
 

Semelhante a Dwh lecture 13-process dm (20)

Dataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra SolutionsDataware house Introduction By Quontra Solutions
Dataware house Introduction By Quontra Solutions
 
Dimensional modeling primer
Dimensional modeling primerDimensional modeling primer
Dimensional modeling primer
 
Dataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClassesDataware house introduction by InformaticaTrainingClasses
Dataware house introduction by InformaticaTrainingClasses
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
INFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAININGINFORMATICA EASY LEARNING ONLINE TRAINING
INFORMATICA EASY LEARNING ONLINE TRAINING
 
Data ware housing- Introduction to olap .
Data ware housing- Introduction to  olap .Data ware housing- Introduction to  olap .
Data ware housing- Introduction to olap .
 
dataWarehouse.pptx
dataWarehouse.pptxdataWarehouse.pptx
dataWarehouse.pptx
 
Demystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep DiveDemystify Big Data, Data Science & Signal Extraction Deep Dive
Demystify Big Data, Data Science & Signal Extraction Deep Dive
 
2. data warehouse 2nd unit
2. data warehouse 2nd unit2. data warehouse 2nd unit
2. data warehouse 2nd unit
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
 
Lecture 01.ppt
Lecture 01.pptLecture 01.ppt
Lecture 01.ppt
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?
 
Sad lecture 1
Sad lecture 1Sad lecture 1
Sad lecture 1
 
datamining and warehousing ppt
datamining  and warehousing pptdatamining  and warehousing ppt
datamining and warehousing ppt
 
Data warehousing and mining furc
Data warehousing and mining furcData warehousing and mining furc
Data warehousing and mining furc
 
Chapter 2 - Retail Sales
Chapter 2 - Retail Sales Chapter 2 - Retail Sales
Chapter 2 - Retail Sales
 
Lecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptxLecture 08B - Logical-DWH-Model-Pending.pptx
Lecture 08B - Logical-DWH-Model-Pending.pptx
 
Business Intelligence Data Warehouse System
Business Intelligence Data Warehouse SystemBusiness Intelligence Data Warehouse System
Business Intelligence Data Warehouse System
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 

Mais de Sulman Ahmed

Entrepreneurial Strategy Generating and Exploiting new entries
Entrepreneurial Strategy Generating and Exploiting new entriesEntrepreneurial Strategy Generating and Exploiting new entries
Entrepreneurial Strategy Generating and Exploiting new entriesSulman Ahmed
 
Entrepreneurial Intentions and corporate entrepreneurship
Entrepreneurial Intentions and corporate entrepreneurshipEntrepreneurial Intentions and corporate entrepreneurship
Entrepreneurial Intentions and corporate entrepreneurshipSulman Ahmed
 
Entrepreneurship main concepts and description
Entrepreneurship main concepts and descriptionEntrepreneurship main concepts and description
Entrepreneurship main concepts and descriptionSulman Ahmed
 
Run time Verification using formal methods
Run time Verification using formal methodsRun time Verification using formal methods
Run time Verification using formal methodsSulman Ahmed
 
Use of Formal Methods at Amazon Web Services
Use of Formal Methods at Amazon Web ServicesUse of Formal Methods at Amazon Web Services
Use of Formal Methods at Amazon Web ServicesSulman Ahmed
 
student learning App
student learning Appstudent learning App
student learning AppSulman Ahmed
 
Software Engineering Economics Life Cycle.
Software Engineering Economics  Life Cycle.Software Engineering Economics  Life Cycle.
Software Engineering Economics Life Cycle.Sulman Ahmed
 
Data mining Techniques
Data mining TechniquesData mining Techniques
Data mining TechniquesSulman Ahmed
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data miningSulman Ahmed
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data miningSulman Ahmed
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining Sulman Ahmed
 
Data mining Basics and complete description
Data mining Basics and complete description Data mining Basics and complete description
Data mining Basics and complete description Sulman Ahmed
 
Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onwordSulman Ahmed
 
Dwh lecture-07-denormalization
Dwh lecture-07-denormalizationDwh lecture-07-denormalization
Dwh lecture-07-denormalizationSulman Ahmed
 
Dwh lecture-06-normalization
Dwh lecture-06-normalizationDwh lecture-06-normalization
Dwh lecture-06-normalizationSulman Ahmed
 
Dwh lecture 11-molap
Dwh  lecture 11-molapDwh  lecture 11-molap
Dwh lecture 11-molapSulman Ahmed
 
Dwh lecture 10-olap
Dwh   lecture 10-olapDwh   lecture 10-olap
Dwh lecture 10-olapSulman Ahmed
 
Dwh lecture 08-denormalization tech
Dwh   lecture 08-denormalization techDwh   lecture 08-denormalization tech
Dwh lecture 08-denormalization techSulman Ahmed
 
Dwh lecture 07-denormalization
Dwh   lecture 07-denormalizationDwh   lecture 07-denormalization
Dwh lecture 07-denormalizationSulman Ahmed
 

Mais de Sulman Ahmed (20)

Entrepreneurial Strategy Generating and Exploiting new entries
Entrepreneurial Strategy Generating and Exploiting new entriesEntrepreneurial Strategy Generating and Exploiting new entries
Entrepreneurial Strategy Generating and Exploiting new entries
 
Entrepreneurial Intentions and corporate entrepreneurship
Entrepreneurial Intentions and corporate entrepreneurshipEntrepreneurial Intentions and corporate entrepreneurship
Entrepreneurial Intentions and corporate entrepreneurship
 
Entrepreneurship main concepts and description
Entrepreneurship main concepts and descriptionEntrepreneurship main concepts and description
Entrepreneurship main concepts and description
 
Run time Verification using formal methods
Run time Verification using formal methodsRun time Verification using formal methods
Run time Verification using formal methods
 
Use of Formal Methods at Amazon Web Services
Use of Formal Methods at Amazon Web ServicesUse of Formal Methods at Amazon Web Services
Use of Formal Methods at Amazon Web Services
 
student learning App
student learning Appstudent learning App
student learning App
 
Software Engineering Economics Life Cycle.
Software Engineering Economics  Life Cycle.Software Engineering Economics  Life Cycle.
Software Engineering Economics Life Cycle.
 
Data mining Techniques
Data mining TechniquesData mining Techniques
Data mining Techniques
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
 
Rules of data mining
Rules of data miningRules of data mining
Rules of data mining
 
Classification in data mining
Classification in data mining Classification in data mining
Classification in data mining
 
Data mining Basics and complete description
Data mining Basics and complete description Data mining Basics and complete description
Data mining Basics and complete description
 
Data mining Basics and complete description onword
Data mining Basics and complete description onwordData mining Basics and complete description onword
Data mining Basics and complete description onword
 
Dwh lecture-07-denormalization
Dwh lecture-07-denormalizationDwh lecture-07-denormalization
Dwh lecture-07-denormalization
 
Dwh lecture-06-normalization
Dwh lecture-06-normalizationDwh lecture-06-normalization
Dwh lecture-06-normalization
 
Dwh lecture 12-dm
Dwh lecture 12-dmDwh lecture 12-dm
Dwh lecture 12-dm
 
Dwh lecture 11-molap
Dwh  lecture 11-molapDwh  lecture 11-molap
Dwh lecture 11-molap
 
Dwh lecture 10-olap
Dwh   lecture 10-olapDwh   lecture 10-olap
Dwh lecture 10-olap
 
Dwh lecture 08-denormalization tech
Dwh   lecture 08-denormalization techDwh   lecture 08-denormalization tech
Dwh lecture 08-denormalization tech
 
Dwh lecture 07-denormalization
Dwh   lecture 07-denormalizationDwh   lecture 07-denormalization
Dwh lecture 07-denormalization
 

Último

US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionMebane Rash
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction managementMariconPadriquez1
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsSachinPawar510423
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substationstephanwindworld
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...121011101441
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxKartikeyaDwivedi3
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfAsst.prof M.Gokilavani
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)dollysharma2066
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 

Último (20)

US Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of ActionUS Department of Education FAFSA Week of Action
US Department of Education FAFSA Week of Action
 
computer application and construction management
computer application and construction managementcomputer application and construction management
computer application and construction management
 
Vishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documentsVishratwadi & Ghorpadi Bridge Tender documents
Vishratwadi & Ghorpadi Bridge Tender documents
 
Earthing details of Electrical Substation
Earthing details of Electrical SubstationEarthing details of Electrical Substation
Earthing details of Electrical Substation
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...Instrumentation, measurement and control of bio process parameters ( Temperat...
Instrumentation, measurement and control of bio process parameters ( Temperat...
 
Concrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptxConcrete Mix Design - IS 10262-2019 - .pptx
Concrete Mix Design - IS 10262-2019 - .pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdfCCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
CCS355 Neural Network & Deep Learning Unit II Notes with Question bank .pdf
 
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
Call Us ≽ 8377877756 ≼ Call Girls In Shastri Nagar (Delhi)
 
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
young call girls in Rajiv Chowk🔝 9953056974 🔝 Delhi escort Service
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 

Dwh lecture 13-process dm

  • 1. Data Warehousing Lecture # 13 Process of Dimensional Modeling 1
  • 2. The Process of Dimensional Modeling Four Step Method from ER to DM 1. Choose the Business Process 2. Choose the Grain 3. Choose the Facts 4. Choose the Dimensions 2
  • 3. Step-1: Choose the Business Process • A business process is a major operational process in an organization. • Typically supported by a legacy system (database) or an OLTP. – Examples: Orders, Invoices, Inventory etc. • Business Processes are often termed as Data Marts and that is why many people criticize DM as being data mart oriented. 3
  • 4. Step-1: Separating the Process 4 Star-1 Star-2 Snow-flake
  • 5. Step-2: Choosing the Grain • Grain is the fundamental, atomic level of data to be represented. • Grain is also termed as the unit of analyses. • Example grain statements • Typical grains – Individual Transactions – Daily aggregates (snapshots) – Monthly aggregates • Relationship between grain and expressiveness. • Grain vs. hardware trade-off. 5
  • 6. The case FOR data aggregation • Works well for repetitive queries. • Follows the known thought process. • Justifiable if used for max number of queries. • Provides a “big picture” or macroscopic view. • Application dependent, usually inflexible to business changes (remember lack of absoluteness of conventions). 6
  • 7. The case AGAINST data aggregation • Aggregation is irreversible. – Can create monthly sales data from weekly sales data, but the reverse is not possible. • Aggregation limits the questions that can be answered. – What, when, why, where, what-else, what-next 7
  • 8. The case AGAINST data aggregation • Aggregation can hide crucial facts. –The average of 100 & 100 is same as 150 & 50 8
  • 9. Aggregation hides crucial facts Example Week-1 Week-2 Week-3 Week-4 Average Zone-1 100 100 100 100 100 Zone-2 50 100 150 100 100 Zone-3 50 100 100 150 100 Zone-4 200 100 50 50 100 Average 100 100 100 100 9 Just looking at the averages i.e. aggregate
  • 10. Aggregation hides crucial facts chart 0 50 100 150 200 250 Week-1 Week-2 Week-3 Week-4 Z1 Z2 Z3 Z4 10 Z1: Sale is constant (need to work on it) Z2: Sale went up, then fell (need of concern) Z3: Sale is on the rise, why? Z4: Sale dropped sharply, need to look deeply. W2: Static sale
  • 11. Step 3: Choose Facts statement 11 “We need monthly sales volume and Rs. by week, product and Zone” Facts Dimensions
  • 12. Step 3: Choose Facts • Choose the facts that will populate each fact table record. –Remember that best Facts are Numeric, Continuously Valued and Additive. –Example: Quantity Sold, Amount etc. 12
  • 13. Step 4: Choose Dimensions • Choose the dimensions that apply to each fact in the fact table. –Typical dimensions: time, product, geography etc. –Identify the descriptive attributes that explain each dimension. –Determine hierarchies within each dimension. 13
  • 14. Step-4: How to Identify a Dimension? • The single valued attributes during recording of a transaction are dimensions. 14 Calendar_Date Time_of_Day Account _No ATM_Location Transaction_Type Transaction_RsTransaction_Rs Fact Table Dim Time_of_day:Time_of_day: Morning, Mid Morning, Lunch Break etc. Transaction_Type:Transaction_Type: Withdrawal, Deposit, Check balance etc.
  • 15. Step-4: Can Dimensions be Multi-valued? • Are dimensions ALWYS single? – Not really – What are the problems? And how to handle them 15  Calendar_Date (of inspection)  Reg_No  Technician  Workshop  Maintenance_Operation  How many maintenance operations are possible?How many maintenance operations are possible?  FewFew  Maybe more for old cars.Maybe more for old cars.
  • 16. Step-4: Dimensions & Grain • Several grains are possible as per business requirement. – For some aggregations certain descriptions do not remain atomic. – Example: Time_of_Day may change several times during daily aggregate, but not during a transaction • Choose the dimensions that are applicable within the selected grain. 16

Notas do Editor

  1. The four steps are: but before the difference b/w er and dm should be clear. Er covers the whole business, which has number of processes like accounting, inventory, invoices or the process which do not work simultaneously. But as er by defination covers the whole business so it covers the all processes. The complexity of er process understanding is difficult. A process could be in case of exceptional conditions what should be done. But in dm we isolates the process, and this is one reason to know the dm. it identifies the processes, and then dm is developed for that process. The second point is to choose the grain of the process. Grain is high then detail will be high as well. In fact table more records or details will come in this case. The grain level b/w sys and fact table could be different (detail later) Now the third step is to check the facts or isolate them. Facts are numeric value but it shouldn’t be the key value. So fact is a non key value. So by choosing facts you made fact tables. Now the dimension tables are great in number and small in size, and you choose them. They are joined with fact table, you have to identify them.
  2. It is not an automated process could be done by a wizard, which needs domain knowledge and human experience which needs efforts. Business process is the major process runing in a business. These are the core business processes which are controlled by oltp systems.
  3. This is an animation. In this snowflake schema and it has a great number of processes which are represented as star which needs the dimensional modeling. The system which is developed against these stars are known as data marts.
  4. Grain is atomic level means which couldn’t be breaked. Grain corresponds to a transaction, which goes in the table. The detail level of data which is captured in business, that level and the level of data which goes in fact table may be different. E.g. in a store a trasaction could be of single item, or multiple items. It could be on the base of sales per hour, week etc. When we talk about grain with transaction, it has a factor of exprssivness. The more detail, means you know more about your business. Less data less expressive. So if the aggregation is high you will not need to invest on hardware for the performance. So the focus should be on grain, i.e if you want to run the queries in parallel you need more processing power. But high aggregation means when you need answer you will not need to run the query in parallel you just need to display the precomputed aggregate against the query.
  5. Grain has 2 costs. 1 grain making needs effort. E.g. you need to down the dwh, so for grain making, you need the time other than the office hours i.e 2 am. So you need support people so you need dollars. After that when you will make the grains, you need to store them. So benifitial when you need to use again and again. So the queries should be which may use the grain again and again. Aggregates are business process dependent. When you make aggregates you keep the business before you. So they are more benifitial.
  6. If I take the weekly aggregates I can make monthly aggregates, but if aggregates are monthly can you come back to weekly aggregates. E.g we can guess 3+5 but can not tell if 8 is given and you want to know 2 numbers. As already discuss several “w” are involved in aggregation, when you aggregate you remove one w. So suppose you aggregate on geography you remove where “w”. So now you cannot tell, this sale is either of zone1,2 etc. suppose you summarized on geography you cannot tell what actually happened in zones as geography has zones as components. The reason is aggregation is irreversible and from aggregation we mean summary. so if you aggregate on when i.e time you cannot tell when that event occurred. Aggregation is one-way i.e. you can create aggregates, but can not dissolve aggregates to get the original data from which the aggregates were created. For example 3+2+1 = 6 at the same time 2+4 also equals 6, so does 5+1 and if we consider reals, then infinetly many ways of adding numbers to get the same result. If you think about the “5 W’s of journalism”, these are the “6 W’s of data analysis”. Again it highlights the types of questions that end users want to ask and can not be answered by summary data. By definition, a summarization will consider at least one of these points irrelevant. For example, a summary across the company takes out the dimension of “WHERE” and a summary by quarters takes out the element of “WHEN”. The point to be noted is that although summary data has a purpose, yet one can take any summary and ask a question that the system can not answer.
  7. This is not the case of ireversibility, it means same aggregate can come by adding different numbers. But a lot of pit falls can come.
  8. In this example a portion of data is hide, here it represents that the average of every zone in every week. Now from aggregation, we can see that average sale is equal in all zones. Suppose a company has launched a sale compaign, and they want to know the impact of the compaign in all zones. Now aggregation has hide the information, and if we look at its graph in next sl
  9. Questioon is what are facts and dimensions. If in this statement when we say we need sales volume and Rs these are facts. And dimension mantioned. So it is your job to identify what are the facts and what are the dimensions.
  10. Every number is not fact i.e. pk. The rule is you suld be able to add them, e.g. discount is not fact, because you give discount on the base of percentage (discuss later).
  11. Dimensions are descriptions, which have hirarchy or you run where clause on it. You have to identify the dimensions you cannot judge them from their loud voices.
  12. Its mean when a transaction is recorded, during transaction the attribute which do not change its value is dimension. See the above table. Here, you went to an atm and you withdraw some money, now transaction_RS attribute will not have that money which was previously their, but there are some attributes like location, account_no they will not change. Now question is time changes but here we are concerned with time_of_day like is it morning, evening etc. so the time will remain constant because transaction will not take so long. So except transaction_rupees every thing is dimension.
  13. Are dimensions Always single valued: a car selling company has a workshop as well. So when car goes there they record different type of data against it. Oil filter change, plug change so maintainance has multivalues. So multi values are possible in dimension table.
  14. Several grains are possible but select the most appropriate grain low grains will have more dimensions ie you aggregate on day, then time of day will have several values like morning, evening etc. but if the grain is weekly then it wont have such dimensions. So low grains are more expressive.