SlideShare uma empresa Scribd logo
1 de 48
Chapter 9 1
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9:Chapter 9:
data warehousingdata warehousing
Modern Database Management
11th
Edition
Jeffrey A. Hoffer, V. Ramesh,
Heikki Topi
© 2013 Pearson Education, Inc.  Publishing as Prentice Hall 1
Chapter 9 2
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Objectives
•Define terms
•Explore reasons for information gap between
information needs and availability
•Understand reasons for need of data
warehousing
•Describe three levels of data warehouse
architectures
•Describe two components of star schema
•Estimate fact table size
•Design a data mart
•Develop requirements for a data mart
2
Chapter 9 3
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Definitions
•Data Warehouse
• A subject-oriented, integrated, time-variant, non-
updatable collection of data used in support of
management decision-making processes
• Subject-oriented: e.g. customers, patients, students,
products
• Integrated: consistent naming conventions, formats,
encoding structures; from multiple data sources
• Time-variant: can study trends and changes
• Non-updatable: read-only, periodically refreshed
•Data Mart
• A data warehouse that is limited in scope
3
Chapter 9 4
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
History Leading to Data Warehousing
•Improvement in database technologies, especially
relational DBMSs
•Advances in computer hardware, including mass
storage and parallel architectures
•Emergence of end-user computing with powerful
interfaces and tools
•Advances in middleware, enabling heterogeneous
database connectivity
•Recognition of difference between operational and
informational systems
4
Chapter 9 5
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Need for Data Warehousing
• Integrated, company-wide view of high-quality information (from
disparate databases)
• Separation of operational and informational systems and data (for
improved performance)
5
Chapter 9 6
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Issues with Company-Wide View
Inconsistent key structures
Synonyms
Free-form vs. structured fields
Inconsistent data values
Missing data
See figure 9-1 for example
6
Chapter 9 7
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 7
Figure 9-1
Examples of
heterogeneous
data
Chapter 9 7
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 8
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Organizational Trends Motivating Data
Warehouses
•No single system of records
•Multiple systems not synchronized
•Organizational need to analyze
activities in a balanced way
•Customer relationship management
•Supplier relationship management
8
Chapter 9 9
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Separating Operational and
Informational Systems
• Operational system – a system that is used to run a business in
real time, based on current data; also called a system of record
• Informational system – a system designed to support decision
making based on historical point-in-time and prediction data for
complex queries or data-mining applications
9
Chapter 9 10
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 10
Chapter 9 11
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Data Warehouse Architectures
•Independent Data Mart
•Dependent Data Mart and
Operational Data Store
•Logical Data Mart and Real-Time Data
Warehouse
•Three-Layer architecture
11
All involve some form of extract, transform and load ((ETLETL)
Chapter 9 12
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 12
Figure 9-2 Independent data mart
data warehousing architecture
Data marts:Data marts:
Mini-warehouses, limited in scope
E
T
L
Separate ETL for each
independent data mart
Data access complexity
due to multiple data marts
Chapter 9 12
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 13
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 13
Figure 9-3 Dependent data mart with
operational data store: a three-level architecture
E
T
L
Single ETL for
enterprise data warehouse (EDW)(EDW)
Simpler data access
ODSODS provides option for
obtaining current data
Dependent data marts
loaded from EDW
Chapter 9 13
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 14
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 14
E
T
L
Near real-time ETL for
Data WarehouseData Warehouse
ODSODS and data warehousedata warehouse
are one and the same
Data marts are NOT separate databases,
but logical views of the data warehouse
 Easier to create new data marts
Figure 9-4 Logical data mart and real
time warehouse architecture
Chapter 9 14
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 15
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 15Chapter 9 15
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 16
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 16
Figure 9-5 Three-layer data architecture for a data warehouse
Chapter 9 16
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 17
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Data Characteristics
Status vs. Event Data
17
Statu
s
Status
Event = a
database action
(create/ update/
delete) that
results from a
transaction
Figure 9-6
Example of DBMS
log entry
Chapter 9 18
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
With transient
data, changes
to existing
records are
written over
previous
records, thus
destroying the
previous data
content.
Figure 9-7
Transient
operational data
DATA CHARACTERISTICSDATA CHARACTERISTICS
STATUS VS. EVENT DATASTATUS VS. EVENT DATA
Chapter 9 19
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 19
Periodic data
are never
physically
altered or
deleted once
they have been
added to the
store.
Figure 9-8 Periodic
warehouse data
DATA CHARACTERISTICSDATA CHARACTERISTICS
STATUS VS. EVENT DATASTATUS VS. EVENT DATA
Chapter 9 20
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Other Data Warehouse Changes
•New descriptive attributes
•New business activity attributes
•New classes of descriptive attributes
•Descriptive attributes become more
refined
•Descriptive data are related to one
another
•New source of data
20
Chapter 9 21
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Derived Data
• Objectives
• Ease of use for decision support applications
• Fast response to predefined user queries
• Customized data for particular target audiences
• Ad-hoc query support
• Data mining capabilities
• Characteristics
• Detailed (mostly periodic) data
• Aggregate (for summary)
• Distributed (to departmental servers)
21
Most common data model = dimensional model
(usually implemented as a star schema)
Chapter 9 22
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 22
Figure 9-9 Components of a star schemastar schema
Fact tables contain factual
or quantitative data
Dimension tables contain descriptions
about the subjects of the business
1:N relationship between
dimension tables and fact tables
Excellent for ad-hoc queries, but bad for online transaction processing
Dimension tables are denormalized to
maximize performance
Chapter 9 22
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 23
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 23
Figure 9-10 Star schema example
Fact table provides statistics for
sales broken down by product, period
and store dimensions
Chapter 9 23
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 24
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 24
Figure 9-11 Star schema with sample data
Chapter 9 24
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 25
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Surrogate Keys
•Dimension table keys should be surrogate (non-
intelligent and non-business related), because:
•Business keys may change over time
•Helps keep track of nonkey attribute values for
a given production key
•Surrogate keys are simpler and shorter
•Surrogate keys can be same length and format
for all key
25
Chapter 9 26
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Grain of the Fact Table
• Granularity of Fact Table–what level of detail do you want?
•Transactional grain–finest level
•Aggregated grain–more summarized
•Finer grains  better market basket analysis
capability
•Finer grain  more dimension tables, more
rows in fact table
•In Web-based commerce, finest granularity is
a click
26
Chapter 9 27
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Duration of the Database
• Natural duration–13 months or 5 quarters
• Financial institutions may need longer duration
• Older data is more difficult to source and cleanse
27
Chapter 9 28
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Size of Fact Table
• Depends on the number of dimensions and the grain of the fact table
• Number of rows = product of number of possible values for each
dimension associated with the fact table
• Example: Assume the following for Figure 9-11:
• Total rows calculated as follows (assuming only half the products record
sales for a given month):
28
Chapter 9 29
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 29
Figure 9-12 Modeling dates
Fact tables contain time-period data
 Date dimensions are important
Chapter 9 30
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Variations of the Star Schema
• Multiple Facts Tables
• Can improve performance
• Often used to store facts for different combinations of
dimensions
• Conformed dimensions
• Factless Facts Tables
• No nonkey data, but foreign keys for associated
dimensions
• Used for:
• Tracking events
• Inventory coverage
30
Chapter 9 31
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 31
Figure 9-13 Conformed dimensions
Conformed
dimension
Associated with
multiple fact
tables
Two fact tables  two (connected) start schemas.
Chapter 9 32
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 32
Figure 9-14a Factless fact table showing occurrence of
an event
No data in fact
table, just keys
associating
dimension records
Fact table forms an
n-ary relationship
between
dimensions
Chapter 9 33
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Normalizing Dimension Tables
•Multivalued Dimensions
• Facts qualified by a set of values for the same business subject
• Normalization involves creating a table for an associative entity
between dimensions
•Hierarchies
• Sometimes a dimension forms a natural, fixed depth hierarchy
• Design options
• Include all information for each level in a single denormalized table
• Normalize the dimension into a nested set of 1:M table relationships
33
Chapter 9 34
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 34
Figure 9-15 Multivalued dimension
Helper table is an associative entity that implements
a M:N relationship between dimension and fact.
Chapter 9 35
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 35
Figure 9-16 Fixed product hierarchy
Dimension hierarchies help to provide levels of
aggregation for users wanting summary information in
a data warehouse.
Chapter 9 36
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Slowly Changing Dimensions (SCD)
•How to maintain knowledge of the past
•Kimble’s approaches:
• Type 1: just replace old data with new (lose historical data)
• Type 2: for each changing attribute, create a current value field and several old-
valued fields (multivalued)
• Type 3: create a new dimension table row each time the dimension object changes,
with all dimension characteristics at the time of change. Most common approach
36
Chapter 9 37
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 37
Figure 9-18 Example of Type 2 SCD Customer dimension table
The dimension table contains several records for the same
customer. The specific customer record to use depends on
the key and the date of the fact, which should be between
start and end dates of the SCD customer record.
Chapter 9 38
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 38
Figure 9-19 Dimension segmentation
For rapidly changing attributes (hot attributes), Type 2 SCD
approach creates too many rows and too much redundant
data. Use segmentation instead.
Chapter 9 39
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
10 Essential Rules for Dimensional10 Essential Rules for Dimensional
ModelingModeling
• Use atomic facts
• Create single-process fact
tables
• Include a date dimension
for each fact table
• Enforce consistent grain
• Disallow null keys in fact
tables
• Honor hierarchies
• Decode dimension tables
• Use surrogate keys
• Conform dimensions
• Balance requirements with
actual data
39
Chapter 9 40
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Other Data Warehouse Advances
 Columnar databases
 Issue of Big Data (huge volume, often unstructured)
 Columnar databases optimize storage for summary data of few columns (different
need than OLTP)
 Data compression
 Sybase, Vertica, Infobright,
 NoSQL
 “Not only SQL”
 Deals with unstructured data
 MongoDB, CouchDB, Apache Cassandra
40
Chapter 9 41
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
The User Interface
Metadata (data catalog)
•Identify subjects of the data mart
•Identify dimensions and facts
•Indicate how data is derived from enterprise data
warehouses, including derivation rules
•Indicate how data is derived from operational data store,
including derivation rules
•Identify available reports and predefined queries
•Identify data analysis techniques (e.g. drill-down)
•Identify responsible people
41
Chapter 9 42
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Online Analytical Processing (OLAP) Tools
• The use of a set of graphical tools that provides users with
multidimensional views of their data and allows them to analyze
the data using simple windowing techniques
•Relational OLAP (ROLAP)
• Traditional relational representation
•Multidimensional OLAP (MOLAP)
• Cube structure
•OLAP Operations
• Cube slicing–come up with 2-D view of data
• Drill-down–going from summary to more detailed views
42
Chapter 9 43
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 43
Figure 9-21 Slicing a data cube
Chapter 9 43
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 44
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 44
Figure 9-22
Example of drill-down
Summary report
Drill-down with
color added
Starting with summary
data, users can obtain
details for particular
cells.
Chapter 9 45
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Business Performance Mgmt (BPM)
45
Figure 9-25
Sample Dashboard
BPM systems allow
managers to measure,
monitor, and manage
key activities and
processes to achieve
organizational goals.
Dashboards are often
used to provide an
information system in
support of BPM.
Charts like these are examples of data visualization, the representation
of data in graphical and multimedia formats for human analysis.
Chapter 9 46
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Data Mining
Knowledge discovery using a blend
of statistical, AI, and computer
graphics techniques
Goals:
Explain observed events or conditions
Confirm hypotheses
Explore data for new or unexpected
relationships
46
Chapter 9 47
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 47Chapter 9 47
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
Chapter 9 48
© 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 4848
Copyright © 2011 Pearson Education, Inc.  Publishing as Prentice HallCopyright © 2011 Pearson Education, Inc.  Publishing as Prentice Hall

Mais conteúdo relacionado

Mais procurados

Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
SAP Data Services
SAP Data ServicesSAP Data Services
SAP Data ServicesGeetika
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouseSrinivasan R
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse designines beltaief
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouseUday Kothari
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousingKavisha Uniyal
 
Difference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional ModelingDifference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional ModelingAbdul Aslam
 
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCEcscpconf
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewNagaraj Yerram
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Templatebutest
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanNivetha Durganathan
 
Olap and data warehouse -inmon 050204 u
Olap and data warehouse -inmon 050204 uOlap and data warehouse -inmon 050204 u
Olap and data warehouse -inmon 050204 uTalita Lima
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasEric Matthews
 

Mais procurados (20)

Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
SAP Data Services
SAP Data ServicesSAP Data Services
SAP Data Services
 
Tuning data warehouse
Tuning data warehouseTuning data warehouse
Tuning data warehouse
 
Column Oriented Databases
Column Oriented DatabasesColumn Oriented Databases
Column Oriented Databases
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data integration
Data integrationData integration
Data integration
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
 
Seminar datawarehousing
Seminar datawarehousingSeminar datawarehousing
Seminar datawarehousing
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Difference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional ModelingDifference between ER-Modeling and Dimensional Modeling
Difference between ER-Modeling and Dimensional Modeling
 
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
 
Traditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overviewTraditional Data-warehousing / BI overview
Traditional Data-warehousing / BI overview
 
Star schema
Star schemaStar schema
Star schema
 
PowerPoint Template
PowerPoint TemplatePowerPoint Template
PowerPoint Template
 
Data warehouse - Nivetha Durganathan
Data warehouse - Nivetha DurganathanData warehouse - Nivetha Durganathan
Data warehouse - Nivetha Durganathan
 
Olap and data warehouse -inmon 050204 u
Olap and data warehouse -inmon 050204 uOlap and data warehouse -inmon 050204 u
Olap and data warehouse -inmon 050204 u
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
 

Semelhante a dataware house

hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.pptKARTHICKT41
 
hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.pptAttaUrRahman78
 
hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.pptbishalnandi2
 
hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.pptHaroon494144
 
DW&BI.ppt, Business Intelligence, DATA Mining
DW&BI.ppt, Business Intelligence, DATA MiningDW&BI.ppt, Business Intelligence, DATA Mining
DW&BI.ppt, Business Intelligence, DATA MiningSHAKIR325211
 
hoffer_edm_pp_ch09_Essential of database management.ppt
hoffer_edm_pp_ch09_Essential of database management.ppthoffer_edm_pp_ch09_Essential of database management.ppt
hoffer_edm_pp_ch09_Essential of database management.pptSaravana Kumar
 
DATA WAREHOUSING.pptx
DATA WAREHOUSING.pptxDATA WAREHOUSING.pptx
DATA WAREHOUSING.pptxcalf_ville86
 
Data Warehouse Architecture.pptx
Data Warehouse Architecture.pptxData Warehouse Architecture.pptx
Data Warehouse Architecture.pptx22PCS007ANBUF
 
The Database Environment Chapter 11
The Database Environment Chapter 11The Database Environment Chapter 11
The Database Environment Chapter 11Jeanie Arnoco
 
009445185.pdf
009445185.pdf009445185.pdf
009445185.pdfEidTahir
 
Heizer om10 ch09
Heizer om10 ch09Heizer om10 ch09
Heizer om10 ch09ryaekle
 
Partitioning your Oracle Data Warehouse - Just a simple task?
Partitioning your Oracle Data Warehouse - Just a simple task?Partitioning your Oracle Data Warehouse - Just a simple task?
Partitioning your Oracle Data Warehouse - Just a simple task?Trivadis
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptMutiaSari53
 
Smarter Management for Your Data Growth
Smarter Management for Your Data GrowthSmarter Management for Your Data Growth
Smarter Management for Your Data GrowthRainStor
 
Chapter 13The Data WarehouseBLCN-534 Fundamentals o.docx
Chapter 13The Data WarehouseBLCN-534 Fundamentals o.docxChapter 13The Data WarehouseBLCN-534 Fundamentals o.docx
Chapter 13The Data WarehouseBLCN-534 Fundamentals o.docxketurahhazelhurst
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptSubrata Kumer Paul
 
Laudon_MIS13_ch06.ppt
Laudon_MIS13_ch06.pptLaudon_MIS13_ch06.ppt
Laudon_MIS13_ch06.pptanisur_rehman
 

Semelhante a dataware house (20)

hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppt
 
hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppt
 
hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppt
 
hoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppthoffer_edm_pp_ch09.ppt
hoffer_edm_pp_ch09.ppt
 
DW&BI.ppt, Business Intelligence, DATA Mining
DW&BI.ppt, Business Intelligence, DATA MiningDW&BI.ppt, Business Intelligence, DATA Mining
DW&BI.ppt, Business Intelligence, DATA Mining
 
hoffer_edm_pp_ch09_Essential of database management.ppt
hoffer_edm_pp_ch09_Essential of database management.ppthoffer_edm_pp_ch09_Essential of database management.ppt
hoffer_edm_pp_ch09_Essential of database management.ppt
 
DATA WAREHOUSING.pptx
DATA WAREHOUSING.pptxDATA WAREHOUSING.pptx
DATA WAREHOUSING.pptx
 
Data Warehouse Architecture.pptx
Data Warehouse Architecture.pptxData Warehouse Architecture.pptx
Data Warehouse Architecture.pptx
 
The Database Environment Chapter 11
The Database Environment Chapter 11The Database Environment Chapter 11
The Database Environment Chapter 11
 
009445185.pdf
009445185.pdf009445185.pdf
009445185.pdf
 
Heizer om10 ch09
Heizer om10 ch09Heizer om10 ch09
Heizer om10 ch09
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
Partitioning your Oracle Data Warehouse - Just a simple task?
Partitioning your Oracle Data Warehouse - Just a simple task?Partitioning your Oracle Data Warehouse - Just a simple task?
Partitioning your Oracle Data Warehouse - Just a simple task?
 
Data Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.pptData Mining Concept & Technique-ch04.ppt
Data Mining Concept & Technique-ch04.ppt
 
Smarter Management for Your Data Growth
Smarter Management for Your Data GrowthSmarter Management for Your Data Growth
Smarter Management for Your Data Growth
 
Chapter 13The Data WarehouseBLCN-534 Fundamentals o.docx
Chapter 13The Data WarehouseBLCN-534 Fundamentals o.docxChapter 13The Data WarehouseBLCN-534 Fundamentals o.docx
Chapter 13The Data WarehouseBLCN-534 Fundamentals o.docx
 
layout strategy.ppt
layout strategy.pptlayout strategy.ppt
layout strategy.ppt
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
Laudon_MIS13_ch06.ppt
Laudon_MIS13_ch06.pptLaudon_MIS13_ch06.ppt
Laudon_MIS13_ch06.ppt
 

Último

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 

Último (20)

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 

dataware house

  • 1. Chapter 9 1 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Chapter 9:Chapter 9: data warehousingdata warehousing Modern Database Management 11th Edition Jeffrey A. Hoffer, V. Ramesh, Heikki Topi © 2013 Pearson Education, Inc.  Publishing as Prentice Hall 1
  • 2. Chapter 9 2 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Objectives •Define terms •Explore reasons for information gap between information needs and availability •Understand reasons for need of data warehousing •Describe three levels of data warehouse architectures •Describe two components of star schema •Estimate fact table size •Design a data mart •Develop requirements for a data mart 2
  • 3. Chapter 9 3 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Definitions •Data Warehouse • A subject-oriented, integrated, time-variant, non- updatable collection of data used in support of management decision-making processes • Subject-oriented: e.g. customers, patients, students, products • Integrated: consistent naming conventions, formats, encoding structures; from multiple data sources • Time-variant: can study trends and changes • Non-updatable: read-only, periodically refreshed •Data Mart • A data warehouse that is limited in scope 3
  • 4. Chapter 9 4 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall History Leading to Data Warehousing •Improvement in database technologies, especially relational DBMSs •Advances in computer hardware, including mass storage and parallel architectures •Emergence of end-user computing with powerful interfaces and tools •Advances in middleware, enabling heterogeneous database connectivity •Recognition of difference between operational and informational systems 4
  • 5. Chapter 9 5 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Need for Data Warehousing • Integrated, company-wide view of high-quality information (from disparate databases) • Separation of operational and informational systems and data (for improved performance) 5
  • 6. Chapter 9 6 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Issues with Company-Wide View Inconsistent key structures Synonyms Free-form vs. structured fields Inconsistent data values Missing data See figure 9-1 for example 6
  • 7. Chapter 9 7 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 7 Figure 9-1 Examples of heterogeneous data Chapter 9 7 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 8. Chapter 9 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Organizational Trends Motivating Data Warehouses •No single system of records •Multiple systems not synchronized •Organizational need to analyze activities in a balanced way •Customer relationship management •Supplier relationship management 8
  • 9. Chapter 9 9 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Separating Operational and Informational Systems • Operational system – a system that is used to run a business in real time, based on current data; also called a system of record • Informational system – a system designed to support decision making based on historical point-in-time and prediction data for complex queries or data-mining applications 9
  • 10. Chapter 9 10 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 10
  • 11. Chapter 9 11 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Data Warehouse Architectures •Independent Data Mart •Dependent Data Mart and Operational Data Store •Logical Data Mart and Real-Time Data Warehouse •Three-Layer architecture 11 All involve some form of extract, transform and load ((ETLETL)
  • 12. Chapter 9 12 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 12 Figure 9-2 Independent data mart data warehousing architecture Data marts:Data marts: Mini-warehouses, limited in scope E T L Separate ETL for each independent data mart Data access complexity due to multiple data marts Chapter 9 12 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 13. Chapter 9 13 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 13 Figure 9-3 Dependent data mart with operational data store: a three-level architecture E T L Single ETL for enterprise data warehouse (EDW)(EDW) Simpler data access ODSODS provides option for obtaining current data Dependent data marts loaded from EDW Chapter 9 13 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 14. Chapter 9 14 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 14 E T L Near real-time ETL for Data WarehouseData Warehouse ODSODS and data warehousedata warehouse are one and the same Data marts are NOT separate databases, but logical views of the data warehouse  Easier to create new data marts Figure 9-4 Logical data mart and real time warehouse architecture Chapter 9 14 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 15. Chapter 9 15 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 15Chapter 9 15 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 16. Chapter 9 16 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 16 Figure 9-5 Three-layer data architecture for a data warehouse Chapter 9 16 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 17. Chapter 9 17 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Data Characteristics Status vs. Event Data 17 Statu s Status Event = a database action (create/ update/ delete) that results from a transaction Figure 9-6 Example of DBMS log entry
  • 18. Chapter 9 18 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall With transient data, changes to existing records are written over previous records, thus destroying the previous data content. Figure 9-7 Transient operational data DATA CHARACTERISTICSDATA CHARACTERISTICS STATUS VS. EVENT DATASTATUS VS. EVENT DATA
  • 19. Chapter 9 19 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 19 Periodic data are never physically altered or deleted once they have been added to the store. Figure 9-8 Periodic warehouse data DATA CHARACTERISTICSDATA CHARACTERISTICS STATUS VS. EVENT DATASTATUS VS. EVENT DATA
  • 20. Chapter 9 20 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Other Data Warehouse Changes •New descriptive attributes •New business activity attributes •New classes of descriptive attributes •Descriptive attributes become more refined •Descriptive data are related to one another •New source of data 20
  • 21. Chapter 9 21 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Derived Data • Objectives • Ease of use for decision support applications • Fast response to predefined user queries • Customized data for particular target audiences • Ad-hoc query support • Data mining capabilities • Characteristics • Detailed (mostly periodic) data • Aggregate (for summary) • Distributed (to departmental servers) 21 Most common data model = dimensional model (usually implemented as a star schema)
  • 22. Chapter 9 22 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 22 Figure 9-9 Components of a star schemastar schema Fact tables contain factual or quantitative data Dimension tables contain descriptions about the subjects of the business 1:N relationship between dimension tables and fact tables Excellent for ad-hoc queries, but bad for online transaction processing Dimension tables are denormalized to maximize performance Chapter 9 22 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 23. Chapter 9 23 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 23 Figure 9-10 Star schema example Fact table provides statistics for sales broken down by product, period and store dimensions Chapter 9 23 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 24. Chapter 9 24 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 24 Figure 9-11 Star schema with sample data Chapter 9 24 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 25. Chapter 9 25 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Surrogate Keys •Dimension table keys should be surrogate (non- intelligent and non-business related), because: •Business keys may change over time •Helps keep track of nonkey attribute values for a given production key •Surrogate keys are simpler and shorter •Surrogate keys can be same length and format for all key 25
  • 26. Chapter 9 26 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Grain of the Fact Table • Granularity of Fact Table–what level of detail do you want? •Transactional grain–finest level •Aggregated grain–more summarized •Finer grains  better market basket analysis capability •Finer grain  more dimension tables, more rows in fact table •In Web-based commerce, finest granularity is a click 26
  • 27. Chapter 9 27 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Duration of the Database • Natural duration–13 months or 5 quarters • Financial institutions may need longer duration • Older data is more difficult to source and cleanse 27
  • 28. Chapter 9 28 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Size of Fact Table • Depends on the number of dimensions and the grain of the fact table • Number of rows = product of number of possible values for each dimension associated with the fact table • Example: Assume the following for Figure 9-11: • Total rows calculated as follows (assuming only half the products record sales for a given month): 28
  • 29. Chapter 9 29 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 29 Figure 9-12 Modeling dates Fact tables contain time-period data  Date dimensions are important
  • 30. Chapter 9 30 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Variations of the Star Schema • Multiple Facts Tables • Can improve performance • Often used to store facts for different combinations of dimensions • Conformed dimensions • Factless Facts Tables • No nonkey data, but foreign keys for associated dimensions • Used for: • Tracking events • Inventory coverage 30
  • 31. Chapter 9 31 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 31 Figure 9-13 Conformed dimensions Conformed dimension Associated with multiple fact tables Two fact tables  two (connected) start schemas.
  • 32. Chapter 9 32 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 32 Figure 9-14a Factless fact table showing occurrence of an event No data in fact table, just keys associating dimension records Fact table forms an n-ary relationship between dimensions
  • 33. Chapter 9 33 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Normalizing Dimension Tables •Multivalued Dimensions • Facts qualified by a set of values for the same business subject • Normalization involves creating a table for an associative entity between dimensions •Hierarchies • Sometimes a dimension forms a natural, fixed depth hierarchy • Design options • Include all information for each level in a single denormalized table • Normalize the dimension into a nested set of 1:M table relationships 33
  • 34. Chapter 9 34 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 34 Figure 9-15 Multivalued dimension Helper table is an associative entity that implements a M:N relationship between dimension and fact.
  • 35. Chapter 9 35 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 35 Figure 9-16 Fixed product hierarchy Dimension hierarchies help to provide levels of aggregation for users wanting summary information in a data warehouse.
  • 36. Chapter 9 36 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Slowly Changing Dimensions (SCD) •How to maintain knowledge of the past •Kimble’s approaches: • Type 1: just replace old data with new (lose historical data) • Type 2: for each changing attribute, create a current value field and several old- valued fields (multivalued) • Type 3: create a new dimension table row each time the dimension object changes, with all dimension characteristics at the time of change. Most common approach 36
  • 37. Chapter 9 37 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 37 Figure 9-18 Example of Type 2 SCD Customer dimension table The dimension table contains several records for the same customer. The specific customer record to use depends on the key and the date of the fact, which should be between start and end dates of the SCD customer record.
  • 38. Chapter 9 38 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 38 Figure 9-19 Dimension segmentation For rapidly changing attributes (hot attributes), Type 2 SCD approach creates too many rows and too much redundant data. Use segmentation instead.
  • 39. Chapter 9 39 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 10 Essential Rules for Dimensional10 Essential Rules for Dimensional ModelingModeling • Use atomic facts • Create single-process fact tables • Include a date dimension for each fact table • Enforce consistent grain • Disallow null keys in fact tables • Honor hierarchies • Decode dimension tables • Use surrogate keys • Conform dimensions • Balance requirements with actual data 39
  • 40. Chapter 9 40 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Other Data Warehouse Advances  Columnar databases  Issue of Big Data (huge volume, often unstructured)  Columnar databases optimize storage for summary data of few columns (different need than OLTP)  Data compression  Sybase, Vertica, Infobright,  NoSQL  “Not only SQL”  Deals with unstructured data  MongoDB, CouchDB, Apache Cassandra 40
  • 41. Chapter 9 41 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall The User Interface Metadata (data catalog) •Identify subjects of the data mart •Identify dimensions and facts •Indicate how data is derived from enterprise data warehouses, including derivation rules •Indicate how data is derived from operational data store, including derivation rules •Identify available reports and predefined queries •Identify data analysis techniques (e.g. drill-down) •Identify responsible people 41
  • 42. Chapter 9 42 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Online Analytical Processing (OLAP) Tools • The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques •Relational OLAP (ROLAP) • Traditional relational representation •Multidimensional OLAP (MOLAP) • Cube structure •OLAP Operations • Cube slicing–come up with 2-D view of data • Drill-down–going from summary to more detailed views 42
  • 43. Chapter 9 43 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 43 Figure 9-21 Slicing a data cube Chapter 9 43 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 44. Chapter 9 44 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 44 Figure 9-22 Example of drill-down Summary report Drill-down with color added Starting with summary data, users can obtain details for particular cells.
  • 45. Chapter 9 45 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Business Performance Mgmt (BPM) 45 Figure 9-25 Sample Dashboard BPM systems allow managers to measure, monitor, and manage key activities and processes to achieve organizational goals. Dashboards are often used to provide an information system in support of BPM. Charts like these are examples of data visualization, the representation of data in graphical and multimedia formats for human analysis.
  • 46. Chapter 9 46 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall Data Mining Knowledge discovery using a blend of statistical, AI, and computer graphics techniques Goals: Explain observed events or conditions Confirm hypotheses Explore data for new or unexpected relationships 46
  • 47. Chapter 9 47 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 47Chapter 9 47 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall
  • 48. Chapter 9 48 © 2013 Pearson Education, Inc. Publishing as Prentice Hall© 2013 Pearson Education, Inc. Publishing as Prentice Hall 4848 Copyright © 2011 Pearson Education, Inc.  Publishing as Prentice HallCopyright © 2011 Pearson Education, Inc.  Publishing as Prentice Hall