O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
Forest
Rim
Technology
Copyright Inmon Consulting Services, 2008C
DATA WAREHOUSE BASICS
a presentation by
W H Inmon
The data warehouse
- a definition
A subject oriented, non volatile,
integrated, time variant collection
of data for the su...
Granular, detailed data and lots of it
Data that can be shaped and reshaped
A foundation of reconcilability
A basis for ne...
key
time
primary data
secondary data
What a typical record of the data warehouse
looks like
Forest
Rim
Technology
Copyrigh...
key
An identifier
Unique or non unique
Often a compound key
May be natural or blind
Forest
Rim
Technology
Copyright Inmon ...
time
Time variancy
- continuous
- from date/to date
- periodic discrete
Forest
Rim
Technology
Copyright Inmon Consulting S...
Name
Address
Phone
Zip
Email
…….. A continuous
time span record
from
date
to
date
Forest
Rim
Technology
Copyright Inmon Co...
Name
Address
Phone
Zip
Email
……..
from
date
to
date
Name
Address
Phone
Zip
Email
……..
from
date
to
date
Name
Address
Phone...
No overlap
Discontinuity is a possibility
999000 From the beginning of time to the end of time
Continuous time span data
F...
Periodic discrete structure
Jan 1
Expenses
Revenues
No of employees
Stock price
Price per share
………………….
Feb 1
Expenses
Re...
Periodic discrete structure
Jan 1
Expenses
Revenues
No of employees
Stock price
Price per share
………………….
Feb 1
Expenses
Re...
Periodic discrete structure
For few variables
For slow changing variables
Continuous time span data
For many variables
For...
Primary data
Primary data relates directly to the key
Example – key – ssno
- primary data – name, date of birth
Forest
Rim...
Secondary data
Secondary data relates directly to
the primary data
Example – key – ssno
- primary data – name, date of bir...
The granular data in the
data warehouse –
- serves as a basis for
many other forms of DSS
- is instantly available
- forms...
Relational
structures Star joins
requirements
The data warehouse is shaped by the data model;
The star join world is shape...
Often called
Multi dimensional data
Often called
Atomic data
Forest
Rim
Technology
Copyright Inmon Consulting Services, 20...
applications
Legacy data
Operational data
Transactional data
Atomic
data
Data
warehouse
The source of data warehouse data
...
m/f
1/0
x/y
male/
female
gender
m/f
integration of data in the data warehouse
Forest
Rim
Technology
Copyright Inmon Consul...
inches
cms
feet
miles
unit of
measure
cms
units of measurement need
to be integrated
Forest
Rim
Technology
Copyright Inmon...
ETL
Extract/transform/load
The integration and conversion of data
is the most difficult part of the data warehouse
process...
Transformation code can
be generated manually or
automatically.
Automatically is always
preferred
Forest
Rim
Technology
Co...
The functions performed
by the ETL process are
not trivial -
Convert
Reformat
Add time element
Restructure
New key
Add def...
ETL performed in host
environment
ETL performed in
source environment
ETL processing can be
performed in different places
...
data warehouse –
at the center of the
decision making of
the corporation
Forest
Rim
Technology
Copyright Inmon Consulting ...
Próximos SlideShares
Carregando em…5
×

[db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

4.095 visualizações

Publicada em

Bill Inmon – the “father of data warehouse” – has written 53 books published in nine languages. Bill’s latest adventure is the building of technology known as textual disambiguation – technology that reads raw text in a narrative format and allows the text to be placed in a conventional data base so that it can be analyzed by standard analytical technology, thereby creating unique business value for Big Data/unstructured data. Bill was named by ComputerWorld as one of the ten most influential people in the history of the computer profession. Bill lives in Castle Rock, Colorado. For more information about textual disambiguation refer to www.forestrimtech.com.

Publicada em: Tecnologia
  • Seja o primeiro a comentar

[db tech showcase Tokyo 2015] DATA WAREHOUSE BASICS by Wiliiam Inmon

  1. 1. Forest Rim Technology Copyright Inmon Consulting Services, 2008C DATA WAREHOUSE BASICS a presentation by W H Inmon
  2. 2. The data warehouse - a definition A subject oriented, non volatile, integrated, time variant collection of data for the support of management’s decisions Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  3. 3. Granular, detailed data and lots of it Data that can be shaped and reshaped A foundation of reconcilability A basis for new, unknown analysis Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  4. 4. key time primary data secondary data What a typical record of the data warehouse looks like Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  5. 5. key An identifier Unique or non unique Often a compound key May be natural or blind Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  6. 6. time Time variancy - continuous - from date/to date - periodic discrete Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  7. 7. Name Address Phone Zip Email …….. A continuous time span record from date to date Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  8. 8. Name Address Phone Zip Email …….. from date to date Name Address Phone Zip Email …….. from date to date Name Address Phone Zip Email …….. from date to date A sequence of time span records Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  9. 9. No overlap Discontinuity is a possibility 999000 From the beginning of time to the end of time Continuous time span data Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  10. 10. Periodic discrete structure Jan 1 Expenses Revenues No of employees Stock price Price per share …………………. Feb 1 Expenses Revenues No of employees Stock price Price per share …………………. Mar 1 Expenses Revenues No of employees Stock price Price per share …………………. Apr 1 Expenses Revenues No of employees Stock price Price per share …………………. The notion of taking a snapshot as of some one moment in time Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  11. 11. Periodic discrete structure Jan 1 Expenses Revenues No of employees Stock price Price per share …………………. Feb 1 Expenses Revenues No of employees Stock price Price per share …………………. Mar 1 Expenses Revenues No of employees Stock price Price per share …………………. Apr 1 Expenses Revenues No of employees Stock price Price per share …………………. The structure says nothing about values as of any other date Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  12. 12. Periodic discrete structure For few variables For slow changing variables Continuous time span data For many variables For quickly changing variables Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  13. 13. Primary data Primary data relates directly to the key Example – key – ssno - primary data – name, date of birth Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  14. 14. Secondary data Secondary data relates directly to the primary data Example – key – ssno - primary data – name, date of birth - secondary data – address, zip, phone Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  15. 15. The granular data in the data warehouse – - serves as a basis for many other forms of DSS - is instantly available - forms a foundation of reconcilability Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  16. 16. Relational structures Star joins requirements The data warehouse is shaped by the data model; The star join world is shaped by requirements Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  17. 17. Often called Multi dimensional data Often called Atomic data Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  18. 18. applications Legacy data Operational data Transactional data Atomic data Data warehouse The source of data warehouse data is the operational environment Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  19. 19. m/f 1/0 x/y male/ female gender m/f integration of data in the data warehouse Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  20. 20. inches cms feet miles unit of measure cms units of measurement need to be integrated Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  21. 21. ETL Extract/transform/load The integration and conversion of data is the most difficult part of the data warehouse process Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  22. 22. Transformation code can be generated manually or automatically. Automatically is always preferred Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  23. 23. The functions performed by the ETL process are not trivial - Convert Reformat Add time element Restructure New key Add default values Change dbms Change operating system Summarize Break into multiple records Convert key structure Merge records Collect metadata Conform to data model Select data/reject data Add indexes Change encoding Change hardware environments Resequence data Ascii to ebcdic;ebcdic to ascii Partition data Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  24. 24. ETL performed in host environment ETL performed in source environment ETL processing can be performed in different places Forest Rim Technology Copyright Inmon Consulting Services, 2008C
  25. 25. data warehouse – at the center of the decision making of the corporation Forest Rim Technology Copyright Inmon Consulting Services, 2008C

×