SlideShare uma empresa Scribd logo
1 de 46
An efficient way of data analysis.
Data warehousing
Data Warehouse (DW) is a Subject oriented,
 integrated, time variant, non-volatile collection of
 data in support of management’s system.
        -W.H.INMON

It is a collection of data designed to support
 management decision making by presenting a
 coherent picture of business conditions at a single
 point of time.
Features of Data Warehouse:-
    Subject-oriented

    Integrated

    Time-variant
                                    Data
    Nonvolatile
                                  Warehouse




                        Subject             Time-     Non
                        -       Integrated
                                           variant   volatile
                        oriente
                        d
Subject-oriented:
    Data are organized according to the subject instead of
     application.
     It mainly focuses on modeling and analysis of data for
     decision makers, not on daily operations or transaction
     processing.

Integrated:
     Constructed by integrating multiple, heterogeneous
      data sources like relational databases, flat files, on-line
      transaction records.
     Ensure consistency in naming conventions, encoding
      structures, attribute measures, etc. among different
      data sources.
Time-variant:
      The time horizon for the data warehouse is significantly
       longer than that of operational systems. i.e. provide
       information from a historical perspective (e.g., past 5-10
       years).


Non-volatile:
    No updates are allowed. Once the data entered into the
     data warehouse, they are never removed.
    The data in warehouse represent company’s history, the
     operational data representing near term history are
     always added to it..
Need for Decision Support
System(DSS) in business
In a typical business environment with an increasing
  competitive market, we cannot ponder over decisions for
  too long.
   Hence the business needs managers with quick decision
     making.
   Therefore, to succeed in business today, any company
     needs information systems that can support the diverse
     information and decision-making needs.
How many businesses appear and disappear
each year?
Decision Support Systems helps us to assess and
 resolve everyday business questions.
It works by compiling useful information from a
 combination of raw data, documents, personal
 knowledge, or business models.
Structured and Unstructured components
of DSS

Every task has a structured component as well as
 an unstructured component.
A Structured component is one which directly
 helps us to proceed towards decision.
An Unstructured component is that which is to
 be still processed and requires human interaction
 with the DSS.
For example, choosing a stock portfolio.
   Here the   Structured components are the amount
    of risk and the performance of stocks, but
   The choice of stocks for a portfolio requires human
    intuition. Hence it is Unstructured.


                                   risk
                                               Structured
          Stock
                                performance
        Portfolio
                                 Choice of
                                 Stock for     Unstructured
                                 portfolio
In order to implement DSS……
DSS architectural styles
 OLTP (Online Transaction Processing)
  -used by traditional operational systems (RDBMS).

 OLAP (Online Analytical Processing)
  -used by Data Warehouse.
The Operational Database is one which is
 accessed by an Operational System to carry out
 regular operations of an organization.
Operational Databases usually use an OLTP
 architecture which is optimized for faster
 transaction processing.
OLTP Databases access the data in the form of
 operations like- Inserting, Deleting, and Updating
 data.
OLTP
OLTP is a methodology to provide end users with
 access to large amounts of data
It works in an intuitive and rapid manner to assist
 with deductions based on investigative reasoning.
OLTP refers to a class of systems that facilitate
 and manage transaction-oriented applications,
 typically for data entry and retrieval transaction
 processing.
A typical OLTP Architecture


An Automatic Teller Machine  (ATM)
 for a bank is an example of a commercial
 transaction processing application.
Benefits of OLTP
1.   Simplicity & Efficiency:
             Reduced paper trails and the faster &
     more accurate forecasts for revenues and
     expenses are both examples of how OLTP
     makes things simpler for businesses.
2.   OLTP systems maintain data integrity and they
     also provide fast query processing in
     environments having multiple access.
Pitfalls of OLTP:-
 OLTP requires instant update.
 The data what we get from OLTP is not suitable
  for data analysis.
 To perform one simple transaction even with
  the normalized structure, we need to query
  multiple tables by using joins.
We usually refer to information as
 attributes related to entities, objects or
 classes, like product price, invoice amount
 or client name. Mapping can be with a
 simple, one argument function:


               product » price
               invoice » amount
               client » name

       each attribute is mapped to one
       column.
What is gross margin by product category
in Europe and Asia?
What's our current inventory by product
and warehouse?
Which was the evolution of return rate of
different products acquired by different
suppliers?
 functions of multiple arguments



   Product category × Region » Gross margin
   Product × Warehouse » Inventory
   Supplier × Time × Product » Return rate



  More than one attribute is mapped to one
  column.
Hence for data reporting and analysis we
 must have a new system.
       DATA WAREHOUSING
DATA WAREHOUSE
Data Warehouse is a database used for data
 reporting and analysis.
The data stored in the warehouse
 are uploaded from the operational systems (such
 as marketing, sales etc.)
The data store contains two main types of data.
    -Business Data
    -Business Data Model
                             Operational
The business data are          data
                                               Business
 extracted from                                  data

 operational database         External
                                data
  and from external
 sources which are relevant to the business (stock
 prices, market indicators, marketing information
 and competitor’s data).
DSS data vs Operational data
The DSS data which is extracted from
 multiple sources differ from operational data
 in three main areas:
         -time span,
             -granularity and
                - dimensionality.
Business data represent a
 snapshot of the company’s
 situation.
 A typical ETL (Extract
   Transform Load)-based
   data warehouse uses
  a) staging,
  b) integration and
  c) access layers (Data
     Marts)
     to house its key
      functions.
The data that arrived at data
 warehouse are first passed to
 Operational Data Store
 (ODS).
Data is integrated from
 multiple sources for
 additional operations on the
 data.
This integrated data is
 passed back to operational
 systems for decision-making.
DSS components
The DSS is usually composed of five main
 components:-
     Data store component
     Data extraction component

     Data filtering component

     End user query tool

     End user presentation tool
Data Sources   Data Storage   OLAP engine Front end tools
The data in the data warehouse is stored in the form
 of Data marts.
It allows the user to access the data in terms of a
 specific business line or team.
Data mart
The data mart is a subset of the data warehouse
 that is usually oriented to a specific business line
 or team.                        Multi-Tier Data
                                                Warehouse
               Distributed Data
               Marts


                                                     Enterprise
        Data              Data
                                                     Data
        Mart              Mart
                                                     Warehouse

           Model refinement       Model refinement

         Define a high-level corporate data model
OLAP(Online Analytical Processing)
OLAP is an approach to answer multi-dimensional
 analytical queries which also encompasses relational
 reporting and data mining.

An OLAP cube is an array of data that is understood
 in terms of its 0 or more dimensions which enables
 the users to gain insight into their data in a fast,
 interactive, easy-to-use manner.
For example, a company might
  wish to summarize financial
  data by product (Maruthi), by
  time-period (May 2006), by
  city (Mumbai) to compare
  actual budget expenses.

 Product, time, city and
  scenario (actual and budget)
  are the data's dimensions.

In this example, the cell
  contains the values of all      OLAP Cube
  Maruthi sales in Mumbai for
  May 2006.
The data in Data Warehouse is arranged in the form
 of hierarchical groups often called dimensions and
 into facts tables and aggregate facts.

OLAP data is typically stored in a Star Schema.
          -which is a combination of dimensions and
 fact tables.
STAR SCHEMA
time
Time_key                                             item
day                                                item_key
day_of_the_week               Sales Fact Table     item_name
month                                              brand
quarter                                time_key    type
year                                               supplier_type
                                       item_key
                                     branch_key
       branch                                      location
                                    location_key
       branch_key                                  location_key
       branch_name                    units_sold   street
       branch_type                                 city
                                    dollars_sold   state_or_province
                                                   country
                                      avg_sales
                  Measures
Fact table consists of the measurements metrics or
 facts of a business process. It is located at the center
 of a star schema surrounded by dimension tables.
In the above example, it holding sales by item, by
 time, by branch and by location.
 Dimension tables contain descriptive attributes (or
 fields) which are typically textual fields or discrete
 numbers that behave like text.
These attributes are designed to serve two critical
 purposes:
         -query constraining/filtering
         -query result set labeling.
OLAP Architecture
OLAP Server
OLAP Server receives the data from data warehouse
 by which it represents the data in a user
 understandable format which actually supply
 analytical functionality for the DSS system.
OLAP Server generally performs data analysis in two
 forms.
              -ROLAP(Relational OLAP)
              -MOLAP(Multi-dimensional OLAP)
ROLAP
 It is a form of OLAP that performs dynamic multi-
 dimensional analysis of data stored in a relational
 database rather than in a multi-dimensional
 database (which is usually considered the OLAP
 standard).
Data processing may take place within the database
 system, a mid-tier server, or the client.
In two-tier architecture, the user submits a
 Structured Query Language (SQL) query to the
 database and receives back the requested data.
Two-Tier Architecture of ROLAP
MOLAP (Multi-dimensional OLAP)

It is a form of OLAP that helps the user to “slice and
 dice” information, providing multi-dimensional
 analysis of data by putting data in a cube structure.

Most MOLAP products use a multi-cube approach in
 which a series of small, dense, pre-calculated cubes
 make a hypercube.
Slicing(above) and Dicing(below) a Cube
It is an ideal tool that
  helps to query data which
  involves time-line (day,
  week, month, year),
  geographical areas (city,
  state, country), various
  product lines or categories
  and channels (sales
  people, stores, etc).
Three kinds of data warehouse applications
 Information processing
     supports querying, basic statistical analysis, and reporting using
      crosstabs, tables, charts and graphs.
 Analytical processing
     multidimensional analysis of data warehouse data.
     supports basic OLAP operations, slice-dice, drilling, pivoting.
 Data mining
     knowledge discovery from hidden patterns .
     supports associations, constructing analytical models,
      performing classification and prediction, and presenting the
      mining results using visualization tools.
Our business in life is not to get ahead of others,
 but to get ahead of ourselves—to break our own
 records, to outstrip our yesterday by our today.
                -Stewart B.Johnson
Queries ??

Mais conteúdo relacionado

Mais procurados

Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
Theju Paul
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
pcherukumalla
 

Mais procurados (20)

3 tier data warehouse
3 tier data warehouse3 tier data warehouse
3 tier data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
 
Warehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemasWarehousing dimension star-snowflake_schemas
Warehousing dimension star-snowflake_schemas
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
Data warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika KotechaData warehousing - Dr. Radhika Kotecha
Data warehousing - Dr. Radhika Kotecha
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 

Semelhante a Introduction to Data Warehouse

UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
DURGADEVIL
 
Dataware housing
Dataware housingDataware housing
Dataware housing
work
 
Data warehouse
Data warehouseData warehouse
Data warehouse
MR Z
 
Data warehouse
Data warehouseData warehouse
Data warehouse
_123_
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
work
 

Semelhante a Introduction to Data Warehouse (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
SAP BODS -quick guide.docx
SAP BODS -quick guide.docxSAP BODS -quick guide.docx
SAP BODS -quick guide.docx
 
11666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect311666 Bitt I 2008 Lect3
11666 Bitt I 2008 Lect3
 
Dataware housing
Dataware housingDataware housing
Dataware housing
 
DW 101
DW 101DW 101
DW 101
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
dw_concepts_2_day_course.ppt
dw_concepts_2_day_course.pptdw_concepts_2_day_course.ppt
dw_concepts_2_day_course.ppt
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.pptChapter 4. Data Warehousing and On-Line Analytical Processing.ppt
Chapter 4. Data Warehousing and On-Line Analytical Processing.ppt
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
DATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining forDATAWAREHOUSE MAIn under data mining for
DATAWAREHOUSE MAIn under data mining for
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
DataWarehousingandAbInitioConcepts.ppt
DataWarehousingandAbInitioConcepts.pptDataWarehousingandAbInitioConcepts.ppt
DataWarehousingandAbInitioConcepts.ppt
 

Último

Último (20)

2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 

Introduction to Data Warehouse

  • 1. An efficient way of data analysis.
  • 2. Data warehousing Data Warehouse (DW) is a Subject oriented, integrated, time variant, non-volatile collection of data in support of management’s system. -W.H.INMON It is a collection of data designed to support management decision making by presenting a coherent picture of business conditions at a single point of time.
  • 3. Features of Data Warehouse:-  Subject-oriented  Integrated  Time-variant Data  Nonvolatile Warehouse Subject Time- Non - Integrated variant volatile oriente d
  • 4. Subject-oriented: Data are organized according to the subject instead of application.  It mainly focuses on modeling and analysis of data for decision makers, not on daily operations or transaction processing. Integrated:  Constructed by integrating multiple, heterogeneous data sources like relational databases, flat files, on-line transaction records.  Ensure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sources.
  • 5. Time-variant:  The time horizon for the data warehouse is significantly longer than that of operational systems. i.e. provide information from a historical perspective (e.g., past 5-10 years). Non-volatile:  No updates are allowed. Once the data entered into the data warehouse, they are never removed.  The data in warehouse represent company’s history, the operational data representing near term history are always added to it..
  • 6. Need for Decision Support System(DSS) in business In a typical business environment with an increasing competitive market, we cannot ponder over decisions for too long. Hence the business needs managers with quick decision making. Therefore, to succeed in business today, any company needs information systems that can support the diverse information and decision-making needs.
  • 7. How many businesses appear and disappear each year?
  • 8. Decision Support Systems helps us to assess and resolve everyday business questions. It works by compiling useful information from a combination of raw data, documents, personal knowledge, or business models.
  • 9. Structured and Unstructured components of DSS Every task has a structured component as well as an unstructured component. A Structured component is one which directly helps us to proceed towards decision. An Unstructured component is that which is to be still processed and requires human interaction with the DSS.
  • 10. For example, choosing a stock portfolio.  Here the Structured components are the amount of risk and the performance of stocks, but  The choice of stocks for a portfolio requires human intuition. Hence it is Unstructured. risk Structured Stock performance Portfolio Choice of Stock for Unstructured portfolio
  • 11. In order to implement DSS……
  • 12. DSS architectural styles  OLTP (Online Transaction Processing) -used by traditional operational systems (RDBMS).  OLAP (Online Analytical Processing) -used by Data Warehouse.
  • 13. The Operational Database is one which is accessed by an Operational System to carry out regular operations of an organization. Operational Databases usually use an OLTP architecture which is optimized for faster transaction processing. OLTP Databases access the data in the form of operations like- Inserting, Deleting, and Updating data.
  • 14. OLTP OLTP is a methodology to provide end users with access to large amounts of data It works in an intuitive and rapid manner to assist with deductions based on investigative reasoning. OLTP refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing.
  • 15. A typical OLTP Architecture An Automatic Teller Machine  (ATM) for a bank is an example of a commercial transaction processing application.
  • 16. Benefits of OLTP 1. Simplicity & Efficiency: Reduced paper trails and the faster & more accurate forecasts for revenues and expenses are both examples of how OLTP makes things simpler for businesses. 2. OLTP systems maintain data integrity and they also provide fast query processing in environments having multiple access.
  • 17. Pitfalls of OLTP:-  OLTP requires instant update.  The data what we get from OLTP is not suitable for data analysis.  To perform one simple transaction even with the normalized structure, we need to query multiple tables by using joins.
  • 18. We usually refer to information as attributes related to entities, objects or classes, like product price, invoice amount or client name. Mapping can be with a simple, one argument function: product » price invoice » amount client » name each attribute is mapped to one column.
  • 19. What is gross margin by product category in Europe and Asia? What's our current inventory by product and warehouse? Which was the evolution of return rate of different products acquired by different suppliers?
  • 20.  functions of multiple arguments Product category × Region » Gross margin Product × Warehouse » Inventory Supplier × Time × Product » Return rate More than one attribute is mapped to one column.
  • 21. Hence for data reporting and analysis we must have a new system. DATA WAREHOUSING
  • 22. DATA WAREHOUSE Data Warehouse is a database used for data reporting and analysis. The data stored in the warehouse are uploaded from the operational systems (such as marketing, sales etc.)
  • 23. The data store contains two main types of data. -Business Data -Business Data Model Operational The business data are data Business extracted from data operational database External data and from external sources which are relevant to the business (stock prices, market indicators, marketing information and competitor’s data).
  • 24. DSS data vs Operational data The DSS data which is extracted from multiple sources differ from operational data in three main areas: -time span, -granularity and - dimensionality.
  • 25. Business data represent a snapshot of the company’s situation.
  • 26.  A typical ETL (Extract Transform Load)-based data warehouse uses a) staging, b) integration and c) access layers (Data Marts) to house its key functions.
  • 27. The data that arrived at data warehouse are first passed to Operational Data Store (ODS). Data is integrated from multiple sources for additional operations on the data. This integrated data is passed back to operational systems for decision-making.
  • 28. DSS components The DSS is usually composed of five main components:-  Data store component  Data extraction component  Data filtering component  End user query tool  End user presentation tool
  • 29. Data Sources Data Storage OLAP engine Front end tools
  • 30. The data in the data warehouse is stored in the form of Data marts. It allows the user to access the data in terms of a specific business line or team.
  • 31. Data mart The data mart is a subset of the data warehouse that is usually oriented to a specific business line or team. Multi-Tier Data Warehouse Distributed Data Marts Enterprise Data Data Data Mart Mart Warehouse Model refinement Model refinement Define a high-level corporate data model
  • 32.
  • 33. OLAP(Online Analytical Processing) OLAP is an approach to answer multi-dimensional analytical queries which also encompasses relational reporting and data mining. An OLAP cube is an array of data that is understood in terms of its 0 or more dimensions which enables the users to gain insight into their data in a fast, interactive, easy-to-use manner.
  • 34. For example, a company might wish to summarize financial data by product (Maruthi), by time-period (May 2006), by city (Mumbai) to compare actual budget expenses.  Product, time, city and scenario (actual and budget) are the data's dimensions. In this example, the cell contains the values of all OLAP Cube Maruthi sales in Mumbai for May 2006.
  • 35. The data in Data Warehouse is arranged in the form of hierarchical groups often called dimensions and into facts tables and aggregate facts. OLAP data is typically stored in a Star Schema. -which is a combination of dimensions and fact tables.
  • 36. STAR SCHEMA time Time_key item day item_key day_of_the_week Sales Fact Table item_name month brand quarter time_key type year supplier_type item_key branch_key branch location location_key branch_key location_key branch_name units_sold street branch_type city dollars_sold state_or_province country avg_sales Measures
  • 37. Fact table consists of the measurements metrics or facts of a business process. It is located at the center of a star schema surrounded by dimension tables. In the above example, it holding sales by item, by time, by branch and by location.  Dimension tables contain descriptive attributes (or fields) which are typically textual fields or discrete numbers that behave like text. These attributes are designed to serve two critical purposes: -query constraining/filtering -query result set labeling.
  • 39. OLAP Server OLAP Server receives the data from data warehouse by which it represents the data in a user understandable format which actually supply analytical functionality for the DSS system. OLAP Server generally performs data analysis in two forms. -ROLAP(Relational OLAP) -MOLAP(Multi-dimensional OLAP)
  • 40. ROLAP  It is a form of OLAP that performs dynamic multi- dimensional analysis of data stored in a relational database rather than in a multi-dimensional database (which is usually considered the OLAP standard). Data processing may take place within the database system, a mid-tier server, or the client. In two-tier architecture, the user submits a Structured Query Language (SQL) query to the database and receives back the requested data.
  • 42. MOLAP (Multi-dimensional OLAP) It is a form of OLAP that helps the user to “slice and dice” information, providing multi-dimensional analysis of data by putting data in a cube structure. Most MOLAP products use a multi-cube approach in which a series of small, dense, pre-calculated cubes make a hypercube.
  • 43. Slicing(above) and Dicing(below) a Cube It is an ideal tool that helps to query data which involves time-line (day, week, month, year), geographical areas (city, state, country), various product lines or categories and channels (sales people, stores, etc).
  • 44. Three kinds of data warehouse applications  Information processing  supports querying, basic statistical analysis, and reporting using crosstabs, tables, charts and graphs.  Analytical processing  multidimensional analysis of data warehouse data.  supports basic OLAP operations, slice-dice, drilling, pivoting.  Data mining  knowledge discovery from hidden patterns .  supports associations, constructing analytical models, performing classification and prediction, and presenting the mining results using visualization tools.
  • 45. Our business in life is not to get ahead of others, but to get ahead of ourselves—to break our own records, to outstrip our yesterday by our today. -Stewart B.Johnson