SlideShare uma empresa Scribd logo
1 de 6
What is a Data Warehouse? :<br />,[object Object]
A data warehouse (or mart) is way of storing data for later retrieval. This retrieval isalmost always used to support decision-making in the organization. That is why manydata warehouses are considered to be DSS (Decision-Support Systems).
Both a data warehouse and a data mart are storage mechanismsfor read-only, historical, aggregated data
Both a data warehouse and a data mart are storage mechanismsfor read-only, historical, aggregated data.
A data warehouse stores current and historical dataOLTP:<br />,[object Object]
This is a standard, normalized database structure.

Mais conteúdo relacionado

Mais procurados

OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Classification of data mart
Classification of data martClassification of data mart
Classification of data martkhush_boo31
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
File organization 1
File organization 1File organization 1
File organization 1Rupali Rana
 
Multidimensional schema of data warehouse
Multidimensional schema of data warehouseMultidimensional schema of data warehouse
Multidimensional schema of data warehousekunjan shah
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture janani thirupathi
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data MiningDHIVYADEVAKI
 
OLAP operations
OLAP operationsOLAP operations
OLAP operationskunj desai
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse ArchitecturesTheju Paul
 
BIS06 Physical Database Models
BIS06 Physical Database ModelsBIS06 Physical Database Models
BIS06 Physical Database ModelsPrithwis Mukerjee
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 

Mais procurados (20)

OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Classification of data mart
Classification of data martClassification of data mart
Classification of data mart
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
File organization 1
File organization 1File organization 1
File organization 1
 
Data models
Data modelsData models
Data models
 
Multidimensional schema of data warehouse
Multidimensional schema of data warehouseMultidimensional schema of data warehouse
Multidimensional schema of data warehouse
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
OLAP operations
OLAP operationsOLAP operations
OLAP operations
 
Data models
Data modelsData models
Data models
 
Data Warehouse Architectures
Data Warehouse ArchitecturesData Warehouse Architectures
Data Warehouse Architectures
 
Schema
SchemaSchema
Schema
 
BIS06 Physical Database Models
BIS06 Physical Database ModelsBIS06 Physical Database Models
BIS06 Physical Database Models
 
What is a DATA DICTIONARY?
What is a DATA DICTIONARY?What is a DATA DICTIONARY?
What is a DATA DICTIONARY?
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse logical design
Data warehouse logical designData warehouse logical design
Data warehouse logical design
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data cubes
Data cubesData cubes
Data cubes
 

Semelhante a Star schema

Data warehouse
Data warehouseData warehouse
Data warehouse_123_
 
Data warehousing
Data warehousingData warehousing
Data warehousingAllen Woods
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tablechirag patil
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxhajon27910
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNabclearnn
 
Basics+of+Datawarehousing
Basics+of+DatawarehousingBasics+of+Datawarehousing
Basics+of+Datawarehousingtheextraaedge
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
Dimensional data model
Dimensional data modelDimensional data model
Dimensional data modelVnktp1
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingAnimesh Srivastava
 
Sqlserver interview questions
Sqlserver interview questionsSqlserver interview questions
Sqlserver interview questionsTaj Basha
 
Data Warehouse by Amr Ali
Data Warehouse by Amr AliData Warehouse by Amr Ali
Data Warehouse by Amr AliAmr Ali
 

Semelhante a Star schema (20)

Dw concepts
Dw conceptsDw concepts
Dw concepts
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact table
 
BI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptxBI_LECTURE_4-2021.pptx
BI_LECTURE_4-2021.pptx
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARN
 
Data Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptxData Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptx
 
Basics+of+Datawarehousing
Basics+of+DatawarehousingBasics+of+Datawarehousing
Basics+of+Datawarehousing
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
ETL QA
ETL QAETL QA
ETL QA
 
Dimensional data model
Dimensional data modelDimensional data model
Dimensional data model
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
02 Essbase
02 Essbase02 Essbase
02 Essbase
 
Dwbi Project
Dwbi ProjectDwbi Project
Dwbi Project
 
Sqlserver interview questions
Sqlserver interview questionsSqlserver interview questions
Sqlserver interview questions
 
3dw
3dw3dw
3dw
 
Data Warehouse by Amr Ali
Data Warehouse by Amr AliData Warehouse by Amr Ali
Data Warehouse by Amr Ali
 
Datawarehouse
DatawarehouseDatawarehouse
Datawarehouse
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 

Último

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 

Último (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 

Star schema

  • 1.
  • 2. A data warehouse (or mart) is way of storing data for later retrieval. This retrieval isalmost always used to support decision-making in the organization. That is why manydata warehouses are considered to be DSS (Decision-Support Systems).
  • 3. Both a data warehouse and a data mart are storage mechanismsfor read-only, historical, aggregated data
  • 4. Both a data warehouse and a data mart are storage mechanismsfor read-only, historical, aggregated data.
  • 5.
  • 6. This is a standard, normalized database structure.
  • 7.
  • 8.
  • 9. Therefore, with each transaction, these indexes must be updated along withthe table. This overhead can significantly decrease our performance.
  • 10. There are some disadvantages to an OLTP structure, especially when we go to retrieve thedata for analysis.
  • 11. For one, we now must utilize joins and query multiple tables to get allthe data we want. Joins tend to be slower than reading from a single table, so we want tominimize the number of tables in any single query.
  • 12. One of the advantages of OLTP is also a disadvantage: fewer indexes per table.
  • 13. In general terms,the fewer indexes we have, the faster inserts, updates, and deletes will be.
  • 14. However, againin general terms, the fewer indexes we have, the slower select queries will run.
  • 15. Since one of our design goals to speed transactions is to minimize the numberof indexes, we are limiting ourselves when it comes to doing data retrieval.
  • 16.
  • 17. It is called a star schema because the entity-relationship diagram between dimensions and fact tables resembles a star where one fact table is connected to multipledimensions.
  • 18.
  • 19. Identify measures or facts (sales dollar).
  • 20. Identify dimensions for facts(product dimension, location dimension, time dimension, organization dimension).
  • 21. List the columns that describe each dimension.(region name, branch name, region name).
  • 22.
  • 23. In a star schema, a dimension table will not have any parent table.
  • 24. Whereas in a snow flake schema, a dimension table will have one or more parent tables.
  • 25. Hierarchies for the dimensions are stored in the dimensional table itself in star schema.
  • 26.
  • 27. When I talk about “by” conditions, I am referring to looking at data by certain conditions
  • 28. For example, if we take the question “On a quarterly and then monthly basis, are DairyProduct sales cyclical” we can break this down into this: “We want to see total sales bycategory (just Dairy Products in this case),by quarter or by month.”
  • 29. Here we are looking at an aggregated value, the sum of sales, by specific criteria.
  • 30. When we talk about the way we want to look at data, we usually want to see some sort ofaggregated data. These data are called measures.
  • 31. These measures are numeric values that are measurable and additive.
  • 32. We need to look at our measures using those “by” conditions. These “by” conditions are called dimensions.
  • 33. When we say we want to know our sales dollars, we almost always mean by day, or by quarter, or by year.
  • 34. These by conditions will map into dimensions:there is almost always a time dimension, and product and geographic dimensions are verycommon as well.
  • 35. Therefore, in designing a star schema, our first order of business is usually to determine
  • 36.
  • 37. This key is often just an identity column, consisting of an automatically incrementing number.
  • 38. (The value of the primary key is meaningless; our information is stored in the other fields.)
  • 39. These other fields contain the full descriptions of what we are after.
  • 40. For example, if we have a Product dimension (which is common) we have fields in it that contain the description, the category name, the sub-category name, etc.
  • 41. These fields do not contain codes that link us to other tables. Because the fields are the full descriptions, the dimension tables are often fat; they contain many large fields.
  • 42. Dimension tables are often short, however. We may have many products, but even so, the dimension table cannot compare in size to a normal fact table.
  • 43. Dimension tables are often short, however. We may have many products, but even so, the dimension table cannot compare in size to a normal fact table.
  • 44. Our dimension table might look something like this:
  • 45. Notice that both Category and Subcategory are stored in the table and not linked in through joined tables that store the hierarchy information.
  • 46.
  • 47. A fact table typically has two types of columns: those that contain facts and those that are foreign keys to dimension tables.
  • 48. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys.
  • 49. A fact table might contain either detail level facts or facts that have been aggregated (fact tables that contain aggregated facts are often instead called summary tables).
  • 50.
  • 51. Identify measures or facts (sales dollar).
  • 52. Identify dimensions for facts(product dimension, location dimension, time dimension, organization dimension).
  • 53. List the columns that describe each dimension.(region name, branch name, region name).
  • 54.
  • 55. The measures are numeric and additive across some or all of the dimensions.
  • 56. For example, sales are numeric and we can look at total sales for a product, or category, and we can look at total sales by any time period.
  • 57. While the dimension tables are short and fat, the fact tables are generally long and skinny.
  • 58. They are long because they can hold the number of records represented by the product of the counts in all the dimension tables.
  • 59. In this schema, we have product, time and store dimensions. If we assume we have ten years of daily data, 200 stores, and we sell 500 products, we have a potential of 365,000,000 records (3650 days * 200 stores * 500 products). As you can see, this makes the fact table long.
  • 60. The fact table is skinny because of the fields it holds. The primary key is made up of foreign keys that have migrated from the dimension tables.
  • 61. These fields are just some sort of numeric value. In addition, our measures are also numeric. Therefore, the size of each record is generally much smaller than those in our dimension tables.
  • 62.
  • 63. Non Additive - Measures that cannot be added across all dimensions.
  • 64. Semi Additive - Measures that can be added across few dimensions and not with others.