SlideShare uma empresa Scribd logo
1 de 12
Data Warehousing – Dimensions | Star and
                    Snowflake Schemas




Eric Matthews - DataWithUs
Defining Some Key Terms
 Dimension
    • Data Element
    • Categorizes each item in a data set
    • Provides Structured Labeling/Tagging
    • Dimensions can consist of hierarchies. For example: Date |
      Month, Quarter, Year
    • Dimension tables contain appropriate foreign keys to join
      to fact tables.
 Dimension – Primary Role
    • Data Filtering
    • Data Grouping
    • Data Labeling

 Fact
    • Measures, Counted, or aggregate event. For example:
      Sales, Admissions, Blood Pressure, Inventory can all be
      construed as “facts”
    • Fact Tables contain appropriate joining keys
Defining Some Key Terms (continued)
 Conformed Dimension
    • Common set of data structures/attributes
    • Can cut across many facts, but…
    • The row headers in an answer must be able to exactly
      match, or…
    • Can be an exact subset



 These definitions will come into brighter light as we look at some
 examples.
Star Schema



   • Most atomic form of dimension modeling

   • Consists of dimension table(s) modeled around a fact table

   • Optimized for querying large data sets
Star Schema
                  Logical                Dimension Table
                                          Patient
Dimension Table                           Demographics
 Date/Time

                            Fact Table


                               Keys
                                           Dimension Table
                              Facts          Referring
Dimension Table                              Physician
  Insurance
  Carrier
Star Schema – Talking Points for Next Diagram
Note: Have original table schema as point of reference.


  • Discuss aggregation from source table to fact table rolling
    up totals (How this needed to be done).
  • Discuss the notion of rolling up fact tables to create other
    fact tables (use account type, financial class, and service
    code columns in the fact table for basis of discussion)
  • Discuss some of the pitfalls of dimension tables by using
    the physician dimension as an example (example:
    Physicians can change jobs)
  • Discuss the Date Dimension from the perspective of the
    data in the table… which transitions us to a key point…

  …which is similar to how one needs to resolve foreign keys in
  reporting the dimension table is a table form of the same
  concept.

  Additionally, If one has well defined master data then populating
  the dimension tables can be done using a columnar subset of the
  source master data table.
Fact Table: Acct Fin Rollup
Dimension Table
Date                                                      Dimension Table
                             ACCT_NUM                     Patient
 WEEK                        ACCT_PTPTR
 YEAR                                                       ACCT_PTPTR
                             ACCT_GUARANTOR_ID              PATIENT_NAME
 QUARTER                     ACCT_REFERRING_MD
 MONTH                                                      CITY
                             ACCT_START_DATE                STATE
                             ACCT_END_DATE                  ZIP
                             PLAN_SEQ1
                             ACCT_TYPE
   Dimension Table           FC
   Insurance Plan/Carrier    HOSPITAL_SERVICE_CODE
    PLAN_SEQ1
    PLAN_NAME                TOT_TOTAL_CHARGES
                                                          Dimension Table
    CARRIER                  TOT_TOTAL_PAYMENTS
                                                          Referring Physician
    CITY                     TOT_TOTAL_ADJUSTMENTS
                             TOT_BALANCE                   ACCT_REFERRING_MD
    STATE
                                                           PHYSICIAN_NAME
    ZIP
                                                           AFFILIATION
                                                           AFFILIATION_CITY
                                                           AFFILIATION_STATE
                                                           AFFILIATION_ZIP
Snowflake Schema
    • Think Star Schema where the dimension tables are
      normalized

    • Can be used to segregate rows in dimension tables that
      have a high percentage of null data (for faster lookup, you
      cannot index null )
Snowflake Schema



       Fact Table

    product_key


                    Dimension Table
    Units            product_key
    Cost Per Unit    supplier_key

                      Product Info    Dimension Table
                                       supplier_key

                                        Supplier Info
Conformed Dimension
  A conformed dimension is a set of data attributes that have been
  physically implemented in multiple tables using the same structure. A
  conformed dimension can be applied to different fact tables. For
  example:

 Dimension Table
    Patient
    Demographics
    (Gender, Age)
                                                  Fact Table
                                                     Hypertension
                                                     Studies
Note: The classic example for
a conformed dimension is                          Fact Table
date. I wanted to offer a
different example.                                   Lab Results


                                                  Fact Table
                                                    Diabetes
                                                    Assessment
Transition to Next Point of Discussion

  Star and Snowflake schemas are optimized for
  querying large data sets.

  They should support:
      • OLAP cubes
      • Business Intelligence and Analytic Applications
      • Ad hoc queries
The End

Mais conteúdo relacionado

Mais procurados

Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaionsridhark1981
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data WarehouseShanthi Mukkavilli
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouseAmin Choroomi
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data WarehousingAlex Meadows
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
Data Warehouse Basics
Data Warehouse BasicsData Warehouse Basics
Data Warehouse BasicsRam Kedem
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURESachin Batham
 
Dimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | TypesDimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | Typesumair saeed
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?RTTS
 

Mais procurados (20)

Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaion
 
OLTP vs OLAP
OLTP vs OLAPOLTP vs OLAP
OLTP vs OLAP
 
Introduction to Data Warehouse
Introduction to Data WarehouseIntroduction to Data Warehouse
Introduction to Data Warehouse
 
Introduction Data warehouse
Introduction Data warehouseIntroduction Data warehouse
Introduction Data warehouse
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Why Data Vault?
Why Data Vault? Why Data Vault?
Why Data Vault?
 
data warehouse vs data lake
data warehouse vs data lakedata warehouse vs data lake
data warehouse vs data lake
 
Introduction To Data Warehousing
Introduction To Data WarehousingIntroduction To Data Warehousing
Introduction To Data Warehousing
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Data Warehouse Basics
Data Warehouse BasicsData Warehouse Basics
Data Warehouse Basics
 
Data Warehouse 101
Data Warehouse 101Data Warehouse 101
Data Warehouse 101
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Star schema
Star schemaStar schema
Star schema
 
ETL Technologies.pptx
ETL Technologies.pptxETL Technologies.pptx
ETL Technologies.pptx
 
Dimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | TypesDimensional model | | Fact Tables | | Types
Dimensional model | | Fact Tables | | Types
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Tableau
TableauTableau
Tableau
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?What is a Data Warehouse and How Do I Test It?
What is a Data Warehouse and How Do I Test It?
 

Destaque

Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRyan Andhavarapu
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_modelClaudia Gomez
 
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...Perficient, Inc.
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schemaSayed Ahmed
 
Best Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyBest Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyWhereScape
 
Difference between star schema and snowflake schema
Difference between star schema and snowflake schemaDifference between star schema and snowflake schema
Difference between star schema and snowflake schemaUmar Ali
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data modeljagdish_93
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)yesheeka
 
How business analysts are catalysts for business change
How business analysts are catalysts for business changeHow business analysts are catalysts for business change
How business analysts are catalysts for business changePatrick Van Renterghem
 
Information Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideInformation Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideDan D'Angelo
 
3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)Patrick Van Renterghem
 
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Patrick Van Renterghem
 
Pedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPatrick Van Renterghem
 
Smarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattSmarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattVincent Kwon
 
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Patrick Van Renterghem
 

Destaque (20)

Rev_3 Components of a Data Warehouse
Rev_3 Components of a Data WarehouseRev_3 Components of a Data Warehouse
Rev_3 Components of a Data Warehouse
 
Dw design 2_conceptual_model
Dw design 2_conceptual_modelDw design 2_conceptual_model
Dw design 2_conceptual_model
 
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
When Facts and Dimensions Alone Aren't the Answer: Logically Reversing the St...
 
Data modeling star schema
Data modeling star schemaData modeling star schema
Data modeling star schema
 
Best Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse QuicklyBest Practices for Building a Warehouse Quickly
Best Practices for Building a Warehouse Quickly
 
Difference between star schema and snowflake schema
Difference between star schema and snowflake schemaDifference between star schema and snowflake schema
Difference between star schema and snowflake schema
 
Multidimentional data model
Multidimentional data modelMultidimentional data model
Multidimentional data model
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Snowflakes for christmas
Snowflakes for christmasSnowflakes for christmas
Snowflakes for christmas
 
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)Data mining  3 - Data Models and Data Warehouse Design (cheat sheet - printable)
Data mining 3 - Data Models and Data Warehouse Design (cheat sheet - printable)
 
Star schema
Star schemaStar schema
Star schema
 
Dw case study
Dw case studyDw case study
Dw case study
 
How business analysts are catalysts for business change
How business analysts are catalysts for business changeHow business analysts are catalysts for business change
How business analysts are catalysts for business change
 
Information Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference GuideInformation Lifecycle Governance Leader Reference Guide
Information Lifecycle Governance Leader Reference Guide
 
3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)3D printing en korte keten recyclage (Evi Swinnen, timelab)
3D printing en korte keten recyclage (Evi Swinnen, timelab)
 
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
Google Glass UX Best Practices Presentation by Litrik De Roy (@litrik) at the...
 
Trends for 2014
Trends for 2014Trends for 2014
Trends for 2014
 
Pedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup PresentationPedro De Bruyckere Meetup Presentation
Pedro De Bruyckere Meetup Presentation
 
Smarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D WattSmarter Eduction - Higher Education Summit 2011 - D Watt
Smarter Eduction - Higher Education Summit 2011 - D Watt
 
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
Creating Better Customer Experiences Online (with Top Tasks) presented by Ger...
 

Semelhante a Warehousing dimension star-snowflake_schemas

(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdfMobeenMasoudi
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
First Steps to Define Grain
First Steps to Define GrainFirst Steps to Define Grain
First Steps to Define GrainRyan Casey
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdfMobeenMasoudi
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
Performance management capability
Performance management capabilityPerformance management capability
Performance management capabilitydesigner DATA
 
Editingglossary
EditingglossaryEditingglossary
EditingglossaryRubiah69
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tablechirag patil
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfJayanti Pande
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
IDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIntisarAhmad5
 
Meta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaMeta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaData Portal India
 

Semelhante a Warehousing dimension star-snowflake_schemas (20)

(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf(Lecture 3) Star Schema.pdf
(Lecture 3) Star Schema.pdf
 
Data Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptxData Warehouse_Architecture.pptx
Data Warehouse_Architecture.pptx
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
First Steps to Define Grain
First Steps to Define GrainFirst Steps to Define Grain
First Steps to Define Grain
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Data modelling interview question
Data modelling interview questionData modelling interview question
Data modelling interview question
 
1234
12341234
1234
 
Dw concepts
Dw conceptsDw concepts
Dw concepts
 
(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf(Lecture 4)Slowly Changing Dimensions.pdf
(Lecture 4)Slowly Changing Dimensions.pdf
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Performance management capability
Performance management capabilityPerformance management capability
Performance management capability
 
Editingglossary
EditingglossaryEditingglossary
Editingglossary
 
Case study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact tableCase study: Implementation of dimension table and fact table
Case study: Implementation of dimension table and fact table
 
19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE19CS3052R-CO1-7-S7 ECE
19CS3052R-CO1-7-S7 ECE
 
Business Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdfBusiness Analytics 1 Module 4.pdf
Business Analytics 1 Module 4.pdf
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
IDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptxIDW Lecture 21-Families of STAR schema.pptx
IDW Lecture 21-Families of STAR schema.pptx
 
Meta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform IndiaMeta Data and Quality of Data for OGD Platform India
Meta Data and Quality of Data for OGD Platform India
 

Último

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Último (20)

Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Warehousing dimension star-snowflake_schemas

  • 1. Data Warehousing – Dimensions | Star and Snowflake Schemas Eric Matthews - DataWithUs
  • 2. Defining Some Key Terms Dimension • Data Element • Categorizes each item in a data set • Provides Structured Labeling/Tagging • Dimensions can consist of hierarchies. For example: Date | Month, Quarter, Year • Dimension tables contain appropriate foreign keys to join to fact tables. Dimension – Primary Role • Data Filtering • Data Grouping • Data Labeling Fact • Measures, Counted, or aggregate event. For example: Sales, Admissions, Blood Pressure, Inventory can all be construed as “facts” • Fact Tables contain appropriate joining keys
  • 3. Defining Some Key Terms (continued) Conformed Dimension • Common set of data structures/attributes • Can cut across many facts, but… • The row headers in an answer must be able to exactly match, or… • Can be an exact subset These definitions will come into brighter light as we look at some examples.
  • 4. Star Schema • Most atomic form of dimension modeling • Consists of dimension table(s) modeled around a fact table • Optimized for querying large data sets
  • 5. Star Schema Logical Dimension Table Patient Dimension Table Demographics Date/Time Fact Table Keys Dimension Table Facts Referring Dimension Table Physician Insurance Carrier
  • 6. Star Schema – Talking Points for Next Diagram Note: Have original table schema as point of reference. • Discuss aggregation from source table to fact table rolling up totals (How this needed to be done). • Discuss the notion of rolling up fact tables to create other fact tables (use account type, financial class, and service code columns in the fact table for basis of discussion) • Discuss some of the pitfalls of dimension tables by using the physician dimension as an example (example: Physicians can change jobs) • Discuss the Date Dimension from the perspective of the data in the table… which transitions us to a key point… …which is similar to how one needs to resolve foreign keys in reporting the dimension table is a table form of the same concept. Additionally, If one has well defined master data then populating the dimension tables can be done using a columnar subset of the source master data table.
  • 7. Fact Table: Acct Fin Rollup Dimension Table Date Dimension Table ACCT_NUM Patient WEEK ACCT_PTPTR YEAR ACCT_PTPTR ACCT_GUARANTOR_ID PATIENT_NAME QUARTER ACCT_REFERRING_MD MONTH CITY ACCT_START_DATE STATE ACCT_END_DATE ZIP PLAN_SEQ1 ACCT_TYPE Dimension Table FC Insurance Plan/Carrier HOSPITAL_SERVICE_CODE PLAN_SEQ1 PLAN_NAME TOT_TOTAL_CHARGES Dimension Table CARRIER TOT_TOTAL_PAYMENTS Referring Physician CITY TOT_TOTAL_ADJUSTMENTS TOT_BALANCE ACCT_REFERRING_MD STATE PHYSICIAN_NAME ZIP AFFILIATION AFFILIATION_CITY AFFILIATION_STATE AFFILIATION_ZIP
  • 8. Snowflake Schema • Think Star Schema where the dimension tables are normalized • Can be used to segregate rows in dimension tables that have a high percentage of null data (for faster lookup, you cannot index null )
  • 9. Snowflake Schema Fact Table product_key Dimension Table Units product_key Cost Per Unit supplier_key Product Info Dimension Table supplier_key Supplier Info
  • 10. Conformed Dimension A conformed dimension is a set of data attributes that have been physically implemented in multiple tables using the same structure. A conformed dimension can be applied to different fact tables. For example: Dimension Table Patient Demographics (Gender, Age) Fact Table Hypertension Studies Note: The classic example for a conformed dimension is Fact Table date. I wanted to offer a different example. Lab Results Fact Table Diabetes Assessment
  • 11. Transition to Next Point of Discussion Star and Snowflake schemas are optimized for querying large data sets. They should support: • OLAP cubes • Business Intelligence and Analytic Applications • Ad hoc queries