SlideShare uma empresa Scribd logo
1 de 26
Baixar para ler offline
USING THE
       RIGHT DATA MODEL
         IN A DATA MART
           D AV I D M WA L K E R
D ATA M A N A G E M E N T & WA R E H O U S I N G
INTRODUCTION

 •  The concept of a Data Mart as the data access
    interface layer for Business Intelligence has been
    around for over 25 years
 •  Kimball style Dimensional Modelling and Star
    Schemas have become the de facto data
    modelling technique for data marts
 •  These have been and continue to be hugely
    successful with relational databases and reporting
    tools – but are they the right tool for todays
    technologies ?


March 2012         © 2012 Data Management & Warehousing   2
WHY IS A STAR SCHEMA SO
                   SUCCESSFUL?
 •  There are three main reasons for creating a star
    schema and their wide acceptance as a technique

     •  Simpler for users to understand

     •  Highly performant user queries

     •  Optimal disk storage usage




March 2012              © 2012 Data Management & Warehousing   Slide 3
WHAT IS A STAR SCHEMA?
 •  A star schema consists of                        DATE DIMENSION                        STORE DIMENSION
    two parts                                  • 
                                               • 
                                                    Date Surrogate Key
                                                    Date
                                                                                     • 
                                                                                     • 
                                                                                          Store Surrogate Key
                                                                                          Store Name
     •  Facts:                                 •    Day                              •    Store Number
        Measurable numeric and/or              •    Month
                                                    Year
                                                                                     •    Store Postcode
                                               •                                     •    Store Town
        time data about an event               •    Public Holiday Flag              •    Store Region
     •  Dimensions:
        Descriptive attributes about                                        SALES FACTS
        the event that give the facts a                            •    Date Surrogate Key
        context                                                    •    Store Surrogate Key
 •  Facts are stored at a                                          • 
                                                                   • 
                                                                        Customer Surrogate Key
                                                                        Product Surrogate Key
    uniform level of detail                                        • 
                                                                   • 
                                                                        Sale Time
                                                                        Sale Quantity
    known as the grain of the                                      •    Sale Unit Price
    data
 •  A star schema consists of a                 CUSTOMER DIMENSION                        PRODUCT DIMENSION
    fact table and a number of                 •    Customer Surrogate Key
                                                    Customer Loyalty Number
                                                                                     •    Product Surrogate Key
                                                                                          Product SKU
    associated dimension tables
                                               •                                     • 
                                               •    Customer Gender                  •    Product Name
                                               •    Customer Postcode                •    Product Category
                                               •    Customer Town                    •    Product Group
                                               •    Customer Region                  •    Temperature Group



March 2012                  © 2012 Data Management & Warehousing                                           Slide 4
STAR SCHEMAS:
  SIMPLER FOR USERS TO UNDERSTAND
 •  Intuitive grouping of                       select     P.PRODUCT_CATEGORY,
                                                           sum(SALES_QUANTITY)
    information                                 from       SALES_FACTS F,
     •  e.g. All customer data in one                      DATE_DIMENSION D,
        dimension, all store data in                       STORE_DIMENSION S,
        another, etc.                                      CUSTOMER_DIMENSION C,
                                                           PRODUCT_DIMENSION P
 •  Much easier queries than on                 where      MONTH = ‘March’
    a full relational schemas                   and        YEAR = ‘2012’
     •  Consequently harder to get              and        CUSTOMER_GENDER = ‘Female’
        the wrong answer because of             and        STORE_LOCATION = ‘South West’
        the wrong join                          and        F.DATE_SKEY = D.DATE_SKEY
                                                and        F.STORE_SKEY = S.STORE_SKEY

 •  All data is at the same level               and        F.CUSTOMER_SKEY = C.CUSTOMER_SKEY
                                                and        F.PRODUCT_SKEY = P.PRODUCT_SKEY
    of granularity
     •  Consequently harder to get              Example query to get the number of sales in each
        the wrong answer because of             product category for March 2012 by female
        mismatched levels of data               customers in stores in the South West region




March 2012                 © 2012 Data Management & Warehousing                            Slide 5
STAR SCHEMAS:
   HIGHLY PERFORMANT USER QUERIES
 •  Dimensional data has                        DATE DIMENSION                        STORE DIMENSION

    an enforced one-to-                   • 
                                          • 
                                               Date Surrogate Key
                                               Date
                                                                                • 
                                                                                • 
                                                                                     Store Surrogate Key
                                                                                     Store Name

    many relationship with                • 
                                          • 
                                               Day
                                               Month
                                                                                • 
                                                                                • 
                                                                                     Store Number
                                                                                     Store Postcode

    the fact table                        • 
                                          • 
                                               Year
                                               Public Holiday Flag
                                                                                • 
                                                                                • 
                                                                                     Store Town
                                                                                     Store Region

 •  Filtering occurs on the
    (smaller) dimensions                                      • 
                                                                       SALES FACTS
                                                                   Date Surrogate Key
     •  e.g.                                                  •    Store Surrogate Key
                                                                   Customer Surrogate Key
        where YEAR = ‘2012’
                                                              • 
                                                              •    Product Surrogate Key
                                                                   Sale Time
 •  Aggregation takes
                                                              • 
                                                              •    Sale Quantity

    place only on the
                                                              •    Sale Unit Price


    relevant subset of the                 CUSTOMER DIMENSION                        PRODUCT DIMENSION
    facts                                 • 
                                          • 
                                               Customer Surrogate Key
                                               Customer Loyalty Number
                                                                                • 
                                                                                • 
                                                                                     Product Surrogate Key
                                                                                     Product SKU
     •  e.g.                              •    Customer Gender
                                               Customer Postcode
                                                                                •    Product Name
                                                                                     Product Category
        sum (SALES_QUANTITY)
                                          •                                     • 
                                          •    Customer Town                    •    Product Group
                                          •    Customer Region                  •    Temperature Group



March 2012             © 2012 Data Management & Warehousing                                           Slide 6
STAR SCHEMAS:
             OPTIMAL DISK STORAGE USAGE
 •  If STORE_REGION had:                  • 
                                                DATE DIMENSION
                                               Date Surrogate Key               • 
                                                                                      STORE DIMENSION
                                                                                     Store Surrogate Key
    •  10 discreet values                 • 
                                          • 
                                               Date
                                               Day
                                                                                • 
                                                                                • 
                                                                                     Store Name
                                                                                     Store Number
    •  was stored in the example          • 
                                          • 
                                               Month
                                               Year
                                                                                • 
                                                                                • 
                                                                                     Store Postcode
                                                                                     Store Town
       SALES_FACT table                   •    Public Holiday Flag              •    Store Region


    •  was on average 10 bytes                                         SALES FACTS
       long                                                   •    Date Surrogate Key

    •  This one field alone would                             • 
                                                              • 
                                                                   Store Surrogate Key
                                                                   Customer Surrogate Key
       require an additional 1Tb                              • 
                                                              • 
                                                                   Product Surrogate Key
                                                                   Sale Time
       of storage                                             • 
                                                              • 
                                                                   Sale Quantity
                                                                   Sale Unit Price
    •  Not storing it in the fact
       also improves query                 CUSTOMER DIMENSION                        PRODUCT DIMENSION

       performance by reducing            • 
                                          • 
                                               Customer Surrogate Key
                                               Customer Loyalty Number
                                                                                • 
                                                                                • 
                                                                                     Product Surrogate Key
                                                                                     Product SKU
       disk I/O required to               • 
                                          • 
                                               Customer Gender
                                               Customer Postcode
                                                                                • 
                                                                                • 
                                                                                     Product Name
                                                                                     Product Category
       retrieve the information           •    Customer Town                    •    Product Group
                                          •    Customer Region                  •    Temperature Group



March 2012             © 2012 Data Management & Warehousing                                                  7
SCHEMAS:
                        THE ALTERNATIVES
RELATIONAL              SNOWFLAKE                            STAR             RESULT SET




Complexity              Complexity                   Complexity             Complexity
Speed                   Speed                        Speed                  Speed
Space                   Space                        Space                  Space


Usually used for data   Favours saving some          De facto standard      Large single table
warehouses rather       space in exchange            for data mart design   with the entire result
than data marts.        for added user query         based on traditional   set – optimal in some
Favoured solution on    complexity – usually         technologies. Also     circumstances
MPP technologies        a techie compromise          used as source for
due to their power                                   OLAP cubes




                                © 2012 Data Management & Warehousing                           8
STAR SCHEMAS:
             TECHNOLOGY ASSUMPTIONS
 •  There are two major and often unspoken assumptions about
    the technologies used to build this sort of environment:

 •  Firstly: The database used is a row store database and not a
    column store database
 •  Secondly: That users will be running reporting tools and OLAP
    cubes to access the data

 •  Neither of these assumptions is necessarily true – the last 10
    years have seen massive innovation in Business Intelligence
    technologies that will have an impact on the chosen
    architectural solution – using alternate technologies means
    that you should challenge existing designs and embrace
    appropriate new designs in order to exploit the technology


March 2012             © 2012 Data Management & Warehousing          9
UNDERSTAND THE DESIGN IMPACT
      OF ALTERNATE TECHNOLOGIES
 •  Column Store Databases:
     •  What is a column store database?
     •  Why are column store databases efficient?
     •  How does this affect data mart design?

 •  The use of alternate reporting mechanisms:
     •  The user requirement gap
     •  How users have filled the gap




March 2012              © 2012 Data Management & Warehousing   Slide 10
WHAT IS A
              COLUMN STORE DATABASE?
 •  Traditionally databases are ‘row-based’ i.e. each
    field of data in a record is stored next to each other:
           Forename    Surname                     Gender
           David       Walker                      Male
           Helen       Walker                      Female
           Sheila      Jones                       Female

 •  Column store databases store the values in columns
    and then hold a mapping to form the record
 •  This is transparent to the user, who queries a table
    with SQL in exactly the same way as they would a
    row-based database
Jan 2012              © 2012 Data Management & Warehousing   11
COLUMN STORAGE EXAMPLE

   First Name     F Token        Note: To the user this appears as a conventional
                                row-based table that can be queried by standard
   Value                        SQL, it is only the underlying storage that is different
   David          PPP
   Helen          QQQ                       F Token S Token           G Token
   Sheila         RRR                       PPP           YYY         BBB
   Surname Value S Token                    QQQ           YYY         AAA
   Jones          XXX                       RRR           XXX         AAA
   Walker         YYY

   Gender Value   G Token
   Female         AAA
   Male           BBB

Jan 2012                © 2012 Data Management & Warehousing                               12
EFFICIENCIES OF COLUMN STORE
                    DATABASES
 •  Column store databases offer significant storage
    optimisation opportunities because long strings are not
    repeatedly stored
 •  In addition it is possible to compress the data column
    stores very efficiently
 •  It is possible, in some column store implementations, that
    the column storage holds additional metadata that can
    be used to speed up specific queries (e.g. the number of
    records associated with each value in a column)
 •  Reduced the data volume stored means reduced I/O
    when querying the database, this therefore also gives
    query performance improvements

Jan 2012             © 2012 Data Management & Warehousing   13
COLUMN STORE DATABASES
             AND DATA MART SCHEMAS
 •  A column store database effectively internally
    creates a star schema of every field in a result set
    table.
 •  This minimises the storage and maximises the query
    speed in this type of database
 •  Creating a star schema at the table level effectively
    duplicates (in a less efficient manner) the
    underlying structure that is automatically created
    by the database engine
 •  Consequently a single table result set is more
    efficient in a column store database than a star
    schema

March 2012         © 2012 Data Management & Warehousing   Slide 14
SCHEMAS:
              THE ALTERNATIVES
    ROW DB     COLUMN DB                          ROW DB      COLUMN DB




Complexity    Complexity                     Complexity      Complexity
Speed         Speed                          Speed           Speed
Space         Space                           Space          Space


              Column Store                                   Column Store
              Database improve                               Databases will
              space usage and                                significantly improve
              increase speed                                 space usage and
              compared to Row                                s p e e d w h e n
              Based Databases                                compared to Row
                                                             Based Databases
        STAR SCHEMA                           RESULT SET SCHEMA
                      © 2012 Data Management & Warehousing                    15
WHO ARE THE COLUMN STORE
                     VENDORS
 •  Many of the major database vendors have bought into this
    concept, mostly by acquisition
             Vendor       Database                   SQL Dialect
             Actian       Vectorwise                 Ingres
             EMC          Greenplum                  Postgres
             HP           Vertica                    Postgres
             InfoBright   InfoBright                 MySQL
             ParAccel     ParAccel                   Postgres
             SAP          HANA (In Memory)
             SAP          Sybase IQ                  Sybase/TSQL
             Teradata     AsterData                  Postgres

 •  There are multiple other players
     •  For more information: Wikipedia & DBMS2

March 2012                    © 2012 Data Management & Warehousing   16
REPORTING TECHNOLOGIES

 •  Historically:

     •  Reporting tools were initially designed to provide a
        ‘simplified’ user interface for reporting against relational
        schemas rather than writing SQL

     •  Schemas were simplified into star schemas and specialist
        tools evolved to query both star schemas and OLAP cubes
        built on top of the star schemas

     •  The focus of the tools was on the ability to report what had
        happened from the data


March 2012               © 2012 Data Management & Warehousing          17
THE USER REQUIREMENT GAP

 What users had:                           What users want:

 Historical                                            Predictive
 Reporting                                              Analytics

 Insight into                                    Understanding
 what has                                          what is likely
 happened                                           to happen
March 2012      © 2012 Data Management & Warehousing          Slide 18
HOW USERS HAVE FILLED THE GAP

 •  Spreadsheets

     •  Users love them even if IT hate the
        associated data integrity issues
     •  Users have adopted the idea of manipulating a worksheet of
        data equivalent to a result set table.
     •  Spreadsheets can connect to database sources to get data
        often using a ‘join all’ view over a star schema to access data
     •  Desktop based spreadsheets now support large data sets
        (e.g. Excel supports 1M rows, 16K columns)
     •  Emergence or equivalent web based technologies
        (e.g. Google Docs)
     •  Emergence of low cost, open source equivalents
     •  In-built graphing and charting capabilities

March 2012               © 2012 Data Management & Warehousing             19
HOW USERS HAVE FILLED THE GAP

 •  Statistical Analysis Tools

     •  Statistical analysis of data to identify future trends
     •  Extracting large result sets to the tools for analysis
     •  Connecting to result sets in the database for direct access
     •  Emergence of low cost, open source equivalents (R)
     •  Emergence or equivalent web based technologies (e.g.
        Google Prediction, R Studio)
     •  Predictive Model Standards (PMML)
     •  In-built graphing and charting capabilities



March 2012              © 2012 Data Management & Warehousing     Slide 20
HOW USERS HAVE FILLED THE GAP

 •  Data Visualisation/Dashboarding Tools
     •  Multiple maps, charts, graphs, gauges, sparklines, heat
        maps and traffic lights displaying process critical information
     •  Often sourced from a result set table which is being drip fed
        the latest data by being automatically generated by
        devices (machine generated data)
     •  Emergence of agile/rapid
        development style tools
     •  Tools depend on it being easy to
        load/update the data to give
        near realtime information



March 2012               © 2012 Data Management & Warehousing      Slide 21
SCHEMA TYPE SELECTION BASED ON
      IMPLEMENTATION TECHNOLOGY
SPREADSHEETS


 DASHBOARDS
 STATISTICAL
    TOOLS




               Physical Star Schema with Single Table View          Physical Single Table
TRADITIONAL

AND CUBING
 REPORTING

   TOOLS




               Physical Star Schema                                 Physical Single Table with Star Schema Views



                        ROW STORE                                       COLUMN STORE
                        DATABASE                                          DATABASE
  March 2012                          © 2012 Data Management & Warehousing                             Slide 22
IN CONCLUSION …

 •  When designing your solution architecture it is
    important that you choose
               The Equivalent Alternate Design
    best suited to the technology you are deploying

 •  Star Schemas are still the best design pattern to use
    when you are using row based databases
 •  Result Set Single Tables are more efficient when
    using column store databases
 •  Consider the users and the tools that they will use
    when choosing the schema design type

March 2012         © 2012 Data Management & Warehousing     23
CONTACT US

 •  Data Management & Warehousing
     •  Website: http://www.datamgmt.com
     •  Telephone: +44 (0) 118 321 5930
 •  David Walker
     •    E-Mail: davidw@datamgmt.com
     •    Telephone: +44 (0) 7990 594 372
     •    Skype: datamgmt
     •    White Papers: http://scribd.com/davidmwalker




March 2012               © 2012 Data Management & Warehousing   24
ABOUT US

   Data Management & Warehousing is a UK based consultancy
   that has been delivering successful business intelligence and
              data warehousing solutions since 1995.

Our consultants have worked with major corporations around the
  world including the US, Europe, Africa and the Middle East.

   We have worked in many industry sectors such as telcos,
   manufacturing, retail, financial and transport. We provide
governance and project management as well as expertise in the
                    leading technologies.




March 2012             © 2012 Data Management & Warehousing        25
THANK YOU
© 2 0 1 2 - D ATA M A N A G E M E N T & WA R E H O U S I N G
            H T T P : / / W W W. D ATA M G M T. C O M

Mais conteúdo relacionado

Mais procurados

iceberg introduction.pptx
iceberg introduction.pptxiceberg introduction.pptx
iceberg introduction.pptxDori Waldman
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Databricks
 
Using S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
Using S3 Select to Deliver 100X Performance Improvements Versus the Public CloudUsing S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
Using S3 Select to Deliver 100X Performance Improvements Versus the Public CloudDatabricks
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company PresentationAndrewJiang18
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in DeltaDatabricks
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...HostedbyConfluent
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshSion Smith
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentialsqureshihamid
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault ModelingKent Graziano
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseSnowflake Computing
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
DMBOKをベースにしたデータマネジメント
DMBOKをベースにしたデータマネジメントDMBOKをベースにしたデータマネジメント
DMBOKをベースにしたデータマネジメントKent Ishizawa
 
【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]
【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]
【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]Hideo Takagi
 
Data Lake - Multitenancy Best Practices
Data Lake - Multitenancy Best PracticesData Lake - Multitenancy Best Practices
Data Lake - Multitenancy Best PracticesCitiusTech
 
データ分析基盤について
データ分析基盤についてデータ分析基盤について
データ分析基盤についてYuta Inamura
 

Mais procurados (20)

iceberg introduction.pptx
iceberg introduction.pptxiceberg introduction.pptx
iceberg introduction.pptx
 
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
 
データ利活用を促進するメタデータ
データ利活用を促進するメタデータデータ利活用を促進するメタデータ
データ利活用を促進するメタデータ
 
Using S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
Using S3 Select to Deliver 100X Performance Improvements Versus the Public CloudUsing S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
Using S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Snowflake Company Presentation
Snowflake Company PresentationSnowflake Company Presentation
Snowflake Company Presentation
 
Change Data Feed in Delta
Change Data Feed in DeltaChange Data Feed in Delta
Change Data Feed in Delta
 
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
Designing Apache Hudi for Incremental Processing With Vinoth Chandar and Etha...
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Data Vault and DW2.0
Data Vault and DW2.0Data Vault and DW2.0
Data Vault and DW2.0
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
DMBOKをベースにしたデータマネジメント
DMBOKをベースにしたデータマネジメントDMBOKをベースにしたデータマネジメント
DMBOKをベースにしたデータマネジメント
 
【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]
【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]
【ウェブ セミナー】AI 時代のクラウド データ ウェアハウス Azure SQL Data Warehouse [実践編]
 
Data Lake - Multitenancy Best Practices
Data Lake - Multitenancy Best PracticesData Lake - Multitenancy Best Practices
Data Lake - Multitenancy Best Practices
 
データ分析基盤について
データ分析基盤についてデータ分析基盤について
データ分析基盤について
 

Destaque

DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURESachin Batham
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etlAashish Rathod
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data Mart de una área de compras
Data Mart de una área de comprasData Mart de una área de compras
Data Mart de una área de comprasroy_vs
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleSajjad Zaheer
 
Capturing Business Requirements For Scorecards, Dashboards And Reports
Capturing Business Requirements For Scorecards, Dashboards And ReportsCapturing Business Requirements For Scorecards, Dashboards And Reports
Capturing Business Requirements For Scorecards, Dashboards And ReportsJulian Rains
 
Sample - Data Warehouse Requirements
Sample -  Data Warehouse RequirementsSample -  Data Warehouse Requirements
Sample - Data Warehouse RequirementsDavid Walker
 
Gathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data WarehousesGathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data WarehousesDavid Walker
 
Gathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business RequirementsGathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business RequirementsWynyard Group
 
Metadata an overview
Metadata an overviewMetadata an overview
Metadata an overviewrobin fay
 
07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements Template07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements TemplateAlan D. Duncan
 

Destaque (20)

Data mart
Data martData mart
Data mart
 
DATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTUREDATA MART APPROCHES TO ARCHITECTURE
DATA MART APPROCHES TO ARCHITECTURE
 
data warehouse , data mart, etl
data warehouse , data mart, etldata warehouse , data mart, etl
data warehouse , data mart, etl
 
Data mart
Data martData mart
Data mart
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Data mart
Data martData mart
Data mart
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data mining
Data miningData mining
Data mining
 
Data Mart de una área de compras
Data Mart de una área de comprasData Mart de una área de compras
Data Mart de una área de compras
 
Dimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with ExampleDimensional Modeling Basic Concept with Example
Dimensional Modeling Basic Concept with Example
 
Oltp vs olap
Oltp vs olapOltp vs olap
Oltp vs olap
 
Capturing Business Requirements For Scorecards, Dashboards And Reports
Capturing Business Requirements For Scorecards, Dashboards And ReportsCapturing Business Requirements For Scorecards, Dashboards And Reports
Capturing Business Requirements For Scorecards, Dashboards And Reports
 
Sample - Data Warehouse Requirements
Sample -  Data Warehouse RequirementsSample -  Data Warehouse Requirements
Sample - Data Warehouse Requirements
 
Gathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data WarehousesGathering Business Requirements for Data Warehouses
Gathering Business Requirements for Data Warehouses
 
Gathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business RequirementsGathering And Documenting Your Bi Business Requirements
Gathering And Documenting Your Bi Business Requirements
 
Metadata an overview
Metadata an overviewMetadata an overview
Metadata an overview
 
07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements Template07. Analytics & Reporting Requirements Template
07. Analytics & Reporting Requirements Template
 
OLAP
OLAPOLAP
OLAP
 
BI Business Requirements - A Framework For Business Analysts
BI Business Requirements -  A Framework For Business AnalystsBI Business Requirements -  A Framework For Business Analysts
BI Business Requirements - A Framework For Business Analysts
 

Semelhante a Using the right data model in a data mart

Teradata Demand Chain Management (DCM): Version 4
Teradata Demand Chain Management (DCM): Version 4Teradata Demand Chain Management (DCM): Version 4
Teradata Demand Chain Management (DCM): Version 4Teradata
 
iView Business Intelligence for SAP Business One - Product Promotion Q12013
iView Business Intelligence for SAP Business One - Product Promotion Q12013iView Business Intelligence for SAP Business One - Product Promotion Q12013
iView Business Intelligence for SAP Business One - Product Promotion Q12013CitiXsys Technologies
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0Jannet Peetz
 
2013 distribution trends
2013 distribution trends2013 distribution trends
2013 distribution trendsAlfaPeople US
 
Finding Gold in Airline Customer Data
Finding Gold in Airline Customer DataFinding Gold in Airline Customer Data
Finding Gold in Airline Customer DataCliff Seiler
 
2015_Partners_SWA_Technology_v12
2015_Partners_SWA_Technology_v122015_Partners_SWA_Technology_v12
2015_Partners_SWA_Technology_v12Cliff Seiler
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dwsam2sung2
 
Key Performance Indicators Worksheet
Key Performance Indicators WorksheetKey Performance Indicators Worksheet
Key Performance Indicators WorksheetDemand Metric
 
Basic rules-in-sap-sd-module
Basic rules-in-sap-sd-moduleBasic rules-in-sap-sd-module
Basic rules-in-sap-sd-moduleNitesh Mahajan
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerWildan Maulana
 
Introduction to Datawarehousing.
Introduction to Datawarehousing.Introduction to Datawarehousing.
Introduction to Datawarehousing.Chetan Gadodia
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)tafosepsdfasg
 
Business Intelligence and OLAP Practice
Business Intelligence and OLAP PracticeBusiness Intelligence and OLAP Practice
Business Intelligence and OLAP PracticeTatiana Ivanova
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Vivastream
 
Axpert Dealer Management System
Axpert  Dealer Management SystemAxpert  Dealer Management System
Axpert Dealer Management SystemAGILE LABS,INDIA
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Vivastream
 
Achieving a single view of customer
Achieving a single view of customerAchieving a single view of customer
Achieving a single view of customerInSync Conference
 
How your sales systems can supercharge your business presentation
How your sales systems can supercharge your business presentationHow your sales systems can supercharge your business presentation
How your sales systems can supercharge your business presentationrepspark
 

Semelhante a Using the right data model in a data mart (20)

Teradata Demand Chain Management (DCM): Version 4
Teradata Demand Chain Management (DCM): Version 4Teradata Demand Chain Management (DCM): Version 4
Teradata Demand Chain Management (DCM): Version 4
 
iView Business Intelligence for SAP Business One - Product Promotion Q12013
iView Business Intelligence for SAP Business One - Product Promotion Q12013iView Business Intelligence for SAP Business One - Product Promotion Q12013
iView Business Intelligence for SAP Business One - Product Promotion Q12013
 
Intro to datawarehouse dev 1.0
Intro to datawarehouse   dev 1.0Intro to datawarehouse   dev 1.0
Intro to datawarehouse dev 1.0
 
2013 distribution trends
2013 distribution trends2013 distribution trends
2013 distribution trends
 
Finding Gold in Airline Customer Data
Finding Gold in Airline Customer DataFinding Gold in Airline Customer Data
Finding Gold in Airline Customer Data
 
2015_Partners_SWA_Technology_v12
2015_Partners_SWA_Technology_v122015_Partners_SWA_Technology_v12
2015_Partners_SWA_Technology_v12
 
Kishore jaladi-dw
Kishore jaladi-dwKishore jaladi-dw
Kishore jaladi-dw
 
Key Performance Indicators Worksheet
Key Performance Indicators WorksheetKey Performance Indicators Worksheet
Key Performance Indicators Worksheet
 
Basic Of Sales Metric
Basic Of Sales MetricBasic Of Sales Metric
Basic Of Sales Metric
 
Basic rules-in-sap-sd-module
Basic rules-in-sap-sd-moduleBasic rules-in-sap-sd-module
Basic rules-in-sap-sd-module
 
Tirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence LayerTirta ERP - Business Intelligence Layer
Tirta ERP - Business Intelligence Layer
 
Introduction to Datawarehousing.
Introduction to Datawarehousing.Introduction to Datawarehousing.
Introduction to Datawarehousing.
 
1 introductory slides (1)
1 introductory slides (1)1 introductory slides (1)
1 introductory slides (1)
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Business Intelligence and OLAP Practice
Business Intelligence and OLAP PracticeBusiness Intelligence and OLAP Practice
Business Intelligence and OLAP Practice
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?
 
Axpert Dealer Management System
Axpert  Dealer Management SystemAxpert  Dealer Management System
Axpert Dealer Management System
 
Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?Is Your Marketing Database "Model Ready"?
Is Your Marketing Database "Model Ready"?
 
Achieving a single view of customer
Achieving a single view of customerAchieving a single view of customer
Achieving a single view of customer
 
How your sales systems can supercharge your business presentation
How your sales systems can supercharge your business presentationHow your sales systems can supercharge your business presentation
How your sales systems can supercharge your business presentation
 

Mais de David Walker

Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServicesDavid Walker
 
Big Data Week 2016 - Worldpay - Deploying Secure Clusters
Big Data Week 2016  - Worldpay - Deploying Secure ClustersBig Data Week 2016  - Worldpay - Deploying Secure Clusters
Big Data Week 2016 - Worldpay - Deploying Secure ClustersDavid Walker
 
Data Works Berlin 2018 - Worldpay - PCI Compliance
Data Works Berlin 2018 - Worldpay - PCI ComplianceData Works Berlin 2018 - Worldpay - PCI Compliance
Data Works Berlin 2018 - Worldpay - PCI ComplianceDavid Walker
 
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy ClustersData Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy ClustersDavid Walker
 
Big Data Analytics 2017 - Worldpay - Empowering Payments
Big Data Analytics 2017  - Worldpay - Empowering PaymentsBig Data Analytics 2017  - Worldpay - Empowering Payments
Big Data Analytics 2017 - Worldpay - Empowering PaymentsDavid Walker
 
Data Driven Insurance Underwriting
Data Driven Insurance UnderwritingData Driven Insurance Underwriting
Data Driven Insurance UnderwritingDavid Walker
 
Data Driven Insurance Underwriting (Dutch Language Version)
Data Driven Insurance Underwriting (Dutch Language Version)Data Driven Insurance Underwriting (Dutch Language Version)
Data Driven Insurance Underwriting (Dutch Language Version)David Walker
 
An introduction to data virtualization in business intelligence
An introduction to data virtualization in business intelligenceAn introduction to data virtualization in business intelligence
An introduction to data virtualization in business intelligenceDavid Walker
 
BI SaaS & Cloud Strategies for Telcos
BI SaaS & Cloud Strategies for TelcosBI SaaS & Cloud Strategies for Telcos
BI SaaS & Cloud Strategies for TelcosDavid Walker
 
Building an analytical platform
Building an analytical platformBuilding an analytical platform
Building an analytical platformDavid Walker
 
Data warehousing change in a challenging environment
Data warehousing change in a challenging environmentData warehousing change in a challenging environment
Data warehousing change in a challenging environmentDavid Walker
 
Building a data warehouse of call data records
Building a data warehouse of call data recordsBuilding a data warehouse of call data records
Building a data warehouse of call data recordsDavid Walker
 
Struggling with data management
Struggling with data managementStruggling with data management
Struggling with data managementDavid Walker
 
A linux mac os x command line interface
A linux mac os x command line interfaceA linux mac os x command line interface
A linux mac os x command line interfaceDavid Walker
 
Connections a life in the day of - david walker
Connections   a life in the day of - david walkerConnections   a life in the day of - david walker
Connections a life in the day of - david walkerDavid Walker
 
Conspectus data warehousing appliances – fad or future
Conspectus   data warehousing appliances – fad or futureConspectus   data warehousing appliances – fad or future
Conspectus data warehousing appliances – fad or futureDavid Walker
 
An introduction to social network data
An introduction to social network dataAn introduction to social network data
An introduction to social network dataDavid Walker
 
Implementing Netezza Spatial
Implementing Netezza SpatialImplementing Netezza Spatial
Implementing Netezza SpatialDavid Walker
 
Storage Characteristics Of Call Data Records In Column Store Databases
Storage Characteristics Of Call Data Records In Column Store DatabasesStorage Characteristics Of Call Data Records In Column Store Databases
Storage Characteristics Of Call Data Records In Column Store DatabasesDavid Walker
 
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - PresentationUKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - PresentationDavid Walker
 

Mais de David Walker (20)

Moving To MicroServices
Moving To MicroServicesMoving To MicroServices
Moving To MicroServices
 
Big Data Week 2016 - Worldpay - Deploying Secure Clusters
Big Data Week 2016  - Worldpay - Deploying Secure ClustersBig Data Week 2016  - Worldpay - Deploying Secure Clusters
Big Data Week 2016 - Worldpay - Deploying Secure Clusters
 
Data Works Berlin 2018 - Worldpay - PCI Compliance
Data Works Berlin 2018 - Worldpay - PCI ComplianceData Works Berlin 2018 - Worldpay - PCI Compliance
Data Works Berlin 2018 - Worldpay - PCI Compliance
 
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy ClustersData Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
Data Works Summit Munich 2017 - Worldpay - Multi Tenancy Clusters
 
Big Data Analytics 2017 - Worldpay - Empowering Payments
Big Data Analytics 2017  - Worldpay - Empowering PaymentsBig Data Analytics 2017  - Worldpay - Empowering Payments
Big Data Analytics 2017 - Worldpay - Empowering Payments
 
Data Driven Insurance Underwriting
Data Driven Insurance UnderwritingData Driven Insurance Underwriting
Data Driven Insurance Underwriting
 
Data Driven Insurance Underwriting (Dutch Language Version)
Data Driven Insurance Underwriting (Dutch Language Version)Data Driven Insurance Underwriting (Dutch Language Version)
Data Driven Insurance Underwriting (Dutch Language Version)
 
An introduction to data virtualization in business intelligence
An introduction to data virtualization in business intelligenceAn introduction to data virtualization in business intelligence
An introduction to data virtualization in business intelligence
 
BI SaaS & Cloud Strategies for Telcos
BI SaaS & Cloud Strategies for TelcosBI SaaS & Cloud Strategies for Telcos
BI SaaS & Cloud Strategies for Telcos
 
Building an analytical platform
Building an analytical platformBuilding an analytical platform
Building an analytical platform
 
Data warehousing change in a challenging environment
Data warehousing change in a challenging environmentData warehousing change in a challenging environment
Data warehousing change in a challenging environment
 
Building a data warehouse of call data records
Building a data warehouse of call data recordsBuilding a data warehouse of call data records
Building a data warehouse of call data records
 
Struggling with data management
Struggling with data managementStruggling with data management
Struggling with data management
 
A linux mac os x command line interface
A linux mac os x command line interfaceA linux mac os x command line interface
A linux mac os x command line interface
 
Connections a life in the day of - david walker
Connections   a life in the day of - david walkerConnections   a life in the day of - david walker
Connections a life in the day of - david walker
 
Conspectus data warehousing appliances – fad or future
Conspectus   data warehousing appliances – fad or futureConspectus   data warehousing appliances – fad or future
Conspectus data warehousing appliances – fad or future
 
An introduction to social network data
An introduction to social network dataAn introduction to social network data
An introduction to social network data
 
Implementing Netezza Spatial
Implementing Netezza SpatialImplementing Netezza Spatial
Implementing Netezza Spatial
 
Storage Characteristics Of Call Data Records In Column Store Databases
Storage Characteristics Of Call Data Records In Column Store DatabasesStorage Characteristics Of Call Data Records In Column Store Databases
Storage Characteristics Of Call Data Records In Column Store Databases
 
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - PresentationUKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
UKOUG06 - An Introduction To Process Neutral Data Modelling - Presentation
 

Último

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Último (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Using the right data model in a data mart

  • 1. USING THE RIGHT DATA MODEL IN A DATA MART D AV I D M WA L K E R D ATA M A N A G E M E N T & WA R E H O U S I N G
  • 2. INTRODUCTION •  The concept of a Data Mart as the data access interface layer for Business Intelligence has been around for over 25 years •  Kimball style Dimensional Modelling and Star Schemas have become the de facto data modelling technique for data marts •  These have been and continue to be hugely successful with relational databases and reporting tools – but are they the right tool for todays technologies ? March 2012 © 2012 Data Management & Warehousing 2
  • 3. WHY IS A STAR SCHEMA SO SUCCESSFUL? •  There are three main reasons for creating a star schema and their wide acceptance as a technique •  Simpler for users to understand •  Highly performant user queries •  Optimal disk storage usage March 2012 © 2012 Data Management & Warehousing Slide 3
  • 4. WHAT IS A STAR SCHEMA? •  A star schema consists of DATE DIMENSION STORE DIMENSION two parts •  •  Date Surrogate Key Date •  •  Store Surrogate Key Store Name •  Facts: •  Day •  Store Number Measurable numeric and/or •  Month Year •  Store Postcode •  •  Store Town time data about an event •  Public Holiday Flag •  Store Region •  Dimensions: Descriptive attributes about SALES FACTS the event that give the facts a •  Date Surrogate Key context •  Store Surrogate Key •  Facts are stored at a •  •  Customer Surrogate Key Product Surrogate Key uniform level of detail •  •  Sale Time Sale Quantity known as the grain of the •  Sale Unit Price data •  A star schema consists of a CUSTOMER DIMENSION PRODUCT DIMENSION fact table and a number of •  Customer Surrogate Key Customer Loyalty Number •  Product Surrogate Key Product SKU associated dimension tables •  •  •  Customer Gender •  Product Name •  Customer Postcode •  Product Category •  Customer Town •  Product Group •  Customer Region •  Temperature Group March 2012 © 2012 Data Management & Warehousing Slide 4
  • 5. STAR SCHEMAS: SIMPLER FOR USERS TO UNDERSTAND •  Intuitive grouping of select P.PRODUCT_CATEGORY, sum(SALES_QUANTITY) information from SALES_FACTS F, •  e.g. All customer data in one DATE_DIMENSION D, dimension, all store data in STORE_DIMENSION S, another, etc. CUSTOMER_DIMENSION C, PRODUCT_DIMENSION P •  Much easier queries than on where MONTH = ‘March’ a full relational schemas and YEAR = ‘2012’ •  Consequently harder to get and CUSTOMER_GENDER = ‘Female’ the wrong answer because of and STORE_LOCATION = ‘South West’ the wrong join and F.DATE_SKEY = D.DATE_SKEY and F.STORE_SKEY = S.STORE_SKEY •  All data is at the same level and F.CUSTOMER_SKEY = C.CUSTOMER_SKEY and F.PRODUCT_SKEY = P.PRODUCT_SKEY of granularity •  Consequently harder to get Example query to get the number of sales in each the wrong answer because of product category for March 2012 by female mismatched levels of data customers in stores in the South West region March 2012 © 2012 Data Management & Warehousing Slide 5
  • 6. STAR SCHEMAS: HIGHLY PERFORMANT USER QUERIES •  Dimensional data has DATE DIMENSION STORE DIMENSION an enforced one-to- •  •  Date Surrogate Key Date •  •  Store Surrogate Key Store Name many relationship with •  •  Day Month •  •  Store Number Store Postcode the fact table •  •  Year Public Holiday Flag •  •  Store Town Store Region •  Filtering occurs on the (smaller) dimensions •  SALES FACTS Date Surrogate Key •  e.g. •  Store Surrogate Key Customer Surrogate Key where YEAR = ‘2012’ •  •  Product Surrogate Key Sale Time •  Aggregation takes •  •  Sale Quantity place only on the •  Sale Unit Price relevant subset of the CUSTOMER DIMENSION PRODUCT DIMENSION facts •  •  Customer Surrogate Key Customer Loyalty Number •  •  Product Surrogate Key Product SKU •  e.g. •  Customer Gender Customer Postcode •  Product Name Product Category sum (SALES_QUANTITY) •  •  •  Customer Town •  Product Group •  Customer Region •  Temperature Group March 2012 © 2012 Data Management & Warehousing Slide 6
  • 7. STAR SCHEMAS: OPTIMAL DISK STORAGE USAGE •  If STORE_REGION had: •  DATE DIMENSION Date Surrogate Key •  STORE DIMENSION Store Surrogate Key •  10 discreet values •  •  Date Day •  •  Store Name Store Number •  was stored in the example •  •  Month Year •  •  Store Postcode Store Town SALES_FACT table •  Public Holiday Flag •  Store Region •  was on average 10 bytes SALES FACTS long •  Date Surrogate Key •  This one field alone would •  •  Store Surrogate Key Customer Surrogate Key require an additional 1Tb •  •  Product Surrogate Key Sale Time of storage •  •  Sale Quantity Sale Unit Price •  Not storing it in the fact also improves query CUSTOMER DIMENSION PRODUCT DIMENSION performance by reducing •  •  Customer Surrogate Key Customer Loyalty Number •  •  Product Surrogate Key Product SKU disk I/O required to •  •  Customer Gender Customer Postcode •  •  Product Name Product Category retrieve the information •  Customer Town •  Product Group •  Customer Region •  Temperature Group March 2012 © 2012 Data Management & Warehousing 7
  • 8. SCHEMAS: THE ALTERNATIVES RELATIONAL SNOWFLAKE STAR RESULT SET Complexity Complexity Complexity Complexity Speed Speed Speed Speed Space Space Space Space Usually used for data Favours saving some De facto standard Large single table warehouses rather space in exchange for data mart design with the entire result than data marts. for added user query based on traditional set – optimal in some Favoured solution on complexity – usually technologies. Also circumstances MPP technologies a techie compromise used as source for due to their power OLAP cubes © 2012 Data Management & Warehousing 8
  • 9. STAR SCHEMAS: TECHNOLOGY ASSUMPTIONS •  There are two major and often unspoken assumptions about the technologies used to build this sort of environment: •  Firstly: The database used is a row store database and not a column store database •  Secondly: That users will be running reporting tools and OLAP cubes to access the data •  Neither of these assumptions is necessarily true – the last 10 years have seen massive innovation in Business Intelligence technologies that will have an impact on the chosen architectural solution – using alternate technologies means that you should challenge existing designs and embrace appropriate new designs in order to exploit the technology March 2012 © 2012 Data Management & Warehousing 9
  • 10. UNDERSTAND THE DESIGN IMPACT OF ALTERNATE TECHNOLOGIES •  Column Store Databases: •  What is a column store database? •  Why are column store databases efficient? •  How does this affect data mart design? •  The use of alternate reporting mechanisms: •  The user requirement gap •  How users have filled the gap March 2012 © 2012 Data Management & Warehousing Slide 10
  • 11. WHAT IS A COLUMN STORE DATABASE? •  Traditionally databases are ‘row-based’ i.e. each field of data in a record is stored next to each other: Forename Surname Gender David Walker Male Helen Walker Female Sheila Jones Female •  Column store databases store the values in columns and then hold a mapping to form the record •  This is transparent to the user, who queries a table with SQL in exactly the same way as they would a row-based database Jan 2012 © 2012 Data Management & Warehousing 11
  • 12. COLUMN STORAGE EXAMPLE First Name F Token Note: To the user this appears as a conventional row-based table that can be queried by standard Value SQL, it is only the underlying storage that is different David PPP Helen QQQ F Token S Token G Token Sheila RRR PPP YYY BBB Surname Value S Token QQQ YYY AAA Jones XXX RRR XXX AAA Walker YYY Gender Value G Token Female AAA Male BBB Jan 2012 © 2012 Data Management & Warehousing 12
  • 13. EFFICIENCIES OF COLUMN STORE DATABASES •  Column store databases offer significant storage optimisation opportunities because long strings are not repeatedly stored •  In addition it is possible to compress the data column stores very efficiently •  It is possible, in some column store implementations, that the column storage holds additional metadata that can be used to speed up specific queries (e.g. the number of records associated with each value in a column) •  Reduced the data volume stored means reduced I/O when querying the database, this therefore also gives query performance improvements Jan 2012 © 2012 Data Management & Warehousing 13
  • 14. COLUMN STORE DATABASES AND DATA MART SCHEMAS •  A column store database effectively internally creates a star schema of every field in a result set table. •  This minimises the storage and maximises the query speed in this type of database •  Creating a star schema at the table level effectively duplicates (in a less efficient manner) the underlying structure that is automatically created by the database engine •  Consequently a single table result set is more efficient in a column store database than a star schema March 2012 © 2012 Data Management & Warehousing Slide 14
  • 15. SCHEMAS: THE ALTERNATIVES ROW DB COLUMN DB ROW DB COLUMN DB Complexity Complexity Complexity Complexity Speed Speed Speed Speed Space Space Space Space Column Store Column Store Database improve Databases will space usage and significantly improve increase speed space usage and compared to Row s p e e d w h e n Based Databases compared to Row Based Databases STAR SCHEMA RESULT SET SCHEMA © 2012 Data Management & Warehousing 15
  • 16. WHO ARE THE COLUMN STORE VENDORS •  Many of the major database vendors have bought into this concept, mostly by acquisition Vendor Database SQL Dialect Actian Vectorwise Ingres EMC Greenplum Postgres HP Vertica Postgres InfoBright InfoBright MySQL ParAccel ParAccel Postgres SAP HANA (In Memory) SAP Sybase IQ Sybase/TSQL Teradata AsterData Postgres •  There are multiple other players •  For more information: Wikipedia & DBMS2 March 2012 © 2012 Data Management & Warehousing 16
  • 17. REPORTING TECHNOLOGIES •  Historically: •  Reporting tools were initially designed to provide a ‘simplified’ user interface for reporting against relational schemas rather than writing SQL •  Schemas were simplified into star schemas and specialist tools evolved to query both star schemas and OLAP cubes built on top of the star schemas •  The focus of the tools was on the ability to report what had happened from the data March 2012 © 2012 Data Management & Warehousing 17
  • 18. THE USER REQUIREMENT GAP What users had: What users want: Historical Predictive Reporting Analytics Insight into Understanding what has what is likely happened to happen March 2012 © 2012 Data Management & Warehousing Slide 18
  • 19. HOW USERS HAVE FILLED THE GAP •  Spreadsheets •  Users love them even if IT hate the associated data integrity issues •  Users have adopted the idea of manipulating a worksheet of data equivalent to a result set table. •  Spreadsheets can connect to database sources to get data often using a ‘join all’ view over a star schema to access data •  Desktop based spreadsheets now support large data sets (e.g. Excel supports 1M rows, 16K columns) •  Emergence or equivalent web based technologies (e.g. Google Docs) •  Emergence of low cost, open source equivalents •  In-built graphing and charting capabilities March 2012 © 2012 Data Management & Warehousing 19
  • 20. HOW USERS HAVE FILLED THE GAP •  Statistical Analysis Tools •  Statistical analysis of data to identify future trends •  Extracting large result sets to the tools for analysis •  Connecting to result sets in the database for direct access •  Emergence of low cost, open source equivalents (R) •  Emergence or equivalent web based technologies (e.g. Google Prediction, R Studio) •  Predictive Model Standards (PMML) •  In-built graphing and charting capabilities March 2012 © 2012 Data Management & Warehousing Slide 20
  • 21. HOW USERS HAVE FILLED THE GAP •  Data Visualisation/Dashboarding Tools •  Multiple maps, charts, graphs, gauges, sparklines, heat maps and traffic lights displaying process critical information •  Often sourced from a result set table which is being drip fed the latest data by being automatically generated by devices (machine generated data) •  Emergence of agile/rapid development style tools •  Tools depend on it being easy to load/update the data to give near realtime information March 2012 © 2012 Data Management & Warehousing Slide 21
  • 22. SCHEMA TYPE SELECTION BASED ON IMPLEMENTATION TECHNOLOGY SPREADSHEETS DASHBOARDS STATISTICAL TOOLS Physical Star Schema with Single Table View Physical Single Table TRADITIONAL AND CUBING REPORTING TOOLS Physical Star Schema Physical Single Table with Star Schema Views ROW STORE COLUMN STORE DATABASE DATABASE March 2012 © 2012 Data Management & Warehousing Slide 22
  • 23. IN CONCLUSION … •  When designing your solution architecture it is important that you choose The Equivalent Alternate Design best suited to the technology you are deploying •  Star Schemas are still the best design pattern to use when you are using row based databases •  Result Set Single Tables are more efficient when using column store databases •  Consider the users and the tools that they will use when choosing the schema design type March 2012 © 2012 Data Management & Warehousing 23
  • 24. CONTACT US •  Data Management & Warehousing •  Website: http://www.datamgmt.com •  Telephone: +44 (0) 118 321 5930 •  David Walker •  E-Mail: davidw@datamgmt.com •  Telephone: +44 (0) 7990 594 372 •  Skype: datamgmt •  White Papers: http://scribd.com/davidmwalker March 2012 © 2012 Data Management & Warehousing 24
  • 25. ABOUT US Data Management & Warehousing is a UK based consultancy that has been delivering successful business intelligence and data warehousing solutions since 1995. Our consultants have worked with major corporations around the world including the US, Europe, Africa and the Middle East. We have worked in many industry sectors such as telcos, manufacturing, retail, financial and transport. We provide governance and project management as well as expertise in the leading technologies. March 2012 © 2012 Data Management & Warehousing 25
  • 26. THANK YOU © 2 0 1 2 - D ATA M A N A G E M E N T & WA R E H O U S I N G H T T P : / / W W W. D ATA M G M T. C O M