SlideShare uma empresa Scribd logo
1 de 29
Aggregate Join Indices & Dimensional models delivering extraordinary performance Jose M. Borja – Jborja@Menard-inc.com
Theory vs. Practice “ In theory, there is no difference between theory and practice. In practice there is….” Yogi Berra The reason we are here today is to help bridge the gap between theory and practice and to share with you real life experiences on using Aggregate Join Indices and Dimensional Models to deliver extraordinary performance
Background (or who is this guy) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What’s the Challenge? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What is the proposed solution  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Common misconceptions about this approach?  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How can I do it in Teradata  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example #1  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example #1 - SQL for 3NF Model  Select product_id,  sum(sold_qty * price_amt) – discount_amt – coupon_amt) as LY_Sales_Amt From  Sale s,  Sale_Line sl  Where saledate between 2005-01-01 and date – interval ‘1’ year and  s.Store_Nbr = sl.Store_Nbr  and  s.Transaction_Nbr = sl.Transaction_Nbr Group By product_id Select product_id,  sum(sold_qty * price_amt) – discount_amt – coupon_amt) as TY_Sales_Amt From  Sale s, Sale_Line sl  Where saledate between 2006-01-01 and date and  s.Store_Nbr = sl.Store_Nbr and  s.Transaction_Nbr = sl.Transaction_Nbr Group by product_id FULL OUTER JOIN
Example #1 – Dimensional Model  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Example #1 - SQL for Dimensional Model Select product_id,  sum(net_sale_amt) as LY_Sales From  store_product_daily_sale Where the_date between 2005-01-01 and date – interval ‘1’ year Group by product_id Select product_id,  sum(net_sale_amt) as TY_Sales From  store_product_daily_sale Where the_date between 2006-01-01 and date Group by product_id FULL OUTER JOIN The fact table is 1/3 the size of the 3NF Sale_Line table and  eliminates a table join between Sales and Sales_Line
Add Aggregate Join Indices to boost performance  A view is added in the Dimensional model to represent a single table aggregate Join Index at the Corporate level.  The AJI removes the Store grain and yields a higher aggregate with less rows.
Example #1 - SQL for Dimensional Model using the Join Index View Select product_id,  sum(net_sale_amt) as LY_Sales From  ji_product_daily_salev Where the_date between 2005-01-01 and date – interval ‘1’ year Group by product_id Select product_id,  sum(net_sale_amt) as TY_Sales From  ji_product_daily_salev Where the_date between 2006-01-01 and date Group by product_id FULL OUTER JOIN The Join Index is 1/30 the size of the 3NF Sale_Line table
A more robust Fact table has more possibilities  Bring additional dimensions to yield different levels of aggregation granularity to the mix of Join Indexes
 
Store & Subclass at 3 levels of Time granularity
Product at Daily Level and Store at Daily level
Subclass at 5 levels of Time granularity
Use the view to gain access to the Join Index  CREATE JOIN INDEX   JI_PRODUCT_DAILY_SALEv   AS  SELECT  product_id, the_date, product_subclass_id, Supplier_id, sum( net_sale_amt) as net_sale_amt)  . . . . . . . FROM STORE_PRODUCT_DAILY_SALE PRIMARY INDEX ( product_id, the_date); REPLACE VIEW   JI_PRODUCT_DAILY_SALEv  AS  SELECT  product_id, the_date, product_subclass_id, Supplier_id, sum( net_sale_amt) as net_sale_amt) . . . . . . . . FROM STORE_PRODUCT_DAILY_SALE; SELECT   prodcut_id, the_date, net_sale_amt  FROM  JI_PRODUCT_DAILY_SALEv   WHERE product_id = 198273648;
CPU consumption for the LY vs. TY Sales Example  12% 2%
Disk I/O Usage for the LY vs. TY Sales Example 21% 7%
Elapsed Time for the LY vs. TY Sales Example 10% 3%
LY vs. TY for 1 Product Corporate Wide
LY vs. TY for all Product Categories Corporate Wide
Conclusions Teradata technology makes it possible to sustain a 3NF and a Dimensional Model in a single system and enjoy the benefits of having both worlds.
Conclusions Teradata technology makes it easy to get the Dimensional model available for use at different levels of granularity using Join Indexes.  Sweet performance with low resource usage and auto-magic maintenance!
Conclusions The expense of maintaining a dozen Join Indexes on a single Fact table is paid back with just one substantial single report ran against the 3NF model. The Join Indexes are maintained when the DW has less usage at night and the benefits are harvested during the day by the users.
Conclusions The number of Secondary Indexes can be kept very low in the 3NF model since the Dimensional Model provides most of the necessary access to large volumes of data.  Most access to the 3NF can be limited to PI queries for application support, tactical queries, or reports that can afford table scans.
Tips on Join Indexes Keep join indexes limited to only one table.  Maintenance is too high on Join Indexes with two or more tables.  If one of the tables is maintained the Join Index may need to be maintained also.  Do not drop and recreate Join Indexes for maintenance.  It is not necessary and can be (very, very, very) costly to recreate. Store the Join Index definitions in macros for reuse and storage in the data dictionary. Create a view to provide “direct” access to the Join Index. Create a dummy Join Index on any table to prevent accidental DROPS.  A life saver to see the can not drop table message!

Mais conteúdo relacionado

Mais procurados

Using SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS CubesUsing SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS CubesCode Mastery
 
Db2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfallsDb2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfallssam2sung2
 
Datastage free tutorial
Datastage free tutorialDatastage free tutorial
Datastage free tutorialtekslate1
 
Accenture informatica interview question answers
Accenture informatica interview question answersAccenture informatica interview question answers
Accenture informatica interview question answersSweta Singh
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecturehasanshan
 
Migration services (DB2 to Teradata)
Migration services (DB2  to Teradata)Migration services (DB2  to Teradata)
Migration services (DB2 to Teradata)ModakAnalytics
 
Teradata Unity
Teradata UnityTeradata Unity
Teradata UnityTeradata
 
The Database Environment Chapter 6
The Database Environment Chapter 6The Database Environment Chapter 6
The Database Environment Chapter 6Jeanie Arnoco
 
Crystal xcelsius best practices and workflows for building enterprise solut...
Crystal xcelsius   best practices and workflows for building enterprise solut...Crystal xcelsius   best practices and workflows for building enterprise solut...
Crystal xcelsius best practices and workflows for building enterprise solut...Yogeeswar Reddy
 
Datastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobsDatastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobsshanker_uma
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designCalpont
 
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDatawarehouse Trainings
 
Building High Performance MySQL Query Systems and Analytic Applications
Building High Performance MySQL Query Systems and Analytic ApplicationsBuilding High Performance MySQL Query Systems and Analytic Applications
Building High Performance MySQL Query Systems and Analytic ApplicationsCalpont
 
Essbase beginner's guide olap fundamental chapter 1
Essbase beginner's guide olap fundamental chapter 1Essbase beginner's guide olap fundamental chapter 1
Essbase beginner's guide olap fundamental chapter 1Amit Sharma
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0alok khobragade
 
Online Datastage training
Online Datastage trainingOnline Datastage training
Online Datastage trainingchpriyaa1
 
Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)LizLavaveshkul
 

Mais procurados (20)

Using SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS CubesUsing SSRS Reports with SSAS Cubes
Using SSRS Reports with SSAS Cubes
 
Db2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfallsDb2 migration -_tips,_tricks,_and_pitfalls
Db2 migration -_tips,_tricks,_and_pitfalls
 
Datastage free tutorial
Datastage free tutorialDatastage free tutorial
Datastage free tutorial
 
Accenture informatica interview question answers
Accenture informatica interview question answersAccenture informatica interview question answers
Accenture informatica interview question answers
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecture
 
Migration services (DB2 to Teradata)
Migration services (DB2  to Teradata)Migration services (DB2  to Teradata)
Migration services (DB2 to Teradata)
 
Teradata Unity
Teradata UnityTeradata Unity
Teradata Unity
 
The Database Environment Chapter 6
The Database Environment Chapter 6The Database Environment Chapter 6
The Database Environment Chapter 6
 
Crystal xcelsius best practices and workflows for building enterprise solut...
Crystal xcelsius   best practices and workflows for building enterprise solut...Crystal xcelsius   best practices and workflows for building enterprise solut...
Crystal xcelsius best practices and workflows for building enterprise solut...
 
Datastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobsDatastage parallell jobs vs datastage server jobs
Datastage parallell jobs vs datastage server jobs
 
The thinking persons guide to data warehouse design
The thinking persons guide to data warehouse designThe thinking persons guide to data warehouse design
The thinking persons guide to data warehouse design
 
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAININGDATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
DATASTAGE AND QUALITY STAGE 9.1 ONLINE TRAINING
 
Migration from 8.1 to 11.3
Migration from 8.1 to 11.3Migration from 8.1 to 11.3
Migration from 8.1 to 11.3
 
Building High Performance MySQL Query Systems and Analytic Applications
Building High Performance MySQL Query Systems and Analytic ApplicationsBuilding High Performance MySQL Query Systems and Analytic Applications
Building High Performance MySQL Query Systems and Analytic Applications
 
Essbase beginner's guide olap fundamental chapter 1
Essbase beginner's guide olap fundamental chapter 1Essbase beginner's guide olap fundamental chapter 1
Essbase beginner's guide olap fundamental chapter 1
 
58750024 datastage-student-guide
58750024 datastage-student-guide58750024 datastage-student-guide
58750024 datastage-student-guide
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
 
Online Datastage training
Online Datastage trainingOnline Datastage training
Online Datastage training
 
Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)Etl Overview (Extract, Transform, And Load)
Etl Overview (Extract, Transform, And Load)
 
Optimization in essbase
Optimization in essbaseOptimization in essbase
Optimization in essbase
 

Destaque

ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisShaheryar Iqbal
 
Teradata memory management - A balancing act
Teradata memory management  -  A balancing actTeradata memory management  -  A balancing act
Teradata memory management - A balancing actShaheryar Iqbal
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
Teradata Big Data London Seminar
Teradata Big Data London SeminarTeradata Big Data London Seminar
Teradata Big Data London SeminarHortonworks
 
Teradata Overview
Teradata OverviewTeradata Overview
Teradata OverviewTeradata
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkVivian S. Zhang
 
The Intelligent Thing -- Using In-Memory for Big Data and Beyond
The Intelligent Thing -- Using In-Memory for Big Data and BeyondThe Intelligent Thing -- Using In-Memory for Big Data and Beyond
The Intelligent Thing -- Using In-Memory for Big Data and BeyondInside Analysis
 
Building your data warehouse with Redshift
Building your data warehouse with RedshiftBuilding your data warehouse with Redshift
Building your data warehouse with RedshiftAmazon Web Services
 
100424 teradata cloud computing 3rd party influencers2c
100424 teradata cloud computing 3rd party influencers2c100424 teradata cloud computing 3rd party influencers2c
100424 teradata cloud computing 3rd party influencers2cguest8ebe0a8
 
Teradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system ArchitectureTeradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system ArchitectureMohammad Tahoon
 
Teradata Architecture
Teradata Architecture Teradata Architecture
Teradata Architecture BigClasses Com
 
Teradata introduction
Teradata introductionTeradata introduction
Teradata introductionRameejmd
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Introduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata WorksIntroduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata WorksBigClasses Com
 
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for HadoopPartners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for HadoopEric Sun
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Amazon Web Services
 

Destaque (18)

ABC of Teradata System Performance Analysis
ABC of Teradata System Performance AnalysisABC of Teradata System Performance Analysis
ABC of Teradata System Performance Analysis
 
Teradata memory management - A balancing act
Teradata memory management  -  A balancing actTeradata memory management  -  A balancing act
Teradata memory management - A balancing act
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Teradata Big Data London Seminar
Teradata Big Data London SeminarTeradata Big Data London Seminar
Teradata Big Data London Seminar
 
Teradata Overview
Teradata OverviewTeradata Overview
Teradata Overview
 
Aggregate fact tables
Aggregate fact tablesAggregate fact tables
Aggregate fact tables
 
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talkNYC Open Data Meetup-- Thoughtworks chief data scientist talk
NYC Open Data Meetup-- Thoughtworks chief data scientist talk
 
The Intelligent Thing -- Using In-Memory for Big Data and Beyond
The Intelligent Thing -- Using In-Memory for Big Data and BeyondThe Intelligent Thing -- Using In-Memory for Big Data and Beyond
The Intelligent Thing -- Using In-Memory for Big Data and Beyond
 
Building your data warehouse with Redshift
Building your data warehouse with RedshiftBuilding your data warehouse with Redshift
Building your data warehouse with Redshift
 
Teradata Intelligent Memory
Teradata Intelligent MemoryTeradata Intelligent Memory
Teradata Intelligent Memory
 
100424 teradata cloud computing 3rd party influencers2c
100424 teradata cloud computing 3rd party influencers2c100424 teradata cloud computing 3rd party influencers2c
100424 teradata cloud computing 3rd party influencers2c
 
Teradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system ArchitectureTeradata introduction - A basic introduction for Taradate system Architecture
Teradata introduction - A basic introduction for Taradate system Architecture
 
Teradata Architecture
Teradata Architecture Teradata Architecture
Teradata Architecture
 
Teradata introduction
Teradata introductionTeradata introduction
Teradata introduction
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Introduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata WorksIntroduction to Teradata And How Teradata Works
Introduction to Teradata And How Teradata Works
 
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for HadoopPartners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
Partners 2013 LinkedIn Use Cases for Teradata Connectors for Hadoop
 
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
Getting Maximum Performance from Amazon Redshift (DAT305) | AWS re:Invent 2013
 

Semelhante a Teradata Aggregate Join Indices And Dimensional Models

Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional ModelingSunita Sahu
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecyclebartlowe
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional modelGersiton Pila Challco
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPDhiren Gala
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehousekiran14360
 
MongoDb Schema Pattern - Kalpit Pandit.pptx
MongoDb Schema Pattern - Kalpit Pandit.pptxMongoDb Schema Pattern - Kalpit Pandit.pptx
MongoDb Schema Pattern - Kalpit Pandit.pptxKalpitPandit1
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSHCL Technologies
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.pptBsMath3rdsem
 
Chapter 13 Business Intelligence and Data Warehouses Problems.docx
Chapter 13 Business Intelligence and Data Warehouses Problems.docxChapter 13 Business Intelligence and Data Warehouses Problems.docx
Chapter 13 Business Intelligence and Data Warehouses Problems.docxbartholomeocoombs
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
Technical Presentation - TimeWIzard
Technical Presentation - TimeWIzardTechnical Presentation - TimeWIzard
Technical Presentation - TimeWIzardPraveen Kumar Peddi
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2akitda
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsDataBench
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operationst_ivanov
 

Semelhante a Teradata Aggregate Join Indices And Dimensional Models (20)

Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
The Data Warehouse Lifecycle
The Data Warehouse LifecycleThe Data Warehouse Lifecycle
The Data Warehouse Lifecycle
 
Designing the business process dimensional model
Designing the business process dimensional modelDesigning the business process dimensional model
Designing the business process dimensional model
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
 
introduction to datawarehouse
introduction to datawarehouseintroduction to datawarehouse
introduction to datawarehouse
 
MongoDb Schema Pattern - Kalpit Pandit.pptx
MongoDb Schema Pattern - Kalpit Pandit.pptxMongoDb Schema Pattern - Kalpit Pandit.pptx
MongoDb Schema Pattern - Kalpit Pandit.pptx
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
 
3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt3._DWH_Architecture__Components.ppt
3._DWH_Architecture__Components.ppt
 
Chapter 13 Business Intelligence and Data Warehouses Problems.docx
Chapter 13 Business Intelligence and Data Warehouses Problems.docxChapter 13 Business Intelligence and Data Warehouses Problems.docx
Chapter 13 Business Intelligence and Data Warehouses Problems.docx
 
Data Warehouse-Final
Data Warehouse-FinalData Warehouse-Final
Data Warehouse-Final
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Technical Presentation - TimeWIzard
Technical Presentation - TimeWIzardTechnical Presentation - TimeWIzard
Technical Presentation - TimeWIzard
 
Cs437 lecture 7-8
Cs437 lecture 7-8Cs437 lecture 7-8
Cs437 lecture 7-8
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
 
ITReady DW Day2
ITReady DW Day2ITReady DW Day2
ITReady DW Day2
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operations
 
CoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core OperationsCoreBigBench: Benchmarking Big Data Core Operations
CoreBigBench: Benchmarking Big Data Core Operations
 

Teradata Aggregate Join Indices And Dimensional Models

  • 1. Aggregate Join Indices & Dimensional models delivering extraordinary performance Jose M. Borja – Jborja@Menard-inc.com
  • 2. Theory vs. Practice “ In theory, there is no difference between theory and practice. In practice there is….” Yogi Berra The reason we are here today is to help bridge the gap between theory and practice and to share with you real life experiences on using Aggregate Join Indices and Dimensional Models to deliver extraordinary performance
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. Example #1 - SQL for 3NF Model Select product_id, sum(sold_qty * price_amt) – discount_amt – coupon_amt) as LY_Sales_Amt From Sale s, Sale_Line sl Where saledate between 2005-01-01 and date – interval ‘1’ year and s.Store_Nbr = sl.Store_Nbr and s.Transaction_Nbr = sl.Transaction_Nbr Group By product_id Select product_id, sum(sold_qty * price_amt) – discount_amt – coupon_amt) as TY_Sales_Amt From Sale s, Sale_Line sl Where saledate between 2006-01-01 and date and s.Store_Nbr = sl.Store_Nbr and s.Transaction_Nbr = sl.Transaction_Nbr Group by product_id FULL OUTER JOIN
  • 10.
  • 11. Example #1 - SQL for Dimensional Model Select product_id, sum(net_sale_amt) as LY_Sales From store_product_daily_sale Where the_date between 2005-01-01 and date – interval ‘1’ year Group by product_id Select product_id, sum(net_sale_amt) as TY_Sales From store_product_daily_sale Where the_date between 2006-01-01 and date Group by product_id FULL OUTER JOIN The fact table is 1/3 the size of the 3NF Sale_Line table and eliminates a table join between Sales and Sales_Line
  • 12. Add Aggregate Join Indices to boost performance A view is added in the Dimensional model to represent a single table aggregate Join Index at the Corporate level. The AJI removes the Store grain and yields a higher aggregate with less rows.
  • 13. Example #1 - SQL for Dimensional Model using the Join Index View Select product_id, sum(net_sale_amt) as LY_Sales From ji_product_daily_salev Where the_date between 2005-01-01 and date – interval ‘1’ year Group by product_id Select product_id, sum(net_sale_amt) as TY_Sales From ji_product_daily_salev Where the_date between 2006-01-01 and date Group by product_id FULL OUTER JOIN The Join Index is 1/30 the size of the 3NF Sale_Line table
  • 14. A more robust Fact table has more possibilities Bring additional dimensions to yield different levels of aggregation granularity to the mix of Join Indexes
  • 15.  
  • 16. Store & Subclass at 3 levels of Time granularity
  • 17. Product at Daily Level and Store at Daily level
  • 18. Subclass at 5 levels of Time granularity
  • 19. Use the view to gain access to the Join Index CREATE JOIN INDEX JI_PRODUCT_DAILY_SALEv AS SELECT product_id, the_date, product_subclass_id, Supplier_id, sum( net_sale_amt) as net_sale_amt) . . . . . . . FROM STORE_PRODUCT_DAILY_SALE PRIMARY INDEX ( product_id, the_date); REPLACE VIEW JI_PRODUCT_DAILY_SALEv AS SELECT product_id, the_date, product_subclass_id, Supplier_id, sum( net_sale_amt) as net_sale_amt) . . . . . . . . FROM STORE_PRODUCT_DAILY_SALE; SELECT prodcut_id, the_date, net_sale_amt FROM JI_PRODUCT_DAILY_SALEv WHERE product_id = 198273648;
  • 20. CPU consumption for the LY vs. TY Sales Example 12% 2%
  • 21. Disk I/O Usage for the LY vs. TY Sales Example 21% 7%
  • 22. Elapsed Time for the LY vs. TY Sales Example 10% 3%
  • 23. LY vs. TY for 1 Product Corporate Wide
  • 24. LY vs. TY for all Product Categories Corporate Wide
  • 25. Conclusions Teradata technology makes it possible to sustain a 3NF and a Dimensional Model in a single system and enjoy the benefits of having both worlds.
  • 26. Conclusions Teradata technology makes it easy to get the Dimensional model available for use at different levels of granularity using Join Indexes. Sweet performance with low resource usage and auto-magic maintenance!
  • 27. Conclusions The expense of maintaining a dozen Join Indexes on a single Fact table is paid back with just one substantial single report ran against the 3NF model. The Join Indexes are maintained when the DW has less usage at night and the benefits are harvested during the day by the users.
  • 28. Conclusions The number of Secondary Indexes can be kept very low in the 3NF model since the Dimensional Model provides most of the necessary access to large volumes of data. Most access to the 3NF can be limited to PI queries for application support, tactical queries, or reports that can afford table scans.
  • 29. Tips on Join Indexes Keep join indexes limited to only one table. Maintenance is too high on Join Indexes with two or more tables. If one of the tables is maintained the Join Index may need to be maintained also. Do not drop and recreate Join Indexes for maintenance. It is not necessary and can be (very, very, very) costly to recreate. Store the Join Index definitions in macros for reuse and storage in the data dictionary. Create a view to provide “direct” access to the Join Index. Create a dummy Join Index on any table to prevent accidental DROPS. A life saver to see the can not drop table message!