SlideShare uma empresa Scribd logo
1 de 13
March 2010 ACS-4904 Ron McFadyen 1
Aggregate management
References:
Lawrence Corr
• Aggregate improvement
www.intelligententerprise.com/011004/415warehouse1_1.jhtml
• Lost, shrunken, and collapsed
www.intelligententerprise.com/020101/501warehouse1_1.jhtml
Ralph Kimball
• Aggregate navigation with (almost) no metadata
www.dbmsmag.com/9608d54.html
March 2010 ACS-4904 Ron McFadyen 2
Aggregate Navigation
From “20 criteria for dimensional friendly systems”
#4. Open Aggregate Navigation.
The system uses physically stored aggregates as a way to
enhance performance of common queries. These
aggregates, like indexes, are chosen silently by the
database if they are physically present. End users and
application developers do not need to know what
aggregates are available at any point in time, and
applications are not required to explicitly code the name
of an aggregate. All query processes accessing the data,
even those from different application vendors, realize the
full benefit of aggregate navigation.
March 2010 ACS-4904 Ron McFadyen 3
Aggregation
• Created for performance reasons
• Using one base star schema, many aggregates can be defined
• Some dimensions could be lost, some shrunken or collapsed –
see Corr’s articles
Customer Date
Product
Store
Transaction
Price
Cost
Tax
Qty
Consider the
Transaction-level schema:
March 2010 ACS-4904 Ron McFadyen 4
Aggregation
Customer Date
Product
Store
Transaction
Price
Cost
Tax
Qty
Month
Product
Store
Price
Cost
Tax
Qty
Transaction-level schema Aggregate schema with:
•lost dimensions (transaction, customer)
•shrunken dimension (month)
•metrics are accumulated over the
transactions, customers, month
March 2010 ACS-4904 Ron McFadyen 5
Aggregation
• Suppose you must create the schema:
• What are the steps to do so?
– Product and Store already exist, but the others must be
created
Month
Product
Store
Price
Cost
Tax
Qty
March 2010 ACS-4904 Ron McFadyen 6
Aggregation
Design Goal 1
• Aggregates must be stored in their own fact tables; each
distinct aggregation level must occupy its own unique fact
table
March 2010 ACS-4904 Ron McFadyen 7
Aggregation
Design Goal 2
• The dimension tables attached to the aggregate fact tables
must be shrunken versions of the dimension tables
associated with the base fact table
Product
ProductSK
SKU
Description
Brand
Category
Department
Category
SK?
Category
Department
Category is a
shrunken version
of Product
Attributes names
from Product
What values does
the PK of Category
assume?
What name should
the PK of Category
have?
March 2010 ACS-4904 Ron McFadyen 8
Aggregation
Design Goal 3
• The base atomic fact table and all of its related aggregate
fact tables must be associated together as a “family of
schemas” so that the aggregate navigator knows which
tables are related to one another.
– Kimball’s paper presents an algorithm/technique for
mapping an SQL statement from a base schema to an
aggregate schema
March 2010 ACS-4904 Ron McFadyen 9
Aggregation
Design Goal 4
• Force all SQL created by any end-user data access tool or
application to refer exclusively to the base fact table and
its associated full-size dimension tables.
March 2010 ACS-4904 Ron McFadyen 10
Kimball’s Aggregate Navigation Algorithm
1. For any given SQL statement presented to the DBMS, find the smallest
fact table that has not yet been examined in the family of schemas
referenced by the query. "Smallest" in this case means the least
number of rows. Finding the smallest unexamined schema is a simple
lookup in the aggregate navigator metadata table. Choose the smallest
schema and proceed to step 2.
2. Compare the table fields in the SQL statement to the table fields in the
particular Fact and Dimension tables being examined.
This is a series of lookups in the DBMS system catalog.
If all of the fields in the SQL statement can be found in the Fact
and Dimension tables being examined, alter the original
SQL by simply substituting destination table names for
original table names. No field names need to change.
If any field in the SQL statement cannot be found in the current
Fact and Dimension tables, then go back to step 1 and find
the next larger Fact table.
3. Run the altered SQL. It is guaranteed to return the correct answer
because all of the fields in the SQL statement are present in the
chosen schema.
March 2010 ACS-4904 Ron McFadyen 11
Aggregate Navigation
www.intelligententerprise.com/000929/feat1.jhtml
“Other data mart capabilities worth mentioning are
aggregation and aggregate navigation. The nature of
multidimensional analysis dictates the extensive use of
aggregation.
…
Both UDB and O8i provide a database-resident means for
navigating user queries to the optimum aggregated tables”
March 2010 ACS-4904 Ron McFadyen 12
Materialized Views
An MV is a view where the result set is pre-computed and
stored in the database
Periodically an MV must be re-computed to account for updates
to its source tables
MVs represent a performance enhancement to a DBMS as the
result set is immediately available when the view is referenced
in a SQL statement
MVs are used in some cases to provide summary or aggregates
for data warehousing transparently to the end-user
Query optimization is a DBMS feature that may re-write an
SQL statement supplied by a user.
March 2010 ACS-4904 Ron McFadyen 13
Materialized Views
Oracle example:
CREATE MATERIALIZED VIEW hr.mview_employees AS
SELECT employees.employee_id, employees.email
FROM employees
UNION ALL
SELECT new_employees.employee_id, new_employees.email
FROM new_employees;
SQL Server has “indexed views” … see
http://www.databasejournal.com/features/mssql/article.php/2119721

Mais conteúdo relacionado

Mais procurados

10. XML in DBMS
10. XML in DBMS10. XML in DBMS
10. XML in DBMS
koolkampus
 
Types of databases
Types of databasesTypes of databases
Types of databases
PAQUIAAIZEL
 
Data structure and its types
Data structure and its typesData structure and its types
Data structure and its types
Navtar Sidhu Brar
 
Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to database
emailharmeet
 

Mais procurados (20)

10. XML in DBMS
10. XML in DBMS10. XML in DBMS
10. XML in DBMS
 
ORACLE PL/SQL
ORACLE PL/SQLORACLE PL/SQL
ORACLE PL/SQL
 
Database systems - Chapter 2 (Remaining)
Database systems - Chapter 2 (Remaining)Database systems - Chapter 2 (Remaining)
Database systems - Chapter 2 (Remaining)
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
Types of Database Models
Types of Database ModelsTypes of Database Models
Types of Database Models
 
11 Database Concepts
11 Database Concepts11 Database Concepts
11 Database Concepts
 
Types of databases
Types of databasesTypes of databases
Types of databases
 
joins in database
 joins in database joins in database
joins in database
 
[OOP - Lec 18] Static Data Member
[OOP - Lec 18] Static Data Member[OOP - Lec 18] Static Data Member
[OOP - Lec 18] Static Data Member
 
Basic Concept of Database
Basic Concept of DatabaseBasic Concept of Database
Basic Concept of Database
 
Introduction to Database
Introduction to DatabaseIntroduction to Database
Introduction to Database
 
Files Vs DataBase
Files Vs DataBaseFiles Vs DataBase
Files Vs DataBase
 
Data structure and its types
Data structure and its typesData structure and its types
Data structure and its types
 
Lecture 01 introduction to database
Lecture 01 introduction to databaseLecture 01 introduction to database
Lecture 01 introduction to database
 
set operators.pptx
set operators.pptxset operators.pptx
set operators.pptx
 
Database Management System
Database Management SystemDatabase Management System
Database Management System
 
DBMS: Types of keys
DBMS:  Types of keysDBMS:  Types of keys
DBMS: Types of keys
 
Database Management System Introduction
Database Management System IntroductionDatabase Management System Introduction
Database Management System Introduction
 
Data base management system and Architecture ppt.
Data base management system and Architecture ppt.Data base management system and Architecture ppt.
Data base management system and Architecture ppt.
 
SQL(DDL & DML)
SQL(DDL & DML)SQL(DDL & DML)
SQL(DDL & DML)
 

Destaque (6)

Difference between association, aggregation and composition
Difference between association, aggregation and compositionDifference between association, aggregation and composition
Difference between association, aggregation and composition
 
Aggregate fact tables
Aggregate fact tablesAggregate fact tables
Aggregate fact tables
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
 
Aggregate - Concrete Technology
Aggregate - Concrete TechnologyAggregate - Concrete Technology
Aggregate - Concrete Technology
 
Aggregates
 Aggregates Aggregates
Aggregates
 
Aggregates
AggregatesAggregates
Aggregates
 

Semelhante a Aggregation

World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
Karthik K Iyengar
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
Lucidworks
 
CaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-XcelsiusCaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-Xcelsius
Mohammed Imran Alam
 
Data warehousing features in oracle
Data warehousing features in oracleData warehousing features in oracle
Data warehousing features in oracle
Jinal Shah
 

Semelhante a Aggregation (20)

Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimization
 
F04302053057
F04302053057F04302053057
F04302053057
 
Cognos framework manager
Cognos framework managerCognos framework manager
Cognos framework manager
 
Process management seminar
Process management seminarProcess management seminar
Process management seminar
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014
 
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...
 
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
World2016_T1_S8_How to upgrade your cubes from 9.x to 10 and turn on optimize...
 
dd presentation.pdf
dd presentation.pdfdd presentation.pdf
dd presentation.pdf
 
Best Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle OptimizerBest Practices for Oracle Exadata and the Oracle Optimizer
Best Practices for Oracle Exadata and the Oracle Optimizer
 
Workload Management with MicroStrategy Software and IBM DB2 9.5
Workload Management with MicroStrategy Software and IBM DB2 9.5Workload Management with MicroStrategy Software and IBM DB2 9.5
Workload Management with MicroStrategy Software and IBM DB2 9.5
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
 
Presentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12cPresentación Oracle Database Migración consideraciones 10g/11g/12c
Presentación Oracle Database Migración consideraciones 10g/11g/12c
 
Democratization of NOSQL Document-Database over Relational Database Comparati...
Democratization of NOSQL Document-Database over Relational Database Comparati...Democratization of NOSQL Document-Database over Relational Database Comparati...
Democratization of NOSQL Document-Database over Relational Database Comparati...
 
BigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery MLBigdataConference Europe - BigQuery ML
BigdataConference Europe - BigQuery ML
 
SQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfSQL Server vs Oracle.pdf
SQL Server vs Oracle.pdf
 
SQL Server vs Oracle.pdf
SQL Server vs Oracle.pdfSQL Server vs Oracle.pdf
SQL Server vs Oracle.pdf
 
A Practical Guide to Database Design.pdf
A Practical Guide to Database Design.pdfA Practical Guide to Database Design.pdf
A Practical Guide to Database Design.pdf
 
CaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-XcelsiusCaseStudy-MohammedImranAlam-Xcelsius
CaseStudy-MohammedImranAlam-Xcelsius
 
Data warehousing features in oracle
Data warehousing features in oracleData warehousing features in oracle
Data warehousing features in oracle
 
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICSQUERY OPTIMIZATION FOR BIG DATA ANALYTICS
QUERY OPTIMIZATION FOR BIG DATA ANALYTICS
 

Aggregation

  • 1. March 2010 ACS-4904 Ron McFadyen 1 Aggregate management References: Lawrence Corr • Aggregate improvement www.intelligententerprise.com/011004/415warehouse1_1.jhtml • Lost, shrunken, and collapsed www.intelligententerprise.com/020101/501warehouse1_1.jhtml Ralph Kimball • Aggregate navigation with (almost) no metadata www.dbmsmag.com/9608d54.html
  • 2. March 2010 ACS-4904 Ron McFadyen 2 Aggregate Navigation From “20 criteria for dimensional friendly systems” #4. Open Aggregate Navigation. The system uses physically stored aggregates as a way to enhance performance of common queries. These aggregates, like indexes, are chosen silently by the database if they are physically present. End users and application developers do not need to know what aggregates are available at any point in time, and applications are not required to explicitly code the name of an aggregate. All query processes accessing the data, even those from different application vendors, realize the full benefit of aggregate navigation.
  • 3. March 2010 ACS-4904 Ron McFadyen 3 Aggregation • Created for performance reasons • Using one base star schema, many aggregates can be defined • Some dimensions could be lost, some shrunken or collapsed – see Corr’s articles Customer Date Product Store Transaction Price Cost Tax Qty Consider the Transaction-level schema:
  • 4. March 2010 ACS-4904 Ron McFadyen 4 Aggregation Customer Date Product Store Transaction Price Cost Tax Qty Month Product Store Price Cost Tax Qty Transaction-level schema Aggregate schema with: •lost dimensions (transaction, customer) •shrunken dimension (month) •metrics are accumulated over the transactions, customers, month
  • 5. March 2010 ACS-4904 Ron McFadyen 5 Aggregation • Suppose you must create the schema: • What are the steps to do so? – Product and Store already exist, but the others must be created Month Product Store Price Cost Tax Qty
  • 6. March 2010 ACS-4904 Ron McFadyen 6 Aggregation Design Goal 1 • Aggregates must be stored in their own fact tables; each distinct aggregation level must occupy its own unique fact table
  • 7. March 2010 ACS-4904 Ron McFadyen 7 Aggregation Design Goal 2 • The dimension tables attached to the aggregate fact tables must be shrunken versions of the dimension tables associated with the base fact table Product ProductSK SKU Description Brand Category Department Category SK? Category Department Category is a shrunken version of Product Attributes names from Product What values does the PK of Category assume? What name should the PK of Category have?
  • 8. March 2010 ACS-4904 Ron McFadyen 8 Aggregation Design Goal 3 • The base atomic fact table and all of its related aggregate fact tables must be associated together as a “family of schemas” so that the aggregate navigator knows which tables are related to one another. – Kimball’s paper presents an algorithm/technique for mapping an SQL statement from a base schema to an aggregate schema
  • 9. March 2010 ACS-4904 Ron McFadyen 9 Aggregation Design Goal 4 • Force all SQL created by any end-user data access tool or application to refer exclusively to the base fact table and its associated full-size dimension tables.
  • 10. March 2010 ACS-4904 Ron McFadyen 10 Kimball’s Aggregate Navigation Algorithm 1. For any given SQL statement presented to the DBMS, find the smallest fact table that has not yet been examined in the family of schemas referenced by the query. "Smallest" in this case means the least number of rows. Finding the smallest unexamined schema is a simple lookup in the aggregate navigator metadata table. Choose the smallest schema and proceed to step 2. 2. Compare the table fields in the SQL statement to the table fields in the particular Fact and Dimension tables being examined. This is a series of lookups in the DBMS system catalog. If all of the fields in the SQL statement can be found in the Fact and Dimension tables being examined, alter the original SQL by simply substituting destination table names for original table names. No field names need to change. If any field in the SQL statement cannot be found in the current Fact and Dimension tables, then go back to step 1 and find the next larger Fact table. 3. Run the altered SQL. It is guaranteed to return the correct answer because all of the fields in the SQL statement are present in the chosen schema.
  • 11. March 2010 ACS-4904 Ron McFadyen 11 Aggregate Navigation www.intelligententerprise.com/000929/feat1.jhtml “Other data mart capabilities worth mentioning are aggregation and aggregate navigation. The nature of multidimensional analysis dictates the extensive use of aggregation. … Both UDB and O8i provide a database-resident means for navigating user queries to the optimum aggregated tables”
  • 12. March 2010 ACS-4904 Ron McFadyen 12 Materialized Views An MV is a view where the result set is pre-computed and stored in the database Periodically an MV must be re-computed to account for updates to its source tables MVs represent a performance enhancement to a DBMS as the result set is immediately available when the view is referenced in a SQL statement MVs are used in some cases to provide summary or aggregates for data warehousing transparently to the end-user Query optimization is a DBMS feature that may re-write an SQL statement supplied by a user.
  • 13. March 2010 ACS-4904 Ron McFadyen 13 Materialized Views Oracle example: CREATE MATERIALIZED VIEW hr.mview_employees AS SELECT employees.employee_id, employees.email FROM employees UNION ALL SELECT new_employees.employee_id, new_employees.email FROM new_employees; SQL Server has “indexed views” … see http://www.databasejournal.com/features/mssql/article.php/2119721