SlideShare uma empresa Scribd logo
1 de 23
Rameswara Reddy.K.V
 Traditional OLTP not designed for DWH
 DWH projects  data model, database schema
 Data model  Analysis but poor performance
 Model schema
 The schema methodology that is gaining wide spread
acceptance for dwh is Star Schema.
Data Layout for Best Access
 All industries have developed considerable expertise in
implementing efficient operational systems such as
payroll, inventory tracking, and purchasing.
 The original objectives in developing an abstract
model known as the relational model
 The data warehouse RDBMS typically needs to process
queries that are large complex, ad hoc, and data
intensive. so data warehouse RDBMS are very different
,it use a database schema for maximizing concurrency
and optimizing insert, update, and delete
performance.
 For solving modern business problems such as market
analysis and financial forecasting requires query-
centric database schemas that are array oriented and
multidimensional in nature.
Multi dimensional data model
 How much revenue did the new product generate?
 How much revenue did the new product generate by
month, in the north eastern division, broken down by
user demographic, by sales office, relative to the
previous version of the product, compare with the
plan?
 The major database technology for DWH is RDBMS.
Star Schema
 The multi dimensional view of data that is expressed
using relational database semantics is provided by the
database schema design called star schema.
Two groups: facts and dimensions
Facts are the core data element being analyzed
e.g.. items sold
dimensions are attributes about the facts
e.g. date of purchase
DBA View point
 A star schema is a relational schema organized around
a central table joined to few smaller tables (dimension
tables)using foreign key references.
 The facts are typically additive and are accessed via
dimensions.
 Dimension tables are heavily indexed
 A star schema created for every industry
Potential performance problem
with star schemas
 Indexing
 Level Indicators
 Star schema design
 Star schema join problem
Indexing
 It improve the performance in the star schema design.
 The table in star schema design contain the entire
hierarchy of attributes(PERIOD dimension this
hierarchy could be day->week->month->quarter-
>year),one approach is to create multi part key of day,
week, month ,quarter ,year .it presents some problems
in the star schema model because it should be in
normalized
problem
 it require multiple metadata definitions
 Since the fact table must carry all key components as
part of its primary key, addition or deletion of levels in
the physical modification of the affected table.
 Carrying all the segments of the compound
dimensional key in the fact table increases the size of
the index, thus impacting both performance and
scalability.
Level Indicator
 Another potential problem with the star schema
design is that in order to navigate the dimensions
successfully.
 The dimensional table design includes a level of
hierarchy indicator for every record.
 Every query that is retrieving detail records from a
table that stores details & aggregates must use this
indicator as an additional constraint to obtain a correct
result.
Pairwise problem:
 The traditional OLAP RDBMS engines are not designed for the set of complex
queries that can be issued against a star schema.
 We need to retrieve related information from several tables in a single query
has a limitation
 Many OLTP RDBMS can join only two tables, so the complex query must be in
the RDBMS needs to break the query into series of pairwise joins. And also
provide the intermediate result. this process will be do up to end of the result
 Thus intermediate results can be large & very costly to create. It affects the
query performance. The number of ways to pairwise join a set of N tables is N!(
N factorial) for example.
 The query has five table 5!=(6x5x4x3x2x1)=120 combinations.
Star schema join problem:
 Because the number of pairwise join combinations is often too
large to fully evaluate many RDBMS optimizers limit the
selection on the basis of a particular criterion. In data
warehousing environment this strategy is very inefficient for star
schemas.
 In a star schema the only table directly related to most other
tables in the fact table this means that the fact table is natural
candidate for the first pairwise join.
 Unfortunately the fact table is typically the very largest table in
the query. A pairwise join order generates very large intermediate
result set. So it affects query performance.
Solution to performance
problems:
 A common optimization provides some relief for the
star schema join problem.
 The basic idea of this optimization is to get around the
pairwise join strategy of selecting only related tables.
 When two tables are joined and no columns ―link‖
the tables every combination of two tables’ rows are
produced. In terms of relational algebra this is called a
Cartesian product.
Contd..
 Normally the RDBMS optimizer logic would never
consider Cartesian product but star schema
considering these Cartesian products. Can sometimes
improve query.
 Alternatively it is a Cartesian product of all dimension
tables is first generated.
 The key cost is the need to generate the Cartesian
product of the dimension tables as long as the cost of
generating the Cartesian product is less than the cost
of generating intermediate results with the fact table.
Star Join and Star Index
A STARjoin is a high-speed, single-pass, parallelizable multitable join and is
introduced by Red Brick’s RDBMS.
 STARindexes to accelerate the join performance
 STARindexes are created in one or more foreign key columns of a fact
table.
 Traditional multicolumn references a single table where as the
STARindex can reference multiple tables
 With multicolumn indexes, if a query’s WHERE Clause does
not contain on all the columns in the composite index, the
index cannot be fully used unless the specified columns are a
leading subset.
 The STARjoin using STARindex could efficiently join
the dimension tables to the fact table without penalty
of generating the full Cartesian product.
 The STARjoin algorithm is able to generate a
Cartesian product in regions where these are rows of
interest and bypass generating Cartesian products over
region where these are no rows.
 10 to 20 times faster than traditional pairwise join
techniques
Bit mapped Indexing
Data is loaded into SYBASE IQ, it converts all data into
a series of bitmaps; which are them highly
compressed and stored in disk.
 SYBASE IQ indexes do not point to data stored
elsewhere all data is contained in the index structure.
Data Cardinality.
 Bitmap indexes are used to queries against low-cardinality data-that is data
in which the total number of potential values is relatively low.
 For example, state code data cardinality is 50 and gender cardinality is only
2(male and female). 
 For low cardinality data, each distinct value has its own bitmap index
consisting of a bit for every row in the table.
 The bit map index representation is a 10000 bit long vector which has its
bits turned ON (value of 1) for every record that satisfies “gender=”M”
condition.
 Bitmap indexes can become cumbersome and even unsuitable for high
cardinality data where the range of potential value is high.
 SYBASE IQ high cardinality index starts at 1000 distinct values.
Index Types.
The SYBASE IQ provides five index techniques. One is
a default index called the Fast projection index and the other is
either a low-or high-cardinality index.
Performance.
SYBASE IQ technology achieves very good performance
in ad hoc queries for several reasons.
 Bitwise Technology. This allows rapid response to queries
containing various data type, supports data aggregation and
grouping.
 Compression. SYBASE IQ uses sophisticated algorithm to
compress data into bitmapping SYBASE IQ can hold more
data in memory minimizing expensive I/O operations.
 Optimized memory-based processing: SYBASE IQ caches data columns in memory
according to the nature of user’s queries.
 Columnwise processing: SYBASE IQ scans columns not rows. For the low selectivity
queries (those that select only a few attributes from a multi attribute row) the technique of
scanning by columns drastically reduces the amount of data the engine has to search.
 Low Overhead: As an engine optimized for decision support, SYBASE IQ does not carry an
overhead associated with traditional OLTP designed RDBMS performance.
 Large Block I/O: Block size high in SYBASE IQ can be tuned from 512 bytes to 64 Kbytes,
so that the system can read as much information as necessary in a single I/O.
 Operating-system-level parallelism: SYBASE IQ breaks low-level operations like sorts,
bitmap manipulation, load and I/O into non blocking operation’s that the operating systems can
schedule independently and in parallel.
 Prejoin and ad hoc join Capabilities: SYBASE IQ allows users to take advantage of know
join relationships between tables by defining them in advance and building indexes between
tables.
Shortcoming of Indexing.
Some of the tradeoffs of the SYBASE IQ are as follows
 No Updates. SYBASE IQ does not support updates, and therefore is
unsuitable
 Lack of core RDBMS features. SYBASE IQ does not support all the
backup and recovery and also does not support stored procedures, data
integrity checker, data replication, complex data types.
 Less advantage for planned queries. SYBASE IQ advantages are
most obvious when running ad hoc queries.
 High memory Usage. SYBASE IQ takes advantage of available system
memory to avoid expensive I/O operations.
Column Local Storage
Performance in the data warehouse environment can be achieved
by storing data in memory in column wise instead to store one row at a
time and each row can be viewed and accessed as single record.
Emp-id Emp-Name Dept Salary
1004
1005
1006
Suresh
Mani
Sara
CSE
MECH
CIVIL
15000
25000
23000
1004 Suresh CSE
15000
1005 Mani
MECH 25000
1006 Sara CIVIL
23000
1004 1005 1006
Suresh Mani Sara
CSE MECH
CIVIL
15000 25000
23000

Mais conteúdo relacionado

Mais procurados

Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization Hafiz faiz
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)DheerajPachauri
 
Cost estimation for Query Optimization
Cost estimation for Query OptimizationCost estimation for Query Optimization
Cost estimation for Query OptimizationRavinder Kamboj
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methodsrajshreemuthiah
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalitiesKrish_ver2
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model IntroductionNishant Munjal
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Salah Amean
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data MiningValerii Klymchuk
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data miningSlideshare
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data MiningDHIVYADEVAKI
 
13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMSkoolkampus
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data MiningValerii Klymchuk
 

Mais procurados (20)

Query Decomposition and data localization
Query Decomposition and data localization Query Decomposition and data localization
Query Decomposition and data localization
 
Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)Clustering for Stream and Parallelism (DATA ANALYTICS)
Clustering for Stream and Parallelism (DATA ANALYTICS)
 
Database fragmentation
Database fragmentationDatabase fragmentation
Database fragmentation
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
Cost estimation for Query Optimization
Cost estimation for Query OptimizationCost estimation for Query Optimization
Cost estimation for Query Optimization
 
lazy learners and other classication methods
lazy learners and other classication methodslazy learners and other classication methods
lazy learners and other classication methods
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Relational Data Model Introduction
Relational Data Model IntroductionRelational Data Model Introduction
Relational Data Model Introduction
 
Recognition-of-tokens
Recognition-of-tokensRecognition-of-tokens
Recognition-of-tokens
 
Linked list
Linked listLinked list
Linked list
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
DDBMS Paper with Solution
DDBMS Paper with SolutionDDBMS Paper with Solution
DDBMS Paper with Solution
 
Insertion Sorting
Insertion SortingInsertion Sorting
Insertion Sorting
 
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
Data Mining: Concepts and Techniques chapter 07 : Advanced Frequent Pattern M...
 
05 Clustering in Data Mining
05 Clustering in Data Mining05 Clustering in Data Mining
05 Clustering in Data Mining
 
Major issues in data mining
Major issues in data miningMajor issues in data mining
Major issues in data mining
 
Hashing PPT
Hashing PPTHashing PPT
Hashing PPT
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
13. Query Processing in DBMS
13. Query Processing in DBMS13. Query Processing in DBMS
13. Query Processing in DBMS
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 

Semelhante a Star Schema Design and Performance Optimization

Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouseUday Kothari
 
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...inscit2006
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of DenormalizationAliya Saldanha
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09guest9d79e073
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Mark Ginnebaugh
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008paulguerin
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx政宏 张
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...IEEEMEMTECHSTUDENTSPROJECTS
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEEMEMTECHSTUDENTPROJECTS
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecturehasanshan
 
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)MILL5
 
Performance By Design
Performance By DesignPerformance By Design
Performance By DesignGuy Harrison
 
World2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewWorld2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewFarah Omer
 
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...Datavail
 

Semelhante a Star Schema Design and Performance Optimization (20)

Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
 
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
A Metadata-Driven Approach to Computing Financial Analytics in a Relational D...
 
When & Why\'s of Denormalization
When & Why\'s of DenormalizationWhen & Why\'s of Denormalization
When & Why\'s of Denormalization
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
 
MULTIMEDIA MODELING
MULTIMEDIA MODELINGMULTIMEDIA MODELING
MULTIMEDIA MODELING
 
Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008Myth busters - performance tuning 102 2008
Myth busters - performance tuning 102 2008
 
Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns Exploring T-SQL Anti-Patterns
Exploring T-SQL Anti-Patterns
 
SPL_ALL_EN.pptx
SPL_ALL_EN.pptxSPL_ALL_EN.pptx
SPL_ALL_EN.pptx
 
CS636-olap.ppt
CS636-olap.pptCS636-olap.ppt
CS636-olap.ppt
 
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
2014 IEEE DOTNET DATA MINING PROJECT A novel model for mining association rul...
 
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
IEEE 2014 DOTNET DATA MINING PROJECTS A novel model for mining association ru...
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Multidimensional Database Design & Architecture
Multidimensional Database Design & ArchitectureMultidimensional Database Design & Architecture
Multidimensional Database Design & Architecture
 
Database aggregation using metadata
Database aggregation using metadataDatabase aggregation using metadata
Database aggregation using metadata
 
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)Whitepaper  Performance Tuning using Upsert and SCD (Task Factory)
Whitepaper Performance Tuning using Upsert and SCD (Task Factory)
 
Mr bi
Mr biMr bi
Mr bi
 
Performance By Design
Performance By DesignPerformance By Design
Performance By Design
 
World2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverviewWorld2016_T5_S7_TeradataFunctionalOverview
World2016_T5_S7_TeradataFunctionalOverview
 
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
SQL Pass Summit Presentations from Datavail - Optimize SQL Server: Query Tuni...
 

Último

Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 

Último (20)

OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 

Star Schema Design and Performance Optimization

  • 2.  Traditional OLTP not designed for DWH  DWH projects  data model, database schema  Data model  Analysis but poor performance  Model schema  The schema methodology that is gaining wide spread acceptance for dwh is Star Schema.
  • 3. Data Layout for Best Access  All industries have developed considerable expertise in implementing efficient operational systems such as payroll, inventory tracking, and purchasing.  The original objectives in developing an abstract model known as the relational model
  • 4.  The data warehouse RDBMS typically needs to process queries that are large complex, ad hoc, and data intensive. so data warehouse RDBMS are very different ,it use a database schema for maximizing concurrency and optimizing insert, update, and delete performance.  For solving modern business problems such as market analysis and financial forecasting requires query- centric database schemas that are array oriented and multidimensional in nature.
  • 5. Multi dimensional data model  How much revenue did the new product generate?  How much revenue did the new product generate by month, in the north eastern division, broken down by user demographic, by sales office, relative to the previous version of the product, compare with the plan?  The major database technology for DWH is RDBMS.
  • 6. Star Schema  The multi dimensional view of data that is expressed using relational database semantics is provided by the database schema design called star schema. Two groups: facts and dimensions Facts are the core data element being analyzed e.g.. items sold dimensions are attributes about the facts e.g. date of purchase
  • 7. DBA View point  A star schema is a relational schema organized around a central table joined to few smaller tables (dimension tables)using foreign key references.  The facts are typically additive and are accessed via dimensions.  Dimension tables are heavily indexed  A star schema created for every industry
  • 8. Potential performance problem with star schemas  Indexing  Level Indicators  Star schema design  Star schema join problem
  • 9. Indexing  It improve the performance in the star schema design.  The table in star schema design contain the entire hierarchy of attributes(PERIOD dimension this hierarchy could be day->week->month->quarter- >year),one approach is to create multi part key of day, week, month ,quarter ,year .it presents some problems in the star schema model because it should be in normalized
  • 10. problem  it require multiple metadata definitions  Since the fact table must carry all key components as part of its primary key, addition or deletion of levels in the physical modification of the affected table.  Carrying all the segments of the compound dimensional key in the fact table increases the size of the index, thus impacting both performance and scalability.
  • 11. Level Indicator  Another potential problem with the star schema design is that in order to navigate the dimensions successfully.  The dimensional table design includes a level of hierarchy indicator for every record.  Every query that is retrieving detail records from a table that stores details & aggregates must use this indicator as an additional constraint to obtain a correct result.
  • 12. Pairwise problem:  The traditional OLAP RDBMS engines are not designed for the set of complex queries that can be issued against a star schema.  We need to retrieve related information from several tables in a single query has a limitation  Many OLTP RDBMS can join only two tables, so the complex query must be in the RDBMS needs to break the query into series of pairwise joins. And also provide the intermediate result. this process will be do up to end of the result  Thus intermediate results can be large & very costly to create. It affects the query performance. The number of ways to pairwise join a set of N tables is N!( N factorial) for example.  The query has five table 5!=(6x5x4x3x2x1)=120 combinations.
  • 13. Star schema join problem:  Because the number of pairwise join combinations is often too large to fully evaluate many RDBMS optimizers limit the selection on the basis of a particular criterion. In data warehousing environment this strategy is very inefficient for star schemas.  In a star schema the only table directly related to most other tables in the fact table this means that the fact table is natural candidate for the first pairwise join.  Unfortunately the fact table is typically the very largest table in the query. A pairwise join order generates very large intermediate result set. So it affects query performance.
  • 14. Solution to performance problems:  A common optimization provides some relief for the star schema join problem.  The basic idea of this optimization is to get around the pairwise join strategy of selecting only related tables.  When two tables are joined and no columns ―link‖ the tables every combination of two tables’ rows are produced. In terms of relational algebra this is called a Cartesian product.
  • 15. Contd..  Normally the RDBMS optimizer logic would never consider Cartesian product but star schema considering these Cartesian products. Can sometimes improve query.  Alternatively it is a Cartesian product of all dimension tables is first generated.  The key cost is the need to generate the Cartesian product of the dimension tables as long as the cost of generating the Cartesian product is less than the cost of generating intermediate results with the fact table.
  • 16. Star Join and Star Index A STARjoin is a high-speed, single-pass, parallelizable multitable join and is introduced by Red Brick’s RDBMS.  STARindexes to accelerate the join performance  STARindexes are created in one or more foreign key columns of a fact table.  Traditional multicolumn references a single table where as the STARindex can reference multiple tables  With multicolumn indexes, if a query’s WHERE Clause does not contain on all the columns in the composite index, the index cannot be fully used unless the specified columns are a leading subset.
  • 17.  The STARjoin using STARindex could efficiently join the dimension tables to the fact table without penalty of generating the full Cartesian product.  The STARjoin algorithm is able to generate a Cartesian product in regions where these are rows of interest and bypass generating Cartesian products over region where these are no rows.  10 to 20 times faster than traditional pairwise join techniques
  • 18. Bit mapped Indexing Data is loaded into SYBASE IQ, it converts all data into a series of bitmaps; which are them highly compressed and stored in disk.  SYBASE IQ indexes do not point to data stored elsewhere all data is contained in the index structure.
  • 19. Data Cardinality.  Bitmap indexes are used to queries against low-cardinality data-that is data in which the total number of potential values is relatively low.  For example, state code data cardinality is 50 and gender cardinality is only 2(male and female).  For low cardinality data, each distinct value has its own bitmap index consisting of a bit for every row in the table.  The bit map index representation is a 10000 bit long vector which has its bits turned ON (value of 1) for every record that satisfies “gender=”M” condition.  Bitmap indexes can become cumbersome and even unsuitable for high cardinality data where the range of potential value is high.  SYBASE IQ high cardinality index starts at 1000 distinct values.
  • 20. Index Types. The SYBASE IQ provides five index techniques. One is a default index called the Fast projection index and the other is either a low-or high-cardinality index. Performance. SYBASE IQ technology achieves very good performance in ad hoc queries for several reasons.  Bitwise Technology. This allows rapid response to queries containing various data type, supports data aggregation and grouping.  Compression. SYBASE IQ uses sophisticated algorithm to compress data into bitmapping SYBASE IQ can hold more data in memory minimizing expensive I/O operations.
  • 21.  Optimized memory-based processing: SYBASE IQ caches data columns in memory according to the nature of user’s queries.  Columnwise processing: SYBASE IQ scans columns not rows. For the low selectivity queries (those that select only a few attributes from a multi attribute row) the technique of scanning by columns drastically reduces the amount of data the engine has to search.  Low Overhead: As an engine optimized for decision support, SYBASE IQ does not carry an overhead associated with traditional OLTP designed RDBMS performance.  Large Block I/O: Block size high in SYBASE IQ can be tuned from 512 bytes to 64 Kbytes, so that the system can read as much information as necessary in a single I/O.  Operating-system-level parallelism: SYBASE IQ breaks low-level operations like sorts, bitmap manipulation, load and I/O into non blocking operation’s that the operating systems can schedule independently and in parallel.  Prejoin and ad hoc join Capabilities: SYBASE IQ allows users to take advantage of know join relationships between tables by defining them in advance and building indexes between tables.
  • 22. Shortcoming of Indexing. Some of the tradeoffs of the SYBASE IQ are as follows  No Updates. SYBASE IQ does not support updates, and therefore is unsuitable  Lack of core RDBMS features. SYBASE IQ does not support all the backup and recovery and also does not support stored procedures, data integrity checker, data replication, complex data types.  Less advantage for planned queries. SYBASE IQ advantages are most obvious when running ad hoc queries.  High memory Usage. SYBASE IQ takes advantage of available system memory to avoid expensive I/O operations.
  • 23. Column Local Storage Performance in the data warehouse environment can be achieved by storing data in memory in column wise instead to store one row at a time and each row can be viewed and accessed as single record. Emp-id Emp-Name Dept Salary 1004 1005 1006 Suresh Mani Sara CSE MECH CIVIL 15000 25000 23000 1004 Suresh CSE 15000 1005 Mani MECH 25000 1006 Sara CIVIL 23000 1004 1005 1006 Suresh Mani Sara CSE MECH CIVIL 15000 25000 23000