SlideShare a Scribd company logo
1 of 22
Data Warehouse & OLAP
Kuliah 1 Introduction
Slide banyak mengambil dari acuan-
acuan yang dipakai
DB, DW, DM
DW vs DB
DW vs DM
INTRODUCTION
Increasingly, organizations are analyzing
current and historical data to identify
useful patterns and support business
strategies.
Emphasis is on complex, interactive,
exploratory analysis of very large datasets
created by integrating data from across all
parts of an enterprise; data is fairly static.
INTRODUCTION (..cont)
Contrast such On-Line Analytic
Processing (OLAP) with traditional On-line
Transaction Processing (OLTP): mostly
long queries, instead of short update
transactions.
OLTP: roda penggerak organisasi
OLAP: mengawasi gerak roda
Cat: rencana mata kuliah transaction processing
THREE COMPLEMENTARY TRENDS
 Data Warehousing: Consolidate data from many
sources in one large repository.
 Loading, periodic synchronization of replicas.
 Semantic integration.
 OLAP:
 Complex SQL queries and views.
 Queries based on spreadsheet-style operations and
“multidimensional” view of data.  bahkan memakai
spreadsheet.
 Interactive and “online” queries.
 Data Mining: Exploratory search for interesting
trends and anomalies.
EXTERNAL DATA SOURCES
EXTRACT
TRANSFORM
LOAD
REFRESH
DATA
WAREHOUSEMetadata
Repository
SUPPORTS
OLAP
DATA
MINING
Data Warehouse: A Multi-Tiered ArchitectureData Warehouse: A Multi-Tiered Architecture
Data
Warehouse
Extract
Transform
Load
Refresh
OLAP Engine
Analysis
Query
Reports
Data mining
Monitor
&
Integrator
Metadata
Data Sources Front-End Tools
Serve
Data Marts
Operational
DBs
Other
sources
Data Storage
OLAP Server
DATA WAREHOUSING
Integrated data spanning long time
periods, often augmented with summary
information.
Several gigabytes to terabytes common.
Interactive response times expected for
complex queries.
Multidimensional Data Model
 Collection of numeric measures,
which depend on a set of dimensions.
 E.g., measure Sales, dimensions
Product (key: pid), Location (locid),
and Time (timeid).
8 10 10
30 20 50
25 8 15
1 2 3
timeid
pid
111213
11 1 1 25
11 2 1 8
11 3 1 15
12 1 1 30
12 2 1 20
12 3 1 50
13 1 1 8
13 2 1 10
13 3 1 10
11 1 2 35
pid
timeid
locid
sales
locid
Slice locid=1
is shown:
MOLAP vs ROLAP
 Multidimensional data can be stored physically
in a (disk-resident, persistent) array; called
MOLAP systems. Alternatively, can store as a
relation; called ROLAP systems.
 The main relation, which relates dimensions to
a measure, is called the fact table. Each
dimension can have additional attributes and an
associated dimension table.
 E.g., Products(pid, pname, category, price)
 Fact tables are much larger than dimensional
tables.
Dimension Hierarchies
For each dimension, the set of values
can be organized in a hierarchy:
PRODUCT TIME LOCATION
category week month state
pname date city
year
quarter country
OLAP Queries
Influenced by SQL and by spreadsheets.
A common operation is to aggregate a
measure over one or more dimensions.
 Find total sales.
 Find total sales for each city, or for each state.
 Find top five products ranked by total sales.
Roll-up: Aggregating at different levels
of a dimension hierarchy.
 E.g., Given total sales by city, we can roll-up to
get sales by state.
OLAP Queries (..cont)
 Drill-down: The inverse of roll-up.
 E.g., Given total sales by state, can drill-down to
get total sales by city.
 E.g., Can also drill-down on different dimension
to get total sales by product for each state.
 Pivoting: Aggregation on selected dimensions.
 E.g., Pivoting on Location and Time
yields this cross-tabulation:
 Slicing and Dicing:
Equality and range selections on
one or more dimensions.
63 81 144
38 107 145
75 35 110
WI CA Total
1995
1996
1997
176 223 339Total
Comparison with SQL Queries
 The cross-tabulation obtained by pivoting can also
be computed using a collection of SQLqueries:
SELECT SUM(S.sales)
FROM Sales S, Times T, Locations L
WHERE S.timeid=T.timeid AND S.timeid=L.timeid
GROUP BY T.year, L.state
SELECT SUM(S.sales)
FROM Sales S, Times T
WHERE S.timeid=T.timeid
GROUP BY T.year
SELECT SUM(S.sales)
FROM Sales S, Location L
WHERE S.timeid=L.timeid
GROUP BY L.state
The CUBE Operator
 Generalizing the previous example, if there
are k dimensions, we have 2^k possible SQL
GROUP BY queries that can be generated
through pivoting on a subset of dimensions.
 CUBE pid, locid, timeid BY SUM Sales
 Equivalent to rolling up Sales on all eight
subsets of the set {pid, locid, timeid}; each roll-
up corresponds to an SQL query of the form:
SELECT SUM(S.sales)
FROM Sales S
GROUP BY grouping-list
Lots of recent work on
optimizing the CUBE operator!
Design Issues
 Fact table in BCNF; dimension tables not
normalized.
 Dimension tables are small; updates/inserts/deletes are
rare. So, anomalies less important than good query
performance.
 This kind of schema is very common in OLAP
applications, and is called a star schema; computing
the join of all these relations is called a star join.
pricecategorypnamepid countrystatecitylocid
saleslocidtimeidpid
holiday_flagweekdatetimeid month quarter year
(Fact table)SALES
TIMES
PRODUCTS LOCATIONS
Implementation Issues
 New indexing techniques: Bitmap indexes, Join
indexes, array representations, compression,
precomputation of aggregations, etc.
 E.g., Bitmap index:
10
10
01
10
112 Joe M 3
115 Ram M 5
119 Sue F 5
112 Woo M 4
00100
00001
00001
00010
sex custid name sex rating ratingBit-vector:
1 bit for each
possible value.
Many queries can
be answered using
bit-vector ops!
M
F
Views and Decision Support
 OLAP queries are typically aggregate queries.
 Precomputation is essential for interactive response
times.
 The CUBE is in fact a collection of aggregate queries,
and precomputation is especially important: lots of
work on what is best to precompute given a limited
amount of space to store precomputed results.
 Warehouses can be thought of as a collection of
asynchronously replicated tables and periodically
maintained views.
 Has renewed interest in view maintenance!
View Modification (Evaluate On Demand)
CREATE VIEW RegionalSales(category,sales,state)
AS SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid
SELECT R.category, R.state, SUM(R.sales)
FROM RegionalSales AS R GROUP BY R.category, R.state
SELECT R.category, R.state, SUM(R.sales)
FROM (SELECT P.category, S.sales, L.state
FROM Products P, Sales S, Locations L
WHERE P.pid=S.pid AND S.locid=L.locid) AS R
GROUP BY R.category, R.state
View
Query
Modified
Query
Interactive Queries
 Top N Queries: If you want to find the 10 (or
so) cheapest cars, it would be nice if the DB
could avoid computing the costs of all cars
before sorting to determine the 10 cheapest.
 Idea: Guess at a cost c such that the 10 cheapest all
cost less than c, and that not too many more cost
less. Then add the selection cost<c and evaluate
the query.
 If the guess is right, great, we avoid computation
for cars that cost more than c.
 If the guess is wrong, need to reset the selection
and recompute the original query.
Top N Queries
SELECT P.pid, P.pname, S.sales
FROM Sales S, Products P
WHERE S.pid=P.pid AND S.locid=1 AND S.timeid=3
ORDER BY S.sales DESC
OPTIMIZE FOR 10 ROWS
 OPTIMIZE FOR construct is not in SQL:1999!
 Cut-off value c is chosen by optimizer.
SELECT P.pid, P.pname, S.sales
FROM Sales S, Products P
WHERE S.pid=P.pid AND S.locid=1 AND S.timeid=3
AND S.sales > c
ORDER BY S.sales DESC
Interactive Queries
Online Aggregation: Consider an
aggregate query, e.g., finding the average
sales by state. Can we provide the user
with some information before the exact
average is computed for all states?

More Related Content

What's hot

Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension Sunita Sahu
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1guest9529cb
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALASaikiran Panjala
 
DW DIMENSN MODELNG
DW DIMENSN MODELNGDW DIMENSN MODELNG
DW DIMENSN MODELNGDivya Tadi
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNabclearnn
 
Advanced Dimensional Modelling
Advanced Dimensional ModellingAdvanced Dimensional Modelling
Advanced Dimensional ModellingVincent Rainardi
 
Data warehouse
Data warehouseData warehouse
Data warehouse_123_
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional ModellingAshish Chandwani
 
Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables Sunita Sahu
 
Dimensional data modeling
Dimensional data modelingDimensional data modeling
Dimensional data modelingAdam Hutson
 
Data warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail businessData warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail businessArsalan Qadri
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data modelmoni sindhu
 

What's hot (20)

Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
BI Suite Overview
BI Suite OverviewBI Suite Overview
BI Suite Overview
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension
 
02 Essbase
02 Essbase02 Essbase
02 Essbase
 
Project report aditi paul1
Project report aditi paul1Project report aditi paul1
Project report aditi paul1
 
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALADATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
DW DIMENSN MODELNG
DW DIMENSN MODELNGDW DIMENSN MODELNG
DW DIMENSN MODELNG
 
IBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARNIBM Cognos tutorial - ABC LEARN
IBM Cognos tutorial - ABC LEARN
 
Advanced Dimensional Modelling
Advanced Dimensional ModellingAdvanced Dimensional Modelling
Advanced Dimensional Modelling
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Introduction to Dimesional Modelling
Introduction to Dimesional ModellingIntroduction to Dimesional Modelling
Introduction to Dimesional Modelling
 
Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables Fact less fact Tables & Aggregate Tables
Fact less fact Tables & Aggregate Tables
 
Dimensional data modeling
Dimensional data modelingDimensional data modeling
Dimensional data modeling
 
Data warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail businessData warehouse implementation design for a Retail business
Data warehouse implementation design for a Retail business
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Olap
OlapOlap
Olap
 
05 OLAP v6 weekend
05 OLAP  v6 weekend05 OLAP  v6 weekend
05 OLAP v6 weekend
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 

Viewers also liked

Buku panduan pengelolaan-e-journal
Buku panduan pengelolaan-e-journalBuku panduan pengelolaan-e-journal
Buku panduan pengelolaan-e-journalCoky Fauzi Alfi
 
Ogd indonesia-final-for-publication
Ogd indonesia-final-for-publicationOgd indonesia-final-for-publication
Ogd indonesia-final-for-publicationCoky Fauzi Alfi
 
Pertemuan 4 -_multi_dimensional_model_design_1
Pertemuan 4 -_multi_dimensional_model_design_1Pertemuan 4 -_multi_dimensional_model_design_1
Pertemuan 4 -_multi_dimensional_model_design_1Abrianto Nugraha
 
1 c manusia-akhlak dan etika
1 c manusia-akhlak dan etika1 c manusia-akhlak dan etika
1 c manusia-akhlak dan etikaHM Mitrohardjono
 
Tugas 1 etika sebagai profesi
Tugas 1 etika sebagai profesiTugas 1 etika sebagai profesi
Tugas 1 etika sebagai profesisitimariyah10
 
Pemrograman berorientasi objek_1
Pemrograman berorientasi objek_1Pemrograman berorientasi objek_1
Pemrograman berorientasi objek_1Abrianto Nugraha
 
Interaksi manusia komputer 3
Interaksi manusia komputer 3Interaksi manusia komputer 3
Interaksi manusia komputer 3Abrianto Nugraha
 
Presentation1 desain dan perilaku organisasi
Presentation1 desain dan perilaku organisasi Presentation1 desain dan perilaku organisasi
Presentation1 desain dan perilaku organisasi Habibullah Srg
 
Pancasila (pancasila sebagai sistem etika)
Pancasila (pancasila sebagai sistem etika)Pancasila (pancasila sebagai sistem etika)
Pancasila (pancasila sebagai sistem etika)isni arifa
 
Etika profesi - pertemuan 2
Etika profesi - pertemuan 2Etika profesi - pertemuan 2
Etika profesi - pertemuan 2Abrianto Nugraha
 
6701144244 tegar jagat geni arya perkasa pis-14-06
6701144244 tegar jagat geni arya perkasa pis-14-066701144244 tegar jagat geni arya perkasa pis-14-06
6701144244 tegar jagat geni arya perkasa pis-14-06tegar jgap
 
Etika Bisnis Komersial
Etika Bisnis KomersialEtika Bisnis Komersial
Etika Bisnis KomersialLuthfi Nk
 
Silabus etika profesi
Silabus etika profesiSilabus etika profesi
Silabus etika profesiArya Fitriadi
 
Etika profesi - pertemuan 1
Etika profesi - pertemuan 1Etika profesi - pertemuan 1
Etika profesi - pertemuan 1Abrianto Nugraha
 
Manusia,nilai,moral dan hukum
Manusia,nilai,moral dan hukumManusia,nilai,moral dan hukum
Manusia,nilai,moral dan hukumRaka Ditya
 

Viewers also liked (20)

Buku panduan pengelolaan-e-journal
Buku panduan pengelolaan-e-journalBuku panduan pengelolaan-e-journal
Buku panduan pengelolaan-e-journal
 
Ogd indonesia-final-for-publication
Ogd indonesia-final-for-publicationOgd indonesia-final-for-publication
Ogd indonesia-final-for-publication
 
Pertemuan 4 -_multi_dimensional_model_design_1
Pertemuan 4 -_multi_dimensional_model_design_1Pertemuan 4 -_multi_dimensional_model_design_1
Pertemuan 4 -_multi_dimensional_model_design_1
 
Visual Intelligence
Visual IntelligenceVisual Intelligence
Visual Intelligence
 
DWO - Pertemuan 2 & 3
DWO - Pertemuan 2 & 3DWO - Pertemuan 2 & 3
DWO - Pertemuan 2 & 3
 
1 c manusia-akhlak dan etika
1 c manusia-akhlak dan etika1 c manusia-akhlak dan etika
1 c manusia-akhlak dan etika
 
Tugas 2[1]
Tugas 2[1]Tugas 2[1]
Tugas 2[1]
 
Tugas 1 etika sebagai profesi
Tugas 1 etika sebagai profesiTugas 1 etika sebagai profesi
Tugas 1 etika sebagai profesi
 
Pertemuan 1
Pertemuan 1Pertemuan 1
Pertemuan 1
 
Pemrograman berorientasi objek_1
Pemrograman berorientasi objek_1Pemrograman berorientasi objek_1
Pemrograman berorientasi objek_1
 
Interaksi manusia komputer 3
Interaksi manusia komputer 3Interaksi manusia komputer 3
Interaksi manusia komputer 3
 
Presentation1 desain dan perilaku organisasi
Presentation1 desain dan perilaku organisasi Presentation1 desain dan perilaku organisasi
Presentation1 desain dan perilaku organisasi
 
Pancasila (pancasila sebagai sistem etika)
Pancasila (pancasila sebagai sistem etika)Pancasila (pancasila sebagai sistem etika)
Pancasila (pancasila sebagai sistem etika)
 
Etika profesi - pertemuan 2
Etika profesi - pertemuan 2Etika profesi - pertemuan 2
Etika profesi - pertemuan 2
 
6701144244 tegar jagat geni arya perkasa pis-14-06
6701144244 tegar jagat geni arya perkasa pis-14-066701144244 tegar jagat geni arya perkasa pis-14-06
6701144244 tegar jagat geni arya perkasa pis-14-06
 
Etika Bisnis Komersial
Etika Bisnis KomersialEtika Bisnis Komersial
Etika Bisnis Komersial
 
Silabus etika profesi
Silabus etika profesiSilabus etika profesi
Silabus etika profesi
 
Etika profesi - pertemuan 1
Etika profesi - pertemuan 1Etika profesi - pertemuan 1
Etika profesi - pertemuan 1
 
Manusia,nilai,moral dan hukum
Manusia,nilai,moral dan hukumManusia,nilai,moral dan hukum
Manusia,nilai,moral dan hukum
 
Visual Semiotics
Visual SemioticsVisual Semiotics
Visual Semiotics
 

Similar to DWO -Pertemuan 1

Data Warehouse
Data WarehouseData Warehouse
Data Warehouseganblues
 
Cs437 lecture 10-12
Cs437 lecture 10-12Cs437 lecture 10-12
Cs437 lecture 10-12Aneeb_Khawar
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPDhiren Gala
 
Intro to Data warehousing Lecture 06
Intro to Data warehousing   Lecture 06Intro to Data warehousing   Lecture 06
Intro to Data warehousing Lecture 06AnwarrChaudary
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
Building a semantic/metrics layer using Calcite
Building a semantic/metrics layer using CalciteBuilding a semantic/metrics layer using Calcite
Building a semantic/metrics layer using CalciteJulian Hyde
 
Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Shani729
 
Olap fundamentals
Olap fundamentalsOlap fundamentals
Olap fundamentalsAmit Sharma
 
Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)Raimonds Simanovskis
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modelingvivekjv
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptxjainyshah20
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
IT301-Datawarehousing (1) and its sub topics.pptx
IT301-Datawarehousing (1) and its sub topics.pptxIT301-Datawarehousing (1) and its sub topics.pptx
IT301-Datawarehousing (1) and its sub topics.pptxReneeClintGortifacio
 

Similar to DWO -Pertemuan 1 (20)

Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Cs437 lecture 10-12
Cs437 lecture 10-12Cs437 lecture 10-12
Cs437 lecture 10-12
 
Become BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAPBecome BI Architect with 1KEY Agile BI Suite - OLAP
Become BI Architect with 1KEY Agile BI Suite - OLAP
 
Essbase intro
Essbase introEssbase intro
Essbase intro
 
Intro to Data warehousing Lecture 06
Intro to Data warehousing   Lecture 06Intro to Data warehousing   Lecture 06
Intro to Data warehousing Lecture 06
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
OLAPCUBE.pptx
OLAPCUBE.pptxOLAPCUBE.pptx
OLAPCUBE.pptx
 
OLAP
OLAPOLAP
OLAP
 
Building a semantic/metrics layer using Calcite
Building a semantic/metrics layer using CalciteBuilding a semantic/metrics layer using Calcite
Building a semantic/metrics layer using Calcite
 
Dwh faqs
Dwh faqsDwh faqs
Dwh faqs
 
Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13Dwh lecture slides-week 12&13
Dwh lecture slides-week 12&13
 
Star schema
Star schemaStar schema
Star schema
 
Olap fundamentals
Olap fundamentalsOlap fundamentals
Olap fundamentals
 
Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)
 
CS636-olap.ppt
CS636-olap.pptCS636-olap.ppt
CS636-olap.ppt
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Data ware housing- Introduction to olap .
Data ware housing- Introduction to  olap .Data ware housing- Introduction to  olap .
Data ware housing- Introduction to olap .
 
Data Warehousing for students educationpptx
Data Warehousing for students educationpptxData Warehousing for students educationpptx
Data Warehousing for students educationpptx
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
IT301-Datawarehousing (1) and its sub topics.pptx
IT301-Datawarehousing (1) and its sub topics.pptxIT301-Datawarehousing (1) and its sub topics.pptx
IT301-Datawarehousing (1) and its sub topics.pptx
 

More from Abrianto Nugraha (20)

Ds sn is-02
Ds sn is-02Ds sn is-02
Ds sn is-02
 
Ds sn is-01
Ds sn is-01Ds sn is-01
Ds sn is-01
 
Pertemuan 5 optimasi_dengan_alternatif_terbatas_-_lengkap
Pertemuan 5 optimasi_dengan_alternatif_terbatas_-_lengkapPertemuan 5 optimasi_dengan_alternatif_terbatas_-_lengkap
Pertemuan 5 optimasi_dengan_alternatif_terbatas_-_lengkap
 
04 pemodelan spk
04 pemodelan spk04 pemodelan spk
04 pemodelan spk
 
02 sistem pengambilan-keputusan_revised
02 sistem pengambilan-keputusan_revised02 sistem pengambilan-keputusan_revised
02 sistem pengambilan-keputusan_revised
 
01 pengantar sistem-pendukung_keputusan
01 pengantar sistem-pendukung_keputusan01 pengantar sistem-pendukung_keputusan
01 pengantar sistem-pendukung_keputusan
 
Pertemuan 7
Pertemuan 7Pertemuan 7
Pertemuan 7
 
Pertemuan 7 dan_8
Pertemuan 7 dan_8Pertemuan 7 dan_8
Pertemuan 7 dan_8
 
Pertemuan 5
Pertemuan 5Pertemuan 5
Pertemuan 5
 
Pertemuan 6
Pertemuan 6Pertemuan 6
Pertemuan 6
 
Pertemuan 4
Pertemuan 4Pertemuan 4
Pertemuan 4
 
Pertemuan 3
Pertemuan 3Pertemuan 3
Pertemuan 3
 
Pertemuan 2
Pertemuan 2Pertemuan 2
Pertemuan 2
 
Modul 1 mengambil nilai parameter
Modul 1   mengambil nilai parameterModul 1   mengambil nilai parameter
Modul 1 mengambil nilai parameter
 
Modul 3 object oriented programming dalam php
Modul 3   object oriented programming dalam phpModul 3   object oriented programming dalam php
Modul 3 object oriented programming dalam php
 
Modul 2 menyimpan ke database
Modul 2  menyimpan ke databaseModul 2  menyimpan ke database
Modul 2 menyimpan ke database
 
Pbo 7
Pbo 7Pbo 7
Pbo 7
 
Pbo 6
Pbo 6Pbo 6
Pbo 6
 
Pbo 4
Pbo 4Pbo 4
Pbo 4
 
Pbo 3
Pbo 3Pbo 3
Pbo 3
 

Recently uploaded

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 

Recently uploaded (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 

DWO -Pertemuan 1

  • 1. Data Warehouse & OLAP Kuliah 1 Introduction Slide banyak mengambil dari acuan- acuan yang dipakai
  • 2. DB, DW, DM DW vs DB DW vs DM
  • 3. INTRODUCTION Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business strategies. Emphasis is on complex, interactive, exploratory analysis of very large datasets created by integrating data from across all parts of an enterprise; data is fairly static.
  • 4. INTRODUCTION (..cont) Contrast such On-Line Analytic Processing (OLAP) with traditional On-line Transaction Processing (OLTP): mostly long queries, instead of short update transactions. OLTP: roda penggerak organisasi OLAP: mengawasi gerak roda Cat: rencana mata kuliah transaction processing
  • 5. THREE COMPLEMENTARY TRENDS  Data Warehousing: Consolidate data from many sources in one large repository.  Loading, periodic synchronization of replicas.  Semantic integration.  OLAP:  Complex SQL queries and views.  Queries based on spreadsheet-style operations and “multidimensional” view of data.  bahkan memakai spreadsheet.  Interactive and “online” queries.  Data Mining: Exploratory search for interesting trends and anomalies.
  • 7. Data Warehouse: A Multi-Tiered ArchitectureData Warehouse: A Multi-Tiered Architecture Data Warehouse Extract Transform Load Refresh OLAP Engine Analysis Query Reports Data mining Monitor & Integrator Metadata Data Sources Front-End Tools Serve Data Marts Operational DBs Other sources Data Storage OLAP Server
  • 8. DATA WAREHOUSING Integrated data spanning long time periods, often augmented with summary information. Several gigabytes to terabytes common. Interactive response times expected for complex queries.
  • 9. Multidimensional Data Model  Collection of numeric measures, which depend on a set of dimensions.  E.g., measure Sales, dimensions Product (key: pid), Location (locid), and Time (timeid). 8 10 10 30 20 50 25 8 15 1 2 3 timeid pid 111213 11 1 1 25 11 2 1 8 11 3 1 15 12 1 1 30 12 2 1 20 12 3 1 50 13 1 1 8 13 2 1 10 13 3 1 10 11 1 2 35 pid timeid locid sales locid Slice locid=1 is shown:
  • 10. MOLAP vs ROLAP  Multidimensional data can be stored physically in a (disk-resident, persistent) array; called MOLAP systems. Alternatively, can store as a relation; called ROLAP systems.  The main relation, which relates dimensions to a measure, is called the fact table. Each dimension can have additional attributes and an associated dimension table.  E.g., Products(pid, pname, category, price)  Fact tables are much larger than dimensional tables.
  • 11. Dimension Hierarchies For each dimension, the set of values can be organized in a hierarchy: PRODUCT TIME LOCATION category week month state pname date city year quarter country
  • 12. OLAP Queries Influenced by SQL and by spreadsheets. A common operation is to aggregate a measure over one or more dimensions.  Find total sales.  Find total sales for each city, or for each state.  Find top five products ranked by total sales. Roll-up: Aggregating at different levels of a dimension hierarchy.  E.g., Given total sales by city, we can roll-up to get sales by state.
  • 13. OLAP Queries (..cont)  Drill-down: The inverse of roll-up.  E.g., Given total sales by state, can drill-down to get total sales by city.  E.g., Can also drill-down on different dimension to get total sales by product for each state.  Pivoting: Aggregation on selected dimensions.  E.g., Pivoting on Location and Time yields this cross-tabulation:  Slicing and Dicing: Equality and range selections on one or more dimensions. 63 81 144 38 107 145 75 35 110 WI CA Total 1995 1996 1997 176 223 339Total
  • 14. Comparison with SQL Queries  The cross-tabulation obtained by pivoting can also be computed using a collection of SQLqueries: SELECT SUM(S.sales) FROM Sales S, Times T, Locations L WHERE S.timeid=T.timeid AND S.timeid=L.timeid GROUP BY T.year, L.state SELECT SUM(S.sales) FROM Sales S, Times T WHERE S.timeid=T.timeid GROUP BY T.year SELECT SUM(S.sales) FROM Sales S, Location L WHERE S.timeid=L.timeid GROUP BY L.state
  • 15. The CUBE Operator  Generalizing the previous example, if there are k dimensions, we have 2^k possible SQL GROUP BY queries that can be generated through pivoting on a subset of dimensions.  CUBE pid, locid, timeid BY SUM Sales  Equivalent to rolling up Sales on all eight subsets of the set {pid, locid, timeid}; each roll- up corresponds to an SQL query of the form: SELECT SUM(S.sales) FROM Sales S GROUP BY grouping-list Lots of recent work on optimizing the CUBE operator!
  • 16. Design Issues  Fact table in BCNF; dimension tables not normalized.  Dimension tables are small; updates/inserts/deletes are rare. So, anomalies less important than good query performance.  This kind of schema is very common in OLAP applications, and is called a star schema; computing the join of all these relations is called a star join. pricecategorypnamepid countrystatecitylocid saleslocidtimeidpid holiday_flagweekdatetimeid month quarter year (Fact table)SALES TIMES PRODUCTS LOCATIONS
  • 17. Implementation Issues  New indexing techniques: Bitmap indexes, Join indexes, array representations, compression, precomputation of aggregations, etc.  E.g., Bitmap index: 10 10 01 10 112 Joe M 3 115 Ram M 5 119 Sue F 5 112 Woo M 4 00100 00001 00001 00010 sex custid name sex rating ratingBit-vector: 1 bit for each possible value. Many queries can be answered using bit-vector ops! M F
  • 18. Views and Decision Support  OLAP queries are typically aggregate queries.  Precomputation is essential for interactive response times.  The CUBE is in fact a collection of aggregate queries, and precomputation is especially important: lots of work on what is best to precompute given a limited amount of space to store precomputed results.  Warehouses can be thought of as a collection of asynchronously replicated tables and periodically maintained views.  Has renewed interest in view maintenance!
  • 19. View Modification (Evaluate On Demand) CREATE VIEW RegionalSales(category,sales,state) AS SELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid SELECT R.category, R.state, SUM(R.sales) FROM RegionalSales AS R GROUP BY R.category, R.state SELECT R.category, R.state, SUM(R.sales) FROM (SELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid) AS R GROUP BY R.category, R.state View Query Modified Query
  • 20. Interactive Queries  Top N Queries: If you want to find the 10 (or so) cheapest cars, it would be nice if the DB could avoid computing the costs of all cars before sorting to determine the 10 cheapest.  Idea: Guess at a cost c such that the 10 cheapest all cost less than c, and that not too many more cost less. Then add the selection cost<c and evaluate the query.  If the guess is right, great, we avoid computation for cars that cost more than c.  If the guess is wrong, need to reset the selection and recompute the original query.
  • 21. Top N Queries SELECT P.pid, P.pname, S.sales FROM Sales S, Products P WHERE S.pid=P.pid AND S.locid=1 AND S.timeid=3 ORDER BY S.sales DESC OPTIMIZE FOR 10 ROWS  OPTIMIZE FOR construct is not in SQL:1999!  Cut-off value c is chosen by optimizer. SELECT P.pid, P.pname, S.sales FROM Sales S, Products P WHERE S.pid=P.pid AND S.locid=1 AND S.timeid=3 AND S.sales > c ORDER BY S.sales DESC
  • 22. Interactive Queries Online Aggregation: Consider an aggregate query, e.g., finding the average sales by state. Can we provide the user with some information before the exact average is computed for all states?