SlideShare uma empresa Scribd logo
1 de 22
Oracle: Data Warehouse Design
Characteristics of a Data Warehouse

  A data warehouse is a database designed for
   querying, reporting, and analysis.
  A data warehouse contains historical data
   derived from transaction data.
  Data warehouses separate analysis workload
   from transaction workload.
  A data warehouse is primarily
   an analytical tool.
Comparing OLTP and Data Warehouses
          OLTP                        Data Warehouse



   Many              Joins            Some


   Comparatively   Data accessed by   Large
   lower           queries            amount


   Normalized       Duplicated data   Denormalized
   DBMS                               DBMS

                   Derived data
   Rare            and                Common
                   aggregates
Data Warehouse Architectures

                                                                     Analysis
Operational
systems
                                  Metadata                Sales


                                                        Purchasing
                        Materialized
                                             Raw data
              Staging   views
              area                                                        Reporting
                                                          Inventory




 Flat files                                                       Data mining
Data Warehouse Design
• Key data warehouse design considerations:
  – Identify the specific data content.
  – Recognize the critical relationships within and
    between groups of data.
  – Define the system environment
    supporting your data warehouse.
  – Identify the required data
    transformations.
  – Calculate the frequency at which
    the data must be refreshed.
Logical Design
– A logical design is conceptual and
  abstract.
– Entity-relationship (ER) modeling
  is useful in identifying logical
  information requirements.
   • An entity represents a chunk of data.
   • The properties of entities are known as attributes.
   • The links between entities and attributes are known
     as relationships.
– Dimensional modeling is a specialized
  type of ER modeling useful in data warehouse
  design.
Oracle Warehouse Builder
– Oracle Database provides tools to implement
  the ETL process.
   • Oracle Warehouse Builder is a tool to help in this
     process.
– Oracle Warehouse Builder generates the
  following types of code:
   •   SQL data definition language (DDL) scripts
   •   PL/SQL programs
   •   SQL*Loader control files
   •   XML Processing Description Language (XPDL)
   •   ABAP code (used to extract data from SAP systems)
Data Warehousing Schemas
– Objects can be arranged in data warehousing
  schema models in a variety of ways:
   •   Star schema
   •   Snowflake schema
   •   Third normal form (3NF) schema
   •   Hybrid schemas
– The source data model and user
  requirements should steer the data
  warehouse schema.
– Implementation of the logical model may
  require changes to enable you to adapt it to
  your physical system.
Schema Characteristics
– Star schema
   • Characterized by one or more large fact tables and a
     number of much smaller dimension tables
   • Each dimension table joined to the fact table using a
     primary key to foreign key join
– Snowflake schema
   • Dimension data grouped into multiple tables instead
     of one large table
   • Increased number of dimension tables, requiring
     more foreign key joins
– Third normal form (3NF) schema
   • A classical relational-database model that minimizes
     data redundancy through normalization
Data Warehousing Objects
– Fact tables
   • Fact tables are the large tables that store business
     measurements.
– Dimension tables
   • A dimension is a structure composed of one or more
     hierarchies that categorizes data.
   • Unique identifiers are specified for one distinct
     record in a dimension table.
– Relationships
   • Relationships guarantee
     integrity of business
     information.
Fact Tables
– A fact table must be defined for each star schema.
– Fact tables are the large tables that store business
  measurements.
– A fact table contains either detail-level or
  aggregated facts.
– A fact table usually contains facts with the same
  level of aggregation.
– The primary key of the fact table is
  usually a composite key made up
  of all its foreign keys.
Dimensions and Hierarchies
                                   CUSTOMERS dimension
– A dimension is a structure       hierarchy (by level)
  composed of one or more
  hierarchies that categorizes data. REGION
– Dimensional attributes help to
  describe the dimensional value.         SUBREGION

– Dimension data is collected at the
  lowest level of detail and aggregatedCOUNTRY
  into higher level totals.
– Hierarchies are structures that use STATE
  ordered levels to organize data.
– In a hierarchy, each level is           CITY

  connected to the levels above and
  below it.                               CUSTOMER
Dimensions and Hierarchies

 PRODUCTS                             CUSTOMERS
 #prod_id         Unique identifier   #cust_id
                    Fact table        cust_last_name
                                       cust_city
                                       cust_state_province
   Relationship    SALES
                   cust_id
                   prod_id                     Hierarchy




 TIMES                                  CHANNELS
                  PROMOTIONS

Dimension table                        Dimension table
                  Dimension table
Physical Design
  Logical             Physical (Tablespaces)


Entities           Tables                      Indexes



                                               Materialized
Relationships          Integrity
                                               views
                       constraints
                     - Primary key
                     - Foreign key
Attributes           - Not null                Dimensions



Unique
identifiers        Columns
Data Warehouse Physical Structures

• Tables and partitioned tables
  – Partitioned tables enable you to split
    large data volumes into smaller,
    more manageable pieces.
  – Expect performance benefits from:
     • Partition pruning
     • Intelligent parallel processing
  – Compressed tables offer scaleup opportunities
    for read-only operations.
  – Table compression saves disk space.
Data Warehouse Physical Structures

  – Views:
     • Are tailored presentations of data contained in one
       or more tables or views
     • Do not require any space in the database
  – Materialized views:
     • Are query results that have been stored in advance
     • (Like indexes) are used transparently and improve
       performance
  – Integrity constraints:
     • Are used in data warehouses for query rewrite
  – Dimensions:
     • Are containers of logical relationships and do not
       require any space in the database
Managing Large Volumes of Data
• Work smarter in your data warehouse:
  –   Partitioning
  –   Bitmap indexes/Star transformation
  –   Data compression
  –   Query rewrite
• Work harder in your data warehouse:
  – Parallelism for all operations
       • DBA tasks, such as loading, index creation, table
         creation, data modification, backup and recovery
       • End-user operations, such as queries
       • Unbounded scalability: Real Application Clusters
I/O Performance in Data Warehouses

  – I/O is typically the primary determinant of data
    warehouse performance.
  – Data warehouse storage configurations should be
    chosen by I/O bandwidth, not storage capacity.
  – Every component of the I/O
    subsystem should provide
    enough bandwidth:
     • Disks
     • I/O channels
     • I/O adapters
  – In data warehouses, maximizing
    sequential I/O throughput is critical.
I/O Scalability
Parallel execution:
     – Reduces response time for data-intensive operations on large
       databases
     – Benefits systems with the following characteristics:
          • Multiprocessors, clusters, or massively parallel systems
          • Sufficient I/O bandwidth
          • Sufficient memory to support memory-intensive processes such
            as sorts, hashing, and I/O buffers

                                Query servers
                                                               Coordinator
 Data on disk          Scan                     Sort Q1

                       Scan                     Sort Q2
                                                               Dispatch
                                                               work
                       Scan                     Sort Q3

                       Scan                     Sort Q4
                     Scanners          Sorters (Aggregators)
I/O Scalability

• Automatic Storage Management (ASM)
  – Configuring storage for a DB depends on many
    variables:
     •   Which data to put on which disk
     •   Logical unit number (LUN) configurations
     •   DB types and workloads; data warehouse, OLTP, DSS
     •   Trade-offs between available options
  – ASM provides solutions to storage issues
    encountered in data warehouses.
I/O Scalability

• Automatic Storage Management: Overview
  – Portable and high-performance
    cluster file system                Application
  – Manages Oracle database files
  – Data spread across disks                Database
    to balance load                  File
  – Integrated mirroring across      system
                                                     ASM
    disks                            Volume
                                     manager
  – Solves many storage
    management challenges           Operating system
Visit more self help tutorials

• Pick a tutorial of your choice and browse
  through it at your own pace.
• The tutorials section is free, self-guiding and
  will not involve any additional support.
• Visit us at www.dataminingtools.net

Mais conteúdo relacionado

Mais procurados

Oracle Database Introduction
Oracle Database IntroductionOracle Database Introduction
Oracle Database IntroductionChhom Karath
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache HadoopAjit Koti
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseDatabricks
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 
Less01 architecture
Less01 architectureLess01 architecture
Less01 architectureAmit Bhalla
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle databaseSamar Prasad
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload managementBiju Nair
 
Distributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedDistributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedYugabyte
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance TuningOracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance TuningTanel Poder
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Azure Storage Services - Part 01
Azure Storage Services - Part 01Azure Storage Services - Part 01
Azure Storage Services - Part 01Neeraj Kumar
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashingsathish sak
 

Mais procurados (20)

Oracle Database Introduction
Oracle Database IntroductionOracle Database Introduction
Oracle Database Introduction
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Apache Hadoop
Apache HadoopApache Hadoop
Apache Hadoop
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
Cassandra
CassandraCassandra
Cassandra
 
NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Less01 architecture
Less01 architectureLess01 architecture
Less01 architecture
 
Overview of oracle database
Overview of oracle databaseOverview of oracle database
Overview of oracle database
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
Distributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedDistributed SQL Databases Deconstructed
Distributed SQL Databases Deconstructed
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Spark SQL
Spark SQLSpark SQL
Spark SQL
 
Oracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance TuningOracle LOB Internals and Performance Tuning
Oracle LOB Internals and Performance Tuning
 
Oracle archi ppt
Oracle archi pptOracle archi ppt
Oracle archi ppt
 
MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals MongoDB Sharding Fundamentals
MongoDB Sharding Fundamentals
 
03 hive query language (hql)
03 hive query language (hql)03 hive query language (hql)
03 hive query language (hql)
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Azure Storage Services - Part 01
Azure Storage Services - Part 01Azure Storage Services - Part 01
Azure Storage Services - Part 01
 
Indexing and Hashing
Indexing and HashingIndexing and Hashing
Indexing and Hashing
 

Destaque

Oracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionOracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionAditya Trivedi
 
business analysis-Data warehousing
business analysis-Data warehousingbusiness analysis-Data warehousing
business analysis-Data warehousingDhilsath Fathima
 
multiparty access control
multiparty access controlmultiparty access control
multiparty access controlLevin Sibi
 
Multiparty Access Control For Online Social Networks : Model and Mechanisms.
Multiparty Access Control For Online Social Networks : Model and Mechanisms.Multiparty Access Control For Online Social Networks : Model and Mechanisms.
Multiparty Access Control For Online Social Networks : Model and Mechanisms.Kiran K.V.S.
 
Data warehousing labs maunal
Data warehousing labs maunalData warehousing labs maunal
Data warehousing labs maunalEducation
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
 
Oratoria E RetóRica Latinas
Oratoria E RetóRica LatinasOratoria E RetóRica Latinas
Oratoria E RetóRica Latinaslara
 
Powerpoint paragraaf 5.3/5.4
Powerpoint paragraaf 5.3/5.4 Powerpoint paragraaf 5.3/5.4
Powerpoint paragraaf 5.3/5.4 guestaa9e6a
 
Bind How To
Bind How ToBind How To
Bind How Tocntlinux
 
MS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesMS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesDataminingTools Inc
 
MS SQL SERVER: Programming sql server data mining
MS SQL SERVER: Programming sql server data miningMS SQL SERVER: Programming sql server data mining
MS SQL SERVER: Programming sql server data miningDataminingTools Inc
 

Destaque (20)

Oracle 11g data warehouse introdution
Oracle 11g data warehouse introdutionOracle 11g data warehouse introdution
Oracle 11g data warehouse introdution
 
Module Owb Basics
Module Owb BasicsModule Owb Basics
Module Owb Basics
 
Module Owb Process Flows
Module Owb Process FlowsModule Owb Process Flows
Module Owb Process Flows
 
Module Owb Lifecycle
Module Owb LifecycleModule Owb Lifecycle
Module Owb Lifecycle
 
business analysis-Data warehousing
business analysis-Data warehousingbusiness analysis-Data warehousing
business analysis-Data warehousing
 
multiparty access control
multiparty access controlmultiparty access control
multiparty access control
 
Multiparty Access Control For Online Social Networks : Model and Mechanisms.
Multiparty Access Control For Online Social Networks : Model and Mechanisms.Multiparty Access Control For Online Social Networks : Model and Mechanisms.
Multiparty Access Control For Online Social Networks : Model and Mechanisms.
 
Data warehousing labs maunal
Data warehousing labs maunalData warehousing labs maunal
Data warehousing labs maunal
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
 
LISP:Object System Lisp
LISP:Object System LispLISP:Object System Lisp
LISP:Object System Lisp
 
LISP: Scope and extent in lisp
LISP: Scope and extent in lispLISP: Scope and extent in lisp
LISP: Scope and extent in lisp
 
Oratoria E RetóRica Latinas
Oratoria E RetóRica LatinasOratoria E RetóRica Latinas
Oratoria E RetóRica Latinas
 
How To Make Pb J
How To Make Pb JHow To Make Pb J
How To Make Pb J
 
SPSS: File Managment
SPSS: File ManagmentSPSS: File Managment
SPSS: File Managment
 
Powerpoint paragraaf 5.3/5.4
Powerpoint paragraaf 5.3/5.4 Powerpoint paragraaf 5.3/5.4
Powerpoint paragraaf 5.3/5.4
 
Bind How To
Bind How ToBind How To
Bind How To
 
BI: Open Source
BI: Open SourceBI: Open Source
BI: Open Source
 
MS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rulesMS SQL SERVER: Microsoft sequence clustering and association rules
MS SQL SERVER: Microsoft sequence clustering and association rules
 
Data Applied:Forecast
Data Applied:ForecastData Applied:Forecast
Data Applied:Forecast
 
MS SQL SERVER: Programming sql server data mining
MS SQL SERVER: Programming sql server data miningMS SQL SERVER: Programming sql server data mining
MS SQL SERVER: Programming sql server data mining
 

Semelhante a Oracle: DW Design

Relational
RelationalRelational
Relationaldieover
 
Oracle: Fundamental Of Dw
Oracle: Fundamental Of DwOracle: Fundamental Of Dw
Oracle: Fundamental Of Dworacle content
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database managementOnline
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Michael Rys
 
Data Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptxData Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptxPriyadarshini648418
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Martin Bém
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introductionMurli Jha
 
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am
Business Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8amBusiness Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8am
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8amBarrett Peterson
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftAmazon Web Services
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSSDeepali Raut
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Datacwensel
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataAbishek V S
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessJawaherAlbaddawi
 

Semelhante a Oracle: DW Design (20)

Relational
RelationalRelational
Relational
 
Oracle: Fundamental Of DW
Oracle: Fundamental Of DWOracle: Fundamental Of DW
Oracle: Fundamental Of DW
 
Oracle: Fundamental Of Dw
Oracle: Fundamental Of DwOracle: Fundamental Of Dw
Oracle: Fundamental Of Dw
 
DBMS
DBMSDBMS
DBMS
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database management
 
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
Best Practices and Performance Tuning of U-SQL in Azure Data Lake (SQL Konfer...
 
data warehousing
data warehousingdata warehousing
data warehousing
 
Lecture3.ppt
Lecture3.pptLecture3.ppt
Lecture3.ppt
 
Data Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptxData Science Machine Lerning Bigdat.pptx
Data Science Machine Lerning Bigdat.pptx
 
The BI Sandbox
The BI SandboxThe BI Sandbox
The BI Sandbox
 
Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27Prague data management meetup 2018-03-27
Prague data management meetup 2018-03-27
 
Data warehouse introduction
Data warehouse introductionData warehouse introduction
Data warehouse introduction
 
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am
Business Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8amBusiness Intelligence  Data Analytics June 28 2012 Icpas V4  Final 20120625 8am
Business Intelligence Data Analytics June 28 2012 Icpas V4 Final 20120625 8am
 
Building Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon RedshiftBuilding Your Data Warehouse with Amazon Redshift
Building Your Data Warehouse with Amazon Redshift
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Processing Big Data
Processing Big DataProcessing Big Data
Processing Big Data
 
(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)
 
Oracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big DataOracle Database 12c - Features for Big Data
Oracle Database 12c - Features for Big Data
 
Computing 7
Computing 7Computing 7
Computing 7
 
BI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business businessBI Chapter 03.pdf business business business business business business
BI Chapter 03.pdf business business business business business business
 

Mais de DataminingTools Inc

AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceDataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web miningDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 

Mais de DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Machine learning Introduction
Machine learning IntroductionMachine learning Introduction
Machine learning Introduction
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
AI: Planning and AI
AI: Planning and AIAI: Planning and AI
AI: Planning and AI
 
AI: Logic in AI 2
AI: Logic in AI 2AI: Logic in AI 2
AI: Logic in AI 2
 
AI: Logic in AI
AI: Logic in AIAI: Logic in AI
AI: Logic in AI
 
AI: Learning in AI 2
AI: Learning in AI 2AI: Learning in AI 2
AI: Learning in AI 2
 
AI: Learning in AI
AI: Learning in AI AI: Learning in AI
AI: Learning in AI
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligenceAI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
 
AI: Belief Networks
AI: Belief NetworksAI: Belief Networks
AI: Belief Networks
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Text and web mining
Data Mining: Text and web miningData Mining: Text and web mining
Data Mining: Text and web mining
 
Data Mining: Outlier analysis
Data Mining: Outlier analysisData Mining: Outlier analysis
Data Mining: Outlier analysis
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 

Último

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 

Último (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

Oracle: DW Design

  • 2. Characteristics of a Data Warehouse A data warehouse is a database designed for querying, reporting, and analysis. A data warehouse contains historical data derived from transaction data. Data warehouses separate analysis workload from transaction workload. A data warehouse is primarily an analytical tool.
  • 3. Comparing OLTP and Data Warehouses OLTP Data Warehouse Many Joins Some Comparatively Data accessed by Large lower queries amount Normalized Duplicated data Denormalized DBMS DBMS Derived data Rare and Common aggregates
  • 4. Data Warehouse Architectures Analysis Operational systems Metadata Sales Purchasing Materialized Raw data Staging views area Reporting Inventory Flat files Data mining
  • 5. Data Warehouse Design • Key data warehouse design considerations: – Identify the specific data content. – Recognize the critical relationships within and between groups of data. – Define the system environment supporting your data warehouse. – Identify the required data transformations. – Calculate the frequency at which the data must be refreshed.
  • 6. Logical Design – A logical design is conceptual and abstract. – Entity-relationship (ER) modeling is useful in identifying logical information requirements. • An entity represents a chunk of data. • The properties of entities are known as attributes. • The links between entities and attributes are known as relationships. – Dimensional modeling is a specialized type of ER modeling useful in data warehouse design.
  • 7. Oracle Warehouse Builder – Oracle Database provides tools to implement the ETL process. • Oracle Warehouse Builder is a tool to help in this process. – Oracle Warehouse Builder generates the following types of code: • SQL data definition language (DDL) scripts • PL/SQL programs • SQL*Loader control files • XML Processing Description Language (XPDL) • ABAP code (used to extract data from SAP systems)
  • 8. Data Warehousing Schemas – Objects can be arranged in data warehousing schema models in a variety of ways: • Star schema • Snowflake schema • Third normal form (3NF) schema • Hybrid schemas – The source data model and user requirements should steer the data warehouse schema. – Implementation of the logical model may require changes to enable you to adapt it to your physical system.
  • 9. Schema Characteristics – Star schema • Characterized by one or more large fact tables and a number of much smaller dimension tables • Each dimension table joined to the fact table using a primary key to foreign key join – Snowflake schema • Dimension data grouped into multiple tables instead of one large table • Increased number of dimension tables, requiring more foreign key joins – Third normal form (3NF) schema • A classical relational-database model that minimizes data redundancy through normalization
  • 10. Data Warehousing Objects – Fact tables • Fact tables are the large tables that store business measurements. – Dimension tables • A dimension is a structure composed of one or more hierarchies that categorizes data. • Unique identifiers are specified for one distinct record in a dimension table. – Relationships • Relationships guarantee integrity of business information.
  • 11. Fact Tables – A fact table must be defined for each star schema. – Fact tables are the large tables that store business measurements. – A fact table contains either detail-level or aggregated facts. – A fact table usually contains facts with the same level of aggregation. – The primary key of the fact table is usually a composite key made up of all its foreign keys.
  • 12. Dimensions and Hierarchies CUSTOMERS dimension – A dimension is a structure hierarchy (by level) composed of one or more hierarchies that categorizes data. REGION – Dimensional attributes help to describe the dimensional value. SUBREGION – Dimension data is collected at the lowest level of detail and aggregatedCOUNTRY into higher level totals. – Hierarchies are structures that use STATE ordered levels to organize data. – In a hierarchy, each level is CITY connected to the levels above and below it. CUSTOMER
  • 13. Dimensions and Hierarchies PRODUCTS CUSTOMERS #prod_id Unique identifier #cust_id Fact table cust_last_name cust_city cust_state_province Relationship SALES cust_id prod_id Hierarchy TIMES CHANNELS PROMOTIONS Dimension table Dimension table Dimension table
  • 14. Physical Design Logical Physical (Tablespaces) Entities Tables Indexes Materialized Relationships Integrity views constraints - Primary key - Foreign key Attributes - Not null Dimensions Unique identifiers Columns
  • 15. Data Warehouse Physical Structures • Tables and partitioned tables – Partitioned tables enable you to split large data volumes into smaller, more manageable pieces. – Expect performance benefits from: • Partition pruning • Intelligent parallel processing – Compressed tables offer scaleup opportunities for read-only operations. – Table compression saves disk space.
  • 16. Data Warehouse Physical Structures – Views: • Are tailored presentations of data contained in one or more tables or views • Do not require any space in the database – Materialized views: • Are query results that have been stored in advance • (Like indexes) are used transparently and improve performance – Integrity constraints: • Are used in data warehouses for query rewrite – Dimensions: • Are containers of logical relationships and do not require any space in the database
  • 17. Managing Large Volumes of Data • Work smarter in your data warehouse: – Partitioning – Bitmap indexes/Star transformation – Data compression – Query rewrite • Work harder in your data warehouse: – Parallelism for all operations • DBA tasks, such as loading, index creation, table creation, data modification, backup and recovery • End-user operations, such as queries • Unbounded scalability: Real Application Clusters
  • 18. I/O Performance in Data Warehouses – I/O is typically the primary determinant of data warehouse performance. – Data warehouse storage configurations should be chosen by I/O bandwidth, not storage capacity. – Every component of the I/O subsystem should provide enough bandwidth: • Disks • I/O channels • I/O adapters – In data warehouses, maximizing sequential I/O throughput is critical.
  • 19. I/O Scalability Parallel execution: – Reduces response time for data-intensive operations on large databases – Benefits systems with the following characteristics: • Multiprocessors, clusters, or massively parallel systems • Sufficient I/O bandwidth • Sufficient memory to support memory-intensive processes such as sorts, hashing, and I/O buffers Query servers Coordinator Data on disk Scan Sort Q1 Scan Sort Q2 Dispatch work Scan Sort Q3 Scan Sort Q4 Scanners Sorters (Aggregators)
  • 20. I/O Scalability • Automatic Storage Management (ASM) – Configuring storage for a DB depends on many variables: • Which data to put on which disk • Logical unit number (LUN) configurations • DB types and workloads; data warehouse, OLTP, DSS • Trade-offs between available options – ASM provides solutions to storage issues encountered in data warehouses.
  • 21. I/O Scalability • Automatic Storage Management: Overview – Portable and high-performance cluster file system Application – Manages Oracle database files – Data spread across disks Database to balance load File – Integrated mirroring across system ASM disks Volume manager – Solves many storage management challenges Operating system
  • 22. Visit more self help tutorials • Pick a tutorial of your choice and browse through it at your own pace. • The tutorials section is free, self-guiding and will not involve any additional support. • Visit us at www.dataminingtools.net