Big Data integration is an excellent feature in the Oracle Data Integration product suite (Oracle Data Integrator, GoldenGate, & Enterprise Data Quality). But not all analytics require big data technologies, such as labor cost, revenue, or expense reporting. Ralph Kimball, an original architect of the dimensional model in data warehousing, spent much of his career working to build an enterprise data warehouse methodology that can meet these reporting needs. His book, "The Data Warehouse ETL Toolkit", is a guide for many ETL developers. This session will walk you through his ETL Subsystem categories; Extracting, Cleaning & Conforming, Delivering, and Managing, describing how the Oracle Data Integration products are perfectly suited for the Kimball approach.
Presented at Oracle OpenWorld 2015 & BIWA Summit 2016.
3. info@rittmanmead.com www.rittmanmead.com @rittmanmead
About Rittman Mead
3
•World’s leading specialist partner for technical
excellence, solutions delivery and innovation in
Oracle Data Integration, Business Intelligence,
Analytics and Big Data
•Providing our customers targeted expertise; we are a
company that doesn’t try to do everything… only
what we excel at
•70+ consultants worldwide including 1 Oracle ACE
Director and 3 Oracle ACEs
•Founded on the values of collaboration, learning,
integrity and getting things done
Optimizing your investment in Oracle Data Integration
•Comprehensive service portfolio designed to
support the full lifecycle of any analytics solution
4. info@rittmanmead.com www.rittmanmead.com @rittmanmead 4
Visual Redesign Business User Training
Ongoing SupportEngagement Toolkit
Average user adoption for BI
platforms is below 25%
Rittman Mead’s User Engagement Service can help
10. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Wait! What are Kimball ETL Subsystems?
Do you all know of Ralph Kimball?
7
www.kimballgroup.com
Ralph Kimball founded the Kimball Group. Since the mid-1980s, he has
been the DW/BI industry’s thought leader on the dimensional approach
and trained more than 20,000 students. Prior to working at Metaphor and
founding Red Brick Systems, Ralph co-invented the first commercially-
available workstation with a graphical user interface at Xerox’s Palo Alto
Research Center (PARC). Ralph has his Ph.D. in Electrical Engineering
from Stanford University.
13. info@rittmanmead.com www.rittmanmead.com @rittmanmead
The Kimball 34 Subsystems of ETL
10
• Cleaning and Conforming Data
- Data Cleansing System
- Error Event Schema
- Audit Dimension Assembler
- Deduplication System
- Conforming System
14. info@rittmanmead.com www.rittmanmead.com @rittmanmead
The Kimball 34 Subsystems of ETL
11
• Delivering Data for Presentation
- Slowly Changing Dimension
Manager
- Surrogate Key Generator
- Hierarchy Manager
- Special Dimensions Manager
- Fact Table Builders
- Surrogate Key Pipeline
- Late Arriving Data Handler
- Multi-Valued Dimension Bridge
Table Builder
- Dimension Manager System
- Fact Provider System
- Aggregate Builder
- OLAP Cube Builder
- Data Propagation Manager
15. info@rittmanmead.com www.rittmanmead.com @rittmanmead
The Kimball 34 Subsystems of ETL
12
• Managing the ETL Environment
- Job Scheduler
- Backup System
- Recovery and Restart System
- Version Control System
- Version Migration System
- Workflow Monitor
- Sorting System
- Lineage & Dependency
Analyzer
- Problem Escalation System
- Parallelizing / Pipelining System
- Security System
- Compliance Manager
- Metadata Repository Manager
34. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Extracting Data - Oracle Data Integrator
20
• Extract from many different
systems? Yes!
- Multiple technologies OOTB
- Custom technologies can be added
• Data Server - connection to the
data source
- Physical Schema
- Logical Schema
35. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Extracting Data - Oracle Data Integrator
20
• Extract from many different
systems? Yes!
- Multiple technologies OOTB
- Custom technologies can be added
• Data Server - connection to the
data source
- Physical Schema
- Logical Schema
44. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Extracting Data - Changed Data Only
23
• Change Data Capture
- Extract only the changed data since the last ETL extract
• Methods
- Audit columns
- Timed extract
- Full “diff compare”
- Database log scraping
45. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Extracting Data - Changed Data Only
23
• Change Data Capture
- Extract only the changed data since the last ETL extract
• Methods
- Audit columns
- Timed extract
- Full “diff compare”
- Database log scraping
77. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Delivering Data
35
• Slowly Changing Dimension
Manager
• Surrogate Key Generator
• Hierarchy Manager
• Special Dimensions Manager
• Fact Table Builders
• Surrogate Key Pipeline
• Late Arriving Data Handler
•Multi-Valued
Dimension Bridge Table
Builder
•Dimension Manager System
•Fact Provider System
•Aggregate Builder
•OLAP Cube Builder
•Data Propagation Manager
78. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Delivering Data
36
• Slowly Changing Dimension Manager
- ODI Integration Knowledge Module
- Set SCD behavior type for each
target column
• Surrogate Key Generator
- Database Sequence objects and ODI Sequences
• Fact Table Builder
- Lookups in ODI
88. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Delivering Data
40
• Slowly Changing Dimension Manager
- ODI Integration Knowledge Module
- Set SCD behavior type for each
target column
• Surrogate Key Generator
- Database Sequence objects and ODI Sequences
• Fact Table Builder
- Lookups in ODI
89. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Managing the ETL Environment
41
• Job Scheduler
• Backup System
• Recovery and Restart System
• Version Control System
• Version Migration System
• Workflow Monitor
• Sorting System
• Lineage &
Dependency Analyzer
• Problem Escalation System
• Parallelizing / Pipelining
System
• Security System
• Compliance Manager
• Metadata Repository Manager
91. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Managing the ETL Environment - Job Scheduler
43
• Alternative to ODI scheduler - external scheduling tool
- ODI Scenarios and Load Plans can be executed via command
line script or web service
./startloadplan.sh LOAD_EDW GLOBAL 6
-AGENT_URL=http://localhost:20910/oraclediagent
97. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Managing the ETL Environment
46
• Job Scheduler
• Backup System
• Recovery and Restart System
• Version Control System
• Version Migration System
• Workflow Monitor
• Sorting System
• Lineage &
Dependency Analyzer
• Problem Escalation System
• Parallelizing / Pipelining
System
• Security System
• Compliance Manager
• Metadata Repository Manager
98. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Where did we end up?
47
• The Kimball ETL Subsystems
will guide your data warehouse
program
• Oracle Data Integration can
help you fully implement the
ETL Subsystems
- Extract, Load, Transform with
ODI and GoldenGate
- Profile and cleanse data with
Enterprise Data Quality
102. info@rittmanmead.com www.rittmanmead.com @rittmanmead
Rittman Mead Sessions
51
No Big Data Hacking—Time for a Complete ETL
Solution with Oracle Data Integrator 12c
[UGF5827]
Jérôme Françoisse | Sunday, Oct 25, 8:00am |
Moscone South 301
Empowering Users: Oracle Business Intelligence
Enterprise Edition 12c Visual Analyzer [UGF5481]
Edelweiss Kammermann | Sunday, Oct 25, 10:00am
| Moscone West 3011
A Walk Through the Kimball ETL Subsystems
with Oracle Data Integration Solutions [UGF6311]
Michael Rainey | Sunday, Oct 25, 12:00pm |
Moscone South 301
Oracle Business Intelligence Cloud Service—
Moving Your Complete BI Platform to the Cloud
[UGF4906]
Mark Rittman | Sunday, Oct 25, 2:30pm | Moscone
South 301
Oracle Data Integration Product Family: a
Cornerstone for Big Data [CON9609]
Mark Rittman | Wednesday, Oct 28, 12:15pm |
Moscone West 2022
Developer Best Practices for Oracle Data
Integrator Lifecycle Management [CON9611]
Jérôme Françoisse | Thursday, Oct 29, 2:30 pm |
Moscone West 2022