SlideShare uma empresa Scribd logo
1 de 19
1© Cloudera, Inc. All rights reserved.
The Future of Data
Warehousing:
ETL Will Never be the Same
Ralph Kimball| Founder, Kimball Group
Manish Vipani| Vice President and Chief Architect,
Kaiser Permanente
2© Cloudera, Inc. All rights reserved.
Hadoop’s impact on data warehousing
• Traditional DBMS stack exploded into separate layers
• Data layer: HDFS files, not curated relational tables
• Metadata layer: open extensible HCatalog, not vendor system tables
• Query layer: cottage industry of query engines, not vendor specific SQL
• Schema on Read
• Allow the query layer to decide how to consume the data
• Materialize the view later (e.g., into Parquet files) for high performance
Integration goes far beyond relational tables
• Conformed dimensions remain the glue holding together Hadoop applications
(even if you have never heard of conformed dimensions!)
3© Cloudera, Inc. All rights reserved.
The logical architecture hasn’t changed
• Original Sources  ETL Step  Exposed Presentation Data  BI Application
• BUT, the physical architecture of the back room now looks very different
4© Cloudera, Inc. All rights reserved.
Old back room
• Slow transfer from sources
• Physical transformations required
• Cleaning, normalization required
• Mandated RDBMS table targets
• Metadata limited to system tables
• Presentation layer vendor mandated
• Single focus: RDBMS SQL only
New back room
• Purpose built for high transfer rates
• Physical transformations optional
• Cleaning, normalization discouraged
• Table targets optional or deferred
• Extensible metadata via HCatalog
• Presentation layer open ended
• Before or after any transformations
• Analytic client specific
• Multiple simultaneous personalities
The old and new back rooms
5© Cloudera, Inc. All rights reserved.
Old back room
• Off limits except to ETL staff
• “we aren’t ready”
• “the data must be cleaned”
• “data governance trumps”
• “end users not trusted”
• Traditional IT control
New back room
• Doors open to
• Qualified analytic users
• Automated processes
• Experiments, model building
• Clients other than SQL
• Open data marketplace
The biggest change to the back room
6© Cloudera, Inc. All rights reserved.
The Landing Zone
at Kaiser Permanente
Implementing the new ETL approach in the real world.
A unified data repository for secure and trusted data.
7© Cloudera, Inc. All rights reserved.
Landing Zone
Landing Zone – Home to secure and organized data
• A self service data platform hosting both the raw and prepared data sets for quick business
consumption to drive advanced business insights and decisions.
• Allow seamless data access for authorized users across enterprise business functions.
• Data is organized by domains/use cases in Raw and Refined zone.
• Perimeter security with data encrypted at rest.
• Kerberized with integration to identity and Access Management system.
Parts of Landing Zone
• Raw Zone -> Exact replica of source data.
• Refined Zone -> Transformed prepared data sets organized by use cases.
• User Defined Space -> Secure and common access to raw and trusted data.
• Master Data, Metadata, Internal Reference Data, Industry Reference Data, etc…
8© Cloudera, Inc. All rights reserved.
Landing Zone
SQL Java PIGHIVE
Replicate
Data Selection
Python
Source
Data
Exploratory Intelligence
A MRD
Analyze MineRefineDiscover
E
DW/DM
L
Data Extract
Role Based
Access Control
Perimeter
Security
Data Registry (Tags & Catalog)
Internal Reference Data
Meta Data
Industry Reference Data
HDFS
Master Data
Raw
Zone
User
Defined
Space
Refined
Zone
Usage Data
All Data Encrypted @ Rest
Access
Authentication
Data Load
Extract-
Load
Copy
Landing Zone – A Self Service Data Platform
hosting both the raw and prepared data sets for
quick business consumption.
 Data Security –
 Deployed on secured network with
traffic monitoring.
 Data is encrypted at rest.
 Role based access and authorization.
 Data Organization –
 Exact replica of source data organized by
information domains in Raw Zone.
 Data organized by use cases in the
Refined Zone (transformed prepared
data sets).
 Separate area allocated to track master
data, metadata, internal reference data
& industry specific reference data sets.
Impala
9© Cloudera, Inc. All rights reserved.
The ETL Revolution Poses
Significant Challenges
Some old, some new
10© Cloudera, Inc. All rights reserved.
Old challenges we’ve seen before
• Big data world is furiously implementing stovepipes
• Good news is the excitement of new data sources and analyses
• Bad news is ignoring integration, the fix is to start over
• New departments not seen with traditional data warehousing
• Not on anyone’s radar  rolling their own systems
• Unusual business user profiles, latency demands, security lapses
• Big speed bumps when replacing old systems with new
• Users don’t want to switch
• New results don’t match old results
• Legacy hardware and software absurdly expensive, doesn’t scale reasonably
11© Cloudera, Inc. All rights reserved.
New challenges needing inventive approaches
• Traditional BI decision makers joined by
• Data scientists
• Roll their own ETL, hardware, OSs, programming languages
• Take results to senior management directly
• Don’t stick around for documentation, rollout, user support, maintenance
• Predictive models and modelers
• Constantly changing schemas
• Tricky integration, e.g., joining relational tables to HBase
• Automatic daemons
• Enormous, bursty demand for computing resources
12© Cloudera, Inc. All rights reserved.
Kaiser Permanente’s Pragmatic
Response to the Challenges
Pain Points:
• Lack of user transient store and structural flexibility due to slow adaption to changes.
• Lack of ability to do analytics and hypothesis testing of new data from disparate systems.
Successes:
• Over 10+ proven use cases with some early adopters.
13© Cloudera, Inc. All rights reserved.
Landing Zone use cases
Problem
• Lack insight to understand factors influencing members’ adoption and utilization
of online services.
• Lack data integration and co-relation due to disparate systems.
• Lack 3600 member service utilization view and dashboards.
Resolution
• Summarized and aggregated data sets in landing zone helps in improved
decision making.
• Faster and complete access to data at scale for metrics reporting and analytics.
• Reduced data collection & metric reporting time from 3 weeks to 10 hours.
• Ease of building “decision-centric” dashboards (8 in 3 months).
Online Member Services – “kp.org”
14© Cloudera, Inc. All rights reserved.
Landing Zone use cases cont…
Problem
• Commercial large-scale data warehouse (Teradata) repository is expensive at scale, grows
exponentially, and processes large volumes of queries/month.
• Continuing workload tuning efforts are slow to yield expected results.
Resolution
• Replicate data from Teradata into Landing Zone.
• Rewrite and tune queries to eliminate semantically equivalent queries to achieve better
performance.
Moving Traditional Data Warehouse Workload to Landing Zone
Problem
• Lack of platform to collect and correlate structured and unstructured data from consumer facing
health monitoring devices e.g.: Fitbit, Glucometer, etc.
• Clinicians cannot track members’ health or weight goals, and see usage patterns.
Resolution
• Ingest transactional data and device logs into landing zone and create analytics workspace.
• Enable clinicians to generate aggregated data for tracking member adherence and build
dashboards using native tools.
Digital Services Dashboard – “Interchange”
15© Cloudera, Inc. All rights reserved.
Landing Zone use cases cont…
Problem
• Sequential and fragmented processes having limited ability to enrich data sources to increase
accuracy.
• Lack of clinical and analytical views increases lead time to analysis and inconsistent results.
Resolution
• Ingest data from fragmented system into the Landing Zone.
• Created program-wide clinical and analytical views with refresh speed to 7 hours from 18 hours.
Common Clinical and Analytical Views
Problem
• Current Medicare reporting solution does not maintain history and requires significant effort to
recreate prior reports and perform trend analysis.
• Externally hosted CIMP systems are cost-prohibitive and difficult to scale.
Resolution
• Replicate data from 30+ source systems into Landing Zone providing access to data internally.
• Rebuild reports with improved performance that runs within reasonable time at scale.
• Proved versatility of platform to handle data at scale and created equivalent reports.
Consumer Information Management Platform – CIMP 2.0
16© Cloudera, Inc. All rights reserved.
Architectural Wrap-Up
What does all this mean?
17© Cloudera, Inc. All rights reserved.
Kaiser Permanente is a work in progress
with impressive early results,
and insights for moving forward
• Be the single source of all Kaiser’s data as well as external data leveraged by Kaiser applications,
processes, and for Kaiser decision making.
• “Learn and adapt” model provides common capabilities across rich data set, with increased agility in
provisioning new data sets.
• Enabling data profiling / tagging, semantic search, descriptive, predictive and prescriptive analytics to
drive advanced business insights and decisions.
18© Cloudera, Inc. All rights reserved.
The Back Room Landing Zone has become
a Vibrant Marketplace
• Replaces the quiet ETL back room
• Challenging (exciting) new service role for IT
• Open for business
• Data scientists  A/B testing  experimentation  prototyping
• Simultaneous ETL pipelines  aggregates, high-performance Parquet files, uploads to EDW
• Simultaneous SQL and non-SQL clients
• Immediate access
• Don’t wait for physical transformation  schema-on-read
• Purpose built for extreme I/O performance
19© Cloudera, Inc. All rights reserved.
Thank you
Ralph Kimball, ralphcollector@gmail.com
Manish Vipani, manish.x.vipani@kp.org

Mais conteúdo relacionado

Mais procurados

Data Architecture for Solutions.pdf
Data Architecture for Solutions.pdfData Architecture for Solutions.pdf
Data Architecture for Solutions.pdfAlan McSweeney
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data GovernanceChristopher Bradley
 
Business Intelligence (BI) and Data Management Basics
Business Intelligence (BI) and Data Management  Basics Business Intelligence (BI) and Data Management  Basics
Business Intelligence (BI) and Data Management Basics amorshed
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...Christopher Bradley
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachDATAVERSITY
 
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationDenodo
 
Data Quality Strategies
Data Quality StrategiesData Quality Strategies
Data Quality StrategiesDATAVERSITY
 
Data Governance
Data GovernanceData Governance
Data GovernanceBoris Otto
 
Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015Carl Anderson
 
Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI StrategyAtScale
 
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DATAVERSITY
 
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...Alan McSweeney
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata ManagementDATAVERSITY
 
Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2Christopher Bradley
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data GovernanceTuba Yaman Him
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architectureanicewick
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challengeLenia Miltiadous
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 

Mais procurados (20)

Data Architecture for Solutions.pdf
Data Architecture for Solutions.pdfData Architecture for Solutions.pdf
Data Architecture for Solutions.pdf
 
Implementing Effective Data Governance
Implementing Effective Data GovernanceImplementing Effective Data Governance
Implementing Effective Data Governance
 
Business Intelligence (BI) and Data Management Basics
Business Intelligence (BI) and Data Management  Basics Business Intelligence (BI) and Data Management  Basics
Business Intelligence (BI) and Data Management Basics
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...
 
Business Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected ApproachBusiness Intelligence & Data Analytics– An Architected Approach
Business Intelligence & Data Analytics– An Architected Approach
 
Modernizing Integration with Data Virtualization
Modernizing Integration with Data VirtualizationModernizing Integration with Data Virtualization
Modernizing Integration with Data Virtualization
 
Data Quality Strategies
Data Quality StrategiesData Quality Strategies
Data Quality Strategies
 
Data Governance
Data GovernanceData Governance
Data Governance
 
Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015Creating a Data-Driven Organization, Crunchconf, October 2015
Creating a Data-Driven Organization, Crunchconf, October 2015
 
Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI Strategy
 
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...
 
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
Data Integration, Access, Flow, Exchange, Transfer, Load And Extract Architec...
 
Best Practices in Metadata Management
Best Practices in Metadata ManagementBest Practices in Metadata Management
Best Practices in Metadata Management
 
Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2
 
Data Quality & Data Governance
Data Quality & Data GovernanceData Quality & Data Governance
Data Quality & Data Governance
 
Data quality architecture
Data quality architectureData quality architecture
Data quality architecture
 
The data quality challenge
The data quality challengeThe data quality challenge
The data quality challenge
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 

Destaque

Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data ArchitectureZaloni
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSINGKing Julian
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Miningidnats
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataInside Analysis
 
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...Shahzad
 
Ods, edf, eav & global types
Ods, edf, eav & global typesOds, edf, eav & global types
Ods, edf, eav & global typesSTIinnsbruck
 
Computer science __engineering(4)
Computer science __engineering(4)Computer science __engineering(4)
Computer science __engineering(4)vasanthak2k
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...Eric Javier Espino Man
 
Data Mining: Concepts and Techniques — Chapter 2 —
Data Mining:  Concepts and Techniques — Chapter 2 —Data Mining:  Concepts and Techniques — Chapter 2 —
Data Mining: Concepts and Techniques — Chapter 2 —Salah Amean
 
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingDunn Solutions Group
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl conceptsjeshocarme
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSKent Graziano
 

Destaque (20)

Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data Architecture
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
DATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MININGDATA WAREHOUSING AND DATA MINING
DATA WAREHOUSING AND DATA MINING
 
Data Warehousing and Data Mining
Data Warehousing and Data MiningData Warehousing and Data Mining
Data Warehousing and Data Mining
 
Hpc 4 5
Hpc 4 5Hpc 4 5
Hpc 4 5
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate Data
 
H0114857
H0114857H0114857
H0114857
 
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...To Study  E T L ( Extract, Transform, Load) Tools Specially  S Q L  Server  I...
To Study E T L ( Extract, Transform, Load) Tools Specially S Q L Server I...
 
Ods, edf, eav & global types
Ods, edf, eav & global typesOds, edf, eav & global types
Ods, edf, eav & global types
 
SAP HORTONWORKS
SAP HORTONWORKSSAP HORTONWORKS
SAP HORTONWORKS
 
Computer science __engineering(4)
Computer science __engineering(4)Computer science __engineering(4)
Computer science __engineering(4)
 
Periyar msc
Periyar mscPeriyar msc
Periyar msc
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
 
Apresentação ODS
Apresentação ODSApresentação ODS
Apresentação ODS
 
Data Mining: Concepts and Techniques — Chapter 2 —
Data Mining:  Concepts and Techniques — Chapter 2 —Data Mining:  Concepts and Techniques — Chapter 2 —
Data Mining: Concepts and Techniques — Chapter 2 —
 
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 4 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Data Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional ModelingData Warehouse Back to Basics: Dimensional Modeling
Data Warehouse Back to Basics: Dimensional Modeling
 
ETL QA
ETL QAETL QA
ETL QA
 
Dw & etl concepts
Dw & etl conceptsDw & etl concepts
Dw & etl concepts
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
 

Semelhante a The Future of Data Warehousing: ETL Will Never be the Same

The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeSaurabh K. Gupta
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...Cloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Saurabh K. Gupta
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesCloudera, Inc.
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
Big Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsBig Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsDavid Bennett
 
Oracle Database Appliance X5-2
Oracle Database Appliance X5-2 Oracle Database Appliance X5-2
Oracle Database Appliance X5-2 Yasir El Nimr
 
Chapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning SystemChapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning SystemMuhammad Azmy
 
PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management ArchitecturePradeep Ballal
 

Semelhante a The Future of Data Warehousing: ETL Will Never be the Same (20)

The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Harness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data LakeHarness the power of Data in a Big Data Lake
Harness the power of Data in a Big Data Lake
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Data warehouseold
Data warehouseoldData warehouseold
Data warehouseold
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration Achieve data democracy in data lake with data integration
Achieve data democracy in data lake with data integration
 
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency ObjectivesHadoop Essentials -- The What, Why and How to Meet Agency Objectives
Hadoop Essentials -- The What, Why and How to Meet Agency Objectives
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
Big Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate SystemsBig Data/Cloudera from Excelerate Systems
Big Data/Cloudera from Excelerate Systems
 
Oracle Database Appliance X5-2
Oracle Database Appliance X5-2 Oracle Database Appliance X5-2
Oracle Database Appliance X5-2
 
Chapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning SystemChapter 11 Enterprise Resource Planning System
Chapter 11 Enterprise Resource Planning System
 
PayPal Decision Management Architecture
PayPal Decision Management ArchitecturePayPal Decision Management Architecture
PayPal Decision Management Architecture
 

Mais de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Mais de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...masabamasaba
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxAnnaArtyushina1
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benonimasabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...chiefasafspells
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...masabamasaba
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationJuha-Pekka Tolvanen
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 

Último (20)

WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
WSO2CON 2024 - Navigating API Complexity: REST, GraphQL, gRPC, Websocket, Web...
 
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
WSO2CON 2024 - WSO2's Digital Transformation Journey with Choreo: A Platforml...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open SourceWSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
WSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security ProgramWSO2CON 2024 - How to Run a Security Program
WSO2CON 2024 - How to Run a Security Program
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 

The Future of Data Warehousing: ETL Will Never be the Same

  • 1. 1© Cloudera, Inc. All rights reserved. The Future of Data Warehousing: ETL Will Never be the Same Ralph Kimball| Founder, Kimball Group Manish Vipani| Vice President and Chief Architect, Kaiser Permanente
  • 2. 2© Cloudera, Inc. All rights reserved. Hadoop’s impact on data warehousing • Traditional DBMS stack exploded into separate layers • Data layer: HDFS files, not curated relational tables • Metadata layer: open extensible HCatalog, not vendor system tables • Query layer: cottage industry of query engines, not vendor specific SQL • Schema on Read • Allow the query layer to decide how to consume the data • Materialize the view later (e.g., into Parquet files) for high performance Integration goes far beyond relational tables • Conformed dimensions remain the glue holding together Hadoop applications (even if you have never heard of conformed dimensions!)
  • 3. 3© Cloudera, Inc. All rights reserved. The logical architecture hasn’t changed • Original Sources  ETL Step  Exposed Presentation Data  BI Application • BUT, the physical architecture of the back room now looks very different
  • 4. 4© Cloudera, Inc. All rights reserved. Old back room • Slow transfer from sources • Physical transformations required • Cleaning, normalization required • Mandated RDBMS table targets • Metadata limited to system tables • Presentation layer vendor mandated • Single focus: RDBMS SQL only New back room • Purpose built for high transfer rates • Physical transformations optional • Cleaning, normalization discouraged • Table targets optional or deferred • Extensible metadata via HCatalog • Presentation layer open ended • Before or after any transformations • Analytic client specific • Multiple simultaneous personalities The old and new back rooms
  • 5. 5© Cloudera, Inc. All rights reserved. Old back room • Off limits except to ETL staff • “we aren’t ready” • “the data must be cleaned” • “data governance trumps” • “end users not trusted” • Traditional IT control New back room • Doors open to • Qualified analytic users • Automated processes • Experiments, model building • Clients other than SQL • Open data marketplace The biggest change to the back room
  • 6. 6© Cloudera, Inc. All rights reserved. The Landing Zone at Kaiser Permanente Implementing the new ETL approach in the real world. A unified data repository for secure and trusted data.
  • 7. 7© Cloudera, Inc. All rights reserved. Landing Zone Landing Zone – Home to secure and organized data • A self service data platform hosting both the raw and prepared data sets for quick business consumption to drive advanced business insights and decisions. • Allow seamless data access for authorized users across enterprise business functions. • Data is organized by domains/use cases in Raw and Refined zone. • Perimeter security with data encrypted at rest. • Kerberized with integration to identity and Access Management system. Parts of Landing Zone • Raw Zone -> Exact replica of source data. • Refined Zone -> Transformed prepared data sets organized by use cases. • User Defined Space -> Secure and common access to raw and trusted data. • Master Data, Metadata, Internal Reference Data, Industry Reference Data, etc…
  • 8. 8© Cloudera, Inc. All rights reserved. Landing Zone SQL Java PIGHIVE Replicate Data Selection Python Source Data Exploratory Intelligence A MRD Analyze MineRefineDiscover E DW/DM L Data Extract Role Based Access Control Perimeter Security Data Registry (Tags & Catalog) Internal Reference Data Meta Data Industry Reference Data HDFS Master Data Raw Zone User Defined Space Refined Zone Usage Data All Data Encrypted @ Rest Access Authentication Data Load Extract- Load Copy Landing Zone – A Self Service Data Platform hosting both the raw and prepared data sets for quick business consumption.  Data Security –  Deployed on secured network with traffic monitoring.  Data is encrypted at rest.  Role based access and authorization.  Data Organization –  Exact replica of source data organized by information domains in Raw Zone.  Data organized by use cases in the Refined Zone (transformed prepared data sets).  Separate area allocated to track master data, metadata, internal reference data & industry specific reference data sets. Impala
  • 9. 9© Cloudera, Inc. All rights reserved. The ETL Revolution Poses Significant Challenges Some old, some new
  • 10. 10© Cloudera, Inc. All rights reserved. Old challenges we’ve seen before • Big data world is furiously implementing stovepipes • Good news is the excitement of new data sources and analyses • Bad news is ignoring integration, the fix is to start over • New departments not seen with traditional data warehousing • Not on anyone’s radar  rolling their own systems • Unusual business user profiles, latency demands, security lapses • Big speed bumps when replacing old systems with new • Users don’t want to switch • New results don’t match old results • Legacy hardware and software absurdly expensive, doesn’t scale reasonably
  • 11. 11© Cloudera, Inc. All rights reserved. New challenges needing inventive approaches • Traditional BI decision makers joined by • Data scientists • Roll their own ETL, hardware, OSs, programming languages • Take results to senior management directly • Don’t stick around for documentation, rollout, user support, maintenance • Predictive models and modelers • Constantly changing schemas • Tricky integration, e.g., joining relational tables to HBase • Automatic daemons • Enormous, bursty demand for computing resources
  • 12. 12© Cloudera, Inc. All rights reserved. Kaiser Permanente’s Pragmatic Response to the Challenges Pain Points: • Lack of user transient store and structural flexibility due to slow adaption to changes. • Lack of ability to do analytics and hypothesis testing of new data from disparate systems. Successes: • Over 10+ proven use cases with some early adopters.
  • 13. 13© Cloudera, Inc. All rights reserved. Landing Zone use cases Problem • Lack insight to understand factors influencing members’ adoption and utilization of online services. • Lack data integration and co-relation due to disparate systems. • Lack 3600 member service utilization view and dashboards. Resolution • Summarized and aggregated data sets in landing zone helps in improved decision making. • Faster and complete access to data at scale for metrics reporting and analytics. • Reduced data collection & metric reporting time from 3 weeks to 10 hours. • Ease of building “decision-centric” dashboards (8 in 3 months). Online Member Services – “kp.org”
  • 14. 14© Cloudera, Inc. All rights reserved. Landing Zone use cases cont… Problem • Commercial large-scale data warehouse (Teradata) repository is expensive at scale, grows exponentially, and processes large volumes of queries/month. • Continuing workload tuning efforts are slow to yield expected results. Resolution • Replicate data from Teradata into Landing Zone. • Rewrite and tune queries to eliminate semantically equivalent queries to achieve better performance. Moving Traditional Data Warehouse Workload to Landing Zone Problem • Lack of platform to collect and correlate structured and unstructured data from consumer facing health monitoring devices e.g.: Fitbit, Glucometer, etc. • Clinicians cannot track members’ health or weight goals, and see usage patterns. Resolution • Ingest transactional data and device logs into landing zone and create analytics workspace. • Enable clinicians to generate aggregated data for tracking member adherence and build dashboards using native tools. Digital Services Dashboard – “Interchange”
  • 15. 15© Cloudera, Inc. All rights reserved. Landing Zone use cases cont… Problem • Sequential and fragmented processes having limited ability to enrich data sources to increase accuracy. • Lack of clinical and analytical views increases lead time to analysis and inconsistent results. Resolution • Ingest data from fragmented system into the Landing Zone. • Created program-wide clinical and analytical views with refresh speed to 7 hours from 18 hours. Common Clinical and Analytical Views Problem • Current Medicare reporting solution does not maintain history and requires significant effort to recreate prior reports and perform trend analysis. • Externally hosted CIMP systems are cost-prohibitive and difficult to scale. Resolution • Replicate data from 30+ source systems into Landing Zone providing access to data internally. • Rebuild reports with improved performance that runs within reasonable time at scale. • Proved versatility of platform to handle data at scale and created equivalent reports. Consumer Information Management Platform – CIMP 2.0
  • 16. 16© Cloudera, Inc. All rights reserved. Architectural Wrap-Up What does all this mean?
  • 17. 17© Cloudera, Inc. All rights reserved. Kaiser Permanente is a work in progress with impressive early results, and insights for moving forward • Be the single source of all Kaiser’s data as well as external data leveraged by Kaiser applications, processes, and for Kaiser decision making. • “Learn and adapt” model provides common capabilities across rich data set, with increased agility in provisioning new data sets. • Enabling data profiling / tagging, semantic search, descriptive, predictive and prescriptive analytics to drive advanced business insights and decisions.
  • 18. 18© Cloudera, Inc. All rights reserved. The Back Room Landing Zone has become a Vibrant Marketplace • Replaces the quiet ETL back room • Challenging (exciting) new service role for IT • Open for business • Data scientists  A/B testing  experimentation  prototyping • Simultaneous ETL pipelines  aggregates, high-performance Parquet files, uploads to EDW • Simultaneous SQL and non-SQL clients • Immediate access • Don’t wait for physical transformation  schema-on-read • Purpose built for extreme I/O performance
  • 19. 19© Cloudera, Inc. All rights reserved. Thank you Ralph Kimball, ralphcollector@gmail.com Manish Vipani, manish.x.vipani@kp.org