SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Why Your Data Science
Architecture Should Include
a Data Virtualization Tool
Chris Day
Director, APAC Sales Engineering
cday@denodo.com
CDAO NEW
ZEALAND
3 - 5 NOVEMBER 2020
Agenda• Advanced Analytics & Machine Learning
• The Data Challenge
• Tackling the Data Preparation Tasks Problem
• Customer Story
• Q&A
3
VentureBeat AI, July 2019
87% of data science projects never make it
into production.
4
Advanced Analytics & Machine Learning Exercises Need Data
Improving Patient
Outcomes
Data includes patient demographics,
family history, patient vitals, lab test
results, claims data etc.
Predictive Maintenance
Maintenance data logs, data coming in
from sensors – including temperature,
running time, power level duration etc.
Predicting Late Payment
Data includes company or individual
demographics, payment history,
customer support logs etc.
Preventing Frauds
Data includes the location where the
claim originated, time of the day,
claimant history and any recent adverse
events.
Reducing Customer Churn
Data includes customer demographics,
products purchased, products used, pat
transaction, company size, history,
revenue etc.
5
The Scale of the Problem
What is Data Virtualization?
7
What is Data Virtualization?
Consume
in business applications
Combine
related data into views
Connect
to disparate data
sources
2
3
1
8
Data Virtualization Architecture Diagram
DATA CONSUMERS
Enterprise Applications, Reporting, BI, Portals, ESB, Mobile, Web, Users, IoT/Streaming Data
Multiple Protocols,
Formats
Query, Search,
Browse
Request/Reply,
Event Driven
Secure
Delivery
DATA CONSUMERSAnalytical Operational
Web
Services
DISPARATE DATA SOURCES
Databases & Warehouses, Cloud/SaaS Applications, Big Data, NoSQL, Web, XML, Excel, PDF, Word...
Less StructuredMore Structured
SQL,
MDX
Big Data
APIs
Web Automation
and Indexing
DATA VIRTUALIZATION
CONNECT COMBINE CONSUME
Share, Deliver,
Publish, Govern,
Collaborate
Discover, Transform,
Prepare, Improve
Quality, Integrate
Normalized
Views of
Disparate Data
Agile Development
Performance
Data Services
Resource
Management
Data Catalog
Governance
& Metadata
Security and
Data Privacy
Lifecycle
Management
Tackling the Data Pipeline Problem
10
Typical Data Science Workflow
A typical workflow for a data scientist is:
1. Gather the requirements for the business problem
2. Identify useful data
▪ Ingest data
3. Cleanse data into a useful format
4. Analyze data
5. Prepare input for your algorithms
6. Execute data science algorithms (ML, AI, etc.)
▪ Iterate steps 2 to 6 until valuable insights are
produced
7. Visualize and share
Source:http://sudeep.co/data-science/Understanding-the-Data-Science-Lifecycle/
11
Where Does Your Time Go?
• 80% of time – Finding and
preparing the data
• 10% of time – Analysis
• 10% of time – Visualizing data
Source:http://sudeep.co/data-science/Understanding-the-Data-Science-Lifecycle/
12
Where Does Your Time Go?
A large amount of time and effort goes into tasks not intrinsically related to data science:
• Finding where the right data may be
• Getting access to the data
▪ Bureaucracy
▪ Understand access methods and technology (noSQL, REST APIs, etc.)
• Transforming data into a format easy to work with
• Combining data originally available in different sources and formats
• Profile and cleanse data to eliminate incomplete or inconsistent data points
13
Data Scientist Workflow
Identify useful
data
Modify datainto
auseful format
Analyzedata Execute data
science algorithms
(ML,AI, etc.)
Prepare for
MLalgorithm
14
Identify Useful Data
If the company has a virtual layer with a good coverage of
data sources, this task is greatly simplified.
▪ A data virtualization tool like Denodo can offer
unified access to all data available in the company.
▪ It abstracts the technologies underneath, offering a
standard SQL interface to query and manipulate.
To further simplify the challenge, Denodo offers a Data
Catalog to search, find and explore your data assets.
15
Data Scientist Workflow
Identify useful
data
Modify datainto
auseful format
Analyzedata Execute data
science algorithms
(ML,AI, etc.)
Prepare for
MLalgorithm
16
Data Virtualization offers the unique opportunity of
using standard SQL (joins, aggregations,
transformations, etc.) to access, manipulate and
analyze any data.
Cleansing and transformation steps can be easily
accomplished in SQL.
Its modeling capabilities enable the definition of views
that embed this logic to foster reusability.
Ingestion And Data Manipulation Tasks
Tackling the Large Dataset Problem
18
It enables the persistence of aggregates to accelerate the execution of analytical
queries
▪ Common joins, aggregations and filters can be precomputed (in the cache or in a data source)
and used as starting points to accelerate queries
Smart Query Acceleration for Analytics
Store
400 rows
Sales
300 M rows
Sales by
Customer, Store, Date
Summary
Sales by
Customer
Summary
S S
Customer
1M rows
Sales by date, with store and customer information ?
Sales with customer information ?
19
Smart Query Acceleration for Analytics: Summaries
Summaries : Commonly joined fact & dimensions are precomputed and used to accelerate future queries.
System Execution
time
Other systems >500 secs
Denodo(without summary) ~13 secs
Denodo(with summary) ~1.4 secs
• Historical Sales – 220 mil rows
• Trailing Twelve Months Sales – 68M rows
• Date_Dim – 73K rows
• Store – contains StoreID, store_name and City
• “Summ 1” – summary, 300K rows
“Total Sales by StoreID and Day” query, stored
in Redshift
Case Study
21
McCormick Uses Denodo to Provide Data to Its AI Project
Background
▪ McCormick’s AI and machine learning based project required data
that was stored in internal systems spread across 4 different
continents and in spreadsheets.
▪ Portions of data in the internal systems and spreadsheets that
were shared with McCormick's research partner firms needed to be
masked and at the same time unmasked when shared internally.
▪ McCormick wanted to create a data service that could simplify the
process of data access and data sharing across the organisation
and be used by the analytics teams for their machine learning
projects.
22
• Data Quality
• Multiple Brands
• Which Data to Use?
23
McCormick – Multi-purpose Data Lake
Solution Highlights
▪ Agile Data Delivery
▪ High Level of Reuse
▪ Single Discovery & Consumption
Platform
24
Data Virtualization Benefits for McCormick
▪ Machine learning and applications were able to
access refreshed, validated and indexed data in
real time, without replication, from Denodo
enterprise data service.
▪ The Denodo enterprise data service gave the
business users the capability to compare data in
multiple systems.
▪ Spreadsheets now the exception.
▪ Ensure the quality of proposed data and services.
25
Data Virtualization Benefits for AI and Machine Learning Projects
✓ Denodo can play key role in the data science ecosystem to reduce data
exploration and analysis timeframes.
✓ Extends and integrates with the capabilities of notebooks, Python, R, etc.
to improve the toolset of the data scientist.
✓ Provides a modern “SQL-on-Anything” engine.
✓ Can leverage Big Data technologies like Spark (as a data source, an
ingestion tool and for external processing) to efficiently work with large
data volumes.
✓ Facilitates collaboration across the data community as a single platform
for all data requirements.
More Information?
27
https://denodo.link/34diju2
28
Virtual Hands-on Lab
Thursday 26 November 2020
8:30am – 12:00pm
https://denodo.link/3kjtQgR
Test Drive
Access Denodo Platform in the Cloud!
Take a Test Drive today!
GET STARTED TODAY
https://denodo.link/3kjtNSd
Thanks!
www.denodo.com info@denodo.com
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm,
without prior the written authorization from Denodo Technologies.

Mais conteúdo relacionado

Mais procurados

Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
Revolution Analytics
 

Mais procurados (20)

How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?How different between Big Data, Business Intelligence and Analytics ?
How different between Big Data, Business Intelligence and Analytics ?
 
Creating an Enterprise AI Strategy
Creating an Enterprise AI StrategyCreating an Enterprise AI Strategy
Creating an Enterprise AI Strategy
 
data warehouse vs data lake
data warehouse vs data lakedata warehouse vs data lake
data warehouse vs data lake
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
 
Evaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics PlatformsEvaluating Big Data Predictive Analytics Platforms
Evaluating Big Data Predictive Analytics Platforms
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Global IT Outsourcing case study
Global IT Outsourcing case studyGlobal IT Outsourcing case study
Global IT Outsourcing case study
 
Discover how Covid-19 is accelerating the need for healthcare interoperabilit...
Discover how Covid-19 is accelerating the need for healthcare interoperabilit...Discover how Covid-19 is accelerating the need for healthcare interoperabilit...
Discover how Covid-19 is accelerating the need for healthcare interoperabilit...
 
Denodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and BusinessDenodo’s Data Catalog: Bridging the Gap between Data and Business
Denodo’s Data Catalog: Bridging the Gap between Data and Business
 
Augmented analytics will push the analytics adoption
Augmented analytics will push the analytics adoptionAugmented analytics will push the analytics adoption
Augmented analytics will push the analytics adoption
 
5 Big Data Use Cases for 2013
5 Big Data Use Cases for 20135 Big Data Use Cases for 2013
5 Big Data Use Cases for 2013
 
Introduction to Big Data & Analytics
Introduction to Big Data & AnalyticsIntroduction to Big Data & Analytics
Introduction to Big Data & Analytics
 
Business case for Big Data Analytics
Business case for Big Data AnalyticsBusiness case for Big Data Analytics
Business case for Big Data Analytics
 
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
Analyst Keynote: Forrester: Data Fabric Strategy is Vital for Business Innova...
 
Real-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to ProductionReal-time Big Data Analytics: From Deployment to Production
Real-time Big Data Analytics: From Deployment to Production
 
The Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their UsageThe Big Data Importance – Tools and their Usage
The Big Data Importance – Tools and their Usage
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
Data Analysis in Manufacturing Application to Steel Industry
Data Analysis in Manufacturing Application to Steel IndustryData Analysis in Manufacturing Application to Steel Industry
Data Analysis in Manufacturing Application to Steel Industry
 
elgendy2014.pdf
elgendy2014.pdfelgendy2014.pdf
elgendy2014.pdf
 
Modern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail BankingModern Data Discovery and Integration in Retail Banking
Modern Data Discovery and Integration in Retail Banking
 

Semelhante a Why Your Data Science Architecture Should Include a Data Virtualization Tool (New Zealand)

Essential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big DataEssential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big Data
Society of Petroleum Engineers
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
Denodo
 

Semelhante a Why Your Data Science Architecture Should Include a Data Virtualization Tool (New Zealand) (20)

Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Essential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big DataEssential Prerequisites for Maximizing Success from Big Data
Essential Prerequisites for Maximizing Success from Big Data
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
How to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics CloudHow to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics Cloud
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Modern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph TechnologyModern Data Challenges require Modern Graph Technology
Modern Data Challenges require Modern Graph Technology
 
Webinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDBWebinar: Faster Big Data Analytics with MongoDB
Webinar: Faster Big Data Analytics with MongoDB
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics2022 Trends in Enterprise Analytics
2022 Trends in Enterprise Analytics
 

Mais de Denodo

Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business Landscape
Denodo
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Denodo
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данных
Denodo
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Denodo
 

Mais de Denodo (20)

Enterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in DenodoEnterprise Monitoring and Auditing in Denodo
Enterprise Monitoring and Auditing in Denodo
 
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps ApproachLunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
Lunch and Learn ANZ: Mastering Cloud Data Cost Control: A FinOps Approach
 
Achieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services LayerAchieving Self-Service Analytics with a Governed Data Services Layer
Achieving Self-Service Analytics with a Governed Data Services Layer
 
What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?What you need to know about Generative AI and Data Management?
What you need to know about Generative AI and Data Management?
 
Mastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business LandscapeMastering Data Compliance in a Dynamic Business Landscape
Mastering Data Compliance in a Dynamic Business Landscape
 
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo LiteDenodo Partner Connect: Business Value Demo with Denodo Demo Lite
Denodo Partner Connect: Business Value Demo with Denodo Demo Lite
 
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
Expert Panel: Overcoming Challenges with Distributed Data to Maximize Busines...
 
Drive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory ComplianceDrive Data Privacy Regulatory Compliance
Drive Data Privacy Regulatory Compliance
 
Знакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данныхЗнакомство с виртуализацией данных для профессионалов в области данных
Знакомство с виртуализацией данных для профессионалов в области данных
 
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data FragmentationData Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
Data Democratization: A Secret Sauce to Say Goodbye to Data Fragmentation
 
Denodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me AnythingDenodo Partner Connect - Technical Webinar - Ask Me Anything
Denodo Partner Connect - Technical Webinar - Ask Me Anything
 
Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!Lunch and Learn ANZ: Key Takeaways for 2023!
Lunch and Learn ANZ: Key Takeaways for 2023!
 
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way ForwardIt’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
It’s a Wrap! 2023 – A Groundbreaking Year for AI and The Way Forward
 
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
Quels sont les facteurs-clés de succès pour appliquer au mieux le RGPD à votr...
 
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
Lunch and Learn ANZ: Achieving Self-Service Analytics with a Governed Data Se...
 
How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?How to Build Your Data Marketplace with Data Virtualization?
How to Build Your Data Marketplace with Data Virtualization?
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
 
Enabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usabilityEnabling Data Catalog users with advanced usability
Enabling Data Catalog users with advanced usability
 
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
Denodo Partner Connect: Technical Webinar - Architect Associate Certification...
 
GenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidadesGenAI y el futuro de la gestión de datos: mitos y realidades
GenAI y el futuro de la gestión de datos: mitos y realidades
 

Último

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
gajnagarg
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
amitlee9823
 

Último (20)

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men  🔝Ongole🔝   Escorts S...
➥🔝 7737669865 🔝▻ Ongole Call-girls in Women Seeking Men 🔝Ongole🔝 Escorts S...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 

Why Your Data Science Architecture Should Include a Data Virtualization Tool (New Zealand)

  • 1. Why Your Data Science Architecture Should Include a Data Virtualization Tool Chris Day Director, APAC Sales Engineering cday@denodo.com CDAO NEW ZEALAND 3 - 5 NOVEMBER 2020
  • 2. Agenda• Advanced Analytics & Machine Learning • The Data Challenge • Tackling the Data Preparation Tasks Problem • Customer Story • Q&A
  • 3. 3 VentureBeat AI, July 2019 87% of data science projects never make it into production.
  • 4. 4 Advanced Analytics & Machine Learning Exercises Need Data Improving Patient Outcomes Data includes patient demographics, family history, patient vitals, lab test results, claims data etc. Predictive Maintenance Maintenance data logs, data coming in from sensors – including temperature, running time, power level duration etc. Predicting Late Payment Data includes company or individual demographics, payment history, customer support logs etc. Preventing Frauds Data includes the location where the claim originated, time of the day, claimant history and any recent adverse events. Reducing Customer Churn Data includes customer demographics, products purchased, products used, pat transaction, company size, history, revenue etc.
  • 5. 5 The Scale of the Problem
  • 6. What is Data Virtualization?
  • 7. 7 What is Data Virtualization? Consume in business applications Combine related data into views Connect to disparate data sources 2 3 1
  • 8. 8 Data Virtualization Architecture Diagram DATA CONSUMERS Enterprise Applications, Reporting, BI, Portals, ESB, Mobile, Web, Users, IoT/Streaming Data Multiple Protocols, Formats Query, Search, Browse Request/Reply, Event Driven Secure Delivery DATA CONSUMERSAnalytical Operational Web Services DISPARATE DATA SOURCES Databases & Warehouses, Cloud/SaaS Applications, Big Data, NoSQL, Web, XML, Excel, PDF, Word... Less StructuredMore Structured SQL, MDX Big Data APIs Web Automation and Indexing DATA VIRTUALIZATION CONNECT COMBINE CONSUME Share, Deliver, Publish, Govern, Collaborate Discover, Transform, Prepare, Improve Quality, Integrate Normalized Views of Disparate Data Agile Development Performance Data Services Resource Management Data Catalog Governance & Metadata Security and Data Privacy Lifecycle Management
  • 9. Tackling the Data Pipeline Problem
  • 10. 10 Typical Data Science Workflow A typical workflow for a data scientist is: 1. Gather the requirements for the business problem 2. Identify useful data ▪ Ingest data 3. Cleanse data into a useful format 4. Analyze data 5. Prepare input for your algorithms 6. Execute data science algorithms (ML, AI, etc.) ▪ Iterate steps 2 to 6 until valuable insights are produced 7. Visualize and share Source:http://sudeep.co/data-science/Understanding-the-Data-Science-Lifecycle/
  • 11. 11 Where Does Your Time Go? • 80% of time – Finding and preparing the data • 10% of time – Analysis • 10% of time – Visualizing data Source:http://sudeep.co/data-science/Understanding-the-Data-Science-Lifecycle/
  • 12. 12 Where Does Your Time Go? A large amount of time and effort goes into tasks not intrinsically related to data science: • Finding where the right data may be • Getting access to the data ▪ Bureaucracy ▪ Understand access methods and technology (noSQL, REST APIs, etc.) • Transforming data into a format easy to work with • Combining data originally available in different sources and formats • Profile and cleanse data to eliminate incomplete or inconsistent data points
  • 13. 13 Data Scientist Workflow Identify useful data Modify datainto auseful format Analyzedata Execute data science algorithms (ML,AI, etc.) Prepare for MLalgorithm
  • 14. 14 Identify Useful Data If the company has a virtual layer with a good coverage of data sources, this task is greatly simplified. ▪ A data virtualization tool like Denodo can offer unified access to all data available in the company. ▪ It abstracts the technologies underneath, offering a standard SQL interface to query and manipulate. To further simplify the challenge, Denodo offers a Data Catalog to search, find and explore your data assets.
  • 15. 15 Data Scientist Workflow Identify useful data Modify datainto auseful format Analyzedata Execute data science algorithms (ML,AI, etc.) Prepare for MLalgorithm
  • 16. 16 Data Virtualization offers the unique opportunity of using standard SQL (joins, aggregations, transformations, etc.) to access, manipulate and analyze any data. Cleansing and transformation steps can be easily accomplished in SQL. Its modeling capabilities enable the definition of views that embed this logic to foster reusability. Ingestion And Data Manipulation Tasks
  • 17. Tackling the Large Dataset Problem
  • 18. 18 It enables the persistence of aggregates to accelerate the execution of analytical queries ▪ Common joins, aggregations and filters can be precomputed (in the cache or in a data source) and used as starting points to accelerate queries Smart Query Acceleration for Analytics Store 400 rows Sales 300 M rows Sales by Customer, Store, Date Summary Sales by Customer Summary S S Customer 1M rows Sales by date, with store and customer information ? Sales with customer information ?
  • 19. 19 Smart Query Acceleration for Analytics: Summaries Summaries : Commonly joined fact & dimensions are precomputed and used to accelerate future queries. System Execution time Other systems >500 secs Denodo(without summary) ~13 secs Denodo(with summary) ~1.4 secs • Historical Sales – 220 mil rows • Trailing Twelve Months Sales – 68M rows • Date_Dim – 73K rows • Store – contains StoreID, store_name and City • “Summ 1” – summary, 300K rows “Total Sales by StoreID and Day” query, stored in Redshift
  • 21. 21 McCormick Uses Denodo to Provide Data to Its AI Project Background ▪ McCormick’s AI and machine learning based project required data that was stored in internal systems spread across 4 different continents and in spreadsheets. ▪ Portions of data in the internal systems and spreadsheets that were shared with McCormick's research partner firms needed to be masked and at the same time unmasked when shared internally. ▪ McCormick wanted to create a data service that could simplify the process of data access and data sharing across the organisation and be used by the analytics teams for their machine learning projects.
  • 22. 22 • Data Quality • Multiple Brands • Which Data to Use?
  • 23. 23 McCormick – Multi-purpose Data Lake Solution Highlights ▪ Agile Data Delivery ▪ High Level of Reuse ▪ Single Discovery & Consumption Platform
  • 24. 24 Data Virtualization Benefits for McCormick ▪ Machine learning and applications were able to access refreshed, validated and indexed data in real time, without replication, from Denodo enterprise data service. ▪ The Denodo enterprise data service gave the business users the capability to compare data in multiple systems. ▪ Spreadsheets now the exception. ▪ Ensure the quality of proposed data and services.
  • 25. 25 Data Virtualization Benefits for AI and Machine Learning Projects ✓ Denodo can play key role in the data science ecosystem to reduce data exploration and analysis timeframes. ✓ Extends and integrates with the capabilities of notebooks, Python, R, etc. to improve the toolset of the data scientist. ✓ Provides a modern “SQL-on-Anything” engine. ✓ Can leverage Big Data technologies like Spark (as a data source, an ingestion tool and for external processing) to efficiently work with large data volumes. ✓ Facilitates collaboration across the data community as a single platform for all data requirements.
  • 28. 28 Virtual Hands-on Lab Thursday 26 November 2020 8:30am – 12:00pm https://denodo.link/3kjtQgR
  • 29. Test Drive Access Denodo Platform in the Cloud! Take a Test Drive today! GET STARTED TODAY https://denodo.link/3kjtNSd
  • 30. Thanks! www.denodo.com info@denodo.com © Copyright Denodo Technologies. All rights reserved Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm, without prior the written authorization from Denodo Technologies.