SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
©2018 Dataiku, Inc. | www.dataiku.com | contact@dataiku.com | @dataiku
DATA SCIENCE SALON
MIAMI 2018
Stop Wasting Time – Case Studies in
Production Machine Learning
Yashas Vaidya
Technical Lead, Alliances-US
yashas.vaidya@dataiku.com
2 X 2
• When putting things into production ...
• why do Data Scientists take too long?
• why does DS takes too long?
• Two case studies getting around those issues
• Speeding up ETL steps to create insights
• Case: Starting with a template and making reusable analytics
Two questions, two case studies
Data Science in production
• Business lines are often the sponsors and need to see ROI from projects
• Creates value across the organization and provide DS teams with recognition
• From strategic insights to augmented (or automated) decision making
Why the focus?
Wait a minute …
Where is the world is Ken?
Re-introduction
• Evangelizer in the Alliances team,
• Training and empowering consulting and technical partners
• ABD ...
I’m Yashas and I am ...
Data Scientists take too long
They mostly spend their time doing ETL
Source: 2017 Data Scientist Report
CrowdFlower: https://visit.crowdflower.com/WC-2017-Data-Science-Report_LP.html
Why do Data Scientists take so long
• H1: Data scientists have a hard time getting their hands on good quality data
and thus must scrub away
• H2: Given their background, they are actually pretty good at cleaning and
organizing data. So they focus on what they can do well and control.
Several hypotheses
Some evidence for both
Data scientists are usually happy data scientists, overwhelmed by data
Source: 2017 Data Scientist Report
CrowdFlower: https://visit.crowdflower.com/WC-2017-Data-Science-Report_LP.html
Getting to production is hard to do
• different teams in charge
• lack of established platforms or frameworks
• overall low maturity on deployment skills and experience
McKinsey Institute: « less than 10% of data science projects are deployed into
production […] with average deployment times of 9 to 12 months »
Why is data science hard to productionalize?
Case Study 1
First case study today
Regulatory Compliance
A bit of context
• New European credit reporting regulation – AnaCredit
• Every financial institution must report a monthly
dataset of loans over 25k with up to 95 attributes
• instrument data
• counterparty data
• liability data…
• Progressive rollout in 2018 for 8 countries
• Large Banks and Insurance companies have
provisioned around 10-15M € for the project
→ How can we provide a innovative, differentiating solution to this problem ?
Anacredit regulatory reporting
Building the Anacredit solution
• Key Usage Scenarios
• Build and automate financial reporting and in same format across 8 countries
• Take into account corrections and quality tests, as well as rejections
• Provide broader reporting on data quality and insights on portfolio management
• A best-of-breed approach to the project
• Data Management & Predictive – Dataiku DSS
• Regulatory watch and Risk Compliance – PwC
• Self Service Analytics, Insights and Reporting – Tableau Software
Usage Scenarios and Approach
Data Management
Focus on Agility and Repeatability
Data Processing Flow for
Template 1
Working « prototype » of the
dataset and templates within 2
months
• 2 Data Engineers
• 1 Business Analyst
• 1 Risk Analyst
• Agile, documented, re-usable,
runs « at scale » in the
company’s IS
• Articulated with ETL and DI
tools
Data Management
Instrument Data
Data Management
Counterparty Data
Data Management
Open Data
Data Management
Processing and Delivevering XML Templates
Analytics and Reporting
• Data Quality and Compliance
• Evolutions over time of errors and rejects
• Risk portfolio reporting
• Predictive insights
End user dashboards, improve analytics culture
Machine Learning
Fast exploration of multiple use cases to build
business cases
• Automatic enrichment of data through fuzzy
matching of LEI records
• Predicting outcome for late payments across
loan fifecycle
• Anomaly detection on loans (Isolation Forest)
Anacredit dataset is a great playground for ML
Case Study 2
Real-time recommendation system at scale
• Power a sales platform used by 4,000 different clients
• First project
• Provide customized recommendations (3 models per client)
• Recommendations should available in real time (API endpoint)
A production conundrum
How is Dataiku used in production?
What does it mean to a customer of a customer?
Data
DSS
Prediction
Customer
What had to be done?
Design	Environment
Automation	Environment
API	Environment
Main	Proj
Proj	A
Proj	B
Service	A
Service	B
Service	A
Service	B
Data	table	A
Data	table	B
Dev	Data	table
What are the technical issues?
* We must take one Project in Design
and Turn it into many in Automation
* We must connect to one ingestion
dataset in Design and many in
Automation
* We must turn one service in Design
into many different services on the
API nodes
One to Many Projects
Solution: Create a bundle and
upload it to Automation
multiple times
Solution: Connect to different
databases on the Automation
node
Solution: Each new project
generates its own service on the
Automation node
January 2013
Dataiku founded
Paris
February 2014
DSS 1.0
The 1st tool worldwide
integrating visual data
preparation and
machine learning
January 2015
20 Employees
30 Customers
$4M Seed
April 2015
DSS 2.0
Real-time collaboration
Spark Integration
April 2015
Office in New York
October 2016
DSS 3.0
300% Yearly Growth
$14M Series A
February 2017
DSS 4.0
Office in London
August
2017
$28M Series B
100+ Employees
100+ Customers
A brief history of
Dataiku and DSS
Consumer Goods
Consumer Electronics
Technology
Financial Services
Healthcare Media
Consulting
Transportation
Travel
E-Retail
125+Customers
3XYoY Growth
in 2017
150%Yearly Net Retention
Powering over 125 customers across 20 countries and multiple industries
Customers - Leaders in their industries
#1 Insurance Brand
#1 Pharma Brand
#1 Financial Information Company
#1 Flash Sales Company
#1 Car Sharing Company
#1 Cosmetics Company
#3 CPG Company
Horizontal Collaboration vs. Vertical Collaboration
Data Engineer
Line-of-
business
Data
Consumer
Data EngineerData Engineer
Data AnalystData Analyst
Data ScientistData ScientistData Scientist
Data Analyst
Business
Leader
Data
Consumer
Line-of-
business
Data
Consumer
Data Engineer
Line-of-
business
Data
Consumer
Data Engineer
Data Analyst
Data ScientistData Scientist
Data Analyst
Business
Leader
Data
Consumer
Line-of-
business
Data
Consumer
Data Engineer
Data Analyst
Data Scientist
©2018 Dataiku, Inc. | www.dataiku.com | contact@dataiku.com | @dataiku
Thanks for listening!

Mais conteúdo relacionado

Mais procurados

Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science teamAshish Bansal
 
Accenture Big Data Expo
Accenture Big Data ExpoAccenture Big Data Expo
Accenture Big Data ExpoBigDataExpo
 
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoH2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoSri Ambati
 
Building a data platform tnt
Building a data platform tntBuilding a data platform tnt
Building a data platform tntBigDataExpo
 
The Five Data Questions
The Five Data QuestionsThe Five Data Questions
The Five Data Questionscrystalpullen
 
Staffing your analytics team: 6 skill sets
Staffing your analytics team:  6 skill setsStaffing your analytics team:  6 skill sets
Staffing your analytics team: 6 skill setsDavid Stephenson, Ph.D.
 
Data science team, a practice to setup
Data science team, a practice to setupData science team, a practice to setup
Data science team, a practice to setupOmid Mogharian
 
1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac1140 track 1 weiss_using his mac
1140 track 1 weiss_using his macRising Media, Inc.
 
H2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientistsH2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientistsSri Ambati
 
925 plenary rexer_using our laptop
925 plenary rexer_using our laptop925 plenary rexer_using our laptop
925 plenary rexer_using our laptopRising Media, Inc.
 
Valuing the data asset
Valuing the data assetValuing the data asset
Valuing the data assetBala Iyer
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Domino Data Lab
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsVMware Tanzu
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessVMware Tanzu
 
Slides: Five Data Valuation Pillars
Slides:  Five Data Valuation PillarsSlides:  Five Data Valuation Pillars
Slides: Five Data Valuation PillarsJohn Furrier
 

Mais procurados (20)

Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
 
Accenture Big Data Expo
Accenture Big Data ExpoAccenture Big Data Expo
Accenture Big Data Expo
 
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing ZhaoH2O World - Advanced Analytics at Macys.com - Daqing Zhao
H2O World - Advanced Analytics at Macys.com - Daqing Zhao
 
Building a data platform tnt
Building a data platform tntBuilding a data platform tnt
Building a data platform tnt
 
The Five Data Questions
The Five Data QuestionsThe Five Data Questions
The Five Data Questions
 
Staffing your analytics team: 6 skill sets
Staffing your analytics team:  6 skill setsStaffing your analytics team:  6 skill sets
Staffing your analytics team: 6 skill sets
 
Data science team, a practice to setup
Data science team, a practice to setupData science team, a practice to setup
Data science team, a practice to setup
 
1415 gold sanford
1415 gold sanford1415 gold sanford
1415 gold sanford
 
1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac
 
H2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientistsH2O World - Machine Learning for non-data scientists
H2O World - Machine Learning for non-data scientists
 
Andreas weigend
Andreas weigendAndreas weigend
Andreas weigend
 
925 plenary rexer_using our laptop
925 plenary rexer_using our laptop925 plenary rexer_using our laptop
925 plenary rexer_using our laptop
 
Notilyze SAS
Notilyze SASNotilyze SAS
Notilyze SAS
 
Valuing the data asset
Valuing the data assetValuing the data asset
Valuing the data asset
 
Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...Moving Data Science from an Event to A Program: Considerations in Creating Su...
Moving Data Science from an Event to A Program: Considerations in Creating Su...
 
Into the Big Data Future with Watson Analytics
Into the Big Data Future with Watson AnalyticsInto the Big Data Future with Watson Analytics
Into the Big Data Future with Watson Analytics
 
Operationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and ToolsOperationalizing Data Science: The Right Architecture and Tools
Operationalizing Data Science: The Right Architecture and Tools
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
 
Evaluation of big data analysis
Evaluation of big data analysisEvaluation of big data analysis
Evaluation of big data analysis
 
Slides: Five Data Valuation Pillars
Slides:  Five Data Valuation PillarsSlides:  Five Data Valuation Pillars
Slides: Five Data Valuation Pillars
 

Semelhante a Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Learning

Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsDenodo
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Denodo
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Denodo
 
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsDATAVERSITY
 
Achieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsAchieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsSense Corp
 
OpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case StudiesOpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case StudiesDatavail
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
Unlock your core business assets for the hybrid cloud with addi webinar dec...
Unlock your core business assets for the hybrid cloud with addi   webinar dec...Unlock your core business assets for the hybrid cloud with addi   webinar dec...
Unlock your core business assets for the hybrid cloud with addi webinar dec...Sherri Hanna
 
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Fred Isbell
 
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan JonesAIIM International
 
EPM Cloud in Real Life: 2 Real-world Cloud Migration Case Studies
EPM Cloud in Real Life: 2 Real-world Cloud Migration Case StudiesEPM Cloud in Real Life: 2 Real-world Cloud Migration Case Studies
EPM Cloud in Real Life: 2 Real-world Cloud Migration Case StudiesDatavail
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
WebXpress Business Intelligence Capability
WebXpress Business Intelligence CapabilityWebXpress Business Intelligence Capability
WebXpress Business Intelligence CapabilityWebXpress.IN
 
How to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics CloudHow to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics CloudPerficient, Inc.
 
Developing a Modernization Strategy: Evaluating the Options by Chris Koppe
Developing a Modernization Strategy: Evaluating the Options by Chris KoppeDeveloping a Modernization Strategy: Evaluating the Options by Chris Koppe
Developing a Modernization Strategy: Evaluating the Options by Chris KoppeFresche Solutions
 
GraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in GraphdatenbankenGraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in GraphdatenbankenNeo4j
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Precisely
 

Semelhante a Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Learning (20)

Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
Why Your Data Science Architecture Should Include a Data Virtualization Tool ...
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced AnalyticsADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
ADV Slides: What Happened of Note in 1H 2020 in Enterprise Advanced Analytics
 
Achieve New Heights with Modern Analytics
Achieve New Heights with Modern AnalyticsAchieve New Heights with Modern Analytics
Achieve New Heights with Modern Analytics
 
OpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case StudiesOpenWorld: 4 Real-world Cloud Migration Case Studies
OpenWorld: 4 Real-world Cloud Migration Case Studies
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
Unlock your core business assets for the hybrid cloud with addi webinar dec...
Unlock your core business assets for the hybrid cloud with addi   webinar dec...Unlock your core business assets for the hybrid cloud with addi   webinar dec...
Unlock your core business assets for the hybrid cloud with addi webinar dec...
 
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
Building a Business Case for Innovation: Project Considerations for Cloud, Mo...
 
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
[AIIM] Getting Stuff Done with Content - Tony Peleska and Jordan Jones
 
EPM Cloud in Real Life: 2 Real-world Cloud Migration Case Studies
EPM Cloud in Real Life: 2 Real-world Cloud Migration Case StudiesEPM Cloud in Real Life: 2 Real-world Cloud Migration Case Studies
EPM Cloud in Real Life: 2 Real-world Cloud Migration Case Studies
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
WebXpress Business Intelligence Capability
WebXpress Business Intelligence CapabilityWebXpress Business Intelligence Capability
WebXpress Business Intelligence Capability
 
How to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics CloudHow to Capitalize on Big Data with Oracle Analytics Cloud
How to Capitalize on Big Data with Oracle Analytics Cloud
 
Developing a Modernization Strategy: Evaluating the Options by Chris Koppe
Developing a Modernization Strategy: Evaluating the Options by Chris KoppeDeveloping a Modernization Strategy: Evaluating the Options by Chris Koppe
Developing a Modernization Strategy: Evaluating the Options by Chris Koppe
 
GraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in GraphdatenbankenGraphTalk Berlin - Einführung in Graphdatenbanken
GraphTalk Berlin - Einführung in Graphdatenbanken
 
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
Foundational Strategies for Trust in Big Data Part 1: Getting Data to the Pla...
 

Mais de Formulatedby

Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...
Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...
Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...Formulatedby
 
Data Science Salon: Are you sure you're an ethical technologist?: Build your ...
Data Science Salon: Are you sure you're an ethical technologist?: Build your ...Data Science Salon: Are you sure you're an ethical technologist?: Build your ...
Data Science Salon: Are you sure you're an ethical technologist?: Build your ...Formulatedby
 
Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...Formulatedby
 
Data Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare DomainData Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare DomainFormulatedby
 
Data Science Salon: Applications of Embeddings and Deep Learning at Groupon
Data Science Salon: Applications of Embeddings and Deep Learning at GrouponData Science Salon: Applications of Embeddings and Deep Learning at Groupon
Data Science Salon: Applications of Embeddings and Deep Learning at GrouponFormulatedby
 
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Formulatedby
 
Data Science Salon: Smart Cities
Data Science Salon: Smart Cities Data Science Salon: Smart Cities
Data Science Salon: Smart Cities Formulatedby
 
Data Science Salon: Building a Data Driven Product Mindset
Data Science Salon: Building a Data Driven Product MindsetData Science Salon: Building a Data Driven Product Mindset
Data Science Salon: Building a Data Driven Product MindsetFormulatedby
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseFormulatedby
 
Data Science Salon: Data visualization and Analysis in the Florida Panthers H...
Data Science Salon: Data visualization and Analysis in the Florida Panthers H...Data Science Salon: Data visualization and Analysis in the Florida Panthers H...
Data Science Salon: Data visualization and Analysis in the Florida Panthers H...Formulatedby
 
Data Science Salon: Machine Learning for Personalized Cancer Vaccines
Data Science Salon: Machine Learning for Personalized Cancer VaccinesData Science Salon: Machine Learning for Personalized Cancer Vaccines
Data Science Salon: Machine Learning for Personalized Cancer VaccinesFormulatedby
 
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at BidtellectData Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at BidtellectFormulatedby
 
Data Science Salon: MCL Clustering of Sparse Graphs
Data Science Salon: MCL Clustering of Sparse GraphsData Science Salon: MCL Clustering of Sparse Graphs
Data Science Salon: MCL Clustering of Sparse GraphsFormulatedby
 
Data Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business ProcessesData Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business ProcessesFormulatedby
 
Data Science Salon: Deep Learning as a Product @ Scribd
Data Science Salon: Deep Learning as a Product @ ScribdData Science Salon: Deep Learning as a Product @ Scribd
Data Science Salon: Deep Learning as a Product @ ScribdFormulatedby
 
Data Science Salon: The Age of Co-creation
Data Science Salon: The Age of Co-creationData Science Salon: The Age of Co-creation
Data Science Salon: The Age of Co-creationFormulatedby
 
Data Science Salon: A Journey of Deploying a Data Science Engine to Production
Data Science Salon: A Journey of Deploying a Data Science Engine to ProductionData Science Salon: A Journey of Deploying a Data Science Engine to Production
Data Science Salon: A Journey of Deploying a Data Science Engine to ProductionFormulatedby
 
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...Formulatedby
 

Mais de Formulatedby (18)

Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...
Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...
Data Science Salon: An Experiment on Data Science Algorithms Enabled by a Pil...
 
Data Science Salon: Are you sure you're an ethical technologist?: Build your ...
Data Science Salon: Are you sure you're an ethical technologist?: Build your ...Data Science Salon: Are you sure you're an ethical technologist?: Build your ...
Data Science Salon: Are you sure you're an ethical technologist?: Build your ...
 
Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...Data Science Salon: In your own words: computing customer similarity from tex...
Data Science Salon: In your own words: computing customer similarity from tex...
 
Data Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare DomainData Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare Domain
 
Data Science Salon: Applications of Embeddings and Deep Learning at Groupon
Data Science Salon: Applications of Embeddings and Deep Learning at GrouponData Science Salon: Applications of Embeddings and Deep Learning at Groupon
Data Science Salon: Applications of Embeddings and Deep Learning at Groupon
 
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
Data Science Salon: Kaggle 1st Place in 30 minutes: Putting AutoML to Work wi...
 
Data Science Salon: Smart Cities
Data Science Salon: Smart Cities Data Science Salon: Smart Cities
Data Science Salon: Smart Cities
 
Data Science Salon: Building a Data Driven Product Mindset
Data Science Salon: Building a Data Driven Product MindsetData Science Salon: Building a Data Driven Product Mindset
Data Science Salon: Building a Data Driven Product Mindset
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
 
Data Science Salon: Data visualization and Analysis in the Florida Panthers H...
Data Science Salon: Data visualization and Analysis in the Florida Panthers H...Data Science Salon: Data visualization and Analysis in the Florida Panthers H...
Data Science Salon: Data visualization and Analysis in the Florida Panthers H...
 
Data Science Salon: Machine Learning for Personalized Cancer Vaccines
Data Science Salon: Machine Learning for Personalized Cancer VaccinesData Science Salon: Machine Learning for Personalized Cancer Vaccines
Data Science Salon: Machine Learning for Personalized Cancer Vaccines
 
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at BidtellectData Science Salon: Enabling self-service predictive analytics at Bidtellect
Data Science Salon: Enabling self-service predictive analytics at Bidtellect
 
Data Science Salon: MCL Clustering of Sparse Graphs
Data Science Salon: MCL Clustering of Sparse GraphsData Science Salon: MCL Clustering of Sparse Graphs
Data Science Salon: MCL Clustering of Sparse Graphs
 
Data Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business ProcessesData Science Salon: Applying Machine Learning to Modernize Business Processes
Data Science Salon: Applying Machine Learning to Modernize Business Processes
 
Data Science Salon: Deep Learning as a Product @ Scribd
Data Science Salon: Deep Learning as a Product @ ScribdData Science Salon: Deep Learning as a Product @ Scribd
Data Science Salon: Deep Learning as a Product @ Scribd
 
Data Science Salon: The Age of Co-creation
Data Science Salon: The Age of Co-creationData Science Salon: The Age of Co-creation
Data Science Salon: The Age of Co-creation
 
Data Science Salon: A Journey of Deploying a Data Science Engine to Production
Data Science Salon: A Journey of Deploying a Data Science Engine to ProductionData Science Salon: A Journey of Deploying a Data Science Engine to Production
Data Science Salon: A Journey of Deploying a Data Science Engine to Production
 
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
Data Science Salon: Culture, Data Engineering and Hamburger Stands: Thoughts ...
 

Último

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Último (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Learning

  • 1. ©2018 Dataiku, Inc. | www.dataiku.com | contact@dataiku.com | @dataiku DATA SCIENCE SALON MIAMI 2018 Stop Wasting Time – Case Studies in Production Machine Learning Yashas Vaidya Technical Lead, Alliances-US yashas.vaidya@dataiku.com
  • 2. 2 X 2 • When putting things into production ... • why do Data Scientists take too long? • why does DS takes too long? • Two case studies getting around those issues • Speeding up ETL steps to create insights • Case: Starting with a template and making reusable analytics Two questions, two case studies
  • 3. Data Science in production • Business lines are often the sponsors and need to see ROI from projects • Creates value across the organization and provide DS teams with recognition • From strategic insights to augmented (or automated) decision making Why the focus?
  • 4. Wait a minute … Where is the world is Ken?
  • 5. Re-introduction • Evangelizer in the Alliances team, • Training and empowering consulting and technical partners • ABD ... I’m Yashas and I am ...
  • 6. Data Scientists take too long They mostly spend their time doing ETL Source: 2017 Data Scientist Report CrowdFlower: https://visit.crowdflower.com/WC-2017-Data-Science-Report_LP.html
  • 7. Why do Data Scientists take so long • H1: Data scientists have a hard time getting their hands on good quality data and thus must scrub away • H2: Given their background, they are actually pretty good at cleaning and organizing data. So they focus on what they can do well and control. Several hypotheses
  • 8. Some evidence for both Data scientists are usually happy data scientists, overwhelmed by data Source: 2017 Data Scientist Report CrowdFlower: https://visit.crowdflower.com/WC-2017-Data-Science-Report_LP.html
  • 9. Getting to production is hard to do • different teams in charge • lack of established platforms or frameworks • overall low maturity on deployment skills and experience McKinsey Institute: « less than 10% of data science projects are deployed into production […] with average deployment times of 9 to 12 months » Why is data science hard to productionalize?
  • 11. First case study today Regulatory Compliance
  • 12. A bit of context • New European credit reporting regulation – AnaCredit • Every financial institution must report a monthly dataset of loans over 25k with up to 95 attributes • instrument data • counterparty data • liability data… • Progressive rollout in 2018 for 8 countries • Large Banks and Insurance companies have provisioned around 10-15M € for the project → How can we provide a innovative, differentiating solution to this problem ? Anacredit regulatory reporting
  • 13. Building the Anacredit solution • Key Usage Scenarios • Build and automate financial reporting and in same format across 8 countries • Take into account corrections and quality tests, as well as rejections • Provide broader reporting on data quality and insights on portfolio management • A best-of-breed approach to the project • Data Management & Predictive – Dataiku DSS • Regulatory watch and Risk Compliance – PwC • Self Service Analytics, Insights and Reporting – Tableau Software Usage Scenarios and Approach
  • 14. Data Management Focus on Agility and Repeatability Data Processing Flow for Template 1 Working « prototype » of the dataset and templates within 2 months • 2 Data Engineers • 1 Business Analyst • 1 Risk Analyst • Agile, documented, re-usable, runs « at scale » in the company’s IS • Articulated with ETL and DI tools
  • 18. Data Management Processing and Delivevering XML Templates
  • 19. Analytics and Reporting • Data Quality and Compliance • Evolutions over time of errors and rejects • Risk portfolio reporting • Predictive insights End user dashboards, improve analytics culture
  • 20. Machine Learning Fast exploration of multiple use cases to build business cases • Automatic enrichment of data through fuzzy matching of LEI records • Predicting outcome for late payments across loan fifecycle • Anomaly detection on loans (Isolation Forest) Anacredit dataset is a great playground for ML
  • 22. Real-time recommendation system at scale • Power a sales platform used by 4,000 different clients • First project • Provide customized recommendations (3 models per client) • Recommendations should available in real time (API endpoint) A production conundrum
  • 23.
  • 24. How is Dataiku used in production? What does it mean to a customer of a customer? Data DSS Prediction Customer
  • 25. What had to be done? Design Environment Automation Environment API Environment Main Proj Proj A Proj B Service A Service B Service A Service B Data table A Data table B Dev Data table
  • 26. What are the technical issues? * We must take one Project in Design and Turn it into many in Automation * We must connect to one ingestion dataset in Design and many in Automation * We must turn one service in Design into many different services on the API nodes One to Many Projects Solution: Create a bundle and upload it to Automation multiple times Solution: Connect to different databases on the Automation node Solution: Each new project generates its own service on the Automation node
  • 27. January 2013 Dataiku founded Paris February 2014 DSS 1.0 The 1st tool worldwide integrating visual data preparation and machine learning January 2015 20 Employees 30 Customers $4M Seed April 2015 DSS 2.0 Real-time collaboration Spark Integration April 2015 Office in New York October 2016 DSS 3.0 300% Yearly Growth $14M Series A February 2017 DSS 4.0 Office in London August 2017 $28M Series B 100+ Employees 100+ Customers A brief history of Dataiku and DSS
  • 28. Consumer Goods Consumer Electronics Technology Financial Services Healthcare Media Consulting Transportation Travel E-Retail 125+Customers 3XYoY Growth in 2017 150%Yearly Net Retention Powering over 125 customers across 20 countries and multiple industries Customers - Leaders in their industries #1 Insurance Brand #1 Pharma Brand #1 Financial Information Company #1 Flash Sales Company #1 Car Sharing Company #1 Cosmetics Company #3 CPG Company
  • 29. Horizontal Collaboration vs. Vertical Collaboration Data Engineer Line-of- business Data Consumer Data EngineerData Engineer Data AnalystData Analyst Data ScientistData ScientistData Scientist Data Analyst Business Leader Data Consumer Line-of- business Data Consumer Data Engineer Line-of- business Data Consumer Data Engineer Data Analyst Data ScientistData Scientist Data Analyst Business Leader Data Consumer Line-of- business Data Consumer Data Engineer Data Analyst Data Scientist
  • 30. ©2018 Dataiku, Inc. | www.dataiku.com | contact@dataiku.com | @dataiku Thanks for listening!