SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
Pipelines for model deployment
2017-04-25
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
Digital Origin – Introduction
Digital Origin is a leading Spanish fintech company focused on technology-enabled consumer finance.
Founded in 2011. €15 million A-round in 2015. 80 employees with offices in Barcelona and Madrid.
Uniquely positioned to address mainstream consumer finance market with a wide portfolio of instant real-time
products with a process completely online.
Over €150 million lent to date.
¡QuéBueno! was released in 2011: Consumer finance microlending.
1
2
3
4
5
Paga+Tarde was released in 2015: Consumer finance for eCommerce and InStore.6
Fraud
Risk
Business
Monitoring
Massive Fraud
Identity Fraud
Not Willing to Pay
Default Risk
Product, UX,
AR vs DR tradeoff
Evaluation
Control & Alerts
Marketing
Credit Cards / Returnings
QB
Device - Fingerprinting
User request
Graph relationships model
DNI Images models
Geo fraud model
Basket Model
Behavioural Model
Configuration & Parameter
Tuning
Reporting
Uplift models
CLTV models
Identity fraud model
Alerts
CREDIT RISK ENGINE
(CRE)
Design
&
Models
params
Risk Model
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
A recurrent problem moving to production
I+D Environment Prod. Environment
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
There are different requirement in development/design phase and once in production.
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
Different languages implies twice or more work.
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
Solution A: Python is well suited for both necessities.
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
/ / / ...
...
Solution B: API approach to get some give some flexibility.
A recurrent problem moving to production
I+D Environment Prod. Environment
• Interactive mode
• Friendly for discovery
• Fast developing language
• Easy to save a state to continue later
on
• Access to mathematic libraries
• …
• Scalable architecture
• Error handling
• High availability
• Load Balance
• Reliable and Stable
• …
Data Scientist profile Engineer profile
...
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
H2O - Architectures
• Open source API for Machine Learning
• Massively Scalable Big Data Analysis
• Easy-to-use WebUI (Jupyter – Python notebook)
• Familiar Interfaces: R, Python, Scala, Java, API, …
• Real-time Data Scoring
• Rapidly deploy models to production via POJO
or model-optimized Java objects (MOJO)
• Algorithms
• GLM
• Random Forest
• GBM
• “Deep Learning”
• Deep Water: Tensorflow, MXNet, Caffe, … (not yet)
• …
https://www.h2o.ai/h2o/
H2O - Architectures
Local
Cluster + HDFS
Cluster
H2O - Architectures
Cluster + Spark
Node 1 … Node N
Cluster + Spark
H2O - Performance
Reproducible benchmark: https://github.com/szilard/benchm-ml
GLM RF GBM (setup A) GBM (setup B)
1. Digital Origin introduction
2. A recurrent problem moving to production
3. H2O
4. Digital Origin pipeline
5. Sometimes is harder than usual to automate: Rejection Inference
Fraud
Risk
Business
Monitoring
Massive Fraud
Identity Fraud
Not Willing to Pay
Default Risk
Product, UX,
AR vs DR tradeoff
Evaluation
Control & Alerts
Marketing
Credit Cards / Returnings
QB
Device - Fingerprinting
User request
Graph relationships model
DNI Images models
Geo fraud model
Behavioural Model
Configuration & Parameter
Tuning
Reporting
Uplift models
CLTV models
Identity fraud model
Alerts
CREDIT RISK ENGINE
(CRE)
Design
&
Models
params
Risk Model
Development Production
Node 1 … Node N
Hadoop ecosystem
Extract
Transform
Train
models
Transform
Scoring
Export POJO
Digital Origin – Introduction
Data Analytics activity
Production
Credit Risk Engine (CRE)
Digital Origin – Actual Pipeline
Reporting
Replica Databases
{{mustache}}
streaming
Query
template
Tools
Corporate Libraries
batch
Data Science
Daily activity and
recurrent processes
Analytics and
Reporting databases
Production Databases
Alerts System
Services to other dep.
CRE development
CRE param. tuning
Front End
New
Model
Back End
New
Config
THANKS!
Questions?
ralabern@digitalorigin.com
markus@digitalorigin.com

Mais conteúdo relacionado

Mais procurados

Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowDatabricks
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMHolden Karau
 
Keeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkKeeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkDatabricks
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache PigJason Shao
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsPeter Haase
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Eric Sun
 
Python tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterPython tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterJeff Hale
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentationTao Feng
 
Growing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RSGrowing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RSDatabricks
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDatabricks
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiDatabricks
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & DeltaDatabricks
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph DatabaseTobias Lindaaker
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...Databricks
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryNeo4j
 
Apache NiFi Record Processing
Apache NiFi Record ProcessingApache NiFi Record Processing
Apache NiFi Record ProcessingBryan Bende
 

Mais procurados (20)

Building an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflowBuilding an ML Platform with Ray and MLflow
Building an ML Platform with Ray and MLflow
 
Improving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVMImproving PySpark performance: Spark Performance Beyond the JVM
Improving PySpark performance: Spark Performance Beyond the JVM
 
Keeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache SparkKeeping Identity Graphs In Sync With Apache Spark
Keeping Identity Graphs In Sync With Apache Spark
 
Introduction to Apache Pig
Introduction to Apache PigIntroduction to Apache Pig
Introduction to Apache Pig
 
ESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge GraphsESWC 2017 Tutorial Knowledge Graphs
ESWC 2017 Tutorial Knowledge Graphs
 
Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)Reshape Data Lake (as of 2020.07)
Reshape Data Lake (as of 2020.07)
 
Python tools to deploy your machine learning models faster
Python tools to deploy your machine learning models fasterPython tools to deploy your machine learning models faster
Python tools to deploy your machine learning models faster
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
Growing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RSGrowing the Delta Ecosystem to Rust and Python with Delta-RS
Growing the Delta Ecosystem to Rust and Python with Delta-RS
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in Production
 
A Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and HudiA Thorough Comparison of Delta Lake, Iceberg and Hudi
A Thorough Comparison of Delta Lake, Iceberg and Hudi
 
Moving to Databricks & Delta
Moving to Databricks & DeltaMoving to Databricks & Delta
Moving to Databricks & Delta
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph Database
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Druid deep dive
Druid deep diveDruid deep dive
Druid deep dive
 
The delta architecture
The delta architectureThe delta architecture
The delta architecture
 
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
The Top Five Mistakes Made When Writing Streaming Applications with Mark Grov...
 
How Lyft Drives Data Discovery
How Lyft Drives Data DiscoveryHow Lyft Drives Data Discovery
How Lyft Drives Data Discovery
 
Apache NiFi Record Processing
Apache NiFi Record ProcessingApache NiFi Record Processing
Apache NiFi Record Processing
 

Semelhante a Digital Origin - Pipelines for model deployment

Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and PythonTravis Oliphant
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkDatabricks
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Sri Ambati
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupSri Ambati
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastSammy Fung
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIMatt Stubbs
 
Proud to be polyglot
Proud to be polyglotProud to be polyglot
Proud to be polyglotTugdual Grall
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2OSri Ambati
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEJo-fai Chow
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMESri Ambati
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the EnterpriseEric Kavanagh
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneJo-fai Chow
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSSri Ambati
 
Simplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing HadoopSimplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing HadoopPrecisely
 
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFSri Ambati
 
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve PooleDevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve PooleJAXLondon_Conference
 
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"Daniel Bryant
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"Jo-fai Chow
 
ISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureMicrosoft Tech Community
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 KeynotePeter Wang
 

Semelhante a Digital Origin - Pipelines for model deployment (20)

Continuum Analytics and Python
Continuum Analytics and PythonContinuum Analytics and Python
Continuum Analytics and Python
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache Spark
 
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
Ruben Diaz, Vision Banco + Rafael Coss, H2O ai + Luis Armenta, IBM - AI journ...
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 Forecast
 
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AIBig Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
Big Data LDN 2017: H2O.ai Driverless AI: Fast, Accurate, Interpretable AI
 
Proud to be polyglot
Proud to be polyglotProud to be polyglot
Proud to be polyglot
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 
Automatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIMEAutomatic and Interpretable Machine Learning in R with H2O and LIME
Automatic and Interpretable Machine Learning in R with H2O and LIME
 
A Tight Ship: How Containers and SDS Optimize the Enterprise
 A Tight Ship: How Containers and SDS Optimize the Enterprise A Tight Ship: How Containers and SDS Optimize the Enterprise
A Tight Ship: How Containers and SDS Optimize the Enterprise
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Simplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing HadoopSimplifying and Future-Proofing Hadoop
Simplifying and Future-Proofing Hadoop
 
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DFML Model Deployment and Scoring on the Edge with Automatic ML & DF
ML Model Deployment and Scoring on the Edge with Automatic ML & DF
 
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve PooleDevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
DevOps and the cloud: all hail the (developer) king - Daniel Bryant, Steve Poole
 
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
JAXLondon 2015 "DevOps and the Cloud: All Hail the (Developer) King"
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
 
ISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on AzureISV Showcase: End-to-end Machine Learning using H2O on Azure
ISV Showcase: End-to-end Machine Learning using H2O on Azure
 
PyData Texas 2015 Keynote
PyData Texas 2015 KeynotePyData Texas 2015 Keynote
PyData Texas 2015 Keynote
 

Último

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 

Último (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 

Digital Origin - Pipelines for model deployment

  • 1. Pipelines for model deployment 2017-04-25
  • 2. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 3. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 4. Digital Origin – Introduction Digital Origin is a leading Spanish fintech company focused on technology-enabled consumer finance. Founded in 2011. €15 million A-round in 2015. 80 employees with offices in Barcelona and Madrid. Uniquely positioned to address mainstream consumer finance market with a wide portfolio of instant real-time products with a process completely online. Over €150 million lent to date. ¡QuéBueno! was released in 2011: Consumer finance microlending. 1 2 3 4 5 Paga+Tarde was released in 2015: Consumer finance for eCommerce and InStore.6
  • 5.
  • 6.
  • 7.
  • 8. Fraud Risk Business Monitoring Massive Fraud Identity Fraud Not Willing to Pay Default Risk Product, UX, AR vs DR tradeoff Evaluation Control & Alerts Marketing Credit Cards / Returnings QB Device - Fingerprinting User request Graph relationships model DNI Images models Geo fraud model Basket Model Behavioural Model Configuration & Parameter Tuning Reporting Uplift models CLTV models Identity fraud model Alerts CREDIT RISK ENGINE (CRE) Design & Models params Risk Model
  • 9. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 10. A recurrent problem moving to production I+D Environment Prod. Environment • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile There are different requirement in development/design phase and once in production. • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • …
  • 11. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile Different languages implies twice or more work.
  • 12. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile Solution A: Python is well suited for both necessities.
  • 13. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile / / / ... ... Solution B: API approach to get some give some flexibility.
  • 14. A recurrent problem moving to production I+D Environment Prod. Environment • Interactive mode • Friendly for discovery • Fast developing language • Easy to save a state to continue later on • Access to mathematic libraries • … • Scalable architecture • Error handling • High availability • Load Balance • Reliable and Stable • … Data Scientist profile Engineer profile ...
  • 15. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 16. H2O - Architectures • Open source API for Machine Learning • Massively Scalable Big Data Analysis • Easy-to-use WebUI (Jupyter – Python notebook) • Familiar Interfaces: R, Python, Scala, Java, API, … • Real-time Data Scoring • Rapidly deploy models to production via POJO or model-optimized Java objects (MOJO) • Algorithms • GLM • Random Forest • GBM • “Deep Learning” • Deep Water: Tensorflow, MXNet, Caffe, … (not yet) • … https://www.h2o.ai/h2o/
  • 18. H2O - Architectures Cluster + Spark Node 1 … Node N Cluster + Spark
  • 19. H2O - Performance Reproducible benchmark: https://github.com/szilard/benchm-ml GLM RF GBM (setup A) GBM (setup B)
  • 20. 1. Digital Origin introduction 2. A recurrent problem moving to production 3. H2O 4. Digital Origin pipeline 5. Sometimes is harder than usual to automate: Rejection Inference
  • 21. Fraud Risk Business Monitoring Massive Fraud Identity Fraud Not Willing to Pay Default Risk Product, UX, AR vs DR tradeoff Evaluation Control & Alerts Marketing Credit Cards / Returnings QB Device - Fingerprinting User request Graph relationships model DNI Images models Geo fraud model Behavioural Model Configuration & Parameter Tuning Reporting Uplift models CLTV models Identity fraud model Alerts CREDIT RISK ENGINE (CRE) Design & Models params Risk Model
  • 22. Development Production Node 1 … Node N Hadoop ecosystem Extract Transform Train models Transform Scoring Export POJO Digital Origin – Introduction
  • 23. Data Analytics activity Production Credit Risk Engine (CRE) Digital Origin – Actual Pipeline Reporting Replica Databases {{mustache}} streaming Query template Tools Corporate Libraries batch Data Science Daily activity and recurrent processes Analytics and Reporting databases Production Databases Alerts System Services to other dep. CRE development CRE param. tuning Front End New Model Back End New Config