SlideShare uma empresa Scribd logo
1 de 27
Baixar para ler offline
Mathieu DESPRIEE, Tinyclues
Using Deep Learning in
Production Pipelines
to Predict Consumers’ Interest
#SAISML2
Who am I ?
Mathieu DESPRIEE
ML Engineering Manager at
In charge of ML Scalability & Operability
Contributor to Spark, MxNET
2
AI-FIRST
SAAS SOLUTION
USING ANONYMIZED
FIRST PARTY DATA
SEAMLESSLY INTEGRATED WITH
YOUR MARKETING STACK
DESIGNED FOR
MARKETERS
TINYCLUES IN A FEW WORDS
DEEP LEARNING FOR CAMPAIGN
TARGETING & PLANNING
DRIVE MORE REVENUE
AND ENGAGEMENT
ACROSS ALL
CHANNELS
FIRST CAMPAIGNS IN
2 WEEKS
Instead of…
4
INTUITION
RFM
DEMOGRAPHICS
LAST PURCHASE
QUANTIFIED
ANTICIPATE & MAXIMIZE
CAMPAIGN PERFORMANCE
AI-FIRST: INTELLIGENT CAMPAIGNS IN JUST A FEW CLICKS
EASY
BUILD SEGMENTS IN MINUTES –
NO GUESSWORK &
NO DATA-SCIENCE NEEDED
STRATEGIC
PRIORITIZE BASED ON VALUE
& MARKETING FATIGUE
SPOT ON
TARGET THE FUTURE BUYERS FOR
ANY PRODUCT, IN THE DAYS
FOLLOWING A CAMPAIGN
START FROM YOUR
CAMPAIGN IDEAS
AND BUSINESS
GOALS
SUCCESS STORIES
RETAIL
+30%
CAMPAIGN REVENUE
RETAIL / FASHION
+151%
REVENUE PER EMAIL
HOSPITALITY
+178%
CAMPAIGN REVENUE
HOSPITALITY
+30%
CAMPAIGN REVENUE
RETAIL / FASHION
+60%
REVENUE PER EMAIL
E-COMMERCE
+80%
CAMPAIGN REVENUE
10M+
10M+
5M+
TRAVEL
+115%
CAMPAIGN REVENUE
10M+
$1B+
$1B+
$1B+
$1B+
$1B+
$1B+
RETAIL / FASHION
+20%
CAMPAIGN REVENUE
ONLINE & IN-STORE
1B+
customers customers customers
customers
How it works
7
common
data model
Marketer’s inputs
- topics
- business goals
customers,
products,
purchases,
clicks, pages views…
(1st
party data, with opt-in)
2 PB
500 TB hot
Unsupervised
Learning
Supervised
Learning
customers’
interests
topics
model
prediction =
f (cust., topic)
travel
retail
…
emailing
FB
mobile …
Feedback Loop
Predictive architecture
8
Encoding
One-hot
TopN
Hashers
…
⇓
Sparse
features
Unsupervised Learning
Multiple layers of
Unsupervised learning
To build meaningful
representations of
data, even on dirty
data
⇓
Embeddings:
Sparse & Dense
Features
- Relational model
- Dirty data
- Very sparse data
(10-100 K products,
niche or new products,
low activity customers…)
common
data model Sequence
Learning
Making
sense of
event
history
Feature
Banks
Model Banks
Supervised
Models
P(click)
P(purchase)
E(revenue)
…
Very wide models
(10K columns)
Working with events
• Events (page views, clicks…) are not always meaningful
individually, but sequences are
• Events are ordered, and are not independent from each
other
• Sequences are not all of the same length
• Can we find a model architecture leveraging on these
characteristics ?
9
RNN: Recurrent Neural Networks
• Networks with loops in them, allowing information to persist.
10
unrolled
• They also allow input sequences of variable length
– you just need to pad data with zeroes X0..Xk
Illustrations by Christopher Olah - Check out his blog! colah.github.io
LSTM: Long Short-Term Memory
• Classical RNN can’t learn long-term dependencies with
gradient descent because of vanishing or exploding
gradients: they tend to forget in the long-term
• LSTMs are explicitly designed to avoid this problem
– Invented by Hochreiter & Schmidhuber (1997)
– Many variants exist
• LSTM have great results in speech recognition,
translation, and are building block of assistants.
11
LSTM
• Cell-state is transmitted from cell to cell (top line)
without going through a non-linear activation function
• The rest of the cell controls:
– how the input and previous outputs modifies the cell state,
– how the state and input are combined together to form output
12
Example of LSTM for purchase prediction
• Inputs are events (page views, clicks), padded to
a max sequence length, and passed through an
embedding layer
• LSTM output is aggregated with a pooling layer,
and some densely connected layers
• Through gradient descent, the network will learn
what event sequences are more likely to lead to
the target (purchase)
13
Event History…
Embedding
… LSTM
∑ Pooling (sum, mean…)
p(purchase)
Fully Connected
+ Classifier (logistic)…
“embedding”: representing an information by a
vector in a space, capturing its essential meaning
Bringing some focus: Attention Model
• Attention models are a way to make the
network focus on elements of interest
• Introduce the notion of “Query” = what the
network should focus on (topic)
• The attention network outputs weights W
that modulate the importance of each LSTM
output, given the topic
• This weighted and interpreted history is
then used in subsequent layers for
classification
14
Event History
Topic
Query
…
… LSTM
Embedding
Weighted history
⊗
⊗
softmax
W
Attention network
… à to final layers
Implementation & Execution Engine
• Desired technology for production:
– Easily distributable with Spark
– Callable from Scala
– Interoperability with Python (used in our Lab)
– High performance out-of-the-box
• should benefit from native libs
• ability to use GPU for training phases
– Good support of Sparse tensors
15
Apache MxNET
• Deep Learning framework, incubated at Apache
• Back-end in C++
• Multiple front-end languages: scala, python, Julia, R, …
• Integration with linalg native libs, including Intel MKL
• Benchmarks with GPUs are very good
• Aims at interoperability with others frameworks (ONNX,
MxNET as Keras backend, etc.)
16
LSTM w/ Attention in Python
- Written using (the very
young) Gluon API
- Similar to Keras
- Can be compiled into
a “Symbol” model
(symbolic graph of
execution)
17
Using in Scala a model trained in Python
18
Inference from a Spark pipeline
• Pay attention to resource usage on workers
– Ensure backend processing will have available CPU cores and plenty of RAM
• Pay attention to data copy operation that will happen between Spark
Vectors (from java memory) and MxNET native backend
19
Make model and
parameters available
to workers
(HDFS, broadcasting)
DataFrame
.mapPartitions
Group rows
into mini-
batches
Stack vectors
into NDArray
matrices /
tensors
Predict
Unstack
output
matrices
mapPartitions func
Initialize model only once per worker
Learnings about MxNET in Scala
• MxNET is very young
– Mixed API styles
– Scala API is lagging Python’s
• No sparse tensors yet
– Unhelpful error messages, especially when they come from the C++ backend
– In Lab context (notebook, fast iteration), it feels heavy
– Some non-trivial boilerplate code to write
– ONNX interop has limitations
• Yet, it’s very promising, and fast out-of-the-box
– Many binaries available with hw optimization
– Usable in Scala/Spark pipelines
20
Take aways
• To find the “tiny clues” within your data, you need powerful
learning algorithms to discover their latent meaning
• LSTM with Attention Models are especially good when it
comes to sequences of events
• Using inference models from Scala/Spark code is easy and
performant with MxNET, even if this framework has still a lot
of limitations
21
Thank you !
Questions ?
Reach me on twitter @mdespriee
22
Backup
23
Classical RNN have limitations
• Vanilla RNN can’t learn long-term dependencies with
gradient descent because of vanishing or exploding
gradients
24
Illustrations by Christopher Olah (check out his blog!) http://colah.github.io/
RNN
25
Input from upstream layers:
Each X is a value at a point in
time, represented by a Feature
Vector
Batched
samples
X0 .. Xt
Feature Vectors
(embedded)
Input /
Upstream layers
Output /
Deeper layers
“embedding”: a compact
mathematical representation of an
information into a vector, capturing
its semantics
MKL impact
26
CPU times: user 4min 12s, sys: 10.4 s, total: 4min 22s Wall time: 2min 2s
CPU times: user 15min 41s, sys: 40.6 s, total: 16min 21s Wall time: 59.6 s
Performance of training
27
0 20 40 60 80 100 120 140
CPU std
CPU + MKL
1 GPU
LSTM w/attention - 10K end-users
Wall time in seconds
0 500 1000 1500 2000 2500 3000 3500 4000 4500
CPU + MKL
1 GPU
LSTM w/attention - 1M end-users
Wall time in seconds
Disclaimer: These measurements
are indicative, and vary a lot
depending on many parameters.

Mais conteúdo relacionado

Mais procurados

Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Simplilearn
 

Mais procurados (20)

Rnn & Lstm
Rnn & LstmRnn & Lstm
Rnn & Lstm
 
Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
 
Factorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender SystemsFactorization Machines and Applications in Recommender Systems
Factorization Machines and Applications in Recommender Systems
 
Algorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at SpotifyAlgorithmic Music Recommendations at Spotify
Algorithmic Music Recommendations at Spotify
 
Uplift models
Uplift modelsUplift models
Uplift models
 
Drifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in ProductionDrifting Away: Testing ML Models in Production
Drifting Away: Testing ML Models in Production
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
 
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
 
Prediction of potential customers for term deposit
Prediction of potential customers for term depositPrediction of potential customers for term deposit
Prediction of potential customers for term deposit
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement Learning
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Introduction to Machine learning ppt
Introduction to Machine learning pptIntroduction to Machine learning ppt
Introduction to Machine learning ppt
 
Meta-Learning with Memory Augmented Neural Networks
Meta-Learning with Memory Augmented Neural NetworksMeta-Learning with Memory Augmented Neural Networks
Meta-Learning with Memory Augmented Neural Networks
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention Networks
 
Social Media Sentiment Analysis
Social Media Sentiment AnalysisSocial Media Sentiment Analysis
Social Media Sentiment Analysis
 
Steffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SFSteffen Rendle, Research Scientist, Google at MLconf SF
Steffen Rendle, Research Scientist, Google at MLconf SF
 

Semelhante a Using Deep Learning in Production Pipelines to Predict Consumers’ Interest with Mathieu Despriee

(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
Timothy Spann
 

Semelhante a Using Deep Learning in Production Pipelines to Predict Consumers’ Interest with Mathieu Despriee (20)

Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case
ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case
ROS 2 AI Integration Working Group 1: ALMA, SustainML & ROS 2 use case
 
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
What's inside the black box? Using ML to tune and manage Kafka. (Matthew Stum...
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup
 
Scala Days Highlights | BoldRadius
Scala Days Highlights | BoldRadiusScala Days Highlights | BoldRadius
Scala Days Highlights | BoldRadius
 
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar)
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar)
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar)
 
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
 
(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
 
Novi sad ai event 1-2018
Novi sad ai event 1-2018Novi sad ai event 1-2018
Novi sad ai event 1-2018
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
 
Machine learning from software developers point of view
Machine learning from software developers point of viewMachine learning from software developers point of view
Machine learning from software developers point of view
 
Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications
 
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
Embrace Sparsity At Web Scale: Apache Spark MLlib Algorithms Optimization For...
 
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote) FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
FPGAs as Components in Heterogeneous HPC Systems (paraFPGA 2015 keynote)
 
OpenMP tasking model: from the standard to the classroom
OpenMP tasking model: from the standard to the classroomOpenMP tasking model: from the standard to the classroom
OpenMP tasking model: from the standard to the classroom
 

Mais de Databricks

Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 

Mais de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Último

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 

Último (20)

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 

Using Deep Learning in Production Pipelines to Predict Consumers’ Interest with Mathieu Despriee

  • 1. Mathieu DESPRIEE, Tinyclues Using Deep Learning in Production Pipelines to Predict Consumers’ Interest #SAISML2
  • 2. Who am I ? Mathieu DESPRIEE ML Engineering Manager at In charge of ML Scalability & Operability Contributor to Spark, MxNET 2
  • 3. AI-FIRST SAAS SOLUTION USING ANONYMIZED FIRST PARTY DATA SEAMLESSLY INTEGRATED WITH YOUR MARKETING STACK DESIGNED FOR MARKETERS TINYCLUES IN A FEW WORDS DEEP LEARNING FOR CAMPAIGN TARGETING & PLANNING DRIVE MORE REVENUE AND ENGAGEMENT ACROSS ALL CHANNELS FIRST CAMPAIGNS IN 2 WEEKS
  • 5. QUANTIFIED ANTICIPATE & MAXIMIZE CAMPAIGN PERFORMANCE AI-FIRST: INTELLIGENT CAMPAIGNS IN JUST A FEW CLICKS EASY BUILD SEGMENTS IN MINUTES – NO GUESSWORK & NO DATA-SCIENCE NEEDED STRATEGIC PRIORITIZE BASED ON VALUE & MARKETING FATIGUE SPOT ON TARGET THE FUTURE BUYERS FOR ANY PRODUCT, IN THE DAYS FOLLOWING A CAMPAIGN START FROM YOUR CAMPAIGN IDEAS AND BUSINESS GOALS
  • 6. SUCCESS STORIES RETAIL +30% CAMPAIGN REVENUE RETAIL / FASHION +151% REVENUE PER EMAIL HOSPITALITY +178% CAMPAIGN REVENUE HOSPITALITY +30% CAMPAIGN REVENUE RETAIL / FASHION +60% REVENUE PER EMAIL E-COMMERCE +80% CAMPAIGN REVENUE 10M+ 10M+ 5M+ TRAVEL +115% CAMPAIGN REVENUE 10M+ $1B+ $1B+ $1B+ $1B+ $1B+ $1B+ RETAIL / FASHION +20% CAMPAIGN REVENUE ONLINE & IN-STORE 1B+ customers customers customers customers
  • 7. How it works 7 common data model Marketer’s inputs - topics - business goals customers, products, purchases, clicks, pages views… (1st party data, with opt-in) 2 PB 500 TB hot Unsupervised Learning Supervised Learning customers’ interests topics model prediction = f (cust., topic) travel retail … emailing FB mobile … Feedback Loop
  • 8. Predictive architecture 8 Encoding One-hot TopN Hashers … ⇓ Sparse features Unsupervised Learning Multiple layers of Unsupervised learning To build meaningful representations of data, even on dirty data ⇓ Embeddings: Sparse & Dense Features - Relational model - Dirty data - Very sparse data (10-100 K products, niche or new products, low activity customers…) common data model Sequence Learning Making sense of event history Feature Banks Model Banks Supervised Models P(click) P(purchase) E(revenue) … Very wide models (10K columns)
  • 9. Working with events • Events (page views, clicks…) are not always meaningful individually, but sequences are • Events are ordered, and are not independent from each other • Sequences are not all of the same length • Can we find a model architecture leveraging on these characteristics ? 9
  • 10. RNN: Recurrent Neural Networks • Networks with loops in them, allowing information to persist. 10 unrolled • They also allow input sequences of variable length – you just need to pad data with zeroes X0..Xk Illustrations by Christopher Olah - Check out his blog! colah.github.io
  • 11. LSTM: Long Short-Term Memory • Classical RNN can’t learn long-term dependencies with gradient descent because of vanishing or exploding gradients: they tend to forget in the long-term • LSTMs are explicitly designed to avoid this problem – Invented by Hochreiter & Schmidhuber (1997) – Many variants exist • LSTM have great results in speech recognition, translation, and are building block of assistants. 11
  • 12. LSTM • Cell-state is transmitted from cell to cell (top line) without going through a non-linear activation function • The rest of the cell controls: – how the input and previous outputs modifies the cell state, – how the state and input are combined together to form output 12
  • 13. Example of LSTM for purchase prediction • Inputs are events (page views, clicks), padded to a max sequence length, and passed through an embedding layer • LSTM output is aggregated with a pooling layer, and some densely connected layers • Through gradient descent, the network will learn what event sequences are more likely to lead to the target (purchase) 13 Event History… Embedding … LSTM ∑ Pooling (sum, mean…) p(purchase) Fully Connected + Classifier (logistic)… “embedding”: representing an information by a vector in a space, capturing its essential meaning
  • 14. Bringing some focus: Attention Model • Attention models are a way to make the network focus on elements of interest • Introduce the notion of “Query” = what the network should focus on (topic) • The attention network outputs weights W that modulate the importance of each LSTM output, given the topic • This weighted and interpreted history is then used in subsequent layers for classification 14 Event History Topic Query … … LSTM Embedding Weighted history ⊗ ⊗ softmax W Attention network … à to final layers
  • 15. Implementation & Execution Engine • Desired technology for production: – Easily distributable with Spark – Callable from Scala – Interoperability with Python (used in our Lab) – High performance out-of-the-box • should benefit from native libs • ability to use GPU for training phases – Good support of Sparse tensors 15
  • 16. Apache MxNET • Deep Learning framework, incubated at Apache • Back-end in C++ • Multiple front-end languages: scala, python, Julia, R, … • Integration with linalg native libs, including Intel MKL • Benchmarks with GPUs are very good • Aims at interoperability with others frameworks (ONNX, MxNET as Keras backend, etc.) 16
  • 17. LSTM w/ Attention in Python - Written using (the very young) Gluon API - Similar to Keras - Can be compiled into a “Symbol” model (symbolic graph of execution) 17
  • 18. Using in Scala a model trained in Python 18
  • 19. Inference from a Spark pipeline • Pay attention to resource usage on workers – Ensure backend processing will have available CPU cores and plenty of RAM • Pay attention to data copy operation that will happen between Spark Vectors (from java memory) and MxNET native backend 19 Make model and parameters available to workers (HDFS, broadcasting) DataFrame .mapPartitions Group rows into mini- batches Stack vectors into NDArray matrices / tensors Predict Unstack output matrices mapPartitions func Initialize model only once per worker
  • 20. Learnings about MxNET in Scala • MxNET is very young – Mixed API styles – Scala API is lagging Python’s • No sparse tensors yet – Unhelpful error messages, especially when they come from the C++ backend – In Lab context (notebook, fast iteration), it feels heavy – Some non-trivial boilerplate code to write – ONNX interop has limitations • Yet, it’s very promising, and fast out-of-the-box – Many binaries available with hw optimization – Usable in Scala/Spark pipelines 20
  • 21. Take aways • To find the “tiny clues” within your data, you need powerful learning algorithms to discover their latent meaning • LSTM with Attention Models are especially good when it comes to sequences of events • Using inference models from Scala/Spark code is easy and performant with MxNET, even if this framework has still a lot of limitations 21
  • 22. Thank you ! Questions ? Reach me on twitter @mdespriee 22
  • 24. Classical RNN have limitations • Vanilla RNN can’t learn long-term dependencies with gradient descent because of vanishing or exploding gradients 24 Illustrations by Christopher Olah (check out his blog!) http://colah.github.io/
  • 25. RNN 25 Input from upstream layers: Each X is a value at a point in time, represented by a Feature Vector Batched samples X0 .. Xt Feature Vectors (embedded) Input / Upstream layers Output / Deeper layers “embedding”: a compact mathematical representation of an information into a vector, capturing its semantics
  • 26. MKL impact 26 CPU times: user 4min 12s, sys: 10.4 s, total: 4min 22s Wall time: 2min 2s CPU times: user 15min 41s, sys: 40.6 s, total: 16min 21s Wall time: 59.6 s
  • 27. Performance of training 27 0 20 40 60 80 100 120 140 CPU std CPU + MKL 1 GPU LSTM w/attention - 10K end-users Wall time in seconds 0 500 1000 1500 2000 2500 3000 3500 4000 4500 CPU + MKL 1 GPU LSTM w/attention - 1M end-users Wall time in seconds Disclaimer: These measurements are indicative, and vary a lot depending on many parameters.