SlideShare uma empresa Scribd logo
1 de 31
Baixar para ler offline
Pragmatic Deep Learning for image labelling
An application to a travel recommendation engine
Introduction and Context
Iterative building of a
recommender system
Labeling Images
Pragmatic deep learning for
dummies
Post Processing
AKA: Image for BI on steroids
Outline
Results
More images !
Dataiku
•  Founded in 2013
•  90 + employees, 100 + clients
•  Paris, New-York, London, San Francisco,
Singapore
Data Science Software Editor of Dataiku DSS
DESIGN
Load and prepare
your data
PREPARE
Build your
models
MODEL
Visualize and share
your work
ANALYSE
Re-execute your
workflow at
ease
AUTOMATE
Follow your production
environment
MONITOR
Get predictions
in real time
SCORE
PRODUCTIO
N
E-business vacation retailer
Negotiate the best prize for their clients
Discount luxury
Key Figures
Sale Image is paramount
Purchase is impulsive
18 Millions of clients.
Hundreds of sales opened everyday
Specificities
Highly temporary sales
-> Classical recommender system fail
-> Time event linked (Christmas, ski, summer)
Expensive Product
-> Few recurrent buyers
-> Appearance counts a lot
Iterative Building of a Recommender System
Basic Recommendation Engines
Other Factors
One Meta Model to Rule Them All
Recommenders	
  as	
  features	
  
Machine	
  learning	
  to	
  op5mize	
  
purchasing	
  probability	
  
Combine	
  
Recommend	
  
Describe	
  
Cleaning, combining
and enrichment of
data
Recommendation
Engines
Optimization of home
display
the application
automatically runs and
compiles heterogeneous
data
Generation of
recommendations based
on user behaviour
Every customer is shown the 10 sales
he is the most likely to buy
Customer visits
Purchases
Sales Images
Metal model combine
recommendations to
directly optimize
purchasing probability
Meta Model
Recommender system for Home Page Ordering
+7% revenue
Sales information
(A/B testing)
Batch Scoring every night
Why use Image ?
We want do distinguish
« Sun and
Beach »
« Ski »
A picture is worth a thousand words
Sales Images
Integrating Image Information
Labeling Model
Pool + Palm Trees Hotel
+ Mountains
Pool + Forest + Hotel + Sea
Sea + Beach +Forest + Hotel
Sales descriptions
vector
CONTENT	
  BASED	
  
Recommender
System
Image Labelling For Recommendation Engine
Pragma&c	
  Deep	
  learning	
  for	
  “Dummies”	
  
Using Deep Learning models
Common Issues
“I don’t have GPUs server” “I don’t have a deep leaning expert”
“I don’t have labelled data” (or too few) “I don’t have the time to wait for model training ”
I don’t want to pay to pay for private apis” / “I’m afraid their labelling will change over time”
“I don’t have (or few) labelled data”
-> Is there similar data ?
Solution 1 : Pre trained models
PLACES	
  DATABASE	
  US	
   SUN	
  DATABASE	
  
205	
  categories	
  
2.5	
  M	
  images	
  
307	
  categories	
  
110	
  K	
  images	
  
tower: 0.53
skyscraper: 0.26
swimming_pool/outdoor: 0.65
inn/outdoor: 0.06
Solution 1 : Pre trained models
If there is open data, there is an open pre trained model !
•  Kudos to the community
•  Check the licensing
Example	
  with	
  Places	
  (Caffe	
  Model	
  Zoo)	
  :	
  
	
  
Solution 2 : Transfer Learning
Credit	
  :	
  	
  Fei-­‐Fei	
  Li	
  &	
  Andrej	
  Karpathy	
  &	
  Jus5n	
  Johnson	
  hYp://cs231n.stanford.edu/slides/winter1516_lecture11.pdf	
  
PLACES	
  DATABASE	
   OUR	
  DATA	
  SUN	
  DATABASE	
  
Training	
  
(op5onal)	
  
Pre-­‐trained	
  model	
  
VGG16	
  
tower: 0.53
skyscraper: 0.26
Re-­‐Training	
  
Transferred	
  Data	
  :	
  
Last	
  convolu5onal	
  
layer	
  features	
  
Re-­‐trained	
  model	
  
TensorFlow	
  
2	
  fully	
  connected	
  layers	
  
Caffe	
  
Model	
  Zoo	
  
	
  
GPU	
  
CPU	
  
GPU	
  
Leverage existing knowledge !
Solution 2 : Transfer Learning
Accuracy:	
  72%,	
  Top-­‐5	
  Acc:	
  90	
  %	
  >	
  state	
  of	
  the	
  art	
  on	
  dataset	
  alone	
  
Post Treatment & Results
(Or how we transfer the labelling
information)
Using	
  Images	
  informa&on	
  for	
  BI	
  on	
  steroids	
  	
  
Labels post-processing
Complementary information Redondant information
Issue with our approach:
Solution : NMF Matrix Factorization
Dimension	
  
Reduc5on	
   Explicability	
  Sparsity	
   Balancedness	
  
Image content detection
Topic scores determine the importance of topics in an image
TOPIC	
   TOPIC	
  SCORE	
  (%)	
  
Golf	
  course	
  –	
  Fairway	
  –	
  PuHng	
  green	
   31	
  
Hotel	
  –	
  Inn	
  –	
  Apartment	
  building	
  outdoor	
   30	
  
Swimming	
  pool	
  –	
  Lido	
  Deck	
  –	
  Hot	
  tub	
  
outdoor	
  
22	
  
Beach	
  –	
  Coast	
  -­‐	
  Harbor	
   17	
  
TOPIC	
   TOPIC	
  SCORE	
  (%)	
  
Tower	
  –	
  Skyscraper	
  –	
  Office	
  building	
   62	
  
Bridge	
  –	
  River	
  –	
  Viaduct	
   38	
  
Results ?
1) Visits :
•  France and Morocco
•  Pool displayed
2) First Recommendation
•  Mostly France & Mediterranean
•  Fails to display pools
3) Only Images recommendation
•  Pool all around the world
•  Does not respect budget
4) Third column = Right Mix
1) 2) 3) 4)
Conclusion
Do iterative data science !
Start simple and grow
Evaluate at each steps
Image labelling = BI on steroids
Transfer Learning
Kick-start your project
Gain time and money
Any Data Scientist can do it
Deep Learning
Don’t start from scratch !
Is there existing data ?
Is there a pre-trained model ?
Learned along the way
What’s next ?
AYrac5veness	
  =	
  %	
  visits	
  with	
  tag	
  /	
  %	
  sales	
  with	
  tag	
  	
  
For	
  ski	
  sales,	
  indoor	
  pictures	
  performs	
  beYer	
  
	
  
What’s Next ?
Kenya
Prague
Berlin
Cambodia
What’s Next ? Customize the Image !
Kenya
Prague
Berlin
Cambodia
Thank you for your attention !
Solution 3 : What about APIs ?
What about APIs ? Use for generating labels !
How to steal model:
•  1) Score part of the database for training
•  2) Train a model
•  3) Score your entire database !
(Or don’t, it’s illegal)
But I have only 5000 requests ?
-> Use Transfer Learning !
What about APIs ? Use for generating labels !
Experiment:
•  5000 requests on API
-> 4500 for training , 500 for validation
-> 180 class to predict
•  Transfer learning with MIT Places Pre-trained Model
•  Scikit learn Multilabel model
•  One Vs the Rest
•  Untuned Logistic regression
(demo, not used in any real project)
(Or don’t, it’s illegal)
What about APIs ? Results
Accuracy	
   95	
  
Recall	
   80	
  
Precision	
   75	
  
Label	
   Probability	
   Label	
   Probability	
  
landscape 1,0000 sunset 0,9998
sky 1,0000 no person 0,9996
outdoors 1,0000 water 0,9990
nature 1,0000 park 0,9849
rock 1,0000 river 0,9678
travel 1,0000 scenic 0,8031
Label	
   Probability	
   Label	
   Probability	
  
beach 1,0000 ocean 1,0000
summer 1,0000 relaxation 1,0000
sand 1,0000 island 1,0000
tropical 1,0000 idyllic 1,0000
travel 1,0000 seashore 0,9998
seascape 1,0000 water 0,9997(demo, not used in any real project)

Mais conteúdo relacionado

Semelhante a Pragmatic deep learning for image labelling

Machine Learning: How small businesses can enter the race
Machine Learning: How small businesses can enter the raceMachine Learning: How small businesses can enter the race
Machine Learning: How small businesses can enter the race
Scaleway
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
Doug Needham
 

Semelhante a Pragmatic deep learning for image labelling (20)

Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...Extracting information from images using deep learning and transfer learning ...
Extracting information from images using deep learning and transfer learning ...
 
Ria Sankar on Building AI Products
Ria Sankar on Building AI ProductsRia Sankar on Building AI Products
Ria Sankar on Building AI Products
 
Cutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for EveryoneCutting Edge Computer Vision for Everyone
Cutting Edge Computer Vision for Everyone
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Knowledge Discovery
Knowledge DiscoveryKnowledge Discovery
Knowledge Discovery
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
Machine Learning: How small businesses can enter the race
Machine Learning: How small businesses can enter the raceMachine Learning: How small businesses can enter the race
Machine Learning: How small businesses can enter the race
 
Scaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflowScaling Ride-Hailing with Machine Learning on MLflow
Scaling Ride-Hailing with Machine Learning on MLflow
 
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
“Machine Learning in Production + Case Studies” by Dmitrijs Lvovs from Epista...
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science Challenge
 
Data quality is more important than you think
Data quality is more important than you thinkData quality is more important than you think
Data quality is more important than you think
 
Building High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning ApplicationsBuilding High Available and Scalable Machine Learning Applications
Building High Available and Scalable Machine Learning Applications
 
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
How to Productionize Your Machine Learning Models Using Apache Spark MLlib 2....
 
2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx2024-02-24_Session 1 - PMLE_UPDATED.pptx
2024-02-24_Session 1 - PMLE_UPDATED.pptx
 
Machine learning quality for production
Machine learning quality for productionMachine learning quality for production
Machine learning quality for production
 
Easy path to machine learning (Spring 2021)
Easy path to machine learning (Spring 2021)Easy path to machine learning (Spring 2021)
Easy path to machine learning (Spring 2021)
 
“Practical Image Data Augmentation Methods for Training Deep Learning Object ...
“Practical Image Data Augmentation Methods for Training Deep Learning Object ...“Practical Image Data Augmentation Methods for Training Deep Learning Object ...
“Practical Image Data Augmentation Methods for Training Deep Learning Object ...
 

Último

👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 

Último (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men  🔝Mathura🔝   Escorts...
➥🔝 7737669865 🔝▻ Mathura Call-girls in Women Seeking Men 🔝Mathura🔝 Escorts...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 

Pragmatic deep learning for image labelling

  • 1. Pragmatic Deep Learning for image labelling An application to a travel recommendation engine
  • 2. Introduction and Context Iterative building of a recommender system Labeling Images Pragmatic deep learning for dummies Post Processing AKA: Image for BI on steroids Outline Results More images !
  • 3. Dataiku •  Founded in 2013 •  90 + employees, 100 + clients •  Paris, New-York, London, San Francisco, Singapore Data Science Software Editor of Dataiku DSS DESIGN Load and prepare your data PREPARE Build your models MODEL Visualize and share your work ANALYSE Re-execute your workflow at ease AUTOMATE Follow your production environment MONITOR Get predictions in real time SCORE PRODUCTIO N
  • 4. E-business vacation retailer Negotiate the best prize for their clients Discount luxury Key Figures Sale Image is paramount Purchase is impulsive 18 Millions of clients. Hundreds of sales opened everyday
  • 5. Specificities Highly temporary sales -> Classical recommender system fail -> Time event linked (Christmas, ski, summer) Expensive Product -> Few recurrent buyers -> Appearance counts a lot
  • 6. Iterative Building of a Recommender System
  • 9. One Meta Model to Rule Them All Recommenders  as  features   Machine  learning  to  op5mize   purchasing  probability   Combine   Recommend   Describe  
  • 10. Cleaning, combining and enrichment of data Recommendation Engines Optimization of home display the application automatically runs and compiles heterogeneous data Generation of recommendations based on user behaviour Every customer is shown the 10 sales he is the most likely to buy Customer visits Purchases Sales Images Metal model combine recommendations to directly optimize purchasing probability Meta Model Recommender system for Home Page Ordering +7% revenue Sales information (A/B testing) Batch Scoring every night
  • 11. Why use Image ? We want do distinguish « Sun and Beach » « Ski » A picture is worth a thousand words
  • 12. Sales Images Integrating Image Information Labeling Model Pool + Palm Trees Hotel + Mountains Pool + Forest + Hotel + Sea Sea + Beach +Forest + Hotel Sales descriptions vector CONTENT  BASED   Recommender System
  • 13. Image Labelling For Recommendation Engine Pragma&c  Deep  learning  for  “Dummies”  
  • 14. Using Deep Learning models Common Issues “I don’t have GPUs server” “I don’t have a deep leaning expert” “I don’t have labelled data” (or too few) “I don’t have the time to wait for model training ” I don’t want to pay to pay for private apis” / “I’m afraid their labelling will change over time”
  • 15. “I don’t have (or few) labelled data” -> Is there similar data ? Solution 1 : Pre trained models PLACES  DATABASE  US   SUN  DATABASE   205  categories   2.5  M  images   307  categories   110  K  images  
  • 16. tower: 0.53 skyscraper: 0.26 swimming_pool/outdoor: 0.65 inn/outdoor: 0.06 Solution 1 : Pre trained models If there is open data, there is an open pre trained model ! •  Kudos to the community •  Check the licensing Example  with  Places  (Caffe  Model  Zoo)  :    
  • 17. Solution 2 : Transfer Learning Credit  :    Fei-­‐Fei  Li  &  Andrej  Karpathy  &  Jus5n  Johnson  hYp://cs231n.stanford.edu/slides/winter1516_lecture11.pdf  
  • 18. PLACES  DATABASE   OUR  DATA  SUN  DATABASE   Training   (op5onal)   Pre-­‐trained  model   VGG16   tower: 0.53 skyscraper: 0.26 Re-­‐Training   Transferred  Data  :   Last  convolu5onal   layer  features   Re-­‐trained  model   TensorFlow   2  fully  connected  layers   Caffe   Model  Zoo     GPU   CPU   GPU   Leverage existing knowledge ! Solution 2 : Transfer Learning Accuracy:  72%,  Top-­‐5  Acc:  90  %  >  state  of  the  art  on  dataset  alone  
  • 19. Post Treatment & Results (Or how we transfer the labelling information) Using  Images  informa&on  for  BI  on  steroids    
  • 20. Labels post-processing Complementary information Redondant information Issue with our approach: Solution : NMF Matrix Factorization Dimension   Reduc5on   Explicability  Sparsity   Balancedness  
  • 21. Image content detection Topic scores determine the importance of topics in an image TOPIC   TOPIC  SCORE  (%)   Golf  course  –  Fairway  –  PuHng  green   31   Hotel  –  Inn  –  Apartment  building  outdoor   30   Swimming  pool  –  Lido  Deck  –  Hot  tub   outdoor   22   Beach  –  Coast  -­‐  Harbor   17   TOPIC   TOPIC  SCORE  (%)   Tower  –  Skyscraper  –  Office  building   62   Bridge  –  River  –  Viaduct   38  
  • 22. Results ? 1) Visits : •  France and Morocco •  Pool displayed 2) First Recommendation •  Mostly France & Mediterranean •  Fails to display pools 3) Only Images recommendation •  Pool all around the world •  Does not respect budget 4) Third column = Right Mix 1) 2) 3) 4)
  • 23. Conclusion Do iterative data science ! Start simple and grow Evaluate at each steps Image labelling = BI on steroids Transfer Learning Kick-start your project Gain time and money Any Data Scientist can do it Deep Learning Don’t start from scratch ! Is there existing data ? Is there a pre-trained model ?
  • 24. Learned along the way What’s next ? AYrac5veness  =  %  visits  with  tag  /  %  sales  with  tag     For  ski  sales,  indoor  pictures  performs  beYer    
  • 26. What’s Next ? Customize the Image ! Kenya Prague Berlin Cambodia
  • 27. Thank you for your attention !
  • 28. Solution 3 : What about APIs ?
  • 29. What about APIs ? Use for generating labels ! How to steal model: •  1) Score part of the database for training •  2) Train a model •  3) Score your entire database ! (Or don’t, it’s illegal) But I have only 5000 requests ? -> Use Transfer Learning !
  • 30. What about APIs ? Use for generating labels ! Experiment: •  5000 requests on API -> 4500 for training , 500 for validation -> 180 class to predict •  Transfer learning with MIT Places Pre-trained Model •  Scikit learn Multilabel model •  One Vs the Rest •  Untuned Logistic regression (demo, not used in any real project) (Or don’t, it’s illegal)
  • 31. What about APIs ? Results Accuracy   95   Recall   80   Precision   75   Label   Probability   Label   Probability   landscape 1,0000 sunset 0,9998 sky 1,0000 no person 0,9996 outdoors 1,0000 water 0,9990 nature 1,0000 park 0,9849 rock 1,0000 river 0,9678 travel 1,0000 scenic 0,8031 Label   Probability   Label   Probability   beach 1,0000 ocean 1,0000 summer 1,0000 relaxation 1,0000 sand 1,0000 island 1,0000 tropical 1,0000 idyllic 1,0000 travel 1,0000 seashore 0,9998 seascape 1,0000 water 0,9997(demo, not used in any real project)