SlideShare uma empresa Scribd logo
1 de 19
Baixar para ler offline
Machine Learning @ Netflix
(and some lessons learned)
Yves Raimond (@moustaki)
Research/Engineering Manager
Search & Recommendations
Algorithm Engineering
Netflix evolution
Netflix scale
● > 69M members
● > 50 countries
● > 1000 device types
● > 3B hours/month
● 36% of peak US downstream traffic
Recommendations @ Netflix
● Goal: Help members find content
to watch and enjoy to maximize
satisfaction and retention
● Over 80% of what people watch
comes from our recommendations
● Top Picks, Because you Watched,
Trending Now, Row Ordering,
Evidence, Search, Search
Recommendations, Personalized
Genre Rows, ...
▪ Regression (Linear, logistic, elastic net)
▪ SVD and other Matrix Factorizations
▪ Factorization Machines
▪ Restricted Boltzmann Machines
▪ Deep Neural Networks
▪ Markov Models and Graph Algorithms
▪ Clustering
▪ Latent Dirichlet Allocation
▪ Gradient Boosted Decision Trees/Random Forests
▪ Gaussian Processes
▪ …
Models & Algorithms
Some lessons learned
Build the offline experimentation
framework first
When tackling a new problem
● What offline metrics can we compute that capture what online improvements we’
re actually trying to achieve?
● How should the input data to that evaluation be constructed (train, validation,
test)?
● How fast and easy is it to run a full cycle of offline experimentations?
○ Minimize time to first metric
● How replicable is the evaluation? How shareable are the results?
○ Provenance (see Dagobah)
○ Notebooks (see Jupyter, Zeppelin, Spark Notebook)
When tackling an old problem
● Same…
○ Were the metrics designed when first running experimentation in that space still appropriate now?
Think about distribution from the
outermost layers
1. For each combination of hyper-parameter
(e.g. grid search, random search, gaussian processes…)
2. For each subset of the training data
a. Multi-core learning (e.g. HogWild)
b. Distributed learning (e.g. ADMM, distributed L-BFGS, …)
When to use distributed learning?
● The impact of communication overhead when building distributed ML
algorithms is non-trivial
● Is your data big enough that the distribution offsets the communication overhead?
Example: Uncollapsed Gibbs sampler for LDA
(more details here)
Design production code to be
experimentation-friendly
Idea Data
Offline
Modeling
(R, Python,
MATLAB, …)
Iterate
Implement in
production
system (Java,
C++, …)
Missing post-
processing logic
Performance
issues
Actual
outputProduction environment
(A/B test) Code
discrepancies
Final
model
Data
discrepancies
Example development process
Avoid dual implementations
Shared Engine
Experiment
code
Production
code
ProductionExperiment
To be continued...
We’re hiring!
Yves Raimond (@moustaki)

Mais conteúdo relacionado

Mais procurados

10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer ExperienceYuan Wang
 
What Would Steve Do? 10 Lessons from the World's Most Captivating Presenters
What Would Steve Do? 10 Lessons from the World's Most Captivating PresentersWhat Would Steve Do? 10 Lessons from the World's Most Captivating Presenters
What Would Steve Do? 10 Lessons from the World's Most Captivating PresentersHubSpot
 
How AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdfHow AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdfMujeeb Riaz
 
Design for Startups - Build Better Products, Not More Features
Design for Startups - Build Better Products, Not More FeaturesDesign for Startups - Build Better Products, Not More Features
Design for Startups - Build Better Products, Not More FeaturesVitaly Golomb
 
The Science of Story: How Brands Can Use Storytelling To Get More Customers
The Science of Story: How Brands Can Use Storytelling To Get More CustomersThe Science of Story: How Brands Can Use Storytelling To Get More Customers
The Science of Story: How Brands Can Use Storytelling To Get More CustomersDigital Surgeons
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learngnakan
 
10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next Presentation10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next PresentationSOAP Presentations
 
A non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptxA non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptxSue Beckingham
 
The AI Revolution - Aaron Stelle WFG Title
The AI Revolution - Aaron Stelle WFG TitleThe AI Revolution - Aaron Stelle WFG Title
The AI Revolution - Aaron Stelle WFG TitleAaron Stelle
 
50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)Heinz Marketing Inc
 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of EverythingCharbel Zeaiter
 
Mind-Blowing Facts About National Parks
Mind-Blowing Facts About National ParksMind-Blowing Facts About National Parks
Mind-Blowing Facts About National ParksEthos3
 
Five Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideFive Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideCrispy Presentations
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPXiachongFeng
 
The power of creative collaboration
The power of creative collaborationThe power of creative collaboration
The power of creative collaborationTable19
 
Design Sprints for Innovation
Design Sprints for InnovationDesign Sprints for Innovation
Design Sprints for InnovationDave Hogue
 
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)Board of Innovation
 
The Science of Memorable Presentations
The Science of Memorable PresentationsThe Science of Memorable Presentations
The Science of Memorable PresentationsEthos3
 
20 Tweetable Quotes to Inspire Marketing & Design Creative Genius
20 Tweetable Quotes to Inspire Marketing & Design Creative Genius20 Tweetable Quotes to Inspire Marketing & Design Creative Genius
20 Tweetable Quotes to Inspire Marketing & Design Creative GeniusIMPACT Branding & Design LLC
 

Mais procurados (20)

10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience
 
What Would Steve Do? 10 Lessons from the World's Most Captivating Presenters
What Would Steve Do? 10 Lessons from the World's Most Captivating PresentersWhat Would Steve Do? 10 Lessons from the World's Most Captivating Presenters
What Would Steve Do? 10 Lessons from the World's Most Captivating Presenters
 
Design Your Career 2018
Design Your Career 2018Design Your Career 2018
Design Your Career 2018
 
How AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdfHow AI is going to change the world _M.Mujeeb Riaz.pdf
How AI is going to change the world _M.Mujeeb Riaz.pdf
 
Design for Startups - Build Better Products, Not More Features
Design for Startups - Build Better Products, Not More FeaturesDesign for Startups - Build Better Products, Not More Features
Design for Startups - Build Better Products, Not More Features
 
The Science of Story: How Brands Can Use Storytelling To Get More Customers
The Science of Story: How Brands Can Use Storytelling To Get More CustomersThe Science of Story: How Brands Can Use Storytelling To Get More Customers
The Science of Story: How Brands Can Use Storytelling To Get More Customers
 
Blueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & LearnBlueprint ChatGPT Lunch & Learn
Blueprint ChatGPT Lunch & Learn
 
10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next Presentation10 Powerful Body Language Tips for your next Presentation
10 Powerful Body Language Tips for your next Presentation
 
A non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptxA non-technical introduction to ChatGPT - SEDA.pptx
A non-technical introduction to ChatGPT - SEDA.pptx
 
The AI Revolution - Aaron Stelle WFG Title
The AI Revolution - Aaron Stelle WFG TitleThe AI Revolution - Aaron Stelle WFG Title
The AI Revolution - Aaron Stelle WFG Title
 
50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)50 Essential Content Marketing Hacks (Content Marketing World)
50 Essential Content Marketing Hacks (Content Marketing World)
 
The Future of Everything
The Future of EverythingThe Future of Everything
The Future of Everything
 
Mind-Blowing Facts About National Parks
Mind-Blowing Facts About National ParksMind-Blowing Facts About National Parks
Mind-Blowing Facts About National Parks
 
Five Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same SlideFive Killer Ways to Design The Same Slide
Five Killer Ways to Design The Same Slide
 
ChatGPT Evaluation for NLP
ChatGPT Evaluation for NLPChatGPT Evaluation for NLP
ChatGPT Evaluation for NLP
 
The power of creative collaboration
The power of creative collaborationThe power of creative collaboration
The power of creative collaboration
 
Design Sprints for Innovation
Design Sprints for InnovationDesign Sprints for Innovation
Design Sprints for Innovation
 
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
How I got 2.5 Million views on Slideshare (by @nickdemey - Board of Innovation)
 
The Science of Memorable Presentations
The Science of Memorable PresentationsThe Science of Memorable Presentations
The Science of Memorable Presentations
 
20 Tweetable Quotes to Inspire Marketing & Design Creative Genius
20 Tweetable Quotes to Inspire Marketing & Design Creative Genius20 Tweetable Quotes to Inspire Marketing & Design Creative Genius
20 Tweetable Quotes to Inspire Marketing & Design Creative Genius
 

Destaque

Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql TuningChris Adkin
 
Metaprogramming JavaScript
Metaprogramming  JavaScriptMetaprogramming  JavaScript
Metaprogramming JavaScriptdanwrong
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyMike Brittain
 
C the basic concepts
C the basic conceptsC the basic concepts
C the basic conceptsAbhinav Vatsa
 
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItWhy Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItLeading Edge Process Consultants LLC
 
Project Management With Scrum
Project Management With ScrumProject Management With Scrum
Project Management With ScrumTommy Norman
 
A Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerA Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerManas Das
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communicationNingsih SM
 
Capability Maturity Model
Capability Maturity ModelCapability Maturity Model
Capability Maturity ModelUzair Akram
 
Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Abdul Basit
 
Gear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of IndiaGear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of Indiakichu
 
Root cause analysis - tools and process
Root cause analysis - tools and processRoot cause analysis - tools and process
Root cause analysis - tools and processCharles Cotter, PhD
 
Introduction to Cyber Security
Introduction to Cyber SecurityIntroduction to Cyber Security
Introduction to Cyber SecurityStephen Lahanas
 
Object Oriented Analysis and Design
Object Oriented Analysis and DesignObject Oriented Analysis and Design
Object Oriented Analysis and DesignHaitham El-Ghareeb
 
Agile Transformation and Cultural Change
 Agile Transformation and Cultural Change Agile Transformation and Cultural Change
Agile Transformation and Cultural ChangeJohnny Ordóñez
 
Evolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsEvolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsSai praveen Seva
 
An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)Usersnap
 

Destaque (20)

Oracle Sql Tuning
Oracle Sql TuningOracle Sql Tuning
Oracle Sql Tuning
 
Metaprogramming JavaScript
Metaprogramming  JavaScriptMetaprogramming  JavaScript
Metaprogramming JavaScript
 
Capability maturity model
Capability maturity modelCapability maturity model
Capability maturity model
 
Organizational Communication
Organizational CommunicationOrganizational Communication
Organizational Communication
 
Principles and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at EtsyPrinciples and Practices in Continuous Deployment at Etsy
Principles and Practices in Continuous Deployment at Etsy
 
C the basic concepts
C the basic conceptsC the basic concepts
C the basic concepts
 
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About ItWhy Project Managers (Understandably) Hate the CMMI -- and What to Do About It
Why Project Managers (Understandably) Hate the CMMI -- and What to Do About It
 
Project Management With Scrum
Project Management With ScrumProject Management With Scrum
Project Management With Scrum
 
A Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For BeginerA Simple Introduction To CMMI For Beginer
A Simple Introduction To CMMI For Beginer
 
Organizational communication
Organizational communicationOrganizational communication
Organizational communication
 
Capability Maturity Model
Capability Maturity ModelCapability Maturity Model
Capability Maturity Model
 
Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8Capability maturity model cmm lecture 8
Capability maturity model cmm lecture 8
 
Gear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of IndiaGear Cutting Presentation for Polytechnic College Students of India
Gear Cutting Presentation for Polytechnic College Students of India
 
6 Thinking Hats
6 Thinking Hats6 Thinking Hats
6 Thinking Hats
 
Root cause analysis - tools and process
Root cause analysis - tools and processRoot cause analysis - tools and process
Root cause analysis - tools and process
 
Introduction to Cyber Security
Introduction to Cyber SecurityIntroduction to Cyber Security
Introduction to Cyber Security
 
Object Oriented Analysis and Design
Object Oriented Analysis and DesignObject Oriented Analysis and Design
Object Oriented Analysis and Design
 
Agile Transformation and Cultural Change
 Agile Transformation and Cultural Change Agile Transformation and Cultural Change
Agile Transformation and Cultural Change
 
Evolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systemsEvolution of Microsoft windows operating systems
Evolution of Microsoft windows operating systems
 
An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)An Overview of User Acceptance Testing (UAT)
An Overview of User Acceptance Testing (UAT)
 

Semelhante a Paris ML meetup

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistAlexey Zinoviev
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...Dataiku
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...Infoshare
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Brocade
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabsChetan Khatri
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroDaniel Marcous
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixJustin Basilico
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Ali Alkan
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or realityAwantik Das
 

Semelhante a Paris ML meetup (20)

Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
Joker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data ScientistJoker'14 Java as a fundamental working tool of the Data Scientist
Joker'14 Java as a fundamental working tool of the Data Scientist
 
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...Dataiku   hadoop summit - semi-supervised learning with hadoop for understand...
Dataiku hadoop summit - semi-supervised learning with hadoop for understand...
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJSJavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
JavaScript and Artificial Intelligence by Aatman & Sagar - AhmedabadJS
 
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
infoShare AI Roadshow 2018 - Adam Karwan (Groupon) - Jak wykorzystać uczenie ...
 
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
Recent Advances in Machine Learning: Bringing a New Level of Intelligence to ...
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Unit no_1.pptx
Unit no_1.pptxUnit no_1.pptx
Unit no_1.pptx
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
A step towards machine learning at accionlabs
A step towards machine learning at accionlabsA step towards machine learning at accionlabs
A step towards machine learning at accionlabs
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Production-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to heroProduction-Ready BIG ML Workflows - from zero to hero
Production-Ready BIG ML Workflows - from zero to hero
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Data science
Data scienceData science
Data science
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
A Kaggle Talk
A Kaggle TalkA Kaggle Talk
A Kaggle Talk
 
machine learning
machine learningmachine learning
machine learning
 

Mais de Yves Raimond

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsYves Raimond
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender SystemsYves Raimond
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learningYves Raimond
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the WorldYves Raimond
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Yves Raimond
 
Utilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCUtilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCYves Raimond
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBCYves Raimond
 
Publishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebPublishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebYves Raimond
 
Linked data and applications
Linked data and applicationsLinked data and applications
Linked data and applicationsYves Raimond
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic WebYves Raimond
 

Mais de Yves Raimond (11)

Time, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender SystemsTime, Context and Causality in Recommender Systems
Time, Context and Causality in Recommender Systems
 
Deep Learning for Recommender Systems
Deep Learning for Recommender SystemsDeep Learning for Recommender Systems
Deep Learning for Recommender Systems
 
(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning(Some) pitfalls of distributed learning
(Some) pitfalls of distributed learning
 
Recommending for the World
Recommending for the WorldRecommending for the World
Recommending for the World
 
Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015Spark Meetup @ Netflix, 05/19/2015
Spark Meetup @ Netflix, 05/19/2015
 
Utilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBCUtilisation du Web Semantique pour les sites de la BBC
Utilisation du Web Semantique pour les sites de la BBC
 
Linked Data on the BBC
Linked Data on the BBCLinked Data on the BBC
Linked Data on the BBC
 
Publishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the WebPublishing and interlinking music-related data on the Web
Publishing and interlinking music-related data on the Web
 
Linked data and applications
Linked data and applicationsLinked data and applications
Linked data and applications
 
Web of data
Web of dataWeb of data
Web of data
 
Towards a musical Semantic Web
Towards a musical Semantic WebTowards a musical Semantic Web
Towards a musical Semantic Web
 

Último

AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Christo Ananth
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLManishPatel169454
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Christo Ananth
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 

Último (20)

AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 

Paris ML meetup

  • 1.
  • 2. Machine Learning @ Netflix (and some lessons learned) Yves Raimond (@moustaki) Research/Engineering Manager Search & Recommendations Algorithm Engineering
  • 4. Netflix scale ● > 69M members ● > 50 countries ● > 1000 device types ● > 3B hours/month ● 36% of peak US downstream traffic
  • 5. Recommendations @ Netflix ● Goal: Help members find content to watch and enjoy to maximize satisfaction and retention ● Over 80% of what people watch comes from our recommendations ● Top Picks, Because you Watched, Trending Now, Row Ordering, Evidence, Search, Search Recommendations, Personalized Genre Rows, ...
  • 6. ▪ Regression (Linear, logistic, elastic net) ▪ SVD and other Matrix Factorizations ▪ Factorization Machines ▪ Restricted Boltzmann Machines ▪ Deep Neural Networks ▪ Markov Models and Graph Algorithms ▪ Clustering ▪ Latent Dirichlet Allocation ▪ Gradient Boosted Decision Trees/Random Forests ▪ Gaussian Processes ▪ … Models & Algorithms
  • 8. Build the offline experimentation framework first
  • 9. When tackling a new problem ● What offline metrics can we compute that capture what online improvements we’ re actually trying to achieve? ● How should the input data to that evaluation be constructed (train, validation, test)? ● How fast and easy is it to run a full cycle of offline experimentations? ○ Minimize time to first metric ● How replicable is the evaluation? How shareable are the results? ○ Provenance (see Dagobah) ○ Notebooks (see Jupyter, Zeppelin, Spark Notebook)
  • 10. When tackling an old problem ● Same… ○ Were the metrics designed when first running experimentation in that space still appropriate now?
  • 11. Think about distribution from the outermost layers
  • 12. 1. For each combination of hyper-parameter (e.g. grid search, random search, gaussian processes…) 2. For each subset of the training data a. Multi-core learning (e.g. HogWild) b. Distributed learning (e.g. ADMM, distributed L-BFGS, …)
  • 13. When to use distributed learning? ● The impact of communication overhead when building distributed ML algorithms is non-trivial ● Is your data big enough that the distribution offsets the communication overhead?
  • 14. Example: Uncollapsed Gibbs sampler for LDA (more details here)
  • 15. Design production code to be experimentation-friendly
  • 16. Idea Data Offline Modeling (R, Python, MATLAB, …) Iterate Implement in production system (Java, C++, …) Missing post- processing logic Performance issues Actual outputProduction environment (A/B test) Code discrepancies Final model Data discrepancies Example development process
  • 17. Avoid dual implementations Shared Engine Experiment code Production code ProductionExperiment