SlideShare uma empresa Scribd logo
1 de 51
Watch the video with slide 
synchronization on InfoQ.com! 
http://www.infoq.com/presentations 
/machine-learning-netflix-2014 
InfoQ.com: News & Community Site 
• 750,000 unique visitors/month 
• Published in 4 languages (English, Chinese, Japanese and Brazilian 
Portuguese) 
• Post content from our QCon conferences 
• News 15-20 / week 
• Articles 3-4 / week 
• Presentations (videos) 12-15 / week 
• Interviews 2-3 / week 
• Books 1 / month
Presented at QCon New York 
www.qconnewyork.com 
Purpose of QCon 
- to empower software development by facilitating the spread of 
knowledge and innovation 
Strategy 
- practitioner-driven conference designed for YOU: influencers of 
change and innovation in your teams 
- speakers and topics driving the evolution and innovation 
- connecting and catalyzing the influencers and innovators 
Highlights 
- attended by more than 12,000 delegates since 2007 
- held in 9 cities worldwide
Machine Learning At 
Netflix Scale 
Aish Fenton 
Manager - Research Engineering 
@aishfenton
Everything is a 
recommendation
4
Top Picks for Aish
Movies based on books
Because you watched Bob’s Burgers
Rank based on your taste 
Rank based on your taste
75% of plays come 
from homepage
Back Story…
What we were interested in: 
▪High quality recommendations 
Proxy question: 
▪Accuracy in predicted rating 
▪Improve by 10% = $1million! 
predicted 
actual
Top two results still used in production! 
SVD RBMs
>
2006 2013
• > 44M members 
• > 40 countries 
• > 5B hours in Q3 2013 
• Log 100B events/day 
• 31.62% of peak US downstream traffic
Data and Models
▪> 40M subscribers 
▪Ratings: ~5M/day 
▪Searches: >3M/day 
▪Plays: > 50M/day 
▪Streamed hours: 
o 5B hours in Q3 2013 
Geo Info 
Time 
Impressions 
Ratings 
Device Info 
Metadata 
Social 
Plays 
Demographics 
Member Behavior
Latent User Vector 
Aish House of Cards 
Latent Item Vector
House of Cards 
m1 
m2 
m3 
House of Cards 
3.53 
M 
u1 u2 u3 
Aish Aish 
U R
Mean Rating My Bias 
Movie Bias 
Interaction
3.55 = 2.50 + -1.5 + 1.2 + pq 
Mean Rating My Bias 
Movie Bias 
Interaction 
My rating for 
House of Cards
House of Cards 
3.53 
R 
U 
M 
u1 u2 u3 
m1 
m2 
m3 
Aish 
1.34 
2.35 
Time 
t1 t2 t3 Time 
T
▪Matrix/Tensor Factorization 
▪Regression models (Logistic, Linear, Elastic nets) 
▪Factorization Machines 
▪Restricted Boltzmann Machines 
▪Markov Chains & other graph models 
▪Clustering / Topic Models 
▪Neural Networks 
▪Association Rules 
▪GBDT/RF 
▪…
Improvement Over Baseline 
Popularity 
+ Ratings 
+ More Features & Optimized Models 
0% 
50% 
100% 
150% 
200% 
250% 
300%
Anatomy of a 
Machine Learning 
Platform
Problem 
Data 
Experiment 
Offline 
Test / 
Metrics 
Produce 
Model
Offline 
Near-line 
Online 
Event 
Distribution 
UI Clients 
Online 
Algs 
Model 
Trainer 
Pre-compute 
AB Test 
Metrics 
API Layer 
Monitoring 
Hadoop / Data Warehouse 
Experimentation 
Platform 
S3 / HDFS 
Offline 
Metrics 
Query Tools 
Models 
Models
Offline 
Near-line 
Online 
Event 
Distribution 
UI Clients 
Online 
Algs 
Model 
Trainer 
Pre-compute 
AB Test 
Metrics 
API Layer 
Monitoring 
Hadoop / Data Warehouse 
Experimentation 
Platform 
S3 / HDFS 
Offline 
Metrics 
Query Tools 
Models 
Models
Many different types of data… 
▪App Logs 
▪User Actions 
▪Ratings 
▪Plays 
▪Queue Adds 
▪Algo Actions 
▪Impressions (Presentation Bias) 
▪Context 
▪Device Info 
▪User Demographics 
▪Social 
▪Time 
▪…
Offline 
Near-line 
Online 
Event 
Distribution 
UI Clients 
Online 
Algs 
Model 
Trainer 
Pre-compute 
AB Test 
Metrics 
API Layer 
Monitoring 
Hadoop / Data Warehouse 
Experimentation 
Platform 
S3 / HDFS 
Offline 
Metrics 
Query Tools 
Models 
Models 
Embedded 
Embedded
Weights 
Real-time popularity of movie
Example: Neural Network Training
θ 
Input Hidden Layer Output
Input Hidden Layers Output
Neural Network Training 
1,536 cores 
G2 Instances 
$0.60 p/h
But… things can go astray
Offline 
Near-line 
Online 
Event 
Distribution 
UI Clients 
Online 
Algs 
Model 
Trainer 
Pre-compute 
AB Test 
Metrics 
API Layer 
Monitoring 
Hadoop / Data Warehouse 
Experimentation 
Platform 
S3 / HDFS 
Offline 
Metrics 
Query Tools 
Models 
Models
M 
U R 
Pre-compute 
Online u1 u2 u3
Offline 
Near-line 
Publish new model 
Online 
Event 
Distribution 
UI Clients 
Online 
Algs 
Model 
Trainer 
Pre-compute 
AB Test 
Metrics 
API Layer 
for Aish 
Monitoring 
Hadoop / Data Warehouse 
Experimentation 
Platform 
S3 / HDFS 
Offline 
Metrics 
Query Tools 
Models 
Models 
Aish played HoC
Aish Fenton 
@aishfenton 
https://www.linkedin.com/profile/view?id=47917219
Watch the video with slide synchronization on 
InfoQ.com! 
http://www.infoq.com/presentations/machine-learning- 
netflix-2014

Mais conteúdo relacionado

Mais de C4Media

Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaC4Media
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideC4Media
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDC4Media
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine LearningC4Media
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at SpeedC4Media
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsC4Media
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerC4Media
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleC4Media
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeC4Media
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereC4Media
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing ForC4Media
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreC4Media
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsC4Media
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechC4Media
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/awaitC4Media
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaC4Media
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayC4Media
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?C4Media
 

Mais de C4Media (20)

Does Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to JavaDoes Java Need Inline Types? What Project Valhalla Can Bring to Java
Does Java Need Inline Types? What Project Valhalla Can Bring to Java
 
Service Meshes- The Ultimate Guide
Service Meshes- The Ultimate GuideService Meshes- The Ultimate Guide
Service Meshes- The Ultimate Guide
 
Shifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CDShifting Left with Cloud Native CI/CD
Shifting Left with Cloud Native CI/CD
 
CI/CD for Machine Learning
CI/CD for Machine LearningCI/CD for Machine Learning
CI/CD for Machine Learning
 
Fault Tolerance at Speed
Fault Tolerance at SpeedFault Tolerance at Speed
Fault Tolerance at Speed
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
ML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.jsML in the Browser: Interactive Experiences with Tensorflow.js
ML in the Browser: Interactive Experiences with Tensorflow.js
 
Build Your Own WebAssembly Compiler
Build Your Own WebAssembly CompilerBuild Your Own WebAssembly Compiler
Build Your Own WebAssembly Compiler
 
User & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix ScaleUser & Device Identity for Microservices @ Netflix Scale
User & Device Identity for Microservices @ Netflix Scale
 
Scaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's EdgeScaling Patterns for Netflix's Edge
Scaling Patterns for Netflix's Edge
 
Make Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home EverywhereMake Your Electron App Feel at Home Everywhere
Make Your Electron App Feel at Home Everywhere
 
The Talk You've Been Await-ing For
The Talk You've Been Await-ing ForThe Talk You've Been Await-ing For
The Talk You've Been Await-ing For
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and MoreAutomated Testing for Terraform, Docker, Packer, Kubernetes, and More
Automated Testing for Terraform, Docker, Packer, Kubernetes, and More
 
Navigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery TeamsNavigating Complexity: High-performance Delivery and Discovery Teams
Navigating Complexity: High-performance Delivery and Discovery Teams
 
High Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in AdtechHigh Performance Cooperative Distributed Systems in Adtech
High Performance Cooperative Distributed Systems in Adtech
 
Rust's Journey to Async/await
Rust's Journey to Async/awaitRust's Journey to Async/await
Rust's Journey to Async/await
 
Opportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven UtopiaOpportunities and Pitfalls of Event-Driven Utopia
Opportunities and Pitfalls of Event-Driven Utopia
 
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/DayDatadog: a Real-Time Metrics Database for One Quadrillion Points/Day
Datadog: a Real-Time Metrics Database for One Quadrillion Points/Day
 
Are We Really Cloud-Native?
Are We Really Cloud-Native?Are We Really Cloud-Native?
Are We Really Cloud-Native?
 

Último

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 

Último (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

Machine Learning at Netflix Scale

  • 1.
  • 2. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations /machine-learning-netflix-2014 InfoQ.com: News & Community Site • 750,000 unique visitors/month • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • News 15-20 / week • Articles 3-4 / week • Presentations (videos) 12-15 / week • Interviews 2-3 / week • Books 1 / month
  • 3. Presented at QCon New York www.qconnewyork.com Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide
  • 4. Machine Learning At Netflix Scale Aish Fenton Manager - Research Engineering @aishfenton
  • 5. Everything is a recommendation
  • 6. 4
  • 9. Because you watched Bob’s Burgers
  • 10.
  • 11. Rank based on your taste Rank based on your taste
  • 12. 75% of plays come from homepage
  • 14.
  • 15. What we were interested in: ▪High quality recommendations Proxy question: ▪Accuracy in predicted rating ▪Improve by 10% = $1million! predicted actual
  • 16. Top two results still used in production! SVD RBMs
  • 17. >
  • 19. • > 44M members • > 40 countries • > 5B hours in Q3 2013 • Log 100B events/day • 31.62% of peak US downstream traffic
  • 21. ▪> 40M subscribers ▪Ratings: ~5M/day ▪Searches: >3M/day ▪Plays: > 50M/day ▪Streamed hours: o 5B hours in Q3 2013 Geo Info Time Impressions Ratings Device Info Metadata Social Plays Demographics Member Behavior
  • 22. Latent User Vector Aish House of Cards Latent Item Vector
  • 23. House of Cards m1 m2 m3 House of Cards 3.53 M u1 u2 u3 Aish Aish U R
  • 24.
  • 25. Mean Rating My Bias Movie Bias Interaction
  • 26. 3.55 = 2.50 + -1.5 + 1.2 + pq Mean Rating My Bias Movie Bias Interaction My rating for House of Cards
  • 27. House of Cards 3.53 R U M u1 u2 u3 m1 m2 m3 Aish 1.34 2.35 Time t1 t2 t3 Time T
  • 28. ▪Matrix/Tensor Factorization ▪Regression models (Logistic, Linear, Elastic nets) ▪Factorization Machines ▪Restricted Boltzmann Machines ▪Markov Chains & other graph models ▪Clustering / Topic Models ▪Neural Networks ▪Association Rules ▪GBDT/RF ▪…
  • 29. Improvement Over Baseline Popularity + Ratings + More Features & Optimized Models 0% 50% 100% 150% 200% 250% 300%
  • 30. Anatomy of a Machine Learning Platform
  • 31. Problem Data Experiment Offline Test / Metrics Produce Model
  • 32. Offline Near-line Online Event Distribution UI Clients Online Algs Model Trainer Pre-compute AB Test Metrics API Layer Monitoring Hadoop / Data Warehouse Experimentation Platform S3 / HDFS Offline Metrics Query Tools Models Models
  • 33. Offline Near-line Online Event Distribution UI Clients Online Algs Model Trainer Pre-compute AB Test Metrics API Layer Monitoring Hadoop / Data Warehouse Experimentation Platform S3 / HDFS Offline Metrics Query Tools Models Models
  • 34. Many different types of data… ▪App Logs ▪User Actions ▪Ratings ▪Plays ▪Queue Adds ▪Algo Actions ▪Impressions (Presentation Bias) ▪Context ▪Device Info ▪User Demographics ▪Social ▪Time ▪…
  • 35.
  • 36.
  • 37. Offline Near-line Online Event Distribution UI Clients Online Algs Model Trainer Pre-compute AB Test Metrics API Layer Monitoring Hadoop / Data Warehouse Experimentation Platform S3 / HDFS Offline Metrics Query Tools Models Models Embedded Embedded
  • 40.
  • 41. θ Input Hidden Layer Output
  • 43. Neural Network Training 1,536 cores G2 Instances $0.60 p/h
  • 44. But… things can go astray
  • 45.
  • 46.
  • 47. Offline Near-line Online Event Distribution UI Clients Online Algs Model Trainer Pre-compute AB Test Metrics API Layer Monitoring Hadoop / Data Warehouse Experimentation Platform S3 / HDFS Offline Metrics Query Tools Models Models
  • 48. M U R Pre-compute Online u1 u2 u3
  • 49. Offline Near-line Publish new model Online Event Distribution UI Clients Online Algs Model Trainer Pre-compute AB Test Metrics API Layer for Aish Monitoring Hadoop / Data Warehouse Experimentation Platform S3 / HDFS Offline Metrics Query Tools Models Models Aish played HoC
  • 50. Aish Fenton @aishfenton https://www.linkedin.com/profile/view?id=47917219
  • 51. Watch the video with slide synchronization on InfoQ.com! http://www.infoq.com/presentations/machine-learning- netflix-2014