SlideShare uma empresa Scribd logo
1 de 14
Baixar para ler offline
Exploring content
recommendation
Felipe Besson
@fmbesson
March, 2013
“A lot of times, people don't know what they
want until you show it to them.”
Steve Jobs
“We don't make money when we sell things;
we make money when we help customers
make purchase decisions.”
Jeff Bezos, Amazon
Why recommendation is important ?
An Apache project to build scalable machine
learning libraries
●
Focused on large data sets
●
Adaption of standard machine learning algorithms
●
Run on Apache Hadoop (map/reduce paradigm)
… or on a non Hadoop node
Who is using Mahout ?
Source: https://cwiki.apache.org/MAHOUT/powered-by-mahout.html
Supported core algorithms
●
Classification
●
Clustering
●
Recommendation
●
Pattern Mining
●
Regression
●
Dimension Reduction
●
Evolutionary Algorithms
●
Vector Similarity
Mahout Recommender
Collaborative filtering
People often get the best recommendation from someone
with similar taste
●
People tend to like things that are similar to other things
they like
●
There are patterns in people likes and dislikes
John Bob
movie1 movie1
movie2
movie2
movie42
movie4
movie5
Will Bob like movie4? and
movie5?
Mahout Recommender
Available recommenders
●
Item based
●
User based
Execution modes
●
Taste: online but not distributed
●
Hadoop: offline (batch) but distributed
Parameters
●
Many coefficients to calculate user and item
similarity and neighborhood
●
Data model abstractions
Mahout Recommender (Hadoop)
Input
user_id
item_id
preference_value (optional)
1, 23, 0.9
1, 15, 0.5
1, 89, 0.1
2, 11, 0.3
2, 15, 0.2
9, 10, 0.5
9, 99, 0.9
9, 11, 0.1
8, 11, 0.5
...
Output
user_id
[recommended_item, score]
1: [10, 0.93; 11, 0.84; … ]
2: [23, 0.72; 17, 0.60; … ]
8: [121, 0.98; 23, 0.78; … ]
17: [12, 0.89; 32, 0.56; … ]
42: [129, 0.92; 98, 0.45; … ]
...
1st try!
Movie recommendation
Netflix base (http://www.netflixprize.com/)
●
# of user tastes: 2.817.131
●
# of movies: 17.770
●
# of users: 472891
Environment and performance
●
Hadoop pseudo-distributed
●
Computer
●
Intel® Core™ i5-3317U CPU @ 1.70GHz × 4
●
6Gb RAM
●
Total time: ~ 16 minutes
How to run ?
1. Copy the input file to HDFS (Hadoop distributed
file system)
hadoop fs -put qualifying.txt /netflix/input/data.txt
2. Run the recommender
hadoop jar core/target/mahout-core-0.8-SNAPSHOT-job.jar
org.apache.mahout.cf.taste.hadoop.item.RecommenderJob
-Dmapred.input.dir=/netflix/input/data.txt
-Dmapred.output.dir=/netflix/output
--numRecommendations 10
--similarityClassname SIMILARITY_LOGLIKELIHOOD
Results
Recommender analyzer
https://github.com/besson/recommender_analyzer
http://rec-analyzer.herokuapp.com/
Results
References
Sean Owen, Robin Anil, Ted Dunning, and Ellen
Friedman. Mahout in Action, Manning publications,
2011.
Thanks
Felipe Besson
@fmbesson

Mais conteúdo relacionado

Semelhante a Exploring content recommendation

Apache Mahout
Apache MahoutApache Mahout
Apache MahoutAjit Koti
 
Azure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learningAzure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learningSetu Chokshi
 
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...Cloudera, Inc.
 
Forget the Fairy Dust - How to Create Content That (Actually) Works
Forget the Fairy Dust - How to Create Content That (Actually) WorksForget the Fairy Dust - How to Create Content That (Actually) Works
Forget the Fairy Dust - How to Create Content That (Actually) WorksJoel Klettke
 
No Nonsense Content Marketing - MNsearch 2017 - Slideshare
No Nonsense Content Marketing - MNsearch 2017 - SlideshareNo Nonsense Content Marketing - MNsearch 2017 - Slideshare
No Nonsense Content Marketing - MNsearch 2017 - SlideshareJohn Doherty
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutKorea Sdec
 
Q2 HUG - Content in AI.pdf
Q2 HUG - Content in AI.pdfQ2 HUG - Content in AI.pdf
Q2 HUG - Content in AI.pdfAlexisLyga
 
Be A Great Product Leader (Amplify, Oct 2019)
Be A Great Product Leader (Amplify, Oct 2019)Be A Great Product Leader (Amplify, Oct 2019)
Be A Great Product Leader (Amplify, Oct 2019)Adam Nash
 
Impersonal Recommendation system on top of Hadoop
Impersonal Recommendation system on top of HadoopImpersonal Recommendation system on top of Hadoop
Impersonal Recommendation system on top of HadoopKostiantyn Kudriavtsev
 
Building a Recommendation Engine - A Balancing act
Building a Recommendation Engine - A Balancing actBuilding a Recommendation Engine - A Balancing act
Building a Recommendation Engine - A Balancing actElad Rosenheim
 
How to create searchable content
How to create searchable contentHow to create searchable content
How to create searchable contentBeth Browning
 
Inbound Marketing Conference 2016 Summary
Inbound Marketing Conference 2016 SummaryInbound Marketing Conference 2016 Summary
Inbound Marketing Conference 2016 SummaryJimmy Smith
 
Jumpstart - 02/01/2015
Jumpstart - 02/01/2015Jumpstart - 02/01/2015
Jumpstart - 02/01/2015Tom Hartman
 
Be a great product leader by Adam Nash, VP Product, Dropbox
Be a great product leader by Adam Nash, VP Product, DropboxBe a great product leader by Adam Nash, VP Product, Dropbox
Be a great product leader by Adam Nash, VP Product, DropboxAmplitude
 
Download Materials
Download MaterialsDownload Materials
Download Materialsbutest
 

Semelhante a Exploring content recommendation (20)

Evc2014
Evc2014Evc2014
Evc2014
 
Apache Mahout
Apache MahoutApache Mahout
Apache Mahout
 
Azure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learningAzure Boot Camp 2017 getting started with azure machine learning
Azure Boot Camp 2017 getting started with azure machine learning
 
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...
Hadoop World 2011: Data Mining in Hadoop, Making Sense of it in Mahout! - Mic...
 
Forget the Fairy Dust - How to Create Content That (Actually) Works
Forget the Fairy Dust - How to Create Content That (Actually) WorksForget the Fairy Dust - How to Create Content That (Actually) Works
Forget the Fairy Dust - How to Create Content That (Actually) Works
 
No Nonsense Content Marketing - MNsearch 2017 - Slideshare
No Nonsense Content Marketing - MNsearch 2017 - SlideshareNo Nonsense Content Marketing - MNsearch 2017 - Slideshare
No Nonsense Content Marketing - MNsearch 2017 - Slideshare
 
Machine Learning & Apache Mahout
Machine Learning & Apache MahoutMachine Learning & Apache Mahout
Machine Learning & Apache Mahout
 
Bootstrapping Coursepad
Bootstrapping CoursepadBootstrapping Coursepad
Bootstrapping Coursepad
 
SDEC2011 Essentials of Mahout
SDEC2011 Essentials of MahoutSDEC2011 Essentials of Mahout
SDEC2011 Essentials of Mahout
 
Q2 HUG - Content in AI.pdf
Q2 HUG - Content in AI.pdfQ2 HUG - Content in AI.pdf
Q2 HUG - Content in AI.pdf
 
Yahoo Help Content Strategy - Chris Todd
Yahoo Help Content Strategy -  Chris ToddYahoo Help Content Strategy -  Chris Todd
Yahoo Help Content Strategy - Chris Todd
 
Be A Great Product Leader (Amplify, Oct 2019)
Be A Great Product Leader (Amplify, Oct 2019)Be A Great Product Leader (Amplify, Oct 2019)
Be A Great Product Leader (Amplify, Oct 2019)
 
Impersonal Recommendation system on top of Hadoop
Impersonal Recommendation system on top of HadoopImpersonal Recommendation system on top of Hadoop
Impersonal Recommendation system on top of Hadoop
 
Building a Recommendation Engine - A Balancing act
Building a Recommendation Engine - A Balancing actBuilding a Recommendation Engine - A Balancing act
Building a Recommendation Engine - A Balancing act
 
How to create searchable content
How to create searchable contentHow to create searchable content
How to create searchable content
 
Inbound Marketing Conference 2016 Summary
Inbound Marketing Conference 2016 SummaryInbound Marketing Conference 2016 Summary
Inbound Marketing Conference 2016 Summary
 
Jumpstart - 02/01/2015
Jumpstart - 02/01/2015Jumpstart - 02/01/2015
Jumpstart - 02/01/2015
 
Be a great product leader by Adam Nash, VP Product, Dropbox
Be a great product leader by Adam Nash, VP Product, DropboxBe a great product leader by Adam Nash, VP Product, Dropbox
Be a great product leader by Adam Nash, VP Product, Dropbox
 
Download Materials
Download MaterialsDownload Materials
Download Materials
 
Better Search Engine Testing
Better Search Engine TestingBetter Search Engine Testing
Better Search Engine Testing
 

Último

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 

Exploring content recommendation