SlideShare a Scribd company logo
1 of 27
Roberto Turrin - Moviri, ContentWise R&D 
Daniele Loiacono – Politecnico di Milano, DEIB 
Andreas Lommatzsch - TU Berlin, DAI-Labor 
An analysis of the 2014 RecSys Challenge
Challenge description 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
•Engagement
Challenge description 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
•Engagement 
•Context: 
–movie ratings tweeted by users (using smartphone) with IMDb 
account connected to their Twitter accounts. 
–data about the tweets 
"I rated The Matrix 9/10 
http://www.imdb.com/title/tt0133093/ #IMDb"
Challenge description 
•Task: predicting which movies generate the highest user 
engagement 
–participant's algorithms should generate a ranked list of 
tweets which are ranked based on the amount of 
interaction. 
–The interaction is defined as the sum of retweet and 
favorite count. 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge
Challenge description 
•The evaluation is based on nDCG@10. 
– computed for each user in the test set 
– then averaged over all users. 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Rival
•Twitter Users Don't Always Click the Links They Retweet 
–a weak correlation between retweets and clicks 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Blind retweets 
http://blog.hubspot.com/blog/tabid/6307/bid/33815/New-Data-Indicates-Twitter-Users-Don-t-Always-Click-the-Links-They-Retweet-INFOGRAPHIC.aspx
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Agenda 
•base analysis 
•enrichment 
•exploration 
•reference predictor
Analysis Enrichment Exploration Predictor 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Observations 
•Each user has the same impact on the overall performance, 
regardless of how many tweets he/she posted. 
•User role in the social network does not influence his/her 
nDCG@10
Analysis Enrichment Exploration Predictor 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Dataset 
Dataset Users Items Tweets Dates 
Training 22,079 170,285 170,285 28/02/2013 - 
08/01/2014 
Test 5,717 4,226 21,285 08/01/2014 - 
11/02/2014 
Evaluation 5,514 4,559 21,287 11/02/2014 – 
24/03/2014 
All 24,924 15,142 212,857 28/02/2013 – 
24/03/2014
Analysis Enrichment Exploration Predictor 
• user identifier 
• tweet identifier 
• timestamp of the message 
• rating (in a 1-to-10 rating scale) 
• additional information such as 
– IMDb url of the movie 
– the tweet has mentions 
– it is a retweet 
– tweet language 
– some user properties (e.g., the number of followers and the number 
of friends). 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Dataset
Analysis Enrichment Exploration Predictor 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
#tweets 
#Users 
All Engaging 
1 9477 3394 
2-5 6406 1020 
6-10 2319 106 
11-20 1837 35 
21-50 1482 16 
51+ 562 52 
22079 4577 
21% users have engaging tweets 
79% users have no engaging tweets
Analysis Enrichment Exploration Predictor 
Linked data 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Enrichment 
RecSysChallenge 
dataset 
Freebase IMDb 
(94%) 
Data 
processing
Analysis Enrichment Exploration Predictor 
Enrichment: Freebase 
• type, category, and genre, runtime and censure rating of the 
movie 
• movie release date 
• main movie language and main country 
• number of won awards and estimated budget 
• if the movie was adapted from a book 
• number of festivals the movies attended 
• if the movie is part of a series or a prequel/sequel 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge
Analysis Enrichment Exploration Predictor 
Enrichment: IMDb 
• average IMDb rating 
• number of IMDb raters 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge
Analysis Enrichment Exploration Predictor 
Enrichment: data processing 
• related number of days between movie release and tweet 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge
Analysis Enrichment Exploration Predictor 
Problem re-formulation 
"engaging" "non-engaging" 
binary problem 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge
Analysis Enrichment Exploration Predictor 
Binary upper bound 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Tweet 
engaging 
non-engaging 
nDCG@10 = 0.9877
Analysis Enrichment Exploration Predictor 
Non-engaging baseline 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Tweet 
non-engaging 
nDCG@10 = 0.7509
Analysis Enrichment Exploration Predictor 
Nominal attributes 
• 10% of Arabic, German tweets are engaging 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
vs. 4.5% of English tweets 
• 6.4% of January tweets are engaging 
vs. 4% of September tweets 
• 11% of German movies tweets are engaging 
vs. 4.5% of English movie
Analysis Enrichment Exploration Predictor 
Numeric/Boolean attributes 
• 22% of engaging tweets have mentions vs. 0% of non-engaging tweets 
• 24% of engaging tweets are a retweet itself vs. 0.9% of non-engaging tweets 
• 17% of engaging tweets have been retweeted vs. 0% of non-engaging tweets 
• Avg.#rating of engaging is 184 vs. 161 of non-engaging 
• Difference between user rating and IMDb rating 
0.74 for engaging vs. 0.27 for non-engaging 
• Engaging tweets have won 2.74 awards vs. 2.28 
and attended 2 festival vs. 1.5 (on average) 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge
Analysis Enrichment Exploration Predictor 
Machine learning 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
•Naive Bayes 
•Bayesian Networks 
•Decision Trees 
•Pair learning
Analysis Enrichment Exploration Predictor 
Machine learning: decision tree 
is retweet? 
engaging 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
yes 
no 
has been 
retweeted? 
yes no 
engaging 
has 
mentions? 
yes no 
engaging 
non 
engaging
Analysis Enrichment Exploration Predictor 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Linear model 
# Added attrib wi nDCG@10 increment 
1 User rating 1000 0.8131
Analysis Enrichment Exploration Predictor 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Linear model 
# Added attrib wi nDCG@10 increment 
1 User rating 1000 0.8131 
2 #user followers 10 0.8146 0.0015
Analysis Enrichment Exploration Predictor 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Linear model 
# Added attrib wi nDCG@10 increment 
1 User rating 1000 0.8131 
2 #user followers 10 0.8146 0.0015 
3 #user favorites 1 0.8168 0.0022 
4 #user friends -3 0.8200 0.0032 
5 Tweet language n.a. 0.8212 0.0012
Analysis Enrichment Exploration Predictor 
eng(t) = 
• rating(t) + 
• has_mentions(t) + 
• has_retweets(t) + 
• is_a_retweet(t) 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
..and finally 
nDCG@10 = 0.8352
•Unbalance towards tweets with no engagement 
•Most relevant attributes related to Tweet content, e.g.,: 
rating, mentions, retweet status 
Roberto TURRIN - An analysis of the 2014 RecSys Challenge 
Conclusion 
0.75 0.84 0.99 
nDCG@10

More Related Content

Similar to Turrin rec syschallenge_presentation_@recsys2014

Social Media Report Uk hotel chains
Social Media Report Uk hotel chainsSocial Media Report Uk hotel chains
Social Media Report Uk hotel chains
SocialWin
 
Intro to R and Data Mining 2012 09 27
Intro to R and Data Mining 2012 09 27Intro to R and Data Mining 2012 09 27
Intro to R and Data Mining 2012 09 27
Raj Kasarabada
 
0022.social win report fast food chains-u.k.
0022.social win report fast food chains-u.k.0022.social win report fast food chains-u.k.
0022.social win report fast food chains-u.k.
SocialWin
 

Similar to Turrin rec syschallenge_presentation_@recsys2014 (20)

A Two Step Ranking Solution for Twitter User Engagement
A Two Step Ranking Solution for Twitter User Engagement�A Two Step Ranking Solution for Twitter User Engagement�
A Two Step Ranking Solution for Twitter User Engagement
 
Social Media Report Uk hotel chains
Social Media Report Uk hotel chainsSocial Media Report Uk hotel chains
Social Media Report Uk hotel chains
 
Social measurementwebinarsep2013
Social measurementwebinarsep2013Social measurementwebinarsep2013
Social measurementwebinarsep2013
 
Intro to R and Data Mining 2012 09 27
Intro to R and Data Mining 2012 09 27Intro to R and Data Mining 2012 09 27
Intro to R and Data Mining 2012 09 27
 
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
MediaEval 2018: NewsREEL Multimedia at MediaEval 2018: News Recommendation wi...
 
Recsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem RevisitedRecsys 2014 Tutorial - The Recommender Problem Revisited
Recsys 2014 Tutorial - The Recommender Problem Revisited
 
Social Media Report UK newspapers 2014
Social Media Report UK newspapers 2014Social Media Report UK newspapers 2014
Social Media Report UK newspapers 2014
 
Social Media Report European Monarchies 2014
Social Media Report European Monarchies 2014Social Media Report European Monarchies 2014
Social Media Report European Monarchies 2014
 
Practical UX Methods - as presented at FOWD 2014
Practical UX Methods - as presented at FOWD 2014Practical UX Methods - as presented at FOWD 2014
Practical UX Methods - as presented at FOWD 2014
 
0022.social win report fast food chains-u.k.
0022.social win report fast food chains-u.k.0022.social win report fast food chains-u.k.
0022.social win report fast food chains-u.k.
 
TV Show Popularity Prediction using Sentiment Analysis in Social Network
TV Show Popularity Prediction using Sentiment Analysis in Social NetworkTV Show Popularity Prediction using Sentiment Analysis in Social Network
TV Show Popularity Prediction using Sentiment Analysis in Social Network
 
Social Media report banks UK
Social Media report banks UKSocial Media report banks UK
Social Media report banks UK
 
Social Media Report soft drinks UK
Social Media Report soft drinks UKSocial Media Report soft drinks UK
Social Media Report soft drinks UK
 
Online Reputation Monitoring in Twitter from an Information Access Perspective
Online Reputation Monitoring in Twitter from an Information Access PerspectiveOnline Reputation Monitoring in Twitter from an Information Access Perspective
Online Reputation Monitoring in Twitter from an Information Access Perspective
 
Analyzing Real Time News
Analyzing Real Time NewsAnalyzing Real Time News
Analyzing Real Time News
 
IRJET- Reality Show Analytics for TRP Ratings Based on Viewer’s Opinion
IRJET- Reality Show Analytics for TRP Ratings Based on Viewer’s OpinionIRJET- Reality Show Analytics for TRP Ratings Based on Viewer’s Opinion
IRJET- Reality Show Analytics for TRP Ratings Based on Viewer’s Opinion
 
Social Media Masterclass for BPMA Conference 2014
Social Media Masterclass for BPMA Conference 2014Social Media Masterclass for BPMA Conference 2014
Social Media Masterclass for BPMA Conference 2014
 
The Content Selfie: What to Serve Your Customers to Make Them Want You
The Content Selfie: What to Serve Your Customers to Make Them Want YouThe Content Selfie: What to Serve Your Customers to Make Them Want You
The Content Selfie: What to Serve Your Customers to Make Them Want You
 
Francia Sandoval UX Portfolio
Francia Sandoval UX PortfolioFrancia Sandoval UX Portfolio
Francia Sandoval UX Portfolio
 
Twitter - Measuring Reach and Engagement
Twitter - Measuring Reach and EngagementTwitter - Measuring Reach and Engagement
Twitter - Measuring Reach and Engagement
 

Recently uploaded

No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
Sheetaleventcompany
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 

Recently uploaded (20)

SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 

Turrin rec syschallenge_presentation_@recsys2014

  • 1. Roberto Turrin - Moviri, ContentWise R&D Daniele Loiacono – Politecnico di Milano, DEIB Andreas Lommatzsch - TU Berlin, DAI-Labor An analysis of the 2014 RecSys Challenge
  • 2. Challenge description Roberto TURRIN - An analysis of the 2014 RecSys Challenge •Engagement
  • 3. Challenge description Roberto TURRIN - An analysis of the 2014 RecSys Challenge •Engagement •Context: –movie ratings tweeted by users (using smartphone) with IMDb account connected to their Twitter accounts. –data about the tweets "I rated The Matrix 9/10 http://www.imdb.com/title/tt0133093/ #IMDb"
  • 4. Challenge description •Task: predicting which movies generate the highest user engagement –participant's algorithms should generate a ranked list of tweets which are ranked based on the amount of interaction. –The interaction is defined as the sum of retweet and favorite count. Roberto TURRIN - An analysis of the 2014 RecSys Challenge
  • 5. Challenge description •The evaluation is based on nDCG@10. – computed for each user in the test set – then averaged over all users. Roberto TURRIN - An analysis of the 2014 RecSys Challenge Rival
  • 6. •Twitter Users Don't Always Click the Links They Retweet –a weak correlation between retweets and clicks Roberto TURRIN - An analysis of the 2014 RecSys Challenge Blind retweets http://blog.hubspot.com/blog/tabid/6307/bid/33815/New-Data-Indicates-Twitter-Users-Don-t-Always-Click-the-Links-They-Retweet-INFOGRAPHIC.aspx
  • 7. Roberto TURRIN - An analysis of the 2014 RecSys Challenge Agenda •base analysis •enrichment •exploration •reference predictor
  • 8. Analysis Enrichment Exploration Predictor Roberto TURRIN - An analysis of the 2014 RecSys Challenge Observations •Each user has the same impact on the overall performance, regardless of how many tweets he/she posted. •User role in the social network does not influence his/her nDCG@10
  • 9. Analysis Enrichment Exploration Predictor Roberto TURRIN - An analysis of the 2014 RecSys Challenge Dataset Dataset Users Items Tweets Dates Training 22,079 170,285 170,285 28/02/2013 - 08/01/2014 Test 5,717 4,226 21,285 08/01/2014 - 11/02/2014 Evaluation 5,514 4,559 21,287 11/02/2014 – 24/03/2014 All 24,924 15,142 212,857 28/02/2013 – 24/03/2014
  • 10. Analysis Enrichment Exploration Predictor • user identifier • tweet identifier • timestamp of the message • rating (in a 1-to-10 rating scale) • additional information such as – IMDb url of the movie – the tweet has mentions – it is a retweet – tweet language – some user properties (e.g., the number of followers and the number of friends). Roberto TURRIN - An analysis of the 2014 RecSys Challenge Dataset
  • 11. Analysis Enrichment Exploration Predictor Roberto TURRIN - An analysis of the 2014 RecSys Challenge #tweets #Users All Engaging 1 9477 3394 2-5 6406 1020 6-10 2319 106 11-20 1837 35 21-50 1482 16 51+ 562 52 22079 4577 21% users have engaging tweets 79% users have no engaging tweets
  • 12. Analysis Enrichment Exploration Predictor Linked data Roberto TURRIN - An analysis of the 2014 RecSys Challenge Enrichment RecSysChallenge dataset Freebase IMDb (94%) Data processing
  • 13. Analysis Enrichment Exploration Predictor Enrichment: Freebase • type, category, and genre, runtime and censure rating of the movie • movie release date • main movie language and main country • number of won awards and estimated budget • if the movie was adapted from a book • number of festivals the movies attended • if the movie is part of a series or a prequel/sequel Roberto TURRIN - An analysis of the 2014 RecSys Challenge
  • 14. Analysis Enrichment Exploration Predictor Enrichment: IMDb • average IMDb rating • number of IMDb raters Roberto TURRIN - An analysis of the 2014 RecSys Challenge
  • 15. Analysis Enrichment Exploration Predictor Enrichment: data processing • related number of days between movie release and tweet Roberto TURRIN - An analysis of the 2014 RecSys Challenge
  • 16. Analysis Enrichment Exploration Predictor Problem re-formulation "engaging" "non-engaging" binary problem Roberto TURRIN - An analysis of the 2014 RecSys Challenge
  • 17. Analysis Enrichment Exploration Predictor Binary upper bound Roberto TURRIN - An analysis of the 2014 RecSys Challenge Tweet engaging non-engaging nDCG@10 = 0.9877
  • 18. Analysis Enrichment Exploration Predictor Non-engaging baseline Roberto TURRIN - An analysis of the 2014 RecSys Challenge Tweet non-engaging nDCG@10 = 0.7509
  • 19. Analysis Enrichment Exploration Predictor Nominal attributes • 10% of Arabic, German tweets are engaging Roberto TURRIN - An analysis of the 2014 RecSys Challenge vs. 4.5% of English tweets • 6.4% of January tweets are engaging vs. 4% of September tweets • 11% of German movies tweets are engaging vs. 4.5% of English movie
  • 20. Analysis Enrichment Exploration Predictor Numeric/Boolean attributes • 22% of engaging tweets have mentions vs. 0% of non-engaging tweets • 24% of engaging tweets are a retweet itself vs. 0.9% of non-engaging tweets • 17% of engaging tweets have been retweeted vs. 0% of non-engaging tweets • Avg.#rating of engaging is 184 vs. 161 of non-engaging • Difference between user rating and IMDb rating 0.74 for engaging vs. 0.27 for non-engaging • Engaging tweets have won 2.74 awards vs. 2.28 and attended 2 festival vs. 1.5 (on average) Roberto TURRIN - An analysis of the 2014 RecSys Challenge
  • 21. Analysis Enrichment Exploration Predictor Machine learning Roberto TURRIN - An analysis of the 2014 RecSys Challenge •Naive Bayes •Bayesian Networks •Decision Trees •Pair learning
  • 22. Analysis Enrichment Exploration Predictor Machine learning: decision tree is retweet? engaging Roberto TURRIN - An analysis of the 2014 RecSys Challenge yes no has been retweeted? yes no engaging has mentions? yes no engaging non engaging
  • 23. Analysis Enrichment Exploration Predictor Roberto TURRIN - An analysis of the 2014 RecSys Challenge Linear model # Added attrib wi nDCG@10 increment 1 User rating 1000 0.8131
  • 24. Analysis Enrichment Exploration Predictor Roberto TURRIN - An analysis of the 2014 RecSys Challenge Linear model # Added attrib wi nDCG@10 increment 1 User rating 1000 0.8131 2 #user followers 10 0.8146 0.0015
  • 25. Analysis Enrichment Exploration Predictor Roberto TURRIN - An analysis of the 2014 RecSys Challenge Linear model # Added attrib wi nDCG@10 increment 1 User rating 1000 0.8131 2 #user followers 10 0.8146 0.0015 3 #user favorites 1 0.8168 0.0022 4 #user friends -3 0.8200 0.0032 5 Tweet language n.a. 0.8212 0.0012
  • 26. Analysis Enrichment Exploration Predictor eng(t) = • rating(t) + • has_mentions(t) + • has_retweets(t) + • is_a_retweet(t) Roberto TURRIN - An analysis of the 2014 RecSys Challenge ..and finally nDCG@10 = 0.8352
  • 27. •Unbalance towards tweets with no engagement •Most relevant attributes related to Tweet content, e.g.,: rating, mentions, retweet status Roberto TURRIN - An analysis of the 2014 RecSys Challenge Conclusion 0.75 0.84 0.99 nDCG@10