A Recommender System for Predicting User Engagement in Twitter

•

0 gostou•503 visualizações

The RecSys Challenge is a traditional competition among Recommender Systems’ (RS) researchers. The 2014 edition is focused on predicting the amount of interaction achieved by tweets related to movies. In this paper, we present an approach to participate in the 2014 RecSys Challenge. Our approach consists of three steps: (i) using binary classification methods in order to split the tweets into two lists, those having user engagement equal to zero, and those having user engagement different from zero; (ii) each list is sorted through the use of regression methods; and (iii) is performed a concatenation of the two lists and a sort of tweets. To validate our approach we tested 126 configurations and verify that the settings using MovieTweetings dataset, Naïve Bayes classifier and Linear Regression, obtained the best results: nDCG@10 = 0.9037242.

Dados e análise

Jonathas Magalhães2, Rubens Pessoa, Cleyton Souza, Evandro Costa, Joseana Fechine
INTRODUCTION
The 2014 RecSys Challenge [1] consists of ordering tweets shared by users on IMDb
according to the amount of interaction that they received. The interaction of a tweet is
defined by the sum of the number of retweets and favorites that it received.Our
objective is to present a contestant approach to the 2014 RecSys Challenge.
COMPOSING AND PRE-PROCESSING THE DATASET
OVERVIEW OF THE RECOMMENDER SYSTEM
CLASSIFICATION STEP
1 More information at http://www.grouptips.org.
2 Corresponding author, e-mail: jonathas@copin.ufcg.edu.br.
RECSYS CHALLENGE 2014
FEDERAL UNIVERSITY OF CAMPINA GRANDE
FEDERAL UNIVERSITY OF ALAGOAS
Intelligent, Personalized and Social Technologies Group1
A RECOMMENDER SYSTEM FOR PREDICTING
USER ENGAGEMENT IN TWITTER
REGRESSION STEP
REFERENCES
[1] A. Said, S. Dooms, B. Loni, and D. Tikk. Recommender systems challenge 2014. In Proceedings of
the eighth ACM conference on Recommender systems, RecSys ’14, New York, NY, USA, 2014. ACM.
[2] S. Dooms, T. De Pessemier, and L. Martens. Movietweetings: a movie rating dataset collected from
twitter. In Workshop on Crowdsourcing and Human Computation for Recommender Systems,
CrowdRec at RecSys 2013, 2013.
We use two datasets:
● The expanded MovieTweetings dataset [2] distributed by the organizers of the
challenge, with the following attributes: movie id, movie rating, crawled time, tweet
time, followers count, statuses count, favourites count and engagement.
● The IMDb dataset which consists of additional information about movies
referenced by tweets in order to complement the MovieTweetings dataset, with
the following attributes: IMDb rating, IMDb votes count, Movie year.
In this work we use three different regressors: Linear Regression, Pace Regression
and induction model trees algorithm M5Base that is an extension of the Quinlan’s
algorithm to the regression task.
Table 2: Regression models and their parameters.
Besides the models presented in Table 2, we implemented three methods to
combine them: Average, Median and Ranking.
Our approach is divided into three steps:
● Classification;
● Regression and;
● Ordering Results.
In the classification and regression steps we use the Weka API to train the models.
Figure 1: Overview of the Recommender System.
We use three classifiers, Naïve Bayes, Support Vector Machines (SVM) and the
Nearest Neighbor algorithm Ibk.
Table 1: Classification models and their parameters.
We also implement a classifier that combine them using Voting. In other words, an
instance will be classified in a given class if it has obtained the required majority of
the models presented.
Table 3 summarizes the factors and the levels used in each one. Considering the
factors and levels used, we have an experimental design with 2 * 7 * 9 = 126
treatments without replication. We use the metric normalized Discounted Cumulative
Gain (nDCG) to compare the methods.
Table 3: Experimental factors and their levels.
METHODOLOGY
Table 4 presents the NDCG@10 results of the ten best configurations of our approach.
Table 4: The nDCG@10 of the 10 best configurations.
RESULTS

Mais conteúdo relacionado

Semelhante a A Recommender System for Predicting User Engagement in Twitter

Evaluating Collaborative Filtering Recommender SystemsMegaVjohnson

IRJET- Classification of Food Recipe Comments using Naive BayesIRJET Journal

Presentation1.pptxVishalLabde

Ensemble Learning in Recommender Systems: Combining Multiple User Interaction...Arthur Fortes

포스터_아미르호세인그다르지_2010-11804Amir Goudarzi

IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...IRJET Journal

You are working as a behavior consulting intern. Your company’s on.docxjeffevans62972

Scalable recommendation with social contextual informationeSAT Journals

IRJET-A Novel Technic to Notice Spam Reviews on e-ShoppingIRJET Journal

An Adaptive Framework for Enhancing Recommendation Using Hybrid Techniqueijcsit

IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal

IRJET- Boosting Response Aware Model-Based Collaborative FilteringIRJET Journal

Improving collaborative filtering using lexicon-based sentiment analysisIJECEIAES

Profile Analysis of Users in Data Analytics DomainDrjabez

IRJET- Analysis of Brand Value Prediction based on Social Media DataIRJET Journal

IRJET- An Efficient Ensemble Machine Learning System for Restaurant Recom...IRJET Journal

Hybrid Personalized Recommender System Using Modified Fuzzy C-Means Clusterin...Waqas Tariq

A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...ijaia

Mano Vaidya: Gateway to Relaxation Via Machine LearningIRJET Journal

Semelhante a A Recommender System for Predicting User Engagement in Twitter (20)

Evaluating Collaborative Filtering Recommender Systems

IRJET- Classification of Food Recipe Comments using Naive Bayes

Presentation1.pptx

Ensemble Learning in Recommender Systems: Combining Multiple User Interaction...

포스터_아미르호세인그다르지_2010-11804

IRJET- Twitter Sentimental Analysis for Predicting Election Result using ...

You are working as a behavior consulting intern. Your company’s on.docx

Scalable recommendation with social contextual information

IRJET-A Novel Technic to Notice Spam Reviews on e-Shopping

An Adaptive Framework for Enhancing Recommendation Using Hybrid Technique

IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...

IRJET- Boosting Response Aware Model-Based Collaborative Filtering

Improving collaborative filtering using lexicon-based sentiment analysis

Profile Analysis of Users in Data Analytics Domain

IRJET- Analysis of Brand Value Prediction based on Social Media Data

IRJET- An Efficient Ensemble Machine Learning System for Restaurant Recom...

Hybrid Personalized Recommender System Using Modified Fuzzy C-Means Clusterin...

A DATA MINING APPROACH FOR FILTERING OUT SOCIAL SPAMMERS IN LARGE-SCALE TWITT...

Mano Vaidya: Gateway to Relaxation Via Machine Learning

Mais de Jonathas Magalhães

Enhancing the Status Message Question Asking Process on FacebookJonathas Magalhães

Recommending Scientific Papers: Investigating the User CurriculumJonathas Magalhães

Sistemas de Recomendação: Conceitos, Técnicas, Ferramentas e AplicaçõesJonathas Magalhães

Redes BayesianasJonathas Magalhães

ProbabilidadeJonathas Magalhães

An Ontology Based Approach for Sharing Distributed EducationalJonathas Magalhães

Social Query: A Query Routing System for TwitterJonathas Magalhães

A Query Routing Model to Rank Expertcandidates on TwitterJonathas Magalhães

K-Nearest NeighborJonathas Magalhães

Naive BayesJonathas Magalhães

Predicting Potential Responders in Twitter: A Query Routing AlgorithmJonathas Magalhães

An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...Jonathas Magalhães

Improving a Recommender System Through Integration of User Profiles: a Semant...Jonathas Magalhães

Mais de Jonathas Magalhães (13)

Enhancing the Status Message Question Asking Process on Facebook

Recommending Scientific Papers: Investigating the User Curriculum

Sistemas de Recomendação: Conceitos, Técnicas, Ferramentas e Aplicações

Redes Bayesianas

Probabilidade

An Ontology Based Approach for Sharing Distributed Educational

Social Query: A Query Routing System for Twitter

A Query Routing Model to Rank Expertcandidates on Twitter

K-Nearest Neighbor

Naive Bayes

Predicting Potential Responders in Twitter: A Query Routing Algorithm

An Open and Inspectable Learner Modeling with a Negotiation Mechanism to Solv...

Improving a Recommender System Through Integration of User Profiles: a Semant...

Último

Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823

Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal

Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823

➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823

Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics

Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823

Capstone Project on IBM Data Analytics ProgramMoniSankarHazra

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795

Midocean dropshipping via API with DroFxolyaivanovalion

hybrid Seed Production In Chilli & Capsicum.pptx9to5mart

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann

➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823

Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls

Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823

➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823

Anomaly detection and data imputation within time seriesParis Women in Machine Learning and Data Science

Discover Why Less is More in B2B Researchmichael115558

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY

Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

A Recommender System for Predicting User Engagement in Twitter

1. Jonathas Magalhães2, Rubens Pessoa, Cleyton Souza, Evandro Costa, Joseana Fechine INTRODUCTION The 2014 RecSys Challenge [1] consists of ordering tweets shared by users on IMDb according to the amount of interaction that they received. The interaction of a tweet is defined by the sum of the number of retweets and favorites that it received.Our objective is to present a contestant approach to the 2014 RecSys Challenge. COMPOSING AND PRE-PROCESSING THE DATASET OVERVIEW OF THE RECOMMENDER SYSTEM CLASSIFICATION STEP 1 More information at http://www.grouptips.org. 2 Corresponding author, e-mail: jonathas@copin.ufcg.edu.br. RECSYS CHALLENGE 2014 FEDERAL UNIVERSITY OF CAMPINA GRANDE FEDERAL UNIVERSITY OF ALAGOAS Intelligent, Personalized and Social Technologies Group1 A RECOMMENDER SYSTEM FOR PREDICTING USER ENGAGEMENT IN TWITTER REGRESSION STEP REFERENCES [1] A. Said, S. Dooms, B. Loni, and D. Tikk. Recommender systems challenge 2014. In Proceedings of the eighth ACM conference on Recommender systems, RecSys ’14, New York, NY, USA, 2014. ACM. [2] S. Dooms, T. De Pessemier, and L. Martens. Movietweetings: a movie rating dataset collected from twitter. In Workshop on Crowdsourcing and Human Computation for Recommender Systems, CrowdRec at RecSys 2013, 2013. We use two datasets: ● The expanded MovieTweetings dataset [2] distributed by the organizers of the challenge, with the following attributes: movie id, movie rating, crawled time, tweet time, followers count, statuses count, favourites count and engagement. ● The IMDb dataset which consists of additional information about movies referenced by tweets in order to complement the MovieTweetings dataset, with the following attributes: IMDb rating, IMDb votes count, Movie year. In this work we use three different regressors: Linear Regression, Pace Regression and induction model trees algorithm M5Base that is an extension of the Quinlan’s algorithm to the regression task. Table 2: Regression models and their parameters. Besides the models presented in Table 2, we implemented three methods to combine them: Average, Median and Ranking. Our approach is divided into three steps: ● Classification; ● Regression and; ● Ordering Results. In the classification and regression steps we use the Weka API to train the models. Figure 1: Overview of the Recommender System. We use three classifiers, Naïve Bayes, Support Vector Machines (SVM) and the Nearest Neighbor algorithm Ibk. Table 1: Classification models and their parameters. We also implement a classifier that combine them using Voting. In other words, an instance will be classified in a given class if it has obtained the required majority of the models presented. Table 3 summarizes the factors and the levels used in each one. Considering the factors and levels used, we have an experimental design with 2 * 7 * 9 = 126 treatments without replication. We use the metric normalized Discounted Cumulative Gain (nDCG) to compare the methods. Table 3: Experimental factors and their levels. METHODOLOGY Table 4 presents the NDCG@10 results of the ten best configurations of our approach. Table 4: The nDCG@10 of the 10 best configurations. RESULTS

A Recommender System for Predicting User Engagement in Twitter

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a A Recommender System for Predicting User Engagement in Twitter

Semelhante a A Recommender System for Predicting User Engagement in Twitter (20)

Mais de Jonathas Magalhães

Mais de Jonathas Magalhães (13)

Último

Último (20)

A Recommender System for Predicting User Engagement in Twitter