* Short introduction to myself (where i am from, which are my hobbies)
* Presenting my research activities in the latest 2 years, with a more detailed presentation of the last paper I wrote with Xavier Amatriain, to be presented at UMAP 2011
Denis studying to become faculty researcher through PhD research on music recommendations
1. Denis studying/working to be a faculty/researcher (Denis Parra || Denis Parra-Santander) PhD Student http://www.sis.pitt.edu/~dparra/ 1 March 18th 2011 PAWS Lab – School of information Sciences – University of Pittsburgh
2. What is this presentation about? A short introduction of myself A description of my research interests and what I have been doing about it in the latest years 2
3. I.1 Where are you from? I am from Chile, a country that looks like a chile pepper, but, paradoxically, people don’t eat much spicy food. Chile ≠ [red hot chile pepper] && Chile ≠ México 3
4. I.2 Are youfrom Santiago, the capital? Good try. One third of the 16 million Chileans lives in Santiago. But Chile is a looong country, in the north is hot and dry, in the south is very cold. I live in Valdivia, a city with rainy weather. Very Hot! Here I Live! Valdivia Very Cold! 4
5. I.3 Which activities do you like to do? I like playing tennis, running & rowing I like writing poetry. Check some poems herein Spanish (translated to English) I like reading novels, my favorite authors are J. L. Borges, Fyodor Dostoyevsky & James Joyce (right now I’m reading a Roberto Bolaño’s novel) I like listening to music, from Blues to Lady Gaga, passing by Pink Floyd, Radiohead and Los Jaivas. I like watching movies like “A Clockwork Orange” by S. Kubrick and “Underground” by E. Kusturica. I also like surrealistic movies like “The Holy Mountain” by Alejandro Jodorowsky. 5
6. I.4 OK, but now let’s talk about work… (1997 - 2002) I have BS in Engineering with emphasis in Informatics from Universidad Austral de Chile. This is a 6 year program, my undergrad thesis was titled “SPORAS: An Adaptive Web Platform based on a Multiagent System and Ontologies” (my first link to Dr. Brusilovsky’s field, Adaptive Hypermedia) Then, I worked in several projects of e-learning, developing on Open Source LMS such as Dokeos and Moodle (2003-2004) later on, I worked as IT Manager and consultant for an aquaculture company, Aqua Cards, in the South of Chile (2005-2007) I was also teaching OOP (Java), Matlab, and Introduction to Software Engineering (2004, 2006-2007) In 2007 I co-founded a company, Perceptum TI. 6
7. I.5 … and what about research? In 2008 I started the PhD program and I joined the PAWS lab (lead by Dr. Brusilovsky) … so here is where this presentation starts Tag-based recommendations Spreading Activation for recommender systems Related projects CourseAgent TagTheMap Conference Navigator Latent Communities This Presentation “Walk the Talk”: Mapping explicit 7
8. I.6.1 Tag-based recommendations Main topic: Lack of ratings in most items of many systems pushes to look for alternatives to apply user and item-based Collaborative Filtering. We explore 2 variants: neighbor-weighted and tag-based BM25. Presented a workshop paper in HT’09, p1 Presented a short-paper (poster) at Recsys ’09, p2 Presented a short-paper at WI 09, p3 8
9. I.6.2 Spreading activation Presented a paper in a Workshop of Recsys 2009, p4 Look for a way to apply Spreading activation for recommendations in order to: Make use of the multidimensional network structure of Folksonomies (users, items, tags) Find an scalable algorithm (compared to state-of-the art FolkRank, SVD and LDA-based) that makes use of local topology/neighborhood 9
11. Part II: so finally… This project is based on the work of my internship at Telefónica Research (Barcelona, Spain) in the Summer of 2010 Paper submitted to UMAP 2011: Walk the Talk Analyzing the relation between implicit and explicit feedback for preference elicitation (I am co-author with Dr. Xavier Amatriain) 11
12. II.1 Introduction (1/2) Explicit feedback: scarcity (people are not especially eager to rate) Implicit feedback: Is less scarce, but (Hu et al., 2008) There’s no negative feedback Noisy Preference v/s Confidence Lack of evaluation metrics 12
13. II.1 Introduction (2/2) Which variables better account for the amount of times a user listens to online albums? Is it possible to map implicit behavior to explicit preference (ratings)? Study with Last.fm users: Part I: demographics and online music consumption Part II: Rating 100 albums collected from their last.fm user profile 13
16. II.2.2 Survey Part I Pre-req: 18 years old & 5,000 min playcount (scrobblings) # Users: 151 users started, 127 completed, 114 after filtering outliers. 82% were male and 18% were female. From 23 different countries, main were Spain (25 users), U.S. (15 users), and UK (16 users). 80% used 20 or more hours per week of internet. 50% of users listening to music for over 20 hours per week. 9% did not attend music concerts. 30% went to 11 or more concerts a year. 35% said that they only read music magazines or blogs sometimes, but 20% did it every week. 50% of our subjects admitted rating music online never or seldom. 45% of our subjects said they bought 1 to 10 physical records a year. However, a non-negligible 18% said they did not buy any. 35% of our subjects report never buy music online, 8% say they do it once a month or more. 14% preferred to listen to single tracks while over 45% preferred listening to full albums. The other 40% reported listening to music either way. 16
17. II.2.3 Survey Part II For item (album) sampling, we accounted for Implicit Feedback (IF): playcount for a user on a given item. Changed to scale [1-3], 3 means being more listened to. Global Popularity (GP): global playcount for all users on a given item [1-3]. Changed to scale [1-3], 3 means being more listened to. Recentness (R) : time ellapsed since user played a given item. Changed to scale [1-3], 3 means being listened to more recently. 17
18. II.3 General Analysis Initial assumption: Rating and IF (# playcount) must be strongly correlated. 18
24. II.3.5 General Analysis - Findings We “see” strong positive correlation between ratings and implicit feedback We “see” some level of positive correlation between ratings and recentness We don’t expect a significant relations between ratings and global popularity. On demographic data: Just listening to track or album shows a significant effect (using ANOVA) 24
25. II.4 Regression Analysis Including Recentness increases R2 in more than 10% [ 1 -> 2] Including GP increases R2, not much compared to RE + IF [ 1 -> 3] Not Including GP, but including interaction between IF and RE improves the variance of the DV explained by the regression model. [ 2 -> 4 ] 25
26. II.4.1 Regression Analysis We tested conclusions of regression analysis by predicting the score, using RMSE and 10-fold cross validations Results of regression analysis are supported. 26
27. II.4.2 Regression Analysis Including track or CD Including this variable that seemed to have an effect in the general analysis, helped to improve accuracy of the model 27
28. II.5 Conclusions Using a linear model, Implicit feedback and recentness can help to predict explicit feedback (playcount) Global popularity doesn’t show a significant improvement in the prediction task (discussion) Our model can help to relate implicit and explicit feedback, helping to evaluate and compare explicit and implicit recommender systems. Ongoing Work? 28
29. THANKS for spending your time listening to this talk Questions? dap89@pitt.edu 29