Real-world News Recommender Systems

1. Verarbeitung von Datenstromen in Echtzeit Tobias Heintz1 Benjamin Kille2 1plista GmbH 2Technische Universitat Berlin September 26, 2014

2. Table of Contents Introduction Recommender Systems Unpersonalised Recommendation Collaborative Filtering Content-based Filtering Evaluation News Recommendation Big Data Issues

3. Who are we? I Tobias Heintz, plista GmbH I Benjamin Kille, Technische Universitat Berlin plista GmbH Pioneers for targeted advertisement and content distribution. I founded 31 July, 2008 I incorporated in the WPP Group as of 1 January, 2014 I headquaters in Berlin, Germany I 120 employees (30 % R&D) Technische Universitat Berlin I >30 000 enrolled students I 331 professors I >2600 researchers

4. What problems do we address? Recommender Systems We will introduce recommender systems; we will discuss a variety of algorithms; we will explore how to evaluate recommender systems. News We will talk about speci

5. c challenges when recommending news; we will illustrate issues arising as system fail to build comprehensive user pro

6. les; we will depict how news evolving over time aect recommender systems. Big Data We will examplify in what way news represent a source of big data; we will introduce a system which grants researchers access to big data; we will show you, how you can compete with your own approaches.

7. Why are these problem important? Users increasingly face information overload as they interact with item collections. For instance: I 43 000 000 songs on Apple's iTunes I 100 h of video are uploaded on Youtube every minute I 3 000 000 movies on IMDb I ... Collection continue to grow causing even more severe information overload. The same yields for news articles.

9. Problem de

10. nition Users have insucient time and cognitive capacity to iterate the full collection. Recommender Systems support users as they

11. lter collections. Recommender Systems dier with respect to the method they use to

12. lter. More formally, a general-purpose recommender system is a triple (U; I; ). U ! set of users fu1; u2; : : : ; uMg I ! set of items fi1; i2; : : : ; iNg ! a

13. lter function The performance of dierent recommendation algorithms typically depends on .

14. Filter Functions Filter functions take a user u, the entire item collection I, and a model M. They return a subset of items to be recommended I. (u; I;M) = I Recommender systems' success or failure strongly depends on the model M. In particular, how accurately the model re ects actual user preferences. M may take various kinds of input, as we will discuss for a selection of recommendation algorithms.

15. Random Recommendation M takes the item collection and selects items randomly.

16. Random Recommendation M takes the item collection and selects items randomly. random

17. Most-Popular Recommendation M orders the item collection according to the number of interactions, K L M N. K interactions L interactions M interactions most N interactions popular

18. Summary: Unpersonalised Recommenders Advantages I low computational complexity I easy to update M I domain independent Disadvantages I disregard personal taste I disregard context I high chance to recommend known or unpopular items

19. Collaborative Filtering Basic Assumptions I systems have access to users' preferences I users with similar tastes in the past will continue to like similar items I systems have means to compare users tastes Distinctions I model-based vs memory-based I item-based vs user-based

20. Example Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra

22. Example Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra user profile: Anna Bad Boys District 9 Elektra [ , , ]

23. Example Anna Bob Clara Dan [ , , ] Bad Boys District 9 Elektra [ , , , ] Aviator Bad Boys District 9 Elektra [ , , ] Cars District 9 Elektra [ ] Aviator

28. Preference Elicitation Explicit Preferences I Likes I Thumbs Up/Down I Ratings I Comments I Purchase Implicit Preferences I Click I Dwell Time I Returns How can we measure whether users like items and how much they do?

29. Collaborative Filtering Algorithms with Ratings Memory-based Algorithm uses the complete set of data in the recommendation process. M contains the full rating matrix. I user-based k-nearest neighbour I item-based k-nearest neighbour Model-based Algorithm learns a model M and uses it to recommend items. I matrix factorisation with ALS I matrix factorisation with SGD

30. User-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (u; v) Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra

31. User-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (u; v) Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra

32. User-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (u; v) Anna Aviator Bob Bad Boys Cars District 9 Elektra 0 0 1 1 1 1 1 0 1 1

33. Similarity Measures Number of items in common (u; v) = X i2I I(i) I(i) = ( 1 if both u and v liked i 0 otherwise Cosine similarity (u; v) = u v jjujjjjvjj Pearson's correlation coecient (u; v) = cov(u; v) std(u)std(v)

34. User-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (u; v) Anna Bob Clara Dan Anna Bob Clara Dan 1 1 1 1 sim(Anna, Bob) sim(Bob, Anna)

35. User-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (u; v) Anna Bob Clara Dan Anna Bob Clara Dan 1 1 1 1 sim(Anna, Bob) sim(Bob, Anna) [1, sBob, sClara, sDan]

36. User-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (u; v) Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra ?

37. User-based k-nearest Neighbour Recommendation procedure user pro

38. le: u = (r (i1); r (i2); : : : ; r (iN)) similarity vector: (u; ) = ((u; v1); (u; v2); : : : ; (u; u); : : : ; (u; vM)) preference prediction: r (j) = u(u; ) Result We obtain a prediction for each item's preference and can rank them accordingly. The algorithm returns as many items as requested starting from the top rank.

39. Item-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (i ; j) Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra

40. Item-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (i ; j) Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra

41. Item-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (i ; j) Anna Aviator Bob Clara Dan Bad Boys 1 1 1 1 0 0 0 0

42. Similarity Measures Number of items in common (i ; j) = X u2U I(u) I(u) = ( 1 if both i and j are liked by u 0 otherwise Cosine similarity (i ; j) = i j jji jjjjj jj Pearson's correlation coecient (i ; j) = cov(i ; j) std(i)std(j)

43. Item-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (i ; j) Aviator Bad Boys Cars District 9 Elektra Aviator Bad Boys Cars District 9 Elektra 1 1 1 1 1 sim(Aviator, Bad Boys) sim(Bad Boys, Aviator)

44. Item-based k-nearest Neighbour Input: M N rating matrix R, similarity measure (i ; j) Anna Aviator Bob Clara Dan Bad Boys Cars District 9 Elektra ?

45. Item-based k-nearest Neighbour Recommendation procedure item pro

46. le: i = (r (u1); r (u2); : : : ; r (uM)) similarity vector: (i ; ) = ((i ; j1); (i ; j2); : : : ; (i ; i); : : : ; (i ; jN)) preference prediction: r (u) = (i ; )i Result We obtain a prediction for each item's preference and can rank them accordingly. The algorithm returns as many items as requested starting from the top rank.

47. Matrix Factorisation Input: M N rating matrix R R = 2 664 1 1 1 1 1 1 1 1 1 1 1 3 775 Goal Fill the gaps of missing preferences.

48. Matrix Factorisation Idea Project preferences into low dimensional space to detect latent structures. [R]MN [P]MK[Q]N K K M;N Problem How to determine P and Q?

49. Matrix Factorisation Learning P and Q Input: Error metric E(P;Q; R) = X (u;i)2R i )2 (r (u; i) PuQ (quadratic error) E(P;Q; R) = X (u;i)2R jr (u; i) PuQ i j (absolute error)

50. Matrix Factorisation Stochastic Gradient Descent Optimise error metric by selecting data points at random. I initialise P;Q with small random values I pick a preference (u; i) at random I determine the gradient at that point I adjust P;Q accordingly I continue Alternating Least Squares Optimise either P or Q keeping the other

51. xed I initialise P;Q with small random values I optimise error metric by P I optimise error metric by Q I continue

52. Summary: Collaborative Filtering Advantages I takes personal taste into account I successful in the Net ix Prize competition I domain-independent Disadvantages I cold-start problem I sparsity I grey sheep

53. Cold-Start Problem I user without known preferences I item without preferences I similarity measures fail I inconclusive latent factors

54. Grey Sheep I user rate all their items average I user pro

55. le: [3; 3; 3; 3; : : : ; 3] I collaborative systems cannot distinguish good from bad items

56. Content-based Filtering Idea Suggest items which are similar to items users have liked. Similarity I based on content ! features I depending on the domain

57. Content-based Filtering Input: user pro

58. le, item collection, item features, and similarity measure

60. le, item collection, item features, and similarity measure

62. le, item collection, item features, and similarity measure Features ▪ Name/ID ▪ Meta data ▪ Content ▪ audio stream -- songs ▪ video stream -- movies ▪ text -- book, news article

64. le, item collection, item features, and similarity measure CBF sim(i,j)

65. Content-based Filtering Similarity: Example I keyword overlap ! text I average colour match ! images/video I maximum amplitude ! audio/sound I common actors ! movies I common interests ! friends/partnership

66. Summary: Content-based Filtering Advantages I considers personal taste I high expectability Disadvantages I cost-sensitive for high-volume contents, e.g., video I low serendipity I user cold-start

67. Evaluation Important aspects I how well does the system predict preferences? I how often do users receive useful suggestions? I how long does it take for the system to provide suggestions? I how many requests cannot be answered? I how often do users return to the site? I how often do users purchase/rent/consume items which the system had recommended? I how well did users perceive the system?

68. Evaluation: Rating Prediction Goal The evaluation ought to show how well the system estimates preferences. Assumptions I system can access recorded explicit numerical preferences I tastes remain stable over time I the more accurate the system estimates preferences, the more suited the suggestions Metrics I root mean squared error q 1 j(u;i)j P (u;i)2R(r (u; i) ^r (u; i ))2 I mean absolute error 1 j(u;i)j P (u;i)2R jr (u; i) ^r (u; i)j

69. Evaluation: Ranking Goal The evaluation ought to show how well the system ranks items according to users' preferences. Assumptions I system can access preference relations between items I tastes remain stable over time I the better the system ranks items, the more suited the suggestions Metrics I normalised discounted cumulative gain DCG IDCG I mean reciprocal rank 1 juj P u2U 1 ranki

70. Evaluation: Top-N Goald The evaluation ought to show how well the system selects the top suggestions. Assumptions I system can access preference relations between items I tastes remain stable over time I the better the system selects the top suggestions, the more suited they are Metrics I precision@N TP TP+FP I recall@N TP TP+FN

71. Evaluation: Problems I explicit preferences may not be available I tastes change over time I recorded data does not fully re ect the current situation Solution Accessing real systems with current user interactions to see whether method performs better than existing one ! second part of the tutorial

72. Summary: Recommender Systems I support users by suggesting interesting items I counteract information overload I unpersonalised recommender I collaborative

73. ltering I user-based k-nearest neighbour I item-based k-nearest neighbour I matrix factorisation I content-based

74. ltering I evaluation still dicult

76. News Recommendation: Special Characteristics Collection Dynamics I thousands of new article published daily I older articles' relevancy decays Contextual Dierences I users perceive recommendations dierently I devices render recommendations dierently I dependence on daytime and weekday Popularity Bias I few items receive a lot of attention I most items receive hardly any attention

77. News Recommendation: Collection Dynamics 2000 1500 1000 500 entry Oct Jan Oct Jan exit

78. News Recommendation: Contextual Dierences Sun Sat Fri Thu Wed Tue desktop phone hour Sun Sat Fri Thu Wed Tue Mon 0 6 12 18 Mon tablet 0.014 0.012 0.010 0.008 0.006 0.004 0.002 0.000

79. News Recommendation: Popularity Bias News Interactions Frequency 10^4 10^3 10^2 10^1 10^0 10^0 10^1 10^2 10^3 10^4 10^5 10^6 Movies Interactions Frequency 10^2.0 10^1.5 10^1.0 10^0.5 10^0.0 10^0 10^1 10^2 10^3 10^4

81. Big Data Goal Intelligent real-time processing of huge amounts of data. Recommender Systems ! personalisation I volume ! amount of data to be stored increases I variety ! heterogeneous data I velocity ! data streams in (near) real-time I veracity ! noisy data

82. Big Data Do news recommendations full

83. l the requirements of big data? Volume hundreds of GB every day X Variety news entail textual data and images enducing some variety Velocity news arise continuously ! second part of the tutorial X Veracity news have some consistent attributes (headline, text), but also comprise some features which are missing or wrong (date, location, image)

84. Questions? Thank you for your attention! We hope you enjoyed the

85. rst part of the tutorial! There is more (practical) to come in the second part!

Real-world News Recommender Systems

Recommended

Recommended

More Related Content

Similar to Real-world News Recommender Systems

Similar to Real-world News Recommender Systems (20)

Recently uploaded

Recently uploaded (20)

Real-world News Recommender Systems