Anúncio
Anúncio

Mais conteúdo relacionado

Similar a ECIR 2019 - Information Retrieval Models for Contact Recommendation in Social Networks(20)

Anúncio

ECIR 2019 - Information Retrieval Models for Contact Recommendation in Social Networks

  1. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Javier Sanz-Cruzado and Pablo Castells @JavierSanzCruza, @pcastells Universidad Autónoma de Madrid http://ir.ii.uam.es Cologne, Germany, April 15th 2019
  2. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Motivation Online Social networks:  Appeared at early 2000s  Most used web applications  New challenges for IR Social Recommendation Search Social IR Content Recommendation Contact Recommendation
  3. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Item recomendation 4 4 2 2 4 1 4 4 3 4 3 ? 2 ? 1 4 ? 4 3 3 1 1 1 5 2 Users Items Rating matrix
  4. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Contact recomendation ? - 1 1 - 2 1 ? - ? 1 3 - 1 4 - Users Users Rating matrix = adjacency matrix
  5. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Contact recommendation approaches Machine Learning Specifical models Recommender systems Contact recommendation Information Retrieval¿ ? Adaptations
  6. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Relation between tasks Contact recommendation Target user Candidate user Relevant link ? Relevant result Term DocumentQuery Target user Candidate item Neighbor user Relevant item ? Collaborative filteringIR task
  7. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Relation between tasks IR task Collaborative filtering Contact recommendation Target user Candidate user Neighbor user Relevant link ? Relevant result Term DocumentQuery Target user Candidate item Neighbor user Relevant item
  8. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Relation between tasks Contact recommendation Target user Candidate user Neighbor user Relevant link Und In
  9. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 An example: BM25 Original formulation: 𝑓𝑞 𝑑 = 𝑡∈𝑑∩𝑞 𝑘 + 1 freq 𝑡, 𝑑 𝑘 1 − 𝑏 + 𝑏 𝑑 avg 𝑑′ 𝑑′ + freq 𝑡, 𝑑 RSJ 𝑡 𝑅𝑆𝐽 𝑤 = log 𝐷 − 𝐷𝑡 − 0.5 𝐷𝑡 − 0.5 Where  𝑑: document  𝑞: query  𝑡: term  𝐷: set of all documents  𝐷𝑡: documents containing 𝑡  freq 𝑡, 𝑑 : frequency of 𝑡 𝑖𝑛 𝑑  𝑑 : document 𝑑 length Γ 𝑑 𝑣 : candidate user Γ 𝑞 𝑢 : target user 𝑡: neighbor user 𝒰: all users Γinv 𝑑 𝑡 : 𝑣 containing 𝑡 in Γ 𝑑 𝑣 𝑤 𝑑 𝑡, 𝑣 : edge weight len𝑙 𝑣 = 𝑥∈Γ 𝑙 𝑣 𝑤 𝑙 𝑥, 𝑣
  10. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 An example: BM25 Original formulation: 𝑓𝑢 𝑣 = 𝑡∈Γ 𝑞 𝑢 ∩Γ 𝑑 𝑣 𝑘 + 1 w 𝑑 𝑡, 𝑣 RSJ 𝑡 𝑘 1 − 𝑏 + 𝑏 ⋅ len𝑙 𝑣 avg 𝑣′ Γ𝑙 𝑣′ + w 𝑑 𝑡, 𝑣 𝑅𝑆𝐽 𝑡 = log 𝒰 − Γinv 𝑑 𝑡 + 0.5 Γinv 𝑑 𝑡 + 0.5 Where  𝑑: document  𝑞: query  𝑡: term  𝐷: set of all documents  𝐷𝑡: documents containing 𝑡  freq 𝑡, 𝑑 : frequency of 𝑡 𝑖𝑛 𝑑  𝑑 : document 𝑑 length Γ 𝑑 𝑣 : candidate user Γ 𝑞 𝑢 : target user 𝑡: neighbor user 𝒰: all users Γinv 𝑑 𝑡 : 𝑣 containing 𝑡 in Γ 𝑑 𝑣 w 𝑑 𝑡, 𝑣 : edge weight len𝑙 𝑣 = 𝑥∈Γ 𝑙 𝑣 w 𝑙 𝑥, 𝑣
  11. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Experiments  Offline evaluation  Data extracted from Twitter (REST API)  Interaction graphs: 𝑢, 𝑣 ∈ 𝐸 ⟺ 𝑢 retweets, mentions 𝑣  Snowball sampling  Two samples: 1. 1 month: All tweets between 19th June and 19th July 2015 2. 200 tweets: 200 last tweets by each user before 2nd August 2015
  12. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Evaluation methodology  Networks are split in training/test (temporal split) – 1 month: Interactions before July 12th / after July 12th – 200 tweets: Interactions in first 80% of tweets / remaining 20%  Recommendations applied over train data – Reciprocal links are not recommended  Evaluated using IR metrics on test: P@10, R@10, nDCG@10
  13. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Algorithms  IR models – Vector space model (VSM) – BIR – BM25 – Query Likelihood (QL) - Jelinek-Mercer, Dirichlet and Laplace smoothing  General collaborative filtering – User-based / Item-based kNN – Implicit matrix factorization (iMF)  Specific approaches – Adamic-Adar – Most common neighbors (MCN) – Personalized PageRank – Jaccard similarity – Money  Sanity-check: Random and most popular
  14. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Results (P@10) 1 month 200 tweets BM25 0.0691 0.0572 BIR 0.0675 0.0534 QLL 0.0609 0.0490 QLJM 0.0580 0.0492 QLD 0.0441 0.0482 VSM 0.0191 0.0268 Money 0.0772 0.0476 Adamic-Adar 0.0676 0.0532 MCN 0.0631 0.0501 PageRank Pers. 0.0598 0.0336 Jaccard 0.0226 0.0304 iMF 0.0834 0.0541 User-based kNN 0.0805 0.0479 Item-based kNN 0.0739 0.0360 Popularity 0.0255 0.0225 Random 0.0009 0.0003  IR models are effective – Probabilistic models among top 5 – BM25 best in “200 tweets” – VSM lowest performing IR model  Rest of algorithms: – Implicit MF is best. – Adamic-Adar and MCN are competitive. – Jaccard is not very competitive. – Rest seem very graph- dependent.
  15. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Efficiency BM25 is much faster than MF
  16. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Conclusions  IR models can be applied to contact recommendation  They provide an effective and efficient solution  Future work: – More IR models – Effects of sampling methods on accuracy – Beyond accuracy: Novelty, diversity and effects on the network
  17. IRGIRGroup @UAM Information Retrieval Models for Contact Recommendation in Social Networks 41st European Conference on Information Retrieval (ECIR 2019) Cologne, Germany, 15 April 2019 Thank you for your attention! Questions?

Notas do Editor

  1. Presentation time: 15 minutes + 5 minutes (questions) Scheme: Presentation Motivation Unified formulation for IR models for CRec. An example: BM25 Experiments: Accuracy Experiments: Efficiency Conclusions Slide text: Hello, my name is Javier Sanz-Cruzado, and I am going to show you the work I have developed along Pablo Castells, titled Information Retrieval Models for Contact Recommendation in Social Networks. In this work, we explore the adaptation of several Information Retrieval models, originally oriented to textual search, as algorithms for recommending people-to-people in social network environments.
  2. Since their appearance at the beginning of the century, online social networks have become some of the most used and important applications on the Web. The availability of new kinds of information such as relations between users or the different contents they generate have given place to new challenges for information retrieval, in both search and recommendation tasks. In our work, we focus on recommender systems. In that field, we could differentiate two different tasks, according to what we want to recommend. First, content recommendation, for finding relevant posts on Facebook, tweets on Twitter, etc. Second, recommending other people in the network who you might be interested in connect with: the so-called contact recommendation. We focus on this second perspective, but, why?
  3. The reason is the following. Although the contact recommendation task might seem similar to other recommendation scenarios (and, indeed, could be treated as any other one), it has many particularities that make this case quite special. First, in classical recommendation scenarios, users and ítems have been, traditionally, separate sets. However, in contact recommendation, that affirmation is not true anymore: the set of ítems is directly extracted from the set of users of the system. The second is that, in addition to the “rating matrix”, we have additional information about how the different users are related to each other: the structure of the network.
  4. The reason is the following. Although the contact recommendation task might seem similar to other recommendation scenarios (and, indeed, could be treated as any other one), it has many particularities that make this case quite special. First, in classical recommendation scenarios, users and ítems have been, traditionally, separate sets. However, in contact recommendation, that affirmation is not true anymore: the set of ítems is directly extracted from the set of users of the system. The second is that, in addition to the “rating matrix”, we have additional information about how the different users are related to each other: the structure of the network.
  5. Although simple, these particularities have given place to the development of multiple specifical algorithms for this task, originated in very different fields, such as link prediction, classical recommendation or machine learning. Given that variety of algorithms, we ask the following: is it possible to apply IR models to recommend people in social networks?
  6. In order to solve that question, first, we first need to establish equivalences between the elements in the contact recommendation task (users and their interactions between them) and the spaces involved in text search (queries, documents and terms). In the slide, we can see the relationships between the different elements in the different tasks. As we can see, the scheme of both tasks is quite similar: in the IR task, given a query, we would need to obtain relevant documents using the textual representation of both elements; in the contact recommendation one, given a target user, we would need to identify suitable people, using the common connections in the graph between both users. The idea is then to fold the three IR spaces into just one: the users in the network. How?
  7. In order to solve that question, first, we first need to establish equivalences between the elements in the contact recommendation task (users and their interactions between them) and the spaces involved in text search (queries, documents and terms). In the slide, we can see the relationships between the different elements in the different tasks. As we can see, the scheme of both tasks is quite similar: in the IR task, given a query, we would need to obtain relevant documents using the textual representation of both elements; in the contact recommendation one, given a target user, we would need to identify suitable people, using the common connections in the graph between both users. The idea is then to fold the three IR spaces into just one: the users in the network. How?
  8. In order to illustrate all the previous talk, let’s see an example, by adapting one of the most well-known and effective state-of-the-art IR models: BM25. In the slide, we can see the original formulation. We will need to change all the elements in that formula: documents, query, terms, etc. to the space of users in order to be able to use it.
  9. In order to illustrate all the previous talk, let’s see an example, by adapting one of the most well-known and effective state-of-the-art IR models: BM25. In the slide, we can see the original formulation. We will need to change all the elements in that formula: documents, query, terms, etc. to the space of users in order to be able to use it.
  10. One we have seen that those algorithms could be adapted, now, we should see whether they are effective as contact recommendation algorithms or not. In order to test that, we extracted two interaction graphs from Twitter, using the snowball sampling method. For the first sample, we obtained all the interactions between 10,000 users for the duration of a month, and, for the second, we extracted the interactions in the last 200 tweets published by a set of 10,000 users.
  11. Both networks were divided in training and test sets using a temporal split, and, recommendations were applied over the training data, and evaluated on test using IR metrics such as Precision, recall or ndcg. In order to prevent users without test links to distort the evaluation, we just applied recommendations over users with links in the test set.
  12. We applied several IR algorithms, like the vector space model, BM25 variants and query likelihood, and also, classical recommendation approaches and specific state-of-the-art algorithms.
  13. In the table we show in the screen, we observe the results of our experiments in terms of P@10. We highlight two algorithms: implicit matrix factorization, and BM25, since they were the best in one of the tested graphs.
  14. Do IR models have any additional advantage over other state-of-the art algorithms, such as matrix factorization or nearest neighbors approaches? In order to check that, we (brief description of the experiment…) and we saw that IR models. RECOMMMENDATION TIME: how much time we spend for recommending UPDATE TIME: Given new data, how much time (in seconds) is needed to re-train the algorithm? This is supported by a theoretical analysis, which is included in the paper (as well as the backup slides).
  15. Hopefully intuitive Explains why and what kind of popularity Advantages: algorithmic design justification, deeper understanding of behavior, enabling extensions & variations Item based better, particularly when observation bias is removed
Anúncio