O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Towards effective research recommender systems for repositories

229 visualizações

Publicada em

In this paper, we argue why and how the integration of recommender systems for research can enhance the functionality and user experience in repositories. We present the latest technical innovations in the CORE Recommender, which provides research article recommendations across the global network of repositories and journals. The CORE Recommender has been recently redeveloped and released into production in the CORE system and has also been deployed in several third-party repositories. We explain the design choices of this unique system and the evaluation processes we have in place to continue raising the quality of the provided recommendations. By drawing on our experience, we discuss the main challenges in offering a state-of-the-art recommender solution for repositories. We highlight two of the key limitations of the current repository infrastructure with respect to developing research recommender systems: 1) the lack of a standardised protocol and capabilities for exposing anonymised user-interaction logs, which represent critically important input data for recommender systems based on collaborative filtering and 2) the lack of a voluntary global sign-on capability in repositories, which would enable the creation of personalised recommendation and notification solutions based on past user interactions.

Publicada em: Dados e análise
  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

Towards effective research recommender systems for repositories

  1. 1. Towards effective research recommender systems for repositories Petr Knoth, Lucas Anastasiou, Aristotelis Charalampous, Matteo Cancellieri, Samuel Pearce and Nancy Pontika CORE Knowledge Media institute, The Open University United Kingdom
  2. 2. In this talk • Why recommender systems for repositories • Building recommender systems for research • The CORE recommender system • CPA-based recommender systems
  3. 3. Why recommender systems for repositories
  4. 4. Discoverability of content in repositories • Criticism of repositories (Salo, 2008; Konkiel, 2012; Acharya, 2015) “[The institutional repository] is like a roach motel. Data goes in, but it doesn’t come out.” (Salo, 2008)
  5. 5. Exploratory research activities • Focus on SEO, but … • Researchers often involved in exploratory information seeking activities (Marchionini, 2008):
  6. 6. Accessibility Accessibility can be defined as the likelihood of expressing a query leading to the retrieval of the resource and the position of the resource in the search results list. [Azzopardi and Vinay, 2008]
  7. 7. By integrating recommender into repositories, we can increase the number of incoming links to relevant resources, thereby increasing their accessibility. Why is it needed?
  8. 8. What use case do research recommender systems address? “As a user, I want to receive recommendations about content (papers, datasets, software, people to follow, grant opportunities, methods, conferences, etc.) that is of interest to me, so I can continuously increase knowledge in my field.” [ Next Generation Repositories WG, 2017]
  9. 9. What effect can it have? • Increase accessibility means more engagement, higher Click-Through Rate (CTR) • Twice as often people access resources on CORE via its recommender system than via search.
  10. 10. Recommender system and social networks • Recsys proven to drive engagement and growth of networks • Enabler of successful solutions.
  11. 11. Building recsys for research
  12. 12. Collaborative filtering (CF) vs Content-based filtering (CBF) Common recsys approaches
  13. 13. Personalised vs non-personalised Common recsys approaches
  14. 14. Aspects to optimise a recommender system for • Type of recs: • Novelty/recency • Relevance • Familiarity • Serendipity • Audience: post-doc, lecturer, professor, etc. • Use case: reviewing literature, staying in touch, learning about new field More information at: http://dl.acm.org/citation.cfm?id=3057152
  15. 15. The CORE recommender system
  16. 16. Recommendations as a service CORE provides a non-personalised CBF-based recommendation system for articles from across the global network of repositories. • 8 million full texts • 77 million metadata records • 2683 repositories
  17. 17. Recs delivered using: • a plugin (Eprints, DSPace + general repository) • over the CORE API Recommendable items in the CORE recommender
  18. 18. How does the CORE recommender work? • Currently an article-article recommender system • Preprocessing prior to recsys: enrichment with e.g. identifiers. document type and citation data • Similarity measure/ranking function • Post-filtering using record quality • Feedback (crowdsourcing a black list)
  19. 19. What features is the similarity calculation based on? • Features: • Textual: title, authors, abstract, full text! • Recency: publication year • Popularity: citations, readership • Document type (thesis/article/slides) • Others: subject field, Citation Proximity Analysis/Co-citations (in progress)
  20. 20. Combining features • Evaluating different ranking functions (P,R,MAP, etc.): • Weights for boosting • Scaling function (e.g. exponential decay for recency) • Offline ground truths: • MAG citation assumption • MAG co-citation assumption • Learning to rank (haven’t done yet) • Online A/B testing (haven’t done yet)
  21. 21. Distinctive features of the CORE recommender • Use of full text rather than just abstracts • Ensure open access availability of recommended articles (recsys for all) • Free service • Integration with repositories and availability over API More information: https://arxiv.org/abs/1705.00578
  22. 22. Discussion & Challenges • Effective content harvesting and TDM from repositories • We are losing valuable interaction data due to our distributed infrastructure => voluntary global sign-on: • Personalisation • Collaborative filtering • Citation data – movement towards having them public • Online vs offline testing (Beel & Langer, 2015)
  23. 23. CPA-based recommender systems
  24. 24. Can we do better than co-citations? • Build on the idea introduced by Beel et al. [1], putting the concept of Citation Proximity Analysis (CPA) into practice. • Extending the co-citation assumption: “the more often two articles are co-cited in a document, the more likely they are related” taking proximity into account • Introducing and evaluating new proximity functions
  25. 25. Method
  26. 26. Proximity functions 1. Co-citations baseline (no proximity information) 2. ProxMin 3. ProxSum 4. ProxMean
  27. 27. Initial evaluation • Corpus of 350k papers from CORE • 6 topics, 5 recs each, 4 systems, 10 annotators (1,200 judgements), Fleiss’s • Improvement of 25% over baseline for P@5 ProxSum
  28. 28. Conclusions • Research recommenders have value, but potential not fully realised yet • Data availability and interoperability needed: • Recommendable content/entities • User interaction data and profiles • Recommendations as a service: CORE recommender • Repositories should allow CF and personalisation by: • Exposing interaction data • Voluntary global sign-on • Both advocated by COAR Next generation Repositories WG