O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing

Presentation made during ICADIWT'15 on Feb 12 2015 at Hong Kong

  • Seja o primeiro a comentar

Rec4LRW – Scientific Paper Recommender System for Literature Review and Writing

  1. 1. REC4LRW – SCIENTIFIC PAPER RECOMMENDER SYSTEM FOR LITERATURE REVIEW AND WRITING Aravind Sesagiri Raamkumar, Schubert Foo & Natalie Pang Wee Kim Wee School of Communcation and Information Nanyang Technological University, Singapore Presentation for ICADIWT’15 12th February 2015
  2. 2. What are we concerned about? “How to get the best set of relevant documents for a researcher’s literature review and publication purposes?” How (Process) + Relevant (User-specific) + Literature Review & Publication (Requirement)
  3. 3. RELATED AREAS OF RESEARCH  Literature Review To enumerate the different stages, steps and activities involved in a researcher’s literature review  Scientific Information Seeking Information Behavior (IB) research has modeled user information seeking activities at an abstract level (Case, 2012)  Recommender Systems (RS) The most relevant area as it can collect user requirements in flexilble manner along with personalization, use wisdom of crowd and provide output at any stage and in different forms (Burke, 2002)
  4. 4. RECOMMENDER SYSTEMS (RS) What is a Recommender System? “Any system that produces individualized recommendations as output or has the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible objects” (Burke, 2002) (Source: IMDB.com) Why Recommender Systems are required? •Inability of Information Retrieval (IR) systems in capturing contextual dimensions •Inability of current systems in providing personalized outputs
  5. 5. USE OF RS IN SCHOLARLY DOMAIN RS have been previously used for scholarly recommendations for the following scenarios: •Identifying conference reviewers (Basu et al, 2001) •Identifying topical experts (Chen et al, 2013) •Identifying potential co-authors for a paper (Huynh et al, 2012) •Recommending similar research papers (Liang et al, 2011) •Recommending reading list of papers (Ekstrand et al, 2010) Techniques used in RS •Collaborative Filtering (CF) •Content-based (CB) recommendation algorithm (more or less IR) •Hybrid versions involving CF and CB, combined with techniques such as topic models, language models, and citation graphs
  6. 6. RELATED WORK Recommending papers for information seeking tasks (Mcnee, 2006) • Theoretical model – “Human Recommender Interaction (HRI)” conceptualized for recommending papers for six information seeking tasks • Experience level connected to RS metrics through aspects (for e.g. correctness, trust) Recommending reading list of research papers • CF recommender reinforced with graph ranking algorithms (PageRank, HITS and SALSA) (Ekstrand et al, 2010) • Latent Dirichlet Allocation (LDA) (Jardine, 2014) and hybrid approaches based on multiple similarity measures (Bae et al, 2014) Finding similar papers based on a seed set of papers • Metadata-based similarity (Martin et al, 2013) and citation-based similarity (Liang, 2011) approaches to identify relevant papers, • Data items such as title, abstract, keywords, bibliographic references and citation web are used Few online stand-alone citation RS • RefSeer is a citation RS built on top of CiteSeer digital library data (Hwang, 2003) • theadvisor is a recent online citation RS that recommends papers based on a seed set of papers (Küçüktunç, 2013) • Docear is a reference management tool with an inbuilt recommendation module (Beel, 2013 )
  7. 7. WHAT’S MISSING THEN? Plentitude of diverse techniques with different data items ⇒Difficult proposition for replication ⇒Lack of intermediate structure Lack of interconnection between sequential tasks => Researchers’ selection of papers evolves through tasks in a natural setting Use of ‘Article Type’ as a contextual dimension ⇒Article type ranges from journal survey/review papers, journal case studies to conference long papers and short papers ⇒Useful in shortlisting papers for inclusion in manuscripts
  8. 8. INTRODUCING REC4LRW… - A Scientific Paper RS for Literature Review and Writing
  9. 9. CRITERIA USED IN REC4LRW (1) First set of criteria for capturing the relations between Research paper and its bibliography •References Count (RC) • Data has the potential for setting the number of the recommendations in the recommendations list provided to the user •Grey Literature Percentage (GL) • Non-scientific references which are yet to be formally published are referred to as grey literature • Intended to be used for the purpose of calculating the extent of inclusion of grey literature references in papers •Coverage (C) • Measures the ability of the bibliography in covering the important papers for the topic(s) being addressed in the main paper
  10. 10. CRITERIA USED IN REC4LRW (2) Second set of criteria for capturing the relations between the research paper and each reference in the bibliography •Recency (RE) • Shows how recent the referenced papers are in the bibliographies of papers • Calculated by finding difference in years between the publication date of the parent paper and references •Textual Similarity (TS) • For calculating the topical similarity between the parent paper and the references • Semantic Textual Similarity (STS) and Letter-pair Similarity are the preferred methods •Specificity (S) • A vertical characteristic as it looks at the relations from a top-down perspective (similar to broad-narrow relations in theasuri) • Measurement will make use of the keywords specified by the author(s) in the article metadata •Citation Count (CC) • To identify the extent to which citation count of references is given importance in the target artefact
  11. 11. CRITERIA MEASUREMENT STEPS
  12. 12. TASKS HANDLED IN RECLRW Literature Review Task 1: Building a reading list of research papers Task 2: Finding similar papers based on a set of papers Manuscript Writing Task 3: Shortlisting papers from the final reading list for inclusion in manuscript based on article type
  13. 13. FIRST TASK IN REC4LRW
  14. 14. SECOND TASK IN REC4LRW
  15. 15. THIRD TASK IN REC4LRW
  16. 16. WORKFLOW OF TASKS IN REC4LRW USER INTERACE
  17. 17. METHODOLOGY Stages Stage 1 (S1): Criteria Measurement for the articles in ACM Dataset Stage 2 (S2): Building of Recommender System Stage 3 (S3): Offline and User Evaluation Dataset used Prominent sources such as CiteSeer, ACL and CiteUlike were considered ACM dataset was shortlisted as it provides a extensive set of research articles from the Computer Science discipline along with full text for majority of papers Feature Periodicals Proceedings Count of total articles 77437 84111 Count of articles satisfying qualification requirement 19040 20022 Period covered 1954-2011 1951-2011
  18. 18. DEVELOPMENT OF REC4LRW Technical Details Databases for storage and basic querying •BaseX XML Store and MySQL Implementation of CB recommender •Apache Lucene used for text search based retrieval process for the CB recommender Implementation of CF recommender •Apache Mahout used for building CF recommender system Web application for conducting the user experiments •A custom web application using PHP will be built so that the experiment URL could be sent to participants
  19. 19. CLOSING REMARKS  Application of RS in academic databases and digital libraries provides benefits for both researchers and system designers  Rec4LRW addresses: • The whole lifecycle of scientific publication through interconnected tasks • With Flexible recommender criteria • With Customizable recommendation techniques  Offline evaluations and user evaluations will be conducted to verify the effectiveness of Rec4LRW
  20. 20. THANK YOU

    Seja o primeiro a comentar

    Entre para ver os comentários

  • aravindsraamkumar

    Feb. 14, 2015
  • BonnieZink

    Feb. 15, 2015
  • sourishdasgupta

    Dec. 4, 2017
  • somasundram.c

    Oct. 10, 2018

Presentation made during ICADIWT'15 on Feb 12 2015 at Hong Kong

Vistos

Vistos totais

1.398

No Slideshare

0

De incorporações

0

Número de incorporações

49

Ações

Baixados

29

Compartilhados

0

Comentários

0

Curtir

4

×