Stefanie Haustein & Vincent Larivière (2014). Mendeley as a Source of Readership by Students and Postdocs? Evaluating Article Usage by Academic Status. Presentation at IATUL 2014. http://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=2033&context=iatul
Mendeley as a Source of Readership by Students and Postdocs? Evaluating Article Usage by Academic Status
1. Mendeley as a Source of
Readership by Students and Postdocs?
Evaluating Article Usage by Academic Status
stefanie.haustein@umontreal.ca
@stefhaustein
Stefanie Haustein
& Vincent Larivière
2. Introduction
Measuring use of scholarly documents
• reshelving, interlibrary loan
• citation analysis
• electronic full text access
• social reference managers and bookmarking systems
3. Introduction
Research questions:
• Can Mendeley readership counts be used to monitor use
of scholarly documents?
• Does use differ between scientific fields?
• Can different user sectors and user types be identified
based on the academic status?
• Can the data be used to determine whether specific user
groups predict citation impact?
5. Introduction
• 2.8 million users, 275,860 groups, 535 user documents (02/2014)
• monthly growth rate of 3.7% (documents) and 2.3% (users) 2013
• 68 million unique publications (08/2012; 281 million user
documents)
Mendeley statistics based on monthly user counts from 10/2010 to 02/2014 on the Mendeley website accessed through the Internet Archive
6. Data sets & methods
• 1,161,145 PubMed papers covered by WoS
• publication years: 2010-2012
• document types: articles & reviews
• NSF disciplines: Biomedical Research, Clinical Medicine,
Health, Psychology (journal-based classification)
• open citation window
• Mendeley readership data collected via API
• Levenshtein distance (5%) to account for errors in
metadata
• document title (long titles = 70 characters, 5 words)
• document title and first author name (short titles)
1.7% false positives, 0.7% false negatives
7. Data sets & methods
• aggregating reader counts of multiple entries
8. Data sets & methods
• number of readers per academic status
• number of missing readership status per paper
29% 7 PhD students
21% 5 Master students
8% 2 Doctoral students
58% 14 readership status available
42% 10 missing readership status
9. Data sets & methods
• aggregating academic status information
10. Results: disciplines
• two-thirds of papers saved at least once on Mendeley
• reader rate comparable to citation rate
• Spearman correlations between citations and reader counts
moderately positive (ρ=0.445** / ρ=0.512**)
• academic status not available for 30% of reader counts
Papers
PubMed
& WoS
Mean
citation
rate
Papers with readers
NSF discipline Readership status
available missing
ρ
%
Mean
reader rate
Mean
citation rate
all disciplines 1,161,145 7.5 65.9% 9.6 8.9 0.512 ** 70.0% 30.0%
Biomedical Research 286,398 10.3 72.4% 14.3 11.8 0.575 ** 69.5% 30.5%
Clinical Medicine 779,707 6.8 62.8% 7.6 8.2 0.492 ** 70.5% 29.5%
Health 59,073 4.4 67.0% 6.5 4.3 0.434 ** 72.8% 27.2%
Psychology 35,967 6.1 81.0% 14.0 6.6 0.545 ** 67.5% 32.5%
11. Results: specialties
Differences between specialties
Papers with readers
%
Mean
reader rate
ρ
NSF discipline or specialty
all disciplines 65.9% 9.6 0.512 **
Biomedical Research 72.4% 14.3 0.575 **
Anatomy & Morphology 68.2% 5.5 0.380 **
Biochem & Mol Biol 71.6% 12.4 0.550 **
Biomedical Engineering 74.9% 10.4 0.513 **
Biophysics 78.6% 11.8 0.537 **
Cell Biol, Cytol & Histol 74.7% 14.3 0.584 **
Embryology 79.2% 13.2 0.649 **
Gen Biomed Research 72.5% 35.1 0.689 **
Genetics & Heredity 74.1% 17.3 0.558 **
Microbiology 72.7% 10.4 0.555 **
Microscopy 72.5% 6.7 0.494 **
Misc Biomed Research 74.3% 8.8 0.585 **
Nutrition & Dietetic 66.9% 6.6 0.494 **
Parasitology 66.0% 6.1 0.436 **
Physiology 72.1% 8.0 0.457 **
Virology 68.9% 7.1 0.534 **
General Biomedical Research
Anatomy & Morphology
Size of data points represents mean reader rate.
Embryology
12. Results: specialties
Differences between specialties
Size of data points represents mean reader rate.
Papers with readers
%
Mean
reader rate
ρ
NSF discipline or specialty
all disciplines 65.9% 9.6 0.512 **
Clinical Medicine 62.8% 7.6 0.492 **
Addictive Diseases 68.2% 5.8 0.436 **
Allergy 69.8% 8.3 0.582 **
Anesthesiology 63.0% 6.8 0.497 **
Arthritis & Rheumatology 63.3% 6.3 0.488 **
Cancer 62.8% 7.3 0.550 **
Cardiovascular System 56.6% 7.4 0.555 **
Dentistry 68.5% 5.6 0.398 **
Dermat & Venerial Dis 51.3% 4.2 0.433 **
Endocrinology 64.4% 7.1 0.518 **
Environ & Occupat Health 66.1% 6.9 0.501 **
Fertility 64.4% 4.3 0.417 **
Gastroenterology 58.1% 6.0 0.508 **
Gen & Internal Medicine 51.8% 8.2 0.519 **
Geriatrics 73.5% 7.5 0.494 **
Hematology 59.5% 6.9 0.557 **
Immunology 65.8% 9.1 0.561 **
Misc Clinical Medicine 70.6% 9.1 0.458 **
Psychiatry
Neurology &
Neurosurgery
VeterinaryMedicine
13. Results: specialties
Differences between specialties
Size of data points represents mean reader rate.
Papers with readers
%
Mean
reader rate
ρ
NSF discipline or specialty
all disciplines 65.9% 9.6 0.512 **
Clinical Medicine 62.8% 7.6 0.492 **
Nephrology 63.9% 5.3 0.458 **
Neurol & Neurosurgery 73.1% 13.6 0.554 **
Obstetrics & Gynecology 60.4% 4.3 0.420 **
Ophthalmology 63.0% 4.4 0.486 **
Orthopedics 66.0% 6.9 0.449 **
Otorhinolaryngology 59.7% 4.1 0.383 **
Pathology 60.1% 5.3 0.503 **
Pediatrics 62.0% 5.8 0.469 **
Pharmacology 63.4% 6.5 0.501 **
Pharmacy 55.9% 4.8 0.405 **
Psychiatry 72.1% 9.2 0.583 **
Radiol & Nucl Medicine 63.9% 6.8 0.467 **
Respiratory System 65.1% 6.8 0.487 **
Surgery 58.0% 4.2 0.420 **
Tropical Medicine 65.4% 5.8 0.478 **
Urology 54.8% 4.1 0.432 **
Veterinary Medicine 66.3% 7.5 0.236 **
Psychiatry
Neurology &
Neurosurgery
VeterinaryMedicine
14. Results: specialties
Differences between specialties
Geriatrics & Gerontology
Size of data points represents mean reader rate.
Papers with readers
%
Mean
reader rate
ρ
NSF discipline or specialty
all disciplines 65.9% 9.6 0.512 **
Health 67.0% 6.5 0.434 **
Geriatrics & Gerontology 69.8% 7.3 0.540 **
Health Policy & Services 66.1% 6.8 0.421 **
Nursing 62.0% 5.1 0.378 **
Public Health 66.0% 6.0 0.439 **
Rehabilitation 73.0% 8.0 0.434 **
Social Sciences, Biomed 76.0% 9.2 0.495 **
Social Studies of Med 49.5% 3.1 0.281 **
Speech-Lang Path & Audio 79.0% 7.7 0.436 **
Psychology 81.0% 14.0 0.545 **
Behav Sci & Compl Psych 83.4% 12.2 0.503 **
Clinical Psychology 80.7% 11.1 0.536 **
Develop & Child Psych 80.3% 13.2 0.531 **
Experimental Psychology 85.6% 19.2 0.582 **
General Psychology 68.5% 9.3 0.493 **
Human Factors 84.2% 9.2 0.434 **
Misc Psychology 79.3% 11.4 0.531 **
Psychoanalysis 39.5% 3.6 0.137
Social Psychology 82.4% 24.8 0.687 **
Social Studies of Medicine
Psychoanalysis
Social Psychology
16. Results: sectors
• Biomedical Research papers mostly used by readers from
scientific sector
• more professionals in Clinical Medicine
• more educational and professional users in Health
• more educational, less professional users in Psychology
%
Papers with readers Sector type of readership status
Mean
reader rate
Mean
citation rate
Scientific Educational Professionalmissing
NSF discipline
ρ
all disciplines 65.9% 9.6 8.9 0.512 ** 48.5% 15.7% 5.8% 30.0%
Biomedical Research 72.4% 14.3 11.8 0.575 ** 54.9% 12.0% 2.6% 30.5%
Clinical Medicine 62.8% 7.6 8.2 0.492 ** 44.2% 17.6% 8.7% 29.5%
Health 67.0% 6.5 4.3 0.434 ** 38.1% 27.3% 7.4% 27.2%
Psychology 81.0% 14.0 6.6 0.545 ** 46.6% 19.0% 1.8% 32.5%
17. Results: sectors
Spearman correlation between citations and reader counts
y = 0.0031x + 0.3823
0.4
R² = 0.433
0.0
0%
0.1
10%
0.2
20%
0.3
30%
40%
0.5
50%
0.6
60%
0.7
70%
0.8
80%
0.9
90%
1.0
100%
Veterinary Medicine
Dentistry
Misc Clinical Medicine
Nursing
Social Studies of Med
Rehabilitation
Anesthesiology
Obstetrics & Gynecology
Orthopedics
Urology
Dermat & Venerial Dis
Nutrition & Dietetic
Surgery
Public Health
Tropical Medicine
Pediatrics
Fertility
Health Policy & Services
Ophthalmology
Pharmacy
Otorhinolaryngology
Speech-Lang Path & Audio
Arthritis & Rheumatology
Addictive Diseases
Gen & Internal Medicine
Cardiovascular System
Psychoanalysis
Respiratory System
Nephrology
Geriatrics
Allergy
Environ & Occupat Health
Social Sciences, Biomed
Pathology
Human Factors
Pharmacology
Gastroenterology
Parasitology
General Psychology
Endocrinology
Radiol & Nucl Medicine
Misc Biomed Research
Geriatrics & Gerontology
Clinical Psychology
Psychiatry
Anatomy & Morphology
Cancer
Physiology
Biomedical Engineering
Misc Psychology
Social Psychology
Hematology
Virology
Immunology
Behav Sci & Compl Psych
Develop & Child Psych
Microbiology
Experimental Psychology
Neurol & Neurosurgery
Microscopy
Biochem & Mol Biol
Biophysics
Genetics & Heredity
Embryology
Cell Biol, Cytol & Histol
Gen Biomed Research
Percentage of readers per sector
Professional Educational Scientific missing Spearman's ρ
20. Results: users
0.575**
0.559**
0.534**
0.478**
0.435**
0.426**
0.410**
0.396**
0.353**
0.318**
0.224**
0.089**
0.234**
0.135**
0.183**
0.059**
0.071**
0.059**
0.066**
0.074**
0.049**
0.042**
0.040
0.051
all readers
Postdoc
PhD Student
Researcher (Academic)
Student (Postgraduate)
Researcher (Non-Academic)
Professor
Assistant Professor
Student (Bachelor)
Associate Professor
Other Professional
Librarian
Biomedical Research
all documents (n=207,255) 100% available reader status (n=80,858)
All documents
• postdocs and PhD
students most
similar to citations
• librarians least
similar
100% status info
• PhD students and
Postdocs most
similar to citations
• other professionals
and associate
professors least
similar
21. Results: users
0.492**
0.451**
0.425**
0.410**
0.408**
0.364**
0.361**
0.317**
0.300**
0.183**
0.137**
0.055**
0.238**
0.075**
0.093**
0.174**
0.121**
0.067**
0.059**
0.056**
0.079**
0.050**
0.030**
0.029**
all readers
Researcher (Academic)
Researcher (Non-Academic)
PhD Student
Postdoc
Assistant Professor
Professor
Associate Professor
Other Professional
Student (Postgraduate)
Student (Bachelor)
Librarian
Clinical Medicine
all documents (n=489,597) 100% available reader status (n=258,656)
All documents
• researchers most
similar to citations
• librarians least
similar
100% status info
• PhD students and
Postdocs most
similar to citations
• Bachelor students
and librarians least
similar
22. Results: users
0.434**
0.340**
0.329**
0.320**
0.307**
0.282**
0.280**
0.276**
0.266**
0.250**
0.214**
0.083**
0.196**
0.127**
0.099**
0.038
0.000
0.093**
0.021
0.004
0.076**
0.044
0.058**
-0.028
all readers
PhD Student
Researcher (Academic)
Researcher (Non-Academic)
Postdoc
Student (Postgraduate)
Professor
Assistant Professor
Other Professional
Associate Professor
Student (Bachelor)
Librarian
Health
all documents (n=39,564) 100% available reader status (n=19,955)
All documents
• low correlations
• PhD students,
researchers and
postdocs most
similar
100% status info
• PhD students and
Postdocs most
similar to citations
• no similarity for
librarians and
postdocs
23. Results: users
0.545**
0.480**
0.480**
0.425**
0.403**
0.400**
0.368**
0.356**
0.321**
0.299**
0.189**
0.282**
0.158**
0.125**
0.082**
0.070
0.052
0.076*
0.048
0.107**
0.037
0.120**
0.048
-0.069
all readers
PhD Student
Postdoc
Student (Postgraduate)
Professor
Researcher (Academic)
Assistant Professor
Student (Bachelor)
Researcher (Non-Academic)
Other Professional
Associate Professor
Librarian
Psychology
all documents (n=29,121) 100% available reader status (n=7,932)
All documents
• PhD students and
postdocs most
similar to citations
• librarians least
similar
100% status info
• PhD students,
postdocs and other
professionals most
similar to citations
• negative
correlation for
librarians
24. Conclusions: general results
• Mendeley important source of documents’ usage
• 2.8 million users, 535 million user documents
• 65.9% of sampled documents saved 9.6 times on average
• reader counts reflect similar but broader use of scholarly
documents than citations
• Spearman’s ρ = 0.445**/0.512**
• PhD students, postgraduate students and postdocs largest
user group, librarians the smallest
26. Limitations
• metadata quality
• academic status self-reported
need to verify whether accurate and up-to-date
• restriction to top 3
• differences between user groups cannot be determined due
to data restriction
• similarity with citation patterns of different user groups cannot
be accurately determined
• even more problematic for countries and disciplines
complete data needed for detailed and accurate statistics
27. Thank you for your attention!
Stefanie Haustein
Questions?
stefanie.haustein@umontreal.ca
@stefhaustein
Notas do Editor
reshelving, interlibrary loan+ actual use, - time-consuming, local
citation analysis+ global, fast, - captures only specific use by citing author
full text access, downloads of electronic papers+ automatic, fast, broad definition of use, - manipulatible, not available
Social reference managers and bookmarking systems
organization of scientific literature
Suggested to provide user statistics of scholarly documents
linear growth
purchase by Elsevier in April 2013 no visible effect
linear growth
purchase by Elsevier in April 2013 no visible effect
highest coverage in Psychology
high coverage and high reader rate: broad distribution and intensive use
particularly high reader rate in Psychology (14.0 vs. 6.6 citation rate) and Biomedical Research (14.3)
moderate positive correlation with citations
0.512** = only documents with readers (765,537)
0.445** = all documents (1.16 million)
intensity, coverage, correlation with citations associated
high coverage in Biomedical Research: Embryology, Biophysics
high reader rate and correlation with citation in General Biomedical Research
low reader rate and correlation with citations for Anatomy & Morphology
high coverage in Clinical Medicine: Geriatrics, Neurology & Neurosurgery
high reader rate and correlation with citation in Neurology & Neurosurgery, Immunology, Allergy
Veterinary Medicine: still high reader rate but low correlation
intensively used but less what is frequently cited
high coverage in Clinical Medicine: Geriatrics, Neurology & Neurosurgery
high reader rate and correlation with citation in Neurology & Neurosurgery, Immunology, Allergy
Veterinary Medicine: still high reader rate but low correlation
intensively used but less what is frequently cited
Scientific: 48.5% - 62.7%
Educational: 15.7% - 24.4%
Professional: 5.8% - 12.9%
no changes, safe to assume that missing readers are distributed almost equally
Microscopy most scientific (63.2%), Veterinary Medicine least (27.7%)
Veterinary Medicine most professionals (18.6%), Experimental Psychology (0.7%) least
Dentistry most Educational (32.5%), General Biomedical Research least (8.6% )
Pearson between % and Spearman correlations:
% Scientific 0.352
% Educational -0.561
% Professional -0.485
% (Educational + Professional) -0.669
If there are more practical readers (not doing research), correlations are lower
PhD students majority (29.5% - 33.2%)
Postgraduate students / postdocs
Academic researcher / other professional / assistant professor / Bachelor’s student / non-academic researcher
professor / associate professor
librarians smallest group (0.4% - 1.4%)
but, because of the restriction to the top 3, the distribution of other user groups cannot be clearly determined. Other than on the level of sectors rankings change with the percentage of available readership status. Considering only those papers with 100% of readership status available, we observe that 33.2% users are PhD students, 17.7% postgraduate students, 11.1% postdocs, 7.2% researchers at an academic institution, 7.0% other professionals, 5.5% assistant professors, 4.7% Bachelor students, 4.5% researchers at a non-academic institution, 4.0% professors, 3.6% associate professors and 1.4% librarians. Since readership information is completely lost for any but the top 3 user groups per paper, this leads to an underestimation of those user groups that are frequently cut off. As the change of ranks between postgraduate students and postdocs, and academic researchers, other professionals, assistant professors and Bachelor students in Figure 3 suggests, this underestimation affects user groups differently.