More Related Content Similar to Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal (20) More from Open Knowledge Maps (20) Download vs. Citation vs. Readership Data:The Case of an Information Systems Journal1. gefördert durch das Kompetenzzentrenprogramm
ISSI 2013 – Altmetrics 2
15 July 2013
know-center.tugraz.at
Download vs. Citation vs.
Readership Data:
The Case of an Information
Systems Journal (RiP)*
Christian Schlögl, Juan Gorraiz,
Christian Gumpenberger, Kris Jack,
Peter Kraker
* Research in Progress
2. © Know-Center 2011
2
Introduction
Many studies have compared download and citation data
(Moed 2005, Bollen & Van De Sompel 2008, Schlögl &
Gorraiz 2011)
Possible sources for download data
Repositories/preprint archives
Open access journals
E-journals
Recently, online reference systems have received a lot of
attention as a possible source for altmetrics
A few studies have compared readership and citation data
(Bar-Ilan 2012, Li and Thelwall 2012 , Kraker et al. 2012)
In this study, we compare citations, downloads, and
readership for the Journal of Strategic Information Systems
3. © Know-Center 2011
3
Research Questions
Are most cited articles the most downloaded ones, and
those which can be found most frequently in user libraries
of the collaborative reference management system
Mendeley?
Do citations, downloads, and readership have different
obsolescence characteristics at publication level?
Are there other features in which citation, download and
readership data differ?
4. © Know-Center 2011
4
Data
The Journal of Strategic Information Systems (JoSIS)
“The Journal of Strategic Information Systems focuses on the
management, business and organizational issues
associated with the introduction and utilization of information
systems as a strategic tool, and considers these issues in a
global context.” http://www.journals.elsevier.com/the-journal-of-strategic-information-systems/
Period of analysis: 2002-2011; 321 documents
Data sources:
ScienceDirect (SD): monthly download data (PDF & HTML)
Scopus: monthly citation data
Mendeley: monthly additions to user libraries (full length
articles)
5. © Know-Center 2011
5
Mendeley
Online reference
management system
Organizing personal
research library
Creating user profile
Reading and annotating
of PDFs
Forming private and
public groups
Sharing of
references/PDFs
Crowdsourced Mendeley
research catalog
2.5 m users
428 m user documents
~75 m unique articles
http://www.mendeley.com/research-papers/
6. © Know-Center 2011
6
Methodology
Preprocessing
Matching documents between ScienceDirect and Scopus
No unique key for SD and Scopus/Different document
types between SD and Scopus
Matching via title, journal, vol/issue, page
Matching documents between Scopus and Mendeley via
title (Levenshtein ratio 1/15.83) – found all but 5
Descriptive statistics
Document types, publication dates, downloads, readers
Correlation analysis
Downloads vs. cites, readers vs. Cites, downloads vs.
readers
7. © Know-Center 2011
7
Results
Downloads per document type
FLAs are the most downloaded document type (94.1%)
All other documents are downloaded at a considerably lower level
Document type n % docs % downloads
Downloads per
doc – relations
Announcement 5 1.6% 0.4% 5.9
Book review 4 1.2% 0.3% 5.5
Contents list 29 9.0% 0.4% 1.0
Editorial Board 29 9.0% 0.6% 1.5
Editorial 49 15.3% 3.3% 4.6
Erratum 1 0.3% 0.1% 5.7
Full length article 181 56.4% 94.1% 35.4
Index 12 3.7% 0.2% 1.3
Miscellaneous 9 2.8% 0.2% 1.8
Publishers note 2 0.6% 0.2% 7.0
321 100% 100%
Source: ScienceDirect; n=321
8. © Know-Center 2011
8
Results
Print publication delay
FLAs are published online more than 1.5 months before print
publication on average.
Document type n
Online date - print
publication date (mean
days)
Announcement 5 -13.2
Book review 4 -40.5
Contents list 29 12.9
Editorial Board 29 12.9
Editorial 49 9.0
Erratum 1 -145.0
Full length article 181 -49.8
Index 12 -4.9
Miscellaneous 9 32.9
Publishers note 2 -13.0
321 -24.9
Source: ScienceDirect; n=321
9. © Know-Center 2011
9
Results
Downloads per publication year (relational)
Download maximum in many cases 1 year after publication
Most downloads in a single year for FLAs published in 2011
DL-year
PY n 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 all
DL/
FLA
2002 13 1.0 2.3 1.7 1.3 1.2 1.4 2.4 2.8 2.8 2.7 19.6 7.4x
2003 21 0.0 1.3 2.2 1.0 1.0 0.9 1.5 1.3 1.5 1.1 11.9 2.8x
2004 17 1.7 2.6 2.1 2.2 2.4 2.7 2.9 2.3 18.9 5.5x
2005 18 1.7 2.3 1.8 2.0 2.4 2.6 2.2 15.0 4.1x
2006 14 0.2 2.4 2.1 1.8 2.1 2.0 2.0 12.5 4.4x
2007 18 0.0 2.7 3.6 3.4 3.5 2.9 16.1 4.4x
2008 16 0.0 2.9 3.5 3.0 2.4 11.8 3.6x
2009 14 3.1 4.0 3.1 10.2 3.6x
2010 21 3.9 4.4 8.3 2.0x
2011 29 0.3 5.6 5.9 1.0x
all 181 1.0 3.7 5.6 6.8 8.9 11.1 16.6 21.4 26.4 29.0 130.4
Source: ScienceDirect; FLA only (n=181)
10. © Know-Center 2011
10
Results
Citations per document type
Different document types in Scopus and ScienceDirect (FLA ≈
articles + conference papers + reviews)
Ca. 25% of all documents not cited (primarily editorials,
conference papers and recent publications)
Doc type no. docs % uncited Cites Cites per doc
type
Article 151 15% 2563 14.8
Conference paper 13 69% 8 0.4
Editorial 33 79% 13 0.2
Review 18 6% 383 20.2
All 215 27% 2967 10.9
Source: Scopus; n=215
11. © Know-Center 2011
11
Results
Citations per publication year
Only a few documents are cited in publication year - citation
maxium is reached several years after publication
Difference to downloads reaching their maximum in the year of
publication or one year later
Pub
year
n
Citation year cites
per doc2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 all
2002 13 2 19 38 69 88 105 158 165 194 199 1037 79.8
2003 14 1 6 21 27 39 35 41 40 39 249 17.8
2004 17 0 15 40 56 74 78 88 107 458 26.9
2005 19 0 16 46 78 76 93 99 408 21.5
2006 14 1 2 14 31 31 53 49 181 12.9
2007 18 1 31 74 92 85 283 15.7
2008 15 3 30 69 83 185 12.3
2009 14 3 34 57 94 6.7
2010 18 5 40 45 2.5
2011 8 14 14 1.8
all 150 2 20 44 106 173 261 410 498 668 772 2954
Source: Scopus; Document types: articles, reviews, conference papers; only cited documents
(n=150)
Special Issue on “Trust
in the Digital Economy“
Special Issue with
conference papers
12. © Know-Center 2011
12
Results
Readers per print publication year
Relative youth of Mendeley (est. 2008), strong increase of its user
base since then (now: 2.5 mio) make obsolescence analyses
difficult – Weighting with user/document growth needed.
Pub
year n
Readership years
Readers
per doc2008 2009 2010 2011
- July
2012
all
2002 13 7 30 126 245 183 591 45.5
2003 21 1 29 58 108 145 341 17.1
2004 17 11 36 107 158 165 477 28.1
2005 18 2 31 79 141 151 404 23.8
2006 14 6 39 88 128 148 409 29.2
2007 18 4 45 129 222 209 609 35.8
2008 16 7 36 99 182 164 488 32.5
2009 14 0 27 111 127 150 415 29.6
2010 21 0 0 84 238 191 513 24.4
2011 29 0 0 4 208 282 494 17.6
all 181 38 273 885 1757 1852 4741
Source: Mendeley; FLA only (n=181)
13. © Know-Center 2011
13
Results
Downloads vs. readers vs. cites (only FLAs)
Moderate to high correlation (Spearman) between downloads and
readers (0.73)
and downloads and citations (0.77)
Moderate correlation between citations and readers (r=0.51)
0
20
40
60
80
100
120
readers
downloads
downloads vs. readers
0
50
100
150
200
250
300
cites
downloads
downloads vs. cites
0
50
100
150
200
250
300
0 20 40 60 80 100 120
cites
readership
readers vs. cites
r=0.73, n=181 r=0.77, n=151 r=0.51, n=151
14. © Know-Center 2011
14
Results
Readership structure of Mendeley articles
2/3 of readership counts come from students
Researchers + Post Docs + Profs ≈ 1/4 of all readership counts
32%
7%
19%
6%
5%
5%
2%
5%
3%
3% 5%
3%
4%
1% 0%
Student (PhD) Student (doctorial) Student (MA)
Student (postgr.) Student (BA) Lecturer
Sen. Lecturer Researcher (academic) Researcher (non-academic)
Post Doc Assist. Prof. Assoc. Prof.
Prof. other Librian
Source: Mendeley; doc type: FLA; n=4741
15. © Know-Center 2011
15
Conclusions
Comparison of different measures not always easy
Different obsolesence characteristics of downloads and
cites (readership to be determined)
Moderate to high correlation between downloads and cites
Moderate correlation between cites and readership data
For representative usage measures, we need to understand
their characteristics on a large scale
To fully understand usage and impact of an article, it will be
important to have many complementary measures with
transparent biases
On the one hand, we need open bibliometric data, on the
other hand, we need a better understanding of the research
process
16. gefördert durch das Kompetenzzentrenprogramm
ISSI 2013 – Altmetrics 2
15 July 2013
know-center.tugraz.at
Thank you very much for your
attention!
Christian Schlögl, Juan Gorraiz,
Christian Gumpenberger, Kris Jack,
Peter Kraker
pkraker@know-center.at