SlideShare uma empresa Scribd logo
1 de 33
Comparing Published Scientific
Journal Articles
to Their Pre-print Versions
Martin Klein Peter Broadwell
@mart1nkle1n @peterbroadwell
with Sharon E. Farb and Todd Grappone
@farbthink, @liber8er
{martinklein,broadwell,farb,grappone}@library.ucla.edu
University of California Los Angeles
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
2
Scientific Output in Numbers
Global STM publishing market > $25 billion
• 55% of this from USA
• 28% from Europe, Middle East
• Journals core part of scholarly communication process
• English language journal revenue: ~ $10 billion
• ~ 70% of that out of libraries’ budget
• > 28k scholarly peer-reviewed journals (+3.5% p.a.)
• ~ 2.5 million articles per year (+3% p.a.)
• 21% of research papers from USA
“STM Report: An Overview of Scientific and Scholarly Publishing”, Mark Ware and Michael Mabe, March 2015
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
3
University of California Publication Impact
“Research Performance of the UC System,” Elsevier, March 2015
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
4
Open Access by Disciplines
“Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al. 2010
http://dx.doi.org/10.1371/journal.pone.0011273
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
5
Open Access Rate Overall
2010
“Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al.
(http://dx.doi.org/10.1371/journal.pone.0011273)
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
6
Open Access Rate Overall
2010
“Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al.
(http://dx.doi.org/10.1371/journal.pone.0011273)
 20.4% OA rate
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
7
Open Access Rate Overall
2010
“Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al.
(http://dx.doi.org/10.1371/journal.pone.0011273)
 20.4% OA rate
2015
“Open Access and Sources of Full-Text Articles in Google Scholar in Different
Subject Fields”, Hammid et al.
(http://dx.doi.org/10.1007/s11192-015-1642-2)
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
8
Open Access Rate Overall
2010
“Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al.
(http://dx.doi.org/10.1371/journal.pone.0011273)
 20.4% OA rate
2015
“Open Access and Sources of Full-Text Articles in Google Scholar in Different
Subject Fields”, Hammid et al.
(http://dx.doi.org/10.1007/s11192-015-1642-2)
 61.1% OA rate
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
9
Pre-print v. Final Published
arXiv.org
• Average annual operating cost for 2013 - 2017:
$826,000
Final Published
• English language STM journals: $10 billion in 2013
http://arxiv.org/help/support/faq#3D
“STM Report: An Overview of Scientific and Scholarly Publishing”, Mark Ware and Michael Mabe, March 2015
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
10
Role of Publisher
• Entrepreneur
• Copyediting
• Tagging
• Marketer
• Distributor
• E-Host
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
11
Value of Publisher
“Once you’ve gone through the peer review process, if you look
at the article that is actually published in a journal, it looks
radically different [to the one submitted due to] that process of
transformation, the copy-editing, the database linking, the data
visualisation tools, making sure that the metadata for the article
is all right, so when people come to [Elsevier database]
ScienceDirect or type a search into Google, they can actually
find what they are looking for on their platforms.”
Gemma Hersh
http://www.thebookseller.com/news/elsevier-defends-its-value-after-open-access-disputes-328037
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
12
Working Assumptions
1. If the publishers’ argument is valid, the text of a
pre-print paper should vary significantly from its
corresponding post-print version.
1. By applying standard similarity measures, we
should be able to detect and quantify such
differences.
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
13
Assembling a pre-print corpus
Source: arXiv.org
• 1.1 million publication records
• Metadata (typical DC, including DOI) obtained
via OAI-PMH interface
• PDF versions of articles available via Amazon’s
S3 service (using “requester pays” option)
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
14
Finding a matching post-print corpus
1. Extract DOIs from arXiv metadata
• 44.5% or articles have DOI
2. CrossRef’s Metadata Search API
• Match by DOI
• Download article & metadata in XML/PDF
 Results in:
• 11,017 full text articles
• Majority published by Elsevier between 2003 and
2015
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
15
Text Comparison Methods
1. Length ratio
2. Levenshtein ratio
3. Cosine similarity
4. Jaccard coefficient
5. Sorensen similarity
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
16
Comparison of Sections
“Analyzing News Events in Non-Traditional Digital Library Collections” M.Klein, P.Broadwell, 2015
http://dx.doi.org/10.1145/2756406.2756948
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
17
Comparison of Sections
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
18
Title Comparison
Explore our findings at http://sologlo.library.ucla.edu/prepost
Papers
Similarity (1 = most similar)
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
1100020003000400050006000700080009000
0102030405060708090100
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
19
Comparison of Sections
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
20
Abstract Comparison
Papers
Similarity (1 = most similar)
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
1100020003000400050006000700080009000
0102030405060708090100
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Explore our findings at http://sologlo.library.ucla.edu/prepost
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
21
10.1016/j.physletb.2006.10.068
Physics Letters B
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
22
Comparison of Sections
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
23
Body Comparison
Papers
Similarity (1 = most similar)
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
110002000300040005000600070008000
0102030405060708090100
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Explore our findings at http://sologlo.library.ucla.edu/prepost
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
24
Publication Dates
Papers
0100030005000
1−90
91−180
181−270
271−360
361−450
451−540
541−630
631−720
>720
Pre−print first
Final published first
Number of days
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
25
Assembling a pre-print corpus
Source: arXiv.org
• 1.1 million publication records
• metadata (typical DC, including DOI) obtained
via OAI-PMH interface
• PDF versions of articles available via Amazon’s
S3 service (using “requester pays” option)
• *Latest version used if multiple available*
• 35% of all arXiv papers have > 1 version
• 58% of our matched papers have > 1 version
• Repeat experiment with *earliest version*
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
26
Publication Dates of Earliest Versions
Papers
Number of days
01000200030004000
1−90
91−180
181−270
271−360
361−450
451−540
541−630
631−720
>720
Pre−print first
Final published first
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
27
Title Deltas
Papers
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
−1000−800−600−400−2000200
1009080706050403020100
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
28
Title Deltas
Papers
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
−1000−800−600−400−2000200
1009080706050403020100
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
29
Title Deltas
Papers
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
−1000−800−600−400−2000200
1009080706050403020100
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
30
Abstract Deltas
Papers
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
−1500−1000−5000500
1009080706050403020100
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
31
Body Deltas
Papers
%ofallpapers
1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0
−1500−1000−50005001000
100806040200
Length
Levenshtein
Cosine
Sorensen
Jaccard
Percentage
Comparing Published Scientific Journal Articles
to Their Pre-print Versions
@mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016
32
Discussion & Future Work
• Single corpus experiment
• Pre-print/final published matches based on:
• DOIs
• CrossRef API results
• UCLA serial subscriptions (majority Elsevier
publications)
• Expand to other disciplines/publishers
• Overlay with ISI Impact factor and usage statistics
• Refine extraction/comparison of authors and
references
• Operate at scale
Comparing Published Scientific
Journal Articles
to Their Pre-print Versions
Martin Klein Peter Broadwell
@mart1nkle1n @peterbroadwell
with Sharon E. Farb and Todd Grappone
@farbthink, @liber8er
{martinklein,broadwell,farb,grappone}@library.ucla.edu
University of California Los Angeles

Mais conteúdo relacionado

Mais procurados

A replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable scienceA replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable scienceBjörn Brembs
 
Bibliosight Project - JournalTOCs Workshop
Bibliosight Project - JournalTOCs WorkshopBibliosight Project - JournalTOCs Workshop
Bibliosight Project - JournalTOCs Workshopazami
 
Why canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarshipWhy canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarshipBjörn Brembs
 
Forging New Links: Libraries in the Semantic Web
Forging New Links: Libraries in the Semantic WebForging New Links: Libraries in the Semantic Web
Forging New Links: Libraries in the Semantic WebGillian Byrne
 
RPI Research in Linked Open Government Systems
RPI Research in Linked Open Government SystemsRPI Research in Linked Open Government Systems
RPI Research in Linked Open Government SystemsJames Hendler
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Herbert Van de Sompel
 
BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?Thomas Meehan
 
Open Access NBIC Workshop April 19, 2011
Open Access NBIC Workshop April 19, 2011Open Access NBIC Workshop April 19, 2011
Open Access NBIC Workshop April 19, 2011Philip Bourne
 
How to build your own citation index
How to build your own citation indexHow to build your own citation index
How to build your own citation indexGESIS
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for LibrariesLukas Koster
 
Federated Search Falls Short
Federated Search Falls ShortFederated Search Falls Short
Federated Search Falls Shortslknight
 
Giving researchers credit for data
Giving researchers credit for dataGiving researchers credit for data
Giving researchers credit for dataJisc
 
Crossref webinar - Maintaining your metadata - latest
Crossref webinar - Maintaining your metadata - latestCrossref webinar - Maintaining your metadata - latest
Crossref webinar - Maintaining your metadata - latestCrossref
 

Mais procurados (20)

A replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable scienceA replication crisis in the making: how we reward unreliable science
A replication crisis in the making: how we reward unreliable science
 
Bibliosight Project - JournalTOCs Workshop
Bibliosight Project - JournalTOCs WorkshopBibliosight Project - JournalTOCs Workshop
Bibliosight Project - JournalTOCs Workshop
 
Why canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarshipWhy canceling subscriptions may just yet save scholarship
Why canceling subscriptions may just yet save scholarship
 
ER&L KBART Update
ER&L KBART UpdateER&L KBART Update
ER&L KBART Update
 
Creating Pockets of Persistence
Creating Pockets of PersistenceCreating Pockets of Persistence
Creating Pockets of Persistence
 
Forging New Links: Libraries in the Semantic Web
Forging New Links: Libraries in the Semantic WebForging New Links: Libraries in the Semantic Web
Forging New Links: Libraries in the Semantic Web
 
RPI Research in Linked Open Government Systems
RPI Research in Linked Open Government SystemsRPI Research in Linked Open Government Systems
RPI Research in Linked Open Government Systems
 
Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013Hiberlink: Investigating Reference Rot, December 2013
Hiberlink: Investigating Reference Rot, December 2013
 
Semantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAMESemantic Web Applications in Libraries: The Road to BIBFRAME
Semantic Web Applications in Libraries: The Road to BIBFRAME
 
BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?BIBFRAME : the future of cataloguing?
BIBFRAME : the future of cataloguing?
 
MLA CE Course: Third-Party PubMed Tools
MLA CE Course: Third-Party PubMed ToolsMLA CE Course: Third-Party PubMed Tools
MLA CE Course: Third-Party PubMed Tools
 
Third-Party PubMed Tools
Third-Party PubMed ToolsThird-Party PubMed Tools
Third-Party PubMed Tools
 
Presentation1
Presentation1Presentation1
Presentation1
 
Open Access NBIC Workshop April 19, 2011
Open Access NBIC Workshop April 19, 2011Open Access NBIC Workshop April 19, 2011
Open Access NBIC Workshop April 19, 2011
 
How to build your own citation index
How to build your own citation indexHow to build your own citation index
How to build your own citation index
 
Linked Open Data for Libraries
Linked Open Data for LibrariesLinked Open Data for Libraries
Linked Open Data for Libraries
 
Federated Search Falls Short
Federated Search Falls ShortFederated Search Falls Short
Federated Search Falls Short
 
Giving researchers credit for data
Giving researchers credit for dataGiving researchers credit for data
Giving researchers credit for data
 
Bracke may4-1
Bracke may4-1Bracke may4-1
Bracke may4-1
 
Crossref webinar - Maintaining your metadata - latest
Crossref webinar - Maintaining your metadata - latestCrossref webinar - Maintaining your metadata - latest
Crossref webinar - Maintaining your metadata - latest
 

Destaque

Jason chinchilla
Jason chinchillaJason chinchilla
Jason chinchillaJason Paz
 
Companies that produce & distribute rn b genre
Companies that produce & distribute rn b genreCompanies that produce & distribute rn b genre
Companies that produce & distribute rn b genrefahrinsultana
 
Ood启思录01
Ood启思录01Ood启思录01
Ood启思录01yiditushe
 
Carol vernallis theory
Carol vernallis theoryCarol vernallis theory
Carol vernallis theoryfahrinsultana
 
Interrogating the Politics and Performativity of Web Archiving
Interrogating the Politics and Performativity of Web ArchivingInterrogating the Politics and Performativity of Web Archiving
Interrogating the Politics and Performativity of Web ArchivingJessica Ogden
 

Destaque (7)

Jason chinchilla
Jason chinchillaJason chinchilla
Jason chinchilla
 
Companies that produce & distribute rn b genre
Companies that produce & distribute rn b genreCompanies that produce & distribute rn b genre
Companies that produce & distribute rn b genre
 
Ood启思录01
Ood启思录01Ood启思录01
Ood启思录01
 
Carol vernallis theory
Carol vernallis theoryCarol vernallis theory
Carol vernallis theory
 
About Webtechnologies
About WebtechnologiesAbout Webtechnologies
About Webtechnologies
 
Interrogating the Politics and Performativity of Web Archiving
Interrogating the Politics and Performativity of Web ArchivingInterrogating the Politics and Performativity of Web Archiving
Interrogating the Politics and Performativity of Web Archiving
 
pi950.pdf
pi950.pdfpi950.pdf
pi950.pdf
 

Semelhante a Comparing Published Scientific Journal Articles to Their Pre-print Versions

Preprints: a journey though time
Preprints: a journey though timePreprints: a journey though time
Preprints: a journey though timeGraham Steel
 
Publishing and impact Wageningen University IL for PhD 20141202
Publishing and impact  Wageningen University IL for PhD 20141202Publishing and impact  Wageningen University IL for PhD 20141202
Publishing and impact Wageningen University IL for PhD 20141202Hugo Besemer
 
British Library
British LibraryBritish Library
British Libraryclarivate
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourBria Davis
 
Author workshop TU Delft 20111122
Author workshop TU Delft 20111122Author workshop TU Delft 20111122
Author workshop TU Delft 20111122Anke Versteeg
 
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVESSTRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVESNicolaie Constantinescu
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
Publish be cited, or perish
Publish be cited, or perishPublish be cited, or perish
Publish be cited, or perishWouter Gerritsma
 
The future of scholarly publishing: where do we go from here?
The future of scholarly publishing: where do we go from here? The future of scholarly publishing: where do we go from here?
The future of scholarly publishing: where do we go from here? Research Information Network
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasAngelo Salatino
 
Where to publish_130709
Where to publish_130709Where to publish_130709
Where to publish_130709opl10
 
The Initiative for Open Citations and the OpenCitations Corpus
The Initiative for Open Citations and the OpenCitations CorpusThe Initiative for Open Citations and the OpenCitations Corpus
The Initiative for Open Citations and the OpenCitations CorpusUniversity of Bologna
 
Publishing and impact 20141028
Publishing and impact 20141028Publishing and impact 20141028
Publishing and impact 20141028Hugo Besemer
 
Science in the context of journals, Open, and the future
Science in the context of journals, Open, and the futureScience in the context of journals, Open, and the future
Science in the context of journals, Open, and the futureBenjamin Laken
 
Holy Cross Lunch and Learn
Holy Cross Lunch and LearnHoly Cross Lunch and Learn
Holy Cross Lunch and Learnrachelmccullough
 

Semelhante a Comparing Published Scientific Journal Articles to Their Pre-print Versions (20)

Preprints: a journey though time
Preprints: a journey though timePreprints: a journey though time
Preprints: a journey though time
 
Publishing and impact Wageningen University IL for PhD 20141202
Publishing and impact  Wageningen University IL for PhD 20141202Publishing and impact  Wageningen University IL for PhD 20141202
Publishing and impact Wageningen University IL for PhD 20141202
 
British Library
British LibraryBritish Library
British Library
 
A Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation BehaviourA Science Mapping Analysis Of Blood Donation Behaviour
A Science Mapping Analysis Of Blood Donation Behaviour
 
Author workshop TU Delft 20111122
Author workshop TU Delft 20111122Author workshop TU Delft 20111122
Author workshop TU Delft 20111122
 
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVESSTRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
STRETCHING THE BOUNDARIES OF PUBLISHING: ALTERNATIVES
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology:  A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology:  A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
Publish be cited, or perish
Publish be cited, or perishPublish be cited, or perish
Publish be cited, or perish
 
SciVerse @ TJU
SciVerse @ TJUSciVerse @ TJU
SciVerse @ TJU
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
The future of scholarly publishing: where do we go from here?
The future of scholarly publishing: where do we go from here? The future of scholarly publishing: where do we go from here?
The future of scholarly publishing: where do we go from here?
 
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research AreasThe Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
The Computer Science Ontology: A Large-Scale Taxonomy of Research Areas
 
Stevan Harnad - Scholarly/Scientific Impact Metrics in the Open Access Era
Stevan Harnad - Scholarly/Scientific Impact Metrics in the Open Access EraStevan Harnad - Scholarly/Scientific Impact Metrics in the Open Access Era
Stevan Harnad - Scholarly/Scientific Impact Metrics in the Open Access Era
 
Open Access Publishing: More Readers, More Impact
Open Access Publishing: More Readers, More ImpactOpen Access Publishing: More Readers, More Impact
Open Access Publishing: More Readers, More Impact
 
Where to publish_130709
Where to publish_130709Where to publish_130709
Where to publish_130709
 
The Initiative for Open Citations and the OpenCitations Corpus
The Initiative for Open Citations and the OpenCitations CorpusThe Initiative for Open Citations and the OpenCitations Corpus
The Initiative for Open Citations and the OpenCitations Corpus
 
Publishing and impact 20141028
Publishing and impact 20141028Publishing and impact 20141028
Publishing and impact 20141028
 
Eps
EpsEps
Eps
 
Science in the context of journals, Open, and the future
Science in the context of journals, Open, and the futureScience in the context of journals, Open, and the future
Science in the context of journals, Open, and the future
 
Holy Cross Lunch and Learn
Holy Cross Lunch and LearnHoly Cross Lunch and Learn
Holy Cross Lunch and Learn
 

Mais de Martin Klein

On the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly WebOn the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly WebMartin Klein
 
On the Persistence of Persistent Identifiers of the Scholarly Web
 On the Persistence of Persistent Identifiers of the Scholarly Web On the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly WebMartin Klein
 
An Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly OrphansAn Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly OrphansMartin Klein
 
Who is Asking - Humans and Machines Experience a Different Scholarly Web
Who is Asking - Humans and Machines  Experience a Different Scholarly WebWho is Asking - Humans and Machines  Experience a Different Scholarly Web
Who is Asking - Humans and Machines Experience a Different Scholarly WebMartin Klein
 
The Memento Tracer Framework: Balancing Quality and Scalability for Web Arch...
The Memento Tracer Framework: Balancing Quality and Scalability  for Web Arch...The Memento Tracer Framework: Balancing Quality and Scalability  for Web Arch...
The Memento Tracer Framework: Balancing Quality and Scalability for Web Arch...Martin Klein
 
Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity f...
Memento Tracer An Innovative Approach Towards Balancing  Scale and Fidelity f...Memento Tracer An Innovative Approach Towards Balancing  Scale and Fidelity f...
Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity f...Martin Klein
 
Comparing the Performance of OAI-PMH with ResourceSync
Comparing the Performance of OAI-PMH with ResourceSyncComparing the Performance of OAI-PMH with ResourceSync
Comparing the Performance of OAI-PMH with ResourceSyncMartin Klein
 
Evaluating Memento Service Optimizations
Evaluating Memento Service OptimizationsEvaluating Memento Service Optimizations
Evaluating Memento Service OptimizationsMartin Klein
 
An Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly OrphansAn Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly OrphansMartin Klein
 
A Vision of the Library’s Role in Archiving Scholarly Artifacts
A Vision of the Library’s Role  in Archiving Scholarly ArtifactsA Vision of the Library’s Role  in Archiving Scholarly Artifacts
A Vision of the Library’s Role in Archiving Scholarly ArtifactsMartin Klein
 
First Steps in Research Data Management Under Constraints of a National Secur...
First Steps in Research Data Management Under Constraints of a National Secur...First Steps in Research Data Management Under Constraints of a National Secur...
First Steps in Research Data Management Under Constraints of a National Secur...Martin Klein
 
Smart Routing of Memento Requests
Smart Routing of Memento RequestsSmart Routing of Memento Requests
Smart Routing of Memento RequestsMartin Klein
 
Building Event Collections from Crawling Web Archives
Building Event Collections from Crawling Web ArchivesBuilding Event Collections from Crawling Web Archives
Building Event Collections from Crawling Web ArchivesMartin Klein
 
A Web-Centric Pipeline for Archiving Scholarly Artifacts
A Web-Centric Pipeline for Archiving Scholarly ArtifactsA Web-Centric Pipeline for Archiving Scholarly Artifacts
A Web-Centric Pipeline for Archiving Scholarly ArtifactsMartin Klein
 
Focused Crawl of Web Archives to Build Event Collections
Focused Crawl of Web Archives to Build Event CollectionsFocused Crawl of Web Archives to Build Event Collections
Focused Crawl of Web Archives to Build Event CollectionsMartin Klein
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live WebMartin Klein
 
Robust Linking to Web Resources
Robust Linking to Web ResourcesRobust Linking to Web Resources
Robust Linking to Web ResourcesMartin Klein
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for RepositoriesMartin Klein
 
Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDMartin Klein
 
Using the Memento Framework to Assess Content Drift in Scholarly Communication
Using the Memento Framework to Assess Content Drift in Scholarly CommunicationUsing the Memento Framework to Assess Content Drift in Scholarly Communication
Using the Memento Framework to Assess Content Drift in Scholarly CommunicationMartin Klein
 

Mais de Martin Klein (20)

On the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly WebOn the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly Web
 
On the Persistence of Persistent Identifiers of the Scholarly Web
 On the Persistence of Persistent Identifiers of the Scholarly Web On the Persistence of Persistent Identifiers of the Scholarly Web
On the Persistence of Persistent Identifiers of the Scholarly Web
 
An Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly OrphansAn Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly Orphans
 
Who is Asking - Humans and Machines Experience a Different Scholarly Web
Who is Asking - Humans and Machines  Experience a Different Scholarly WebWho is Asking - Humans and Machines  Experience a Different Scholarly Web
Who is Asking - Humans and Machines Experience a Different Scholarly Web
 
The Memento Tracer Framework: Balancing Quality and Scalability for Web Arch...
The Memento Tracer Framework: Balancing Quality and Scalability  for Web Arch...The Memento Tracer Framework: Balancing Quality and Scalability  for Web Arch...
The Memento Tracer Framework: Balancing Quality and Scalability for Web Arch...
 
Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity f...
Memento Tracer An Innovative Approach Towards Balancing  Scale and Fidelity f...Memento Tracer An Innovative Approach Towards Balancing  Scale and Fidelity f...
Memento Tracer An Innovative Approach Towards Balancing Scale and Fidelity f...
 
Comparing the Performance of OAI-PMH with ResourceSync
Comparing the Performance of OAI-PMH with ResourceSyncComparing the Performance of OAI-PMH with ResourceSync
Comparing the Performance of OAI-PMH with ResourceSync
 
Evaluating Memento Service Optimizations
Evaluating Memento Service OptimizationsEvaluating Memento Service Optimizations
Evaluating Memento Service Optimizations
 
An Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly OrphansAn Institutional Perspective to Rescue Scholarly Orphans
An Institutional Perspective to Rescue Scholarly Orphans
 
A Vision of the Library’s Role in Archiving Scholarly Artifacts
A Vision of the Library’s Role  in Archiving Scholarly ArtifactsA Vision of the Library’s Role  in Archiving Scholarly Artifacts
A Vision of the Library’s Role in Archiving Scholarly Artifacts
 
First Steps in Research Data Management Under Constraints of a National Secur...
First Steps in Research Data Management Under Constraints of a National Secur...First Steps in Research Data Management Under Constraints of a National Secur...
First Steps in Research Data Management Under Constraints of a National Secur...
 
Smart Routing of Memento Requests
Smart Routing of Memento RequestsSmart Routing of Memento Requests
Smart Routing of Memento Requests
 
Building Event Collections from Crawling Web Archives
Building Event Collections from Crawling Web ArchivesBuilding Event Collections from Crawling Web Archives
Building Event Collections from Crawling Web Archives
 
A Web-Centric Pipeline for Archiving Scholarly Artifacts
A Web-Centric Pipeline for Archiving Scholarly ArtifactsA Web-Centric Pipeline for Archiving Scholarly Artifacts
A Web-Centric Pipeline for Archiving Scholarly Artifacts
 
Focused Crawl of Web Archives to Build Event Collections
Focused Crawl of Web Archives to Build Event CollectionsFocused Crawl of Web Archives to Build Event Collections
Focused Crawl of Web Archives to Build Event Collections
 
Creating Topical Collections: Web Archives vs. Live Web
Creating Topical Collections:Web Archives vs. Live WebCreating Topical Collections:Web Archives vs. Live Web
Creating Topical Collections: Web Archives vs. Live Web
 
Robust Linking to Web Resources
Robust Linking to Web ResourcesRobust Linking to Web Resources
Robust Linking to Web Resources
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for Repositories
 
Discovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCIDDiscovering Scholarly Orphans Using ORCID
Discovering Scholarly Orphans Using ORCID
 
Using the Memento Framework to Assess Content Drift in Scholarly Communication
Using the Memento Framework to Assess Content Drift in Scholarly CommunicationUsing the Memento Framework to Assess Content Drift in Scholarly Communication
Using the Memento Framework to Assess Content Drift in Scholarly Communication
 

Último

Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 

Último (20)

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

Comparing Published Scientific Journal Articles to Their Pre-print Versions

  • 1. Comparing Published Scientific Journal Articles to Their Pre-print Versions Martin Klein Peter Broadwell @mart1nkle1n @peterbroadwell with Sharon E. Farb and Todd Grappone @farbthink, @liber8er {martinklein,broadwell,farb,grappone}@library.ucla.edu University of California Los Angeles
  • 2. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 2 Scientific Output in Numbers Global STM publishing market > $25 billion • 55% of this from USA • 28% from Europe, Middle East • Journals core part of scholarly communication process • English language journal revenue: ~ $10 billion • ~ 70% of that out of libraries’ budget • > 28k scholarly peer-reviewed journals (+3.5% p.a.) • ~ 2.5 million articles per year (+3% p.a.) • 21% of research papers from USA “STM Report: An Overview of Scientific and Scholarly Publishing”, Mark Ware and Michael Mabe, March 2015
  • 3. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 3 University of California Publication Impact “Research Performance of the UC System,” Elsevier, March 2015
  • 4. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 4 Open Access by Disciplines “Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al. 2010 http://dx.doi.org/10.1371/journal.pone.0011273
  • 5. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 5 Open Access Rate Overall 2010 “Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al. (http://dx.doi.org/10.1371/journal.pone.0011273)
  • 6. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 6 Open Access Rate Overall 2010 “Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al. (http://dx.doi.org/10.1371/journal.pone.0011273)  20.4% OA rate
  • 7. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 7 Open Access Rate Overall 2010 “Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al. (http://dx.doi.org/10.1371/journal.pone.0011273)  20.4% OA rate 2015 “Open Access and Sources of Full-Text Articles in Google Scholar in Different Subject Fields”, Hammid et al. (http://dx.doi.org/10.1007/s11192-015-1642-2)
  • 8. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 8 Open Access Rate Overall 2010 “Open Access to the Scientific Journal Literature: Situation 2009”, Björk B-C et al. (http://dx.doi.org/10.1371/journal.pone.0011273)  20.4% OA rate 2015 “Open Access and Sources of Full-Text Articles in Google Scholar in Different Subject Fields”, Hammid et al. (http://dx.doi.org/10.1007/s11192-015-1642-2)  61.1% OA rate
  • 9. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 9 Pre-print v. Final Published arXiv.org • Average annual operating cost for 2013 - 2017: $826,000 Final Published • English language STM journals: $10 billion in 2013 http://arxiv.org/help/support/faq#3D “STM Report: An Overview of Scientific and Scholarly Publishing”, Mark Ware and Michael Mabe, March 2015
  • 10. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 10 Role of Publisher • Entrepreneur • Copyediting • Tagging • Marketer • Distributor • E-Host
  • 11. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 11 Value of Publisher “Once you’ve gone through the peer review process, if you look at the article that is actually published in a journal, it looks radically different [to the one submitted due to] that process of transformation, the copy-editing, the database linking, the data visualisation tools, making sure that the metadata for the article is all right, so when people come to [Elsevier database] ScienceDirect or type a search into Google, they can actually find what they are looking for on their platforms.” Gemma Hersh http://www.thebookseller.com/news/elsevier-defends-its-value-after-open-access-disputes-328037
  • 12. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 12 Working Assumptions 1. If the publishers’ argument is valid, the text of a pre-print paper should vary significantly from its corresponding post-print version. 1. By applying standard similarity measures, we should be able to detect and quantify such differences.
  • 13. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 13 Assembling a pre-print corpus Source: arXiv.org • 1.1 million publication records • Metadata (typical DC, including DOI) obtained via OAI-PMH interface • PDF versions of articles available via Amazon’s S3 service (using “requester pays” option)
  • 14. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 14 Finding a matching post-print corpus 1. Extract DOIs from arXiv metadata • 44.5% or articles have DOI 2. CrossRef’s Metadata Search API • Match by DOI • Download article & metadata in XML/PDF  Results in: • 11,017 full text articles • Majority published by Elsevier between 2003 and 2015
  • 15. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 15 Text Comparison Methods 1. Length ratio 2. Levenshtein ratio 3. Cosine similarity 4. Jaccard coefficient 5. Sorensen similarity
  • 16. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 16 Comparison of Sections “Analyzing News Events in Non-Traditional Digital Library Collections” M.Klein, P.Broadwell, 2015 http://dx.doi.org/10.1145/2756406.2756948
  • 17. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 17 Comparison of Sections
  • 18. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 18 Title Comparison Explore our findings at http://sologlo.library.ucla.edu/prepost Papers Similarity (1 = most similar) %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 1100020003000400050006000700080009000 0102030405060708090100 Length Levenshtein Cosine Sorensen Jaccard Percentage
  • 19. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 19 Comparison of Sections
  • 20. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 20 Abstract Comparison Papers Similarity (1 = most similar) %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 1100020003000400050006000700080009000 0102030405060708090100 Length Levenshtein Cosine Sorensen Jaccard Percentage Explore our findings at http://sologlo.library.ucla.edu/prepost
  • 21. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 21 10.1016/j.physletb.2006.10.068 Physics Letters B
  • 22. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 22 Comparison of Sections
  • 23. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 23 Body Comparison Papers Similarity (1 = most similar) %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 110002000300040005000600070008000 0102030405060708090100 Length Levenshtein Cosine Sorensen Jaccard Percentage Explore our findings at http://sologlo.library.ucla.edu/prepost
  • 24. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 24 Publication Dates Papers 0100030005000 1−90 91−180 181−270 271−360 361−450 451−540 541−630 631−720 >720 Pre−print first Final published first Number of days
  • 25. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 25 Assembling a pre-print corpus Source: arXiv.org • 1.1 million publication records • metadata (typical DC, including DOI) obtained via OAI-PMH interface • PDF versions of articles available via Amazon’s S3 service (using “requester pays” option) • *Latest version used if multiple available* • 35% of all arXiv papers have > 1 version • 58% of our matched papers have > 1 version • Repeat experiment with *earliest version*
  • 26. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 26 Publication Dates of Earliest Versions Papers Number of days 01000200030004000 1−90 91−180 181−270 271−360 361−450 451−540 541−630 631−720 >720 Pre−print first Final published first
  • 27. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 27 Title Deltas Papers %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 −1000−800−600−400−2000200 1009080706050403020100 Length Levenshtein Cosine Sorensen Jaccard Percentage
  • 28. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 28 Title Deltas Papers %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 −1000−800−600−400−2000200 1009080706050403020100 Length Levenshtein Cosine Sorensen Jaccard Percentage
  • 29. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 29 Title Deltas Papers %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 −1000−800−600−400−2000200 1009080706050403020100 Length Levenshtein Cosine Sorensen Jaccard Percentage
  • 30. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 30 Abstract Deltas Papers %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 −1500−1000−5000500 1009080706050403020100 Length Levenshtein Cosine Sorensen Jaccard Percentage
  • 31. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 31 Body Deltas Papers %ofallpapers 1 ... 0.9 0.9 ... 0.8 0.8 ... 0.7 0.7 ... 0.6 0.6 ... 0.5 0.5 ... 0.4 0.4 ... 0.3 0.3 ... 0.2 0.2 ... 0.1 0.1 ... 0 −1500−1000−50005001000 100806040200 Length Levenshtein Cosine Sorensen Jaccard Percentage
  • 32. Comparing Published Scientific Journal Articles to Their Pre-print Versions @mart1nkle1n #jcdl2016, Newark, NJ, 06/21/2016 32 Discussion & Future Work • Single corpus experiment • Pre-print/final published matches based on: • DOIs • CrossRef API results • UCLA serial subscriptions (majority Elsevier publications) • Expand to other disciplines/publishers • Overlay with ISI Impact factor and usage statistics • Refine extraction/comparison of authors and references • Operate at scale
  • 33. Comparing Published Scientific Journal Articles to Their Pre-print Versions Martin Klein Peter Broadwell @mart1nkle1n @peterbroadwell with Sharon E. Farb and Todd Grappone @farbthink, @liber8er {martinklein,broadwell,farb,grappone}@library.ucla.edu University of California Los Angeles