SlideShare uma empresa Scribd logo
1 de 34
Baixar para ler offline
Tracking citations to research
software via PIDs
!1
Lars Holm Nielsen & Stephanie van de Sandt
CERN, IT Department, Digital Repositories Section
!3
Upload Describe Publish
Zenodo
Zenodo + Software 2014
2019
75% of the world’s software DOIs
!5
The Asclepias project
• Brokering and harvesting scholarly links
• Open citation data in ADS, Crossref/DataCite
Event Data and Europe PMC.
• ~6000 citations to Zenodo records (January
2019)
!6
Author of scholarly manuscripts.
Developer of scientific software.
Roles of researcher
CreditSoftware
Author
Publisher
Discovery
Services
Write paper
Publish paper
Developer Repository
Write software
Publish software
The system
Citations
Cite my software as …
Find software
Systemic key issues
!8
Developer Repository
Discovery
service
Write software Publish software1. Information loss
2. Dilution of citations
Information loss: Author
• What do I cite? Paper, software, software version
• Citation recommendations
• Reference manager (e.g. BibTeX, Endnote, …)
• Exists? Correct? BibTeX Latency
• “Software” type doesn’t exists.
• No version field support in BibTeX.
• Persistent identifier for software
• zero, one or more?
!9
Include citation in paper
The “Challenge”: a messy world
• Triangle.py
• 10.5281/zenodo.10598

10.5281/zenodo.11020
• Corner.py
• 10.5281/zenodo.45906
• 10.5281/zenodo.53155
• 10.5281/zenodo.591491 (Concept)
• JOSS
• 10.21105/joss.00024
• ASCL (Astronomy Source Code Library)
• https://ascl.net/1702.002
!10
Author
Publisher
Discovery
Services
Write paper
Publish paper
Developer Repository
Write software
Publish software
The system
Citations
Cite my software as …
Find software
Information loss: Publisher
• Policy prohibits software citation.
• Journal authoring system defects:
• Information from BibTeX is lost
• CrossRef DOI ➞ JATS XML ➞ PDF
• Copy editors needs training
• Journal -> Scientific society -> Publisher -> Vendor platform ->
Outsourcing
!12
Include citation in products
Information loss: Metadata quality
• Example (cite arXiv identifiers)
• yymm.nnnnv1 (published 2012)
• yymm.nnnnv7 (published 2017)
• Paper from 2015 cites “yymm.nnnn”
• Result: 2015 paper cites 2017 software

because metadata doesn’t say 2012.
!13
Author
Publisher
Discovery
Services
Write paper
Publish paper
Developer Repository
Write software
Publish software
The system
Citations
Cite my software as …
Find software
Information loss: Discovery Service
• Paper ingest workflow:
• 1) identify link 2) create/update local record? 

3) attribute citation link.
• Policy prohibits software records.
• Ingestion workflow incapable of identifying software
• Non-trivial to identify local record.
!15
Ingest paper and track citations
Discovery service differences
!16
• Europe PMC: 71 different publishers
• Springer, F1000, PLOS, PeerJ,
Pensoft, Frontiers
• Crossref: 57 different publishers
• Springer, F1000, Pensoft, PeerJ,
Wiley
• NASA ADS: 38 different publishers
• arXiv, American Astronomical
Society, Springer, IOP, Oxford
University Press, Elsevier
Discovery service differences
!17
Author
Publisher
Discovery
Services
Write paper
Publish paper
Developer Repository
Write software
Publish software
The system
Citations
Cite my software as …
Find software
Dilution of citations: Developer/Repository
• Persistent identifier: Software, software paper,
discovery system identifier (i.e. zero, one or more PIDs)
• Dynamic authorship
• Software name changes
• Granularity: DOI per version, module, module version…
!19
Ensure software is citable
Dilution of citations: Citation recommendations
!20
Dilution of citations: Citation recommendations
!21
• Loss of specificity:
• “lmfit 0.9.5 or later [4]
was used“
Dilution of citations: BibTeX latency
!22
• trackpy
• Cite what
you used
Author
Publisher
Discovery
Services
Write paper
Publish paper
Developer Repository
Write software
Publish software
Systemic issues
Citations
Cite my software as …
!24
How can we expect researchers to
change culture, if we can’t even
track citations to software?
Generality
• Search/Replace:
• “Software” with “Data”, … (except “Paper”)
• “Astronomy” with “Physics”, …
• Problems:
• Information loss, dilution of citations,
closed proprietary systems.
!25
The “fix” of a chain linked system
!26
Systemic issues need joint effort to be solved.
The “Fix”: Publisher
• Software citation policy
• Authoring system:
• Working with vendor to
produce correct DOI
metadata and JATS
XML (machine
readability).
!27
The “Fix”: Discovery
• Ingestion workflow for
software with DataCite
DOI, handling:
• Synonymous PIDs
• Version relationships
• BibTeX generation fixes
!28
The “Fix”: Repository
• DOI Versioning:
• Version relationships
• Version number field
• DataCite metadata
• Dynamic authorship
• BibTeX generation fixes
• GitHub integration
!29
v1.0 v1.2
SW
The “Challenge”: Roll-up citations for software
• Goal: Proper credit for software
• Roll-up citations for software
• Synonymous PIDs (identifies a resource)
• Version relationships (identifies group resources)
• Citation relationships (links between groups of resources)
• Expert curation (actions in individual systems)
• Information needed by all:
• Discovery systems
• Repositories
• Problem: Share and exchange information about scholarly links.
!30
Software citation today
• Primarily self-citation (~80% of citations)
• Not necessarily bad (SW citation principles)
• Citation count >5 (~2%)
• Generic libraries (neural networks, stats
visualisation, …)
• Citation recommendations in a bad shape
• Each recommendation has a unique story
!31
Software citation today
!32
Citation growth rate
is higher than
Zenodo uploads
growth rate
• Software citation is in a pretty bad shape …but
don’t despair (still infancy)!

• Systemic issues can only be solved with joint efforts

• Problems exposed also impact PIDs knowledge
graphs in general.
!33
Software citation today
Author
Publisher
Discovery
Services
Write paper
Publish paper
Developer Repository
Write software
Publish software
Thanks for listening…
Citations
Cite my software as …

Mais conteúdo relacionado

Semelhante a Tracking Citations to Research Software via PIDs

Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
Daniel S. Katz
 

Semelhante a Tracking Citations to Research Software via PIDs (20)

Software citation
Software citationSoftware citation
Software citation
 
20160607 citation4software panel
20160607 citation4software panel20160607 citation4software panel
20160607 citation4software panel
 
Citation and reproducibility in software
Citation and reproducibility in softwareCitation and reproducibility in software
Citation and reproducibility in software
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Organising and Managing Research
Organising and Managing ResearchOrganising and Managing Research
Organising and Managing Research
 
Software Citation: Principles, Implementation, and Impact
Software Citation:  Principles, Implementation, and ImpactSoftware Citation:  Principles, Implementation, and Impact
Software Citation: Principles, Implementation, and Impact
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرنمحاضرة برنامج Nails  لتحليل الدراسات السابقة د.شروق المقرن
محاضرة برنامج Nails لتحليل الدراسات السابقة د.شروق المقرن
 
لتحليل الدراسات السابقة Nails محاضرة برنامج
  لتحليل الدراسات السابقة Nails محاضرة برنامج  لتحليل الدراسات السابقة Nails محاضرة برنامج
لتحليل الدراسات السابقة Nails محاضرة برنامج
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
OpenAIRE Broker Service and the Dashboard for Content Providers
OpenAIRE Broker Service and the Dashboard for Content ProvidersOpenAIRE Broker Service and the Dashboard for Content Providers
OpenAIRE Broker Service and the Dashboard for Content Providers
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine Jones
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19CROSSMINER Project at OW2con'19
CROSSMINER Project at OW2con'19
 
Software Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental ScanSoftware Repositories for Research -- An Environmental Scan
Software Repositories for Research -- An Environmental Scan
 
Kasyanov "Web of Science API Workshop"
Kasyanov "Web of Science API Workshop"Kasyanov "Web of Science API Workshop"
Kasyanov "Web of Science API Workshop"
 
DataHub
DataHubDataHub
DataHub
 

Mais de ETH-Bibliothek

Mais de ETH-Bibliothek (20)

17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
17:15 Kolloquium – Donnerstag, 27. Februar 2020 – Das Büro darf nicht nur Mit...
 
ETH Zurich's DOI Desk
ETH Zurich's DOI DeskETH Zurich's DOI Desk
ETH Zurich's DOI Desk
 
10 YearsDOI Desk at ETH Zurich
10 YearsDOI Desk at ETH Zurich10 YearsDOI Desk at ETH Zurich
10 YearsDOI Desk at ETH Zurich
 
OriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
OriginStamp: Trusted Time Stamping via the Bitcoin BlockchainOriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
OriginStamp: Trusted Time Stamping via the Bitcoin Blockchain
 
Persistent Identifiers for Scientific Data at CSCS
Persistent Identifiers for Scientific Data at CSCSPersistent Identifiers for Scientific Data at CSCS
Persistent Identifiers for Scientific Data at CSCS
 
Building Open Research Infrastructure with PIDs
Building Open Research Infrastructure with PIDsBuilding Open Research Infrastructure with PIDs
Building Open Research Infrastructure with PIDs
 
DataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying KnowledgeDataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying Knowledge
 
Bilder online recherchieren – Tipps und Tricks
Bilder online recherchieren – Tipps und TricksBilder online recherchieren – Tipps und Tricks
Bilder online recherchieren – Tipps und Tricks
 
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
Transkribus. Eine Forschungsplattform für die automatisierte Digitalisierung,...
 
Herausforderungen im Datenmanagement von Metadaten
Herausforderungen im Datenmanagement von MetadatenHerausforderungen im Datenmanagement von Metadaten
Herausforderungen im Datenmanagement von Metadaten
 
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
Gamification und Game Design: Theorie und Praxis jenseits der Heilsversprechu...
 
Data Management in Research –WhyandHow?
Data Management in Research –WhyandHow?Data Management in Research –WhyandHow?
Data Management in Research –WhyandHow?
 
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
Openness, exchange, FAIR DATA – oh brave new world that has such vision! (Dr....
 
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen GlobusCitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
CitizenScience - Freiwillige lokalisieren Bilder im virtuellen Globus
 
FORUM - Das Bottom-up Gremium der ETH-Bibliothek
FORUM - Das Bottom-up Gremium der ETH-BibliothekFORUM - Das Bottom-up Gremium der ETH-Bibliothek
FORUM - Das Bottom-up Gremium der ETH-Bibliothek
 
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
Digitaler Zugang zu Lesespuren - Das Projekt „Thomas Mann Nachlassbibliothek“...
 
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek „Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
„Ex meis libris“ - Die Provenienzdatenbank der ETH-Bibliothek
 
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
Wenn Algorithmen Zeitschriften lesen - Vom Mehrwert automatisierter Textanrei...
 
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
Die Research Collection der ETH Zürich - Ein Repositorium für Publikationen u...
 
The ETH Zurich DOI Desk
The ETH Zurich DOI Desk The ETH Zurich DOI Desk
The ETH Zurich DOI Desk
 

Último

CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
Silpa
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Silpa
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Silpa
 

Último (20)

CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.Reboulia: features, anatomy, morphology etc.
Reboulia: features, anatomy, morphology etc.
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 

Tracking Citations to Research Software via PIDs

  • 1. Tracking citations to research software via PIDs !1 Lars Holm Nielsen & Stephanie van de Sandt CERN, IT Department, Digital Repositories Section
  • 2.
  • 4. Zenodo + Software 2014 2019 75% of the world’s software DOIs
  • 5. !5 The Asclepias project • Brokering and harvesting scholarly links • Open citation data in ADS, Crossref/DataCite Event Data and Europe PMC. • ~6000 citations to Zenodo records (January 2019)
  • 6. !6 Author of scholarly manuscripts. Developer of scientific software. Roles of researcher CreditSoftware
  • 7. Author Publisher Discovery Services Write paper Publish paper Developer Repository Write software Publish software The system Citations Cite my software as … Find software
  • 8. Systemic key issues !8 Developer Repository Discovery service Write software Publish software1. Information loss 2. Dilution of citations
  • 9. Information loss: Author • What do I cite? Paper, software, software version • Citation recommendations • Reference manager (e.g. BibTeX, Endnote, …) • Exists? Correct? BibTeX Latency • “Software” type doesn’t exists. • No version field support in BibTeX. • Persistent identifier for software • zero, one or more? !9 Include citation in paper
  • 10. The “Challenge”: a messy world • Triangle.py • 10.5281/zenodo.10598
 10.5281/zenodo.11020 • Corner.py • 10.5281/zenodo.45906 • 10.5281/zenodo.53155 • 10.5281/zenodo.591491 (Concept) • JOSS • 10.21105/joss.00024 • ASCL (Astronomy Source Code Library) • https://ascl.net/1702.002 !10
  • 11. Author Publisher Discovery Services Write paper Publish paper Developer Repository Write software Publish software The system Citations Cite my software as … Find software
  • 12. Information loss: Publisher • Policy prohibits software citation. • Journal authoring system defects: • Information from BibTeX is lost • CrossRef DOI ➞ JATS XML ➞ PDF • Copy editors needs training • Journal -> Scientific society -> Publisher -> Vendor platform -> Outsourcing !12 Include citation in products
  • 13. Information loss: Metadata quality • Example (cite arXiv identifiers) • yymm.nnnnv1 (published 2012) • yymm.nnnnv7 (published 2017) • Paper from 2015 cites “yymm.nnnn” • Result: 2015 paper cites 2017 software
 because metadata doesn’t say 2012. !13
  • 14. Author Publisher Discovery Services Write paper Publish paper Developer Repository Write software Publish software The system Citations Cite my software as … Find software
  • 15. Information loss: Discovery Service • Paper ingest workflow: • 1) identify link 2) create/update local record? 
 3) attribute citation link. • Policy prohibits software records. • Ingestion workflow incapable of identifying software • Non-trivial to identify local record. !15 Ingest paper and track citations
  • 16. Discovery service differences !16 • Europe PMC: 71 different publishers • Springer, F1000, PLOS, PeerJ, Pensoft, Frontiers • Crossref: 57 different publishers • Springer, F1000, Pensoft, PeerJ, Wiley • NASA ADS: 38 different publishers • arXiv, American Astronomical Society, Springer, IOP, Oxford University Press, Elsevier
  • 18. Author Publisher Discovery Services Write paper Publish paper Developer Repository Write software Publish software The system Citations Cite my software as … Find software
  • 19. Dilution of citations: Developer/Repository • Persistent identifier: Software, software paper, discovery system identifier (i.e. zero, one or more PIDs) • Dynamic authorship • Software name changes • Granularity: DOI per version, module, module version… !19 Ensure software is citable
  • 20. Dilution of citations: Citation recommendations !20
  • 21. Dilution of citations: Citation recommendations !21 • Loss of specificity: • “lmfit 0.9.5 or later [4] was used“
  • 22. Dilution of citations: BibTeX latency !22 • trackpy • Cite what you used
  • 23. Author Publisher Discovery Services Write paper Publish paper Developer Repository Write software Publish software Systemic issues Citations Cite my software as …
  • 24. !24 How can we expect researchers to change culture, if we can’t even track citations to software?
  • 25. Generality • Search/Replace: • “Software” with “Data”, … (except “Paper”) • “Astronomy” with “Physics”, … • Problems: • Information loss, dilution of citations, closed proprietary systems. !25
  • 26. The “fix” of a chain linked system !26 Systemic issues need joint effort to be solved.
  • 27. The “Fix”: Publisher • Software citation policy • Authoring system: • Working with vendor to produce correct DOI metadata and JATS XML (machine readability). !27
  • 28. The “Fix”: Discovery • Ingestion workflow for software with DataCite DOI, handling: • Synonymous PIDs • Version relationships • BibTeX generation fixes !28
  • 29. The “Fix”: Repository • DOI Versioning: • Version relationships • Version number field • DataCite metadata • Dynamic authorship • BibTeX generation fixes • GitHub integration !29 v1.0 v1.2 SW
  • 30. The “Challenge”: Roll-up citations for software • Goal: Proper credit for software • Roll-up citations for software • Synonymous PIDs (identifies a resource) • Version relationships (identifies group resources) • Citation relationships (links between groups of resources) • Expert curation (actions in individual systems) • Information needed by all: • Discovery systems • Repositories • Problem: Share and exchange information about scholarly links. !30
  • 31. Software citation today • Primarily self-citation (~80% of citations) • Not necessarily bad (SW citation principles) • Citation count >5 (~2%) • Generic libraries (neural networks, stats visualisation, …) • Citation recommendations in a bad shape • Each recommendation has a unique story !31
  • 32. Software citation today !32 Citation growth rate is higher than Zenodo uploads growth rate
  • 33. • Software citation is in a pretty bad shape …but don’t despair (still infancy)!
 • Systemic issues can only be solved with joint efforts
 • Problems exposed also impact PIDs knowledge graphs in general. !33 Software citation today
  • 34. Author Publisher Discovery Services Write paper Publish paper Developer Repository Write software Publish software Thanks for listening… Citations Cite my software as …