SlideShare a Scribd company logo
1 of 23
A decades experiences in transparent and interactive
publication of FAIR data and software via an end-to-end XML
publishing platform
Scott Edmunds 0000-0001-6444-1436
https://www.telegraph.co.uk/technology/2020/05/16/neil-fergusons-imperial-model-could-devastating-software-mistake/
Scientists: need to convince public + politicians
The “Infodemic” Era
Imperial College: Report 9
GigaSolution: rewarding open data & code
http://gigasciencejournal.com/
Publishes “Data Notes” for CC0 data, “Tech Notes” for OSI software.
Transparent: Open Peer Review and linked to preprints. Mandates code in repo.
Integrated GigaDB repository: DataCite DOIs, no size limits, code snapshots, APC covers curation
http://gigadb.org/
GigaSolution: rewarding open data & code
0 1 2
4
2
5 6 6
8
2
0
0
0
0
0
0
3 2
1
0
0
0
0
0
0
0
1
1
2
1
0
0
0
0
0
0
0
5 0
2
2 1
2
8
7
28
35
34
48
45
0
10
20
30
40
50
60
70
GigaScience software/workflow papers (Technical Notes), 2012-2021
Galaxy Snakemake Nextflow CWL Other
Changes in how research is shared: workflows
gigagalaxy.net
Experience publishing Galaxy workflows: 2013
https://doi.org/10.1186/2047-217X-3-23
• Downloadable as virtual hard-disk/available as Amazon Machine Image
• Unclear how to describe licensing & security issues?
Experience publishing VMs: 2014
https://doi.org/10.1186/s13742-015-0087-0
https://doi.org/10.1186/s13742-015-0073-6
• From 2015 increasing submissions leveraging containers
• Promoted experiments in standardization such as bioboxes
• Integrated with CodeOcean & tested with Gigantum
• Carried out reproducibility case-studies (can be expensive)
Experience publishing containers: 2015
Independent execution of computations underlying research articles.
Experience publishing CODECHECK: 2020
CODECHECK tackles one of the main challenges of computational research by supporting
codecheckers with a workflow, guidelines and tools to evaluate computer programs
underlying scientific papers. The independent time-stamped runs conducted by
codecheckers will award a “certificate of executable computation” and increase availability,
discovery and reproducibility of crucial artefacts for computational sciences.
https://codecheck.org.uk/
Experience publishing CODECHECK: 2020
http://gigasciencejournal.com/blog/codecheck-certificate/
https://doi.org/10.1093/gigascience/giaa026
Experience publishing CODECHECK: 2020
https://www.nature.com/articles/d41586-020-01685-y
http://doi.org/10.5281/zenodo.3865491
Tech really the
bottleneck
Process much too
slow & expensive
Still too focused on
narrative and static
“version of record”
Still not very FAIR
Lessons learned in a decade of data & software
publishing:
D ATA C O D E E N T I T I E S FA C T S S TA B I L I T Y
A new approach
Follow the Software
Paradigm?
C O D E R E L E A S E F O R K U P D AT E R E P E AT
Deconstruct the “Version
of Record”?
Move to new XML end-to-end pipeline
Custom end-to-end workflow makes integrations simpler with one integration point
Features of new journal:
Main advantage of workflow is XML from start to end
https://gigabytejournal.com/
Several modules acting as one platform: no
import/export of files, so fast and accurate
Cutting out production allows huge time & cost saving
(currently as little as 3.5hrs per paper)
Any number of versions can be published instantly,
including typographic quality PDF or updates/forks
Allows instantaneous switch of views
Leverage embeddable dynamic content/widgets
Initial focus on forkable open source products:
data + software + update papers
Focusing beyond VoR allows different views…
16
What does focusing on Data + software + XML allow us to do?
https://doi.org/10.46471/gigabyte.1
https://doi.org/10.46471/gigabyte.6
High quality rich XML
CC-BY open licensed, open citations, open corpus
Structured schema.org metadata
No hiding of material in supplemental files
Maximise use of persistent identifiers (PIDs)
Who
ORCID IDs
CASRAI contributorship
Funder (Fundref)
Institution (ROR)
What
Species (NCBI, fishbase)
Cell/strain (RRID)
How
Equipment (RRID)
Software (RRID, bio.tools)
Output
Data (accessions, DOIs)
Results (DOIs)
Helping to make research “AI-ready”
Thinking about users: machines
Interaction: increasing understanding & trust
https://doi.org/10.46471/gigabyte.13
Do you trust an immunoinformatics tool to predict whether memory T cells generated from
previous exposure to common cold coronaviruses are cross-reactive against SARSCoV2?
Interaction: software and code via Stencila and CodeOcean
http://gigasciencejournal.com/blog/gigabyte-executable-research-articles/
Code Ocean “Compute Capsule”: readers can
directly interact with software via an embedded
version in the article; or deploy and run in their
own cloud computing environment.
Popout Stencila “Executable Research Article”
where figures are accompanied by editable
code blocks that can be edited and re-
executed to immediately see the changes.
Interact with Stenci.la “code chunks” & Code Ocean “compute
capsules” of COVID-19 immunoinformatics papers
https://doi.org/10.46471/gigabyte.13
A new way of publishing FAIR research with new tech
• Share & get credit for updatable data & software papers
• Follow the software paradigm, bring your research to life
• XML makes it much easier to embed interactive content
• Use automation & interaction to increase scrutiny & trust
• XML only workflow cuts time and cost to publish
• Rethink “Version of Record”: focus on facts/data/code &
discard the packaging
Help us change scientific publishing, contact: editorial@gigabytejournal.com
https://gigabytejournal.com/
Thanks to:
@GigaByteJournal
facebook.com/GigaScience
http://gigasciencejournal.com/blog/
Follow us:
+
Weibo
& WeChat
Laurie Goodman, Publisher
Nicole Nogoy, Editor
Hans Zauner, Assistant Editor
Hongling Zhao, Assistant Editor
Peter Li, Head of IT
Chris Hunter, Lead BioCurator
Chris Armit, Data Scientist
Mary Ann Tulli, Data Editor
Rija Ménagé, Senior Software Engineer
Ken Cho, Systems Programmer Analyst
Chen Qi, Shenzhen Office.
https://gigabytejournal.com/
editorial@gigabytejournal.com
Questions?

More Related Content

Similar to IDW2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform

Introduction to Data Models & Cisco's NextGen Device Level APIs: an overview
Introduction to Data Models & Cisco's NextGen Device Level APIs: an overviewIntroduction to Data Models & Cisco's NextGen Device Level APIs: an overview
Introduction to Data Models & Cisco's NextGen Device Level APIs: an overviewCisco DevNet
 
NI Trend Watch 2015
NI Trend Watch 2015NI Trend Watch 2015
NI Trend Watch 2015Hank Lydick
 
Automated Test Outlook 2017
Automated Test Outlook 2017Automated Test Outlook 2017
Automated Test Outlook 2017Hank Lydick
 
Programming IoT Gateways with macchina.io
Programming IoT Gateways with macchina.ioProgramming IoT Gateways with macchina.io
Programming IoT Gateways with macchina.ioGünter Obiltschnig
 
Open Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewOpen Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewKrishna-Kumar
 
OpenPicus Keynote at Web of Things workshop 2012 in Newcastle
OpenPicus Keynote at Web of Things workshop 2012 in NewcastleOpenPicus Keynote at Web of Things workshop 2012 in Newcastle
OpenPicus Keynote at Web of Things workshop 2012 in NewcastleopenPicus
 
IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...
IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...
IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...Mark Goldstein
 
End-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooEnd-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooJason Dai
 
Digitizing your factory the open source way
Digitizing your factory the open source wayDigitizing your factory the open source way
Digitizing your factory the open source wayChristofer Dutz
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
 
pythonOCC PDE2009 presentation
pythonOCC PDE2009 presentationpythonOCC PDE2009 presentation
pythonOCC PDE2009 presentationThomas Paviot
 
IOT SOLUTIONS FROM INTEL
IOT SOLUTIONS FROM INTELIOT SOLUTIONS FROM INTEL
IOT SOLUTIONS FROM INTELonebee kumar
 
Node-RED Interoperability Test
Node-RED Interoperability TestNode-RED Interoperability Test
Node-RED Interoperability TestBoris Adryan
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteGigaScience, BGI Hong Kong
 
OpenWhisk - Serverless Architecture
OpenWhisk - Serverless Architecture OpenWhisk - Serverless Architecture
OpenWhisk - Serverless Architecture Dev_Events
 
IoT Standardisation Panel
IoT Standardisation PanelIoT Standardisation Panel
IoT Standardisation PanelDuncan Wilson
 

Similar to IDW2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform (20)

Introduction to Data Models & Cisco's NextGen Device Level APIs: an overview
Introduction to Data Models & Cisco's NextGen Device Level APIs: an overviewIntroduction to Data Models & Cisco's NextGen Device Level APIs: an overview
Introduction to Data Models & Cisco's NextGen Device Level APIs: an overview
 
Inspector Gadget 2023 - CalCPA.pdf
Inspector Gadget 2023 - CalCPA.pdfInspector Gadget 2023 - CalCPA.pdf
Inspector Gadget 2023 - CalCPA.pdf
 
NI Trend Watch 2015
NI Trend Watch 2015NI Trend Watch 2015
NI Trend Watch 2015
 
Automated Test Outlook 2017
Automated Test Outlook 2017Automated Test Outlook 2017
Automated Test Outlook 2017
 
Programming IoT Gateways with macchina.io
Programming IoT Gateways with macchina.ioProgramming IoT Gateways with macchina.io
Programming IoT Gateways with macchina.io
 
Open Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewOpen Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - Overview
 
OpenPicus Keynote at Web of Things workshop 2012 in Newcastle
OpenPicus Keynote at Web of Things workshop 2012 in NewcastleOpenPicus Keynote at Web of Things workshop 2012 in Newcastle
OpenPicus Keynote at Web of Things workshop 2012 in Newcastle
 
Digital transformation and AI @Edge
Digital transformation and AI @EdgeDigital transformation and AI @Edge
Digital transformation and AI @Edge
 
IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...
IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...
IEEE Computer Society Phoenix Chapter - Internet of Things Innovations & Mega...
 
End-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics ZooEnd-to-End Big Data AI with Analytics Zoo
End-to-End Big Data AI with Analytics Zoo
 
Digitizing your factory the open source way
Digitizing your factory the open source wayDigitizing your factory the open source way
Digitizing your factory the open source way
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
 
pythonOCC PDE2009 presentation
pythonOCC PDE2009 presentationpythonOCC PDE2009 presentation
pythonOCC PDE2009 presentation
 
IOT SOLUTIONS FROM INTEL
IOT SOLUTIONS FROM INTELIOT SOLUTIONS FROM INTEL
IOT SOLUTIONS FROM INTEL
 
Resume_Pratik
Resume_PratikResume_Pratik
Resume_Pratik
 
Node-RED Interoperability Test
Node-RED Interoperability TestNode-RED Interoperability Test
Node-RED Interoperability Test
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
OpenWhisk - Serverless Architecture
OpenWhisk - Serverless Architecture OpenWhisk - Serverless Architecture
OpenWhisk - Serverless Architecture
 
IoT Standardisation Panel
IoT Standardisation PanelIoT Standardisation Panel
IoT Standardisation Panel
 
IoT standardisation
IoT standardisationIoT standardisation
IoT standardisation
 

More from GigaScience, BGI Hong Kong

Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...GigaScience, BGI Hong Kong
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...GigaScience, BGI Hong Kong
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...GigaScience, BGI Hong Kong
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixGigaScience, BGI Hong Kong
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserGigaScience, BGI Hong Kong
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...GigaScience, BGI Hong Kong
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...GigaScience, BGI Hong Kong
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveGigaScience, BGI Hong Kong
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...GigaScience, BGI Hong Kong
 
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...GigaScience, BGI Hong Kong
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventGigaScience, BGI Hong Kong
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...GigaScience, BGI Hong Kong
 

More from GigaScience, BGI Hong Kong (20)

Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
Laurie Goodman: Sharing and Reusing Cell Image Data, ASCB/EMBO 2017 Subgroup ...
 
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"eventSusanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
Susanna Sansone at the Knowledge Dialogues/ODHK "Beyond Open"event
 
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
Jie Zheng at #ICG12: PhenoSpD: an atlas of phenotypic correlations and a mult...
 

Recently uploaded

Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsDobusch Leonhard
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clonechaudhary charan shingh university
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxzaydmeerab121
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024Jene van der Heide
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11GelineAvendao
 
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdfDECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdfDivyaK787011
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterHanHyoKim
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
projectile motion, impulse and moment
projectile  motion, impulse  and  momentprojectile  motion, impulse  and  moment
projectile motion, impulse and momentdonamiaquintan2
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlshansessene
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 

Recently uploaded (20)

Science (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and PitfallsScience (Communication) and Wikipedia - Potentials and Pitfalls
Science (Communication) and Wikipedia - Potentials and Pitfalls
 
Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
whole genome sequencing new and its types including shortgun and clone by clone
whole genome sequencing new  and its types including shortgun and clone by clonewhole genome sequencing new  and its types including shortgun and clone by clone
whole genome sequencing new and its types including shortgun and clone by clone
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
AZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTXAZOTOBACTER AS BIOFERILIZER.PPTX
AZOTOBACTER AS BIOFERILIZER.PPTX
 
well logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptxwell logging & petrophysical analysis.pptx
well logging & petrophysical analysis.pptx
 
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024GenAI talk for Young at Wageningen University & Research (WUR) March 2024
GenAI talk for Young at Wageningen University & Research (WUR) March 2024
 
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
WEEK 4 PHYSICAL SCIENCE QUARTER 3 FOR G11
 
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdfDECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
DECOMPOSITION PATHWAYS of TM-alkyl complexes.pdf
 
PLASMODIUM. PPTX
PLASMODIUM. PPTXPLASMODIUM. PPTX
PLASMODIUM. PPTX
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarter
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
projectile motion, impulse and moment
projectile  motion, impulse  and  momentprojectile  motion, impulse  and  moment
projectile motion, impulse and moment
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girls
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
Quarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and FunctionsQuarter 4_Grade 8_Digestive System Structure and Functions
Quarter 4_Grade 8_Digestive System Structure and Functions
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 

IDW2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform

  • 1. A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform Scott Edmunds 0000-0001-6444-1436
  • 3. GigaSolution: rewarding open data & code http://gigasciencejournal.com/ Publishes “Data Notes” for CC0 data, “Tech Notes” for OSI software. Transparent: Open Peer Review and linked to preprints. Mandates code in repo.
  • 4. Integrated GigaDB repository: DataCite DOIs, no size limits, code snapshots, APC covers curation http://gigadb.org/ GigaSolution: rewarding open data & code
  • 5. 0 1 2 4 2 5 6 6 8 2 0 0 0 0 0 0 3 2 1 0 0 0 0 0 0 0 1 1 2 1 0 0 0 0 0 0 0 5 0 2 2 1 2 8 7 28 35 34 48 45 0 10 20 30 40 50 60 70 GigaScience software/workflow papers (Technical Notes), 2012-2021 Galaxy Snakemake Nextflow CWL Other Changes in how research is shared: workflows
  • 7. https://doi.org/10.1186/2047-217X-3-23 • Downloadable as virtual hard-disk/available as Amazon Machine Image • Unclear how to describe licensing & security issues? Experience publishing VMs: 2014
  • 8. https://doi.org/10.1186/s13742-015-0087-0 https://doi.org/10.1186/s13742-015-0073-6 • From 2015 increasing submissions leveraging containers • Promoted experiments in standardization such as bioboxes • Integrated with CodeOcean & tested with Gigantum • Carried out reproducibility case-studies (can be expensive) Experience publishing containers: 2015
  • 9. Independent execution of computations underlying research articles. Experience publishing CODECHECK: 2020 CODECHECK tackles one of the main challenges of computational research by supporting codecheckers with a workflow, guidelines and tools to evaluate computer programs underlying scientific papers. The independent time-stamped runs conducted by codecheckers will award a “certificate of executable computation” and increase availability, discovery and reproducibility of crucial artefacts for computational sciences. https://codecheck.org.uk/
  • 10. Experience publishing CODECHECK: 2020 http://gigasciencejournal.com/blog/codecheck-certificate/ https://doi.org/10.1093/gigascience/giaa026
  • 11. Experience publishing CODECHECK: 2020 https://www.nature.com/articles/d41586-020-01685-y http://doi.org/10.5281/zenodo.3865491
  • 12. Tech really the bottleneck Process much too slow & expensive Still too focused on narrative and static “version of record” Still not very FAIR Lessons learned in a decade of data & software publishing:
  • 13. D ATA C O D E E N T I T I E S FA C T S S TA B I L I T Y A new approach Follow the Software Paradigm? C O D E R E L E A S E F O R K U P D AT E R E P E AT Deconstruct the “Version of Record”?
  • 14. Move to new XML end-to-end pipeline Custom end-to-end workflow makes integrations simpler with one integration point
  • 15. Features of new journal: Main advantage of workflow is XML from start to end https://gigabytejournal.com/ Several modules acting as one platform: no import/export of files, so fast and accurate Cutting out production allows huge time & cost saving (currently as little as 3.5hrs per paper) Any number of versions can be published instantly, including typographic quality PDF or updates/forks Allows instantaneous switch of views Leverage embeddable dynamic content/widgets Initial focus on forkable open source products: data + software + update papers
  • 16. Focusing beyond VoR allows different views… 16 What does focusing on Data + software + XML allow us to do? https://doi.org/10.46471/gigabyte.1
  • 17. https://doi.org/10.46471/gigabyte.6 High quality rich XML CC-BY open licensed, open citations, open corpus Structured schema.org metadata No hiding of material in supplemental files Maximise use of persistent identifiers (PIDs) Who ORCID IDs CASRAI contributorship Funder (Fundref) Institution (ROR) What Species (NCBI, fishbase) Cell/strain (RRID) How Equipment (RRID) Software (RRID, bio.tools) Output Data (accessions, DOIs) Results (DOIs) Helping to make research “AI-ready” Thinking about users: machines
  • 18. Interaction: increasing understanding & trust https://doi.org/10.46471/gigabyte.13 Do you trust an immunoinformatics tool to predict whether memory T cells generated from previous exposure to common cold coronaviruses are cross-reactive against SARSCoV2?
  • 19. Interaction: software and code via Stencila and CodeOcean http://gigasciencejournal.com/blog/gigabyte-executable-research-articles/ Code Ocean “Compute Capsule”: readers can directly interact with software via an embedded version in the article; or deploy and run in their own cloud computing environment. Popout Stencila “Executable Research Article” where figures are accompanied by editable code blocks that can be edited and re- executed to immediately see the changes.
  • 20. Interact with Stenci.la “code chunks” & Code Ocean “compute capsules” of COVID-19 immunoinformatics papers https://doi.org/10.46471/gigabyte.13
  • 21. A new way of publishing FAIR research with new tech • Share & get credit for updatable data & software papers • Follow the software paradigm, bring your research to life • XML makes it much easier to embed interactive content • Use automation & interaction to increase scrutiny & trust • XML only workflow cuts time and cost to publish • Rethink “Version of Record”: focus on facts/data/code & discard the packaging Help us change scientific publishing, contact: editorial@gigabytejournal.com https://gigabytejournal.com/
  • 22. Thanks to: @GigaByteJournal facebook.com/GigaScience http://gigasciencejournal.com/blog/ Follow us: + Weibo & WeChat Laurie Goodman, Publisher Nicole Nogoy, Editor Hans Zauner, Assistant Editor Hongling Zhao, Assistant Editor Peter Li, Head of IT Chris Hunter, Lead BioCurator Chris Armit, Data Scientist Mary Ann Tulli, Data Editor Rija Ménagé, Senior Software Engineer Ken Cho, Systems Programmer Analyst Chen Qi, Shenzhen Office. https://gigabytejournal.com/ editorial@gigabytejournal.com