SlideShare uma empresa Scribd logo
1 de 59
FAIRy stories
Tales from building the
FAIR Research Commons
Carole Goble
The University of Manchester, UK
ELIXIR UK Head of Node
FAIRDOM Coordinator
Software Sustainability Institute UK
carole.goble@manchester.ac.uk
INCF Neuroinformatics 2019, Warsaw, September 1-2, 2019
FAIR Guiding Principles for Scientific Data Management and
Stewardship, Scientific Data 3, 160018 (2016)
doi:10.1038/sdata.2016.18
A Digital Object Research Commons
organising DOs for a field and across fields.
A “shared space” where investigators can store, share, access, connect
and interact with digital objects generated from research, and use
them. Not a Database or Data warehouse.
repositories
zoo
registries
zoo
https://medium.com/@rgrossman1/a-proposed-end-to-end-principle-for-data-commons-5872f2fa8a47 [Bob Grossman, 2018]
A Digital Object Research Commons
organising DOs for a field and across fields.
A “shared space” where investigators can store, share, access, connect
and interact with digital objects generated from research, and do
more data-intensive research. Not a Database or Data warehouse.
Ecosystem of pooled
community resources
Federation with many entry
points
Collectively created, owned or
shared by community
Mixed degrees of control
[Ian Fore, 2019]
We are all trying to build
A FAIR Research Commons
We are all trying to build
A FAIR Research Commons
We are all trying to build
A FAIR Research Commons
An “ad hoc” commons “in the Wild”
Using FAIR as a general principle
Fragmented
ecosystem of pooled
community resources
Distributed
federation with many
entry points and
many providers
Each has its APIs,
Web interfaces, Data
Submission,Tool
deployment
23 countries
15 communities
Including health
Held together with standards, metadata
mark-up, common identifiers, registries,
workflows, shared vision, hard work, love
and hope.
National
datasets
Community,
Public datasets
http://elixir-europe.org
Uber FAIR Life Science Commons
Federation over an ecosystem of different fields
Ecosystem of
FAIR innovative
tools
Publish
FAIR life
science data
A zoo of Catalogues of
tools, data, workflows,
computing resources …
Our first FAIRy tale:
Finding* stuff in a pre-existing ecosystem
EOSC Dataset Minimum Information
https://eosc-edmi.github.io/
Minimum information
metadata guideline to find and
access datasets reusing existing
data models and interfaces.
Conventions for using
schema.org
Find, Access and Index
Google Dataset Search
Small, Lightweight, Viral
A little bit of Semantics everywhere
*and a bit of provenance, licencing
Our first FAIRy tale:
Finding* stuff in a pre-existing ecosystem
Structured data descriptors in web
pages
Low barrier universal mark-up
Harvesting, indexing, search
Exchange & register without API
Automated curation
A little bit of Semantics everywhere
*and a bit of provenance, licencing
Our first FAIRy tale:
Finding* stuff in a pre-existing ecosystem
A little bit of Semantics everywhere
The Goldilocks Principle
Scale out mark-up for a federation
Dataset
Properties
91 -> 5 + 8
Data Exchange: Without an API
MarRef → BioSamples
https://github.com/EBIBioSamples/bioschemas_marref_demo/blob/master/Summary.md
Bioschemas markup added
to MarRef pages
Markup crawled using BuzzBang
Data included as a BioSample Curation
A happy ending approaches
Endorsed by ELIXIR
First types -> Schema.org
Goldilocks
• Esp. good for small data providers
• Types & Profiles debates/explosion
• Domain ontology reuse challenges
• Elegance vs best for tools
• Trolls
Community based demonstration (Toxicology, Rare Disease)
Validation, mark-up & harvesting tools
A subset
of the
FAIR
Principles
Is your resource FAIR?
Is your data/workflow/model FAIR from first to last?
The FAIR Data Principles
vs
FAIR the Nice Intention
2014 - Lorentz workshop
2015 - BioHackathon
2016 - Published
Grassroots activity that has
become a top down one.
Many efforts before….
Scientific Data 3, 160018
(2016)
doi:10.1038/sdata.2016.18
2nd Story: FAIR.
Once Upon A Time…
https://www.incf.org/activities/standards-and-best-practices/what-is-fair
The FAIR
principles in
the paper…
actually only in a
break out box
https://www.incf.org/activities/standards-and-best-practices/what-is-fair
Machine and human readable
data formats and metadata
that is compliant to many
community standards, that
persists, and tells you the
provenance of the data and
how its cross-linked
Data and metadata are
locatable and accessible by
GUIDs, standard access
protocols and have the least
restrictive licenses
Access
Reproducibility
Automation
Policy
Practice
Proclamation
“enhancing the ability
of machines to
automatically find and
use data or any digital
object, and support its
reuse by individuals”
FAIR Principles
more than a fuzzy feeling
The message spread
across the lands….
The message spread
across the lands….
Simple words are powerful
things that can be mangled.
Simple concepts are not so
simple to implement.
Once size does not fit all.
Beware FAIR zealots and
vested interests.
We { are | will be | always have been } FAIR
Use our platform /technology to be FAIR.
Even if its not what FAIR meant
Only we control FAIR.
Our way is the right way.
We don’t know what it means to implement
FAIR but we want to measure and certify it.
“FAIR principles: interpretations and implementation considerations” J Data
Intelligence, coming soon in 2019…. which was still contentious
FAIRy tale -> Reality!
• An aspiration, a journey.
• A call for machine actionability. of
data and metadata.
• Ambiguous.
• A spectrum.
• Domain respectful.
• Implementable with todays
protocols and standards.
• A subset of indicators:
– ROI cost/benefit, impact, community
need, sustainability of repository,
quality of content/service….
• Work in progress.
Principles are… Principles are not…
• A standard.
• Just about humans.
• Strict.
• Technology specific.
• Only for one domain.
• About inventing new
protocols.
• One size fits all.
• Anything to do with quality.
• Synonymous with open.
• Tablets of stone.
• Mons et al Cloudy, increasingly FAIR; Revisiting the FAIR Data guiding principles for the European Open Science Cloud. Information Services &
Use. 37. 1-8. 10.3233/ISU-170824.
• Dunning et al Are the FAIR Data Principles fair? IDCC17
FAIRy Stories about FAIR
• Its not about Open
• Its not about a resource’s
Quality or Impact
• Its not actually about
Harmonising all metadata to
one schema.
The FAIR Hype
Clarity
Infrastructure
Methodologies
Incentives
Cost/benefit analysis
FAIR is a Journey….
Concepts
for FAIR
Impleme
ntation
FAIR
Culture
FAIR
Ecosystem
Skills
for
FAIR
Incentives
and
Metrics
Invest
ment
in FAIR
Turning FAIR into Reality, EC Report, 2018
Review Criteria for Endorsement of Standards and
Best Practices, 2018 DOI: 10.5281/zenodo.2535741
Subset of principles
applied to standards
and best practices
The INCF Commons
and its Resources
themselves?
“INCF supports the FAIR (Findable, Accessible, Interoperable,
Reusable) principles, and adherence to them is a requirement for
an INCF-endorsed standard or best practice.”
https://www.incf.org/activities/standards-and-best-practices
Defining and Implementing FAIR
Clarity
Metrics / Indicators
Maturity Models
Manual / Automated Assessment
FAIRification Methodologies
• At the first mile
• At the last mile
• For the legacy
Toolkits,Tools and Services
Compliance
Awareness
Expectation
setting
Self-evaluation
Reporting
Certification
Endorsement
Judgement
Regulation
Comparison
Monitoring
Review
Quality
Contract
By Providers,
Users &
Community
By Community By ???
https://fairshake.cloud/
http://blog.ukdataservice.ac.uk/fair-data-assessment-tool/
https://www.howfairismydata.com/
Unhappy ending
• Subjective
• Hard to interpret and compare
• Weak transparency
• Judgemental
• Drift to Quality review
• Independent of the community
• Occasionally barking mad
Community pushback
https://fairshake.cloud/
http://blog.ukdataservice.ac.uk/fair-data-assessment-tool/
https://www.howfairismydata.com/
Dunkelziffer
“Not everything that can be
counted counts.
Not everything that counts can
be counted”
[William Bruce Cameron]
“FAIR is non-trivial, and
domain specific at anything
other than the most superficial
level” Wilkinson
Matrix of indicators
Maturity levels for
each
+
*The MetricTide, https://responsiblemetrics.org/the-metric-tide/
A FAIR Assessment
Transparent
evaluation
What, Who, How
Objective evaluations
Narrative feedback on fails
Indicators
Robustness,
Humility,
Transparency,
Diversity,
Reflexivity*
Context
Community standards
Incremental
Cost/benefit
Not just a score
Non judgmental
Scope for novelty
Transparent evaluation Eat the Dog Food
Design-Build-Test-Learn
indicators and evaluation
Maturity Model
Value Based Assessment
Selection
Goal Setting
Process planning
Modelling
Transformation
Publishing
[Susheel Varma]
A FAIR Assessment
Capability Maturity Model
of entities & their capabilities
Indicators and metrics
measuring levels
Foundational
Components
FAIRification
Process
Awareness and Policy
Standards and Guidelines
People
Infrastructure
Value Based
Assessment
Selection
Goal Setting
Process planning
Modelling
Transformation
Publishing
Impl.Outcome:
Dataset
Persistent Identification
Data Set Discovery
Machine Readability
DataAccess and Usage
Preservation and Sustainability
FAIR Data Maturity ModelWG
A FAIR Assessment
[Oya Deniz Beyan, 2019]
Next meeting 12th September 2019
Sessions at Helsinki RDA Plenary October 2019
Licence
Metadata includes information about the licence under which the data can be reusedMandatory
Metadata includes licence information in the appropriate element of the metadata
standard used
Metadata refers to a standard reuse licenceRecommended
Metadata includes information about consent for reuse (e.g. personal data)
Metadata refers to a machine-understandable reuse licenceOptional
FAIR Data Maturity ModelWG
An “easy” indicator….
“R1.1. (meta)data are released with a clear and
accessible data usage license”
Format Allows
-- -- --
non-standard human readable access
standard
open standard reuse
& machine readable
clear reuse criteria
“
“
“
“
“ “
“
A trickier indicator…
“R1.3. (meta)data meet domain-relevant
community standards”FAIR Data Maturity ModelWG
Mandatory
Recommended
Metadata complies with a community standard
Data complies with a community standard
Metadata is expressed in compliance with a machine-understandable community standard
Data is expressed in compliance with a machine-understandable community standard
Neuroshapes
Metadata
Portal
Reviewers
Suppose there isn’t a standard or its not up to it?
Indicators have to be community specific
Librarian’s view point vs Genomics view point?
How is it validated? JSON and SHACL validators.
How is it captured? Spreadsheets.
Interoperability is nearly always purpose specific
• Community governed “indicators” not metrics
• Automated objective scale up & out
• Sanity check put into practice
https://fairsharing.github.io/FAIR-Evaluator-FrontEnd/#!/
Community
creates
Maturity
Indicators
Registered,
Collections
Compliance
tests written,
registered
Resource
tested
from a
starting
identifier
Report,
(Registered)
Wilkinson et al “Evaluating FAIR Maturity Through a Scalable, Automated, Community-
Governed Framework” bioRxiv, https://doi.org/10.1101/649202 , 2019
“FAIRification” (of legacy datasets)
the new magic wand word
• Need to do at the same
time as define indicators
• Needs experts
• BYODs
• ROI cost/benefit step
• Muddle with
harmonisation pipelines
(compliance to I and R)
• Non-trivial
• Upstream
• Turning into a business
https://fairplus-project.eu/
https://www.go-fair.org/fair-principles/fairification-process/
FAIR needs
to be at the
“first mile”,
embedded into
investigator
practice.
Mark Wilkinson
Just saying you are
FAIR doesn’t make
it true. Its uneven
and multi-facetted.
Identifier use is
chaotic.
Separating
metadata and data
is problematic.
FAIRification is
non-trivial.
FAIR is a set of
behaviours
not a specific
technology
Commons for autonomous,
self-managing Sys Bio projects
Hubs for Projects,
People, Data, Models, SOPs,
Workflows, Samples
First Mile /
Last Mile
From the
infrastructure /
standard /
commons /
database / tool / *
To the actual
investigator
fair-dom.org, fairdomhub.org
I3: references between
(meta)data
Models + Data + Methods
Respect and bridge the ecosystem
federated catalogue, integrated context
Respect and bridge the ecosystem
federated catalogue, integrated context
Public database
Local store
National infrastructure
Secure store
Public model
repository
Github
Shared SOPs
Neylon, Knowledge Exchange Report: http://www.knowledge-exchange.info/event/ke-approach-open-scholarship
Respect and bridge the ecosystem
going the first mile, and the last mile*
A miracle of sweat
and tears here
different scales, different agendas, different incentives
Koureas, The ‘last mile’ challenge for European research e-infrastructures https://riojournal.com/article/9933/
New ELIXIR
Converge
project
[Maryann Martone]
TheTragedy of the FAIR Commons*
• A Commons is only a
FAIR as its tenants
• Project sovereignty
• Public good vs personal
burden
• Professional
Stewardship for Projects
• Community socialisation
and values
Nudging
*Mark Musen , https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/
Based on Matt Spritzer / Brian Nosek figure, COS
More than just data
Software, models, workflows, SOPs, Lab Protocols….
4th (and Last story): FAIR Digital Objects
FAIR Workflows Commons
Workflow management
system (and registry) zoo*
*https://s.apache.org/existing-workflow-systems
FAIR Computational Workflows
The point of FAIR (meta)data was
to be machine actionable….. and
even better if machine generated.
• Operate in FAIR not proprietary
formats
• Support propagation of identifiers,
licenses, and AAI
• Mint FAIR identifiers, track data
provenance, license end products
Goble et al 2019 FAIR ComputationalWorkflows https://doi.org/10.5281/zenodo.3268653
FAIR workflows in their own right.
Like Software:
Principles stretched
Versioning
Software maturity, quality, maintainability,
documentation practices
Goble et al 2019 FAIR ComputationalWorkflows https://doi.org/10.5281/zenodo.3268653
FAIR workflows in their own right.
Like Data:
We can give them machine
actionable metadata.
Goble et al 2019 FAIR ComputationalWorkflows https://doi.org/10.5281/zenodo.3268653
Describes workflows to be
portable, scalable & interoperable
with different workflow systems and containerised tools
Bundles descriptions, references, files
Adds context, provenance, examples, data …
Relates to data collections, SOPs, lab protocols…
Links CWL descriptions with native workflows
Regulatory Practice
robust, safe exchange and reuse of HTS
computational analytical workflows
http://biocomputeobject.org
IEEE P2791
BioComputeWorking Group
[Vahan Simonyan]
Alterovitz, Dean II,Goble,Crusoe, Soiland-Reyes et al “Enabling Precision Medicine via standard communication of NGS provenance, analysis,
and results” PLOS Biology 2018, https://doi.org/10.1371/journal.pbio.3000099
A happy ending?
• FAIR is work in progress!
• Keep grounded,
developer friendly and
community supported
• No-one reads specs.
Everyone copies
examples.
• Nipype CWL is coming!
MG-RAST/EBI MGnify
Design by workflow blocks
Pipeline versions comparison
Pipeline exchange
Recycling tool descriptions and
sub-workflows
What is FAIR, what should be FAIR and how to
implement it is not simple.
Its not just Good Intentions
A social story, not a technical one.
Without incentives, cultural normalisation and long
term investment it will be a just a story.
INCF’s FAIR Journey….
Acknowledgements
Ian Fore
Mark Wilkinson
Susanna Sansone
Stian Soiland-Reyes
Rob Grossman
Barend Mons
Nick Juty
Alasdair Gray
Rafael Jimenez
Michel Dumontier
Michael Crusoe
Ian Cottam
And all the projects and many more
FAIRy stories: tales from building the FAIR Research Commons

Mais conteúdo relacionado

Mais procurados

FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
Carole Goble
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 

Mais procurados (20)

Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Report of the second FAIRDOM foundry
Report of the second FAIRDOM foundryReport of the second FAIRDOM foundry
Report of the second FAIRDOM foundry
 
The FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems BiologyThe FAIRDOM Commons for Systems Biology
The FAIRDOM Commons for Systems Biology
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
 
Building the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of ScientistsBuilding the FAIR Research Commons: A Data Driven Society of Scientists
Building the FAIR Research Commons: A Data Driven Society of Scientists
 
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)What is Reproducibility? The R* brouhaha (and how Research Objects can help)
What is Reproducibility? The R* brouhaha (and how Research Objects can help)
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher?
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 

Semelhante a FAIRy stories: tales from building the FAIR Research Commons

#ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love #ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love
Kristi Holmes
 

Semelhante a FAIRy stories: tales from building the FAIR Research Commons (20)

The future of FAIR
The future of FAIRThe future of FAIR
The future of FAIR
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
FAIR play?
FAIR play? FAIR play?
FAIR play?
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
 
FAIR Data-centric Information Architecture.pptx
FAIR Data-centric Information Architecture.pptxFAIR Data-centric Information Architecture.pptx
FAIR Data-centric Information Architecture.pptx
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
 
FAIRsharing - ENVRI-FAIR Webinar
FAIRsharing - ENVRI-FAIR WebinarFAIRsharing - ENVRI-FAIR Webinar
FAIRsharing - ENVRI-FAIR Webinar
 
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)
 
Managing and sharing data: lessons from the European context
Managing and sharing data: lessons from the European contextManaging and sharing data: lessons from the European context
Managing and sharing data: lessons from the European context
 
FAIRsharing presentation at the Japan Science and Technology Agency
FAIRsharing presentation at the Japan Science and Technology AgencyFAIRsharing presentation at the Japan Science and Technology Agency
FAIRsharing presentation at the Japan Science and Technology Agency
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
FAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDAFAIR data: what it means, how we achieve it, and the role of RDA
FAIR data: what it means, how we achieve it, and the role of RDA
 
Ready, Set, GO FAIR
Ready, Set, GO FAIRReady, Set, GO FAIR
Ready, Set, GO FAIR
 
#ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love #ALAAC15 Linked Data Love
#ALAAC15 Linked Data Love
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
 
LIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into RealityLIBER Webinar: Turning FAIR Data Into Reality
LIBER Webinar: Turning FAIR Data Into Reality
 
Open Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon HodsonOpen Science Globally: Some Developments/Dr Simon Hodson
Open Science Globally: Some Developments/Dr Simon Hodson
 
What it means to be FAIR
What it means to be FAIRWhat it means to be FAIR
What it means to be FAIR
 

Mais de Carole Goble

RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
Carole Goble
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Carole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 

Mais de Carole Goble (19)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 

Último

Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Cherry
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Cherry
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cherry
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 

Último (20)

Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.FS P2 COMBO MSTA LAST PUSH past exam papers.
FS P2 COMBO MSTA LAST PUSH past exam papers.
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
Early Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdfEarly Development of Mammals (Mouse and Human).pdf
Early Development of Mammals (Mouse and Human).pdf
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Concept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdfConcept of gene and Complementation test.pdf
Concept of gene and Complementation test.pdf
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 

FAIRy stories: tales from building the FAIR Research Commons

  • 1. FAIRy stories Tales from building the FAIR Research Commons Carole Goble The University of Manchester, UK ELIXIR UK Head of Node FAIRDOM Coordinator Software Sustainability Institute UK carole.goble@manchester.ac.uk INCF Neuroinformatics 2019, Warsaw, September 1-2, 2019
  • 2. FAIR Guiding Principles for Scientific Data Management and Stewardship, Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18
  • 3. A Digital Object Research Commons organising DOs for a field and across fields. A “shared space” where investigators can store, share, access, connect and interact with digital objects generated from research, and use them. Not a Database or Data warehouse. repositories zoo registries zoo https://medium.com/@rgrossman1/a-proposed-end-to-end-principle-for-data-commons-5872f2fa8a47 [Bob Grossman, 2018]
  • 4. A Digital Object Research Commons organising DOs for a field and across fields. A “shared space” where investigators can store, share, access, connect and interact with digital objects generated from research, and do more data-intensive research. Not a Database or Data warehouse. Ecosystem of pooled community resources Federation with many entry points Collectively created, owned or shared by community Mixed degrees of control
  • 5. [Ian Fore, 2019] We are all trying to build A FAIR Research Commons
  • 6. We are all trying to build A FAIR Research Commons
  • 7. We are all trying to build A FAIR Research Commons
  • 8. An “ad hoc” commons “in the Wild” Using FAIR as a general principle Fragmented ecosystem of pooled community resources Distributed federation with many entry points and many providers Each has its APIs, Web interfaces, Data Submission,Tool deployment 23 countries 15 communities Including health Held together with standards, metadata mark-up, common identifiers, registries, workflows, shared vision, hard work, love and hope. National datasets Community, Public datasets http://elixir-europe.org
  • 9. Uber FAIR Life Science Commons Federation over an ecosystem of different fields Ecosystem of FAIR innovative tools Publish FAIR life science data A zoo of Catalogues of tools, data, workflows, computing resources …
  • 10. Our first FAIRy tale: Finding* stuff in a pre-existing ecosystem EOSC Dataset Minimum Information https://eosc-edmi.github.io/ Minimum information metadata guideline to find and access datasets reusing existing data models and interfaces. Conventions for using schema.org Find, Access and Index Google Dataset Search Small, Lightweight, Viral A little bit of Semantics everywhere *and a bit of provenance, licencing
  • 11. Our first FAIRy tale: Finding* stuff in a pre-existing ecosystem Structured data descriptors in web pages Low barrier universal mark-up Harvesting, indexing, search Exchange & register without API Automated curation A little bit of Semantics everywhere *and a bit of provenance, licencing
  • 12. Our first FAIRy tale: Finding* stuff in a pre-existing ecosystem A little bit of Semantics everywhere The Goldilocks Principle
  • 13. Scale out mark-up for a federation Dataset Properties 91 -> 5 + 8
  • 14. Data Exchange: Without an API MarRef → BioSamples https://github.com/EBIBioSamples/bioschemas_marref_demo/blob/master/Summary.md Bioschemas markup added to MarRef pages Markup crawled using BuzzBang Data included as a BioSample Curation
  • 15. A happy ending approaches Endorsed by ELIXIR First types -> Schema.org Goldilocks • Esp. good for small data providers • Types & Profiles debates/explosion • Domain ontology reuse challenges • Elegance vs best for tools • Trolls Community based demonstration (Toxicology, Rare Disease) Validation, mark-up & harvesting tools A subset of the FAIR Principles
  • 16. Is your resource FAIR? Is your data/workflow/model FAIR from first to last? The FAIR Data Principles vs FAIR the Nice Intention
  • 17. 2014 - Lorentz workshop 2015 - BioHackathon 2016 - Published Grassroots activity that has become a top down one. Many efforts before…. Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18 2nd Story: FAIR. Once Upon A Time…
  • 19. https://www.incf.org/activities/standards-and-best-practices/what-is-fair Machine and human readable data formats and metadata that is compliant to many community standards, that persists, and tells you the provenance of the data and how its cross-linked Data and metadata are locatable and accessible by GUIDs, standard access protocols and have the least restrictive licenses
  • 20. Access Reproducibility Automation Policy Practice Proclamation “enhancing the ability of machines to automatically find and use data or any digital object, and support its reuse by individuals” FAIR Principles more than a fuzzy feeling
  • 21. The message spread across the lands….
  • 22. The message spread across the lands….
  • 23. Simple words are powerful things that can be mangled. Simple concepts are not so simple to implement. Once size does not fit all. Beware FAIR zealots and vested interests.
  • 24. We { are | will be | always have been } FAIR Use our platform /technology to be FAIR. Even if its not what FAIR meant Only we control FAIR. Our way is the right way. We don’t know what it means to implement FAIR but we want to measure and certify it.
  • 25. “FAIR principles: interpretations and implementation considerations” J Data Intelligence, coming soon in 2019…. which was still contentious
  • 26. FAIRy tale -> Reality! • An aspiration, a journey. • A call for machine actionability. of data and metadata. • Ambiguous. • A spectrum. • Domain respectful. • Implementable with todays protocols and standards. • A subset of indicators: – ROI cost/benefit, impact, community need, sustainability of repository, quality of content/service…. • Work in progress. Principles are… Principles are not… • A standard. • Just about humans. • Strict. • Technology specific. • Only for one domain. • About inventing new protocols. • One size fits all. • Anything to do with quality. • Synonymous with open. • Tablets of stone. • Mons et al Cloudy, increasingly FAIR; Revisiting the FAIR Data guiding principles for the European Open Science Cloud. Information Services & Use. 37. 1-8. 10.3233/ISU-170824. • Dunning et al Are the FAIR Data Principles fair? IDCC17
  • 27. FAIRy Stories about FAIR • Its not about Open • Its not about a resource’s Quality or Impact • Its not actually about Harmonising all metadata to one schema.
  • 29. FAIR is a Journey…. Concepts for FAIR Impleme ntation FAIR Culture FAIR Ecosystem Skills for FAIR Incentives and Metrics Invest ment in FAIR Turning FAIR into Reality, EC Report, 2018
  • 30. Review Criteria for Endorsement of Standards and Best Practices, 2018 DOI: 10.5281/zenodo.2535741 Subset of principles applied to standards and best practices The INCF Commons and its Resources themselves? “INCF supports the FAIR (Findable, Accessible, Interoperable, Reusable) principles, and adherence to them is a requirement for an INCF-endorsed standard or best practice.” https://www.incf.org/activities/standards-and-best-practices
  • 31. Defining and Implementing FAIR Clarity Metrics / Indicators Maturity Models Manual / Automated Assessment FAIRification Methodologies • At the first mile • At the last mile • For the legacy Toolkits,Tools and Services
  • 33. https://fairshake.cloud/ http://blog.ukdataservice.ac.uk/fair-data-assessment-tool/ https://www.howfairismydata.com/ Unhappy ending • Subjective • Hard to interpret and compare • Weak transparency • Judgemental • Drift to Quality review • Independent of the community • Occasionally barking mad Community pushback
  • 34. https://fairshake.cloud/ http://blog.ukdataservice.ac.uk/fair-data-assessment-tool/ https://www.howfairismydata.com/ Dunkelziffer “Not everything that can be counted counts. Not everything that counts can be counted” [William Bruce Cameron] “FAIR is non-trivial, and domain specific at anything other than the most superficial level” Wilkinson
  • 35. Matrix of indicators Maturity levels for each + *The MetricTide, https://responsiblemetrics.org/the-metric-tide/ A FAIR Assessment Transparent evaluation What, Who, How Objective evaluations Narrative feedback on fails Indicators Robustness, Humility, Transparency, Diversity, Reflexivity* Context Community standards Incremental Cost/benefit Not just a score Non judgmental Scope for novelty Transparent evaluation Eat the Dog Food Design-Build-Test-Learn indicators and evaluation
  • 36. Maturity Model Value Based Assessment Selection Goal Setting Process planning Modelling Transformation Publishing [Susheel Varma] A FAIR Assessment
  • 37. Capability Maturity Model of entities & their capabilities Indicators and metrics measuring levels Foundational Components FAIRification Process Awareness and Policy Standards and Guidelines People Infrastructure Value Based Assessment Selection Goal Setting Process planning Modelling Transformation Publishing Impl.Outcome: Dataset Persistent Identification Data Set Discovery Machine Readability DataAccess and Usage Preservation and Sustainability FAIR Data Maturity ModelWG A FAIR Assessment [Oya Deniz Beyan, 2019]
  • 38. Next meeting 12th September 2019 Sessions at Helsinki RDA Plenary October 2019
  • 39. Licence Metadata includes information about the licence under which the data can be reusedMandatory Metadata includes licence information in the appropriate element of the metadata standard used Metadata refers to a standard reuse licenceRecommended Metadata includes information about consent for reuse (e.g. personal data) Metadata refers to a machine-understandable reuse licenceOptional FAIR Data Maturity ModelWG An “easy” indicator…. “R1.1. (meta)data are released with a clear and accessible data usage license” Format Allows -- -- -- non-standard human readable access standard open standard reuse & machine readable clear reuse criteria “ “ “ “ “ “ “
  • 40. A trickier indicator… “R1.3. (meta)data meet domain-relevant community standards”FAIR Data Maturity ModelWG Mandatory Recommended Metadata complies with a community standard Data complies with a community standard Metadata is expressed in compliance with a machine-understandable community standard Data is expressed in compliance with a machine-understandable community standard Neuroshapes Metadata Portal Reviewers Suppose there isn’t a standard or its not up to it? Indicators have to be community specific Librarian’s view point vs Genomics view point? How is it validated? JSON and SHACL validators. How is it captured? Spreadsheets. Interoperability is nearly always purpose specific
  • 41. • Community governed “indicators” not metrics • Automated objective scale up & out • Sanity check put into practice https://fairsharing.github.io/FAIR-Evaluator-FrontEnd/#!/ Community creates Maturity Indicators Registered, Collections Compliance tests written, registered Resource tested from a starting identifier Report, (Registered) Wilkinson et al “Evaluating FAIR Maturity Through a Scalable, Automated, Community- Governed Framework” bioRxiv, https://doi.org/10.1101/649202 , 2019
  • 42. “FAIRification” (of legacy datasets) the new magic wand word • Need to do at the same time as define indicators • Needs experts • BYODs • ROI cost/benefit step • Muddle with harmonisation pipelines (compliance to I and R) • Non-trivial • Upstream • Turning into a business https://fairplus-project.eu/ https://www.go-fair.org/fair-principles/fairification-process/
  • 43. FAIR needs to be at the “first mile”, embedded into investigator practice. Mark Wilkinson Just saying you are FAIR doesn’t make it true. Its uneven and multi-facetted. Identifier use is chaotic. Separating metadata and data is problematic. FAIRification is non-trivial. FAIR is a set of behaviours not a specific technology
  • 44. Commons for autonomous, self-managing Sys Bio projects Hubs for Projects, People, Data, Models, SOPs, Workflows, Samples First Mile / Last Mile From the infrastructure / standard / commons / database / tool / * To the actual investigator fair-dom.org, fairdomhub.org
  • 45. I3: references between (meta)data Models + Data + Methods Respect and bridge the ecosystem federated catalogue, integrated context
  • 46. Respect and bridge the ecosystem federated catalogue, integrated context Public database Local store National infrastructure Secure store Public model repository Github Shared SOPs
  • 47. Neylon, Knowledge Exchange Report: http://www.knowledge-exchange.info/event/ke-approach-open-scholarship Respect and bridge the ecosystem going the first mile, and the last mile* A miracle of sweat and tears here different scales, different agendas, different incentives Koureas, The ‘last mile’ challenge for European research e-infrastructures https://riojournal.com/article/9933/ New ELIXIR Converge project
  • 49. TheTragedy of the FAIR Commons* • A Commons is only a FAIR as its tenants • Project sovereignty • Public good vs personal burden • Professional Stewardship for Projects • Community socialisation and values Nudging *Mark Musen , https://ncip.nci.nih.gov/blog/face-new-tragedy-commons-remedy-better-metadata/ Based on Matt Spritzer / Brian Nosek figure, COS
  • 50. More than just data Software, models, workflows, SOPs, Lab Protocols…. 4th (and Last story): FAIR Digital Objects
  • 51. FAIR Workflows Commons Workflow management system (and registry) zoo* *https://s.apache.org/existing-workflow-systems
  • 52. FAIR Computational Workflows The point of FAIR (meta)data was to be machine actionable….. and even better if machine generated. • Operate in FAIR not proprietary formats • Support propagation of identifiers, licenses, and AAI • Mint FAIR identifiers, track data provenance, license end products Goble et al 2019 FAIR ComputationalWorkflows https://doi.org/10.5281/zenodo.3268653
  • 53. FAIR workflows in their own right. Like Software: Principles stretched Versioning Software maturity, quality, maintainability, documentation practices Goble et al 2019 FAIR ComputationalWorkflows https://doi.org/10.5281/zenodo.3268653
  • 54. FAIR workflows in their own right. Like Data: We can give them machine actionable metadata. Goble et al 2019 FAIR ComputationalWorkflows https://doi.org/10.5281/zenodo.3268653 Describes workflows to be portable, scalable & interoperable with different workflow systems and containerised tools Bundles descriptions, references, files Adds context, provenance, examples, data … Relates to data collections, SOPs, lab protocols… Links CWL descriptions with native workflows
  • 55. Regulatory Practice robust, safe exchange and reuse of HTS computational analytical workflows http://biocomputeobject.org IEEE P2791 BioComputeWorking Group [Vahan Simonyan] Alterovitz, Dean II,Goble,Crusoe, Soiland-Reyes et al “Enabling Precision Medicine via standard communication of NGS provenance, analysis, and results” PLOS Biology 2018, https://doi.org/10.1371/journal.pbio.3000099
  • 56. A happy ending? • FAIR is work in progress! • Keep grounded, developer friendly and community supported • No-one reads specs. Everyone copies examples. • Nipype CWL is coming! MG-RAST/EBI MGnify Design by workflow blocks Pipeline versions comparison Pipeline exchange Recycling tool descriptions and sub-workflows
  • 57. What is FAIR, what should be FAIR and how to implement it is not simple. Its not just Good Intentions A social story, not a technical one. Without incentives, cultural normalisation and long term investment it will be a just a story. INCF’s FAIR Journey….
  • 58. Acknowledgements Ian Fore Mark Wilkinson Susanna Sansone Stian Soiland-Reyes Rob Grossman Barend Mons Nick Juty Alasdair Gray Rafael Jimenez Michel Dumontier Michael Crusoe Ian Cottam And all the projects and many more

Notas do Editor

  1. https://www.neuroinformatics2019.org Title: FAIRy stories: tales from building the FAIR Research Commons Findable Accessable Interoperable Reusable. The “FAIR Principles” for research data, software, computational workflows, scripts, or any kind of Research Object is a mantra; a method; a meme; a myth; a mystery. For the past 15 years I have been working on FAIR in a range of projects and initiatives in the Life Sciences as we try to build the FAIR Research Commons. Some are top-down like the European Research Infrastructures ELIXIR, ISBE and IBISBA, and the NIH Data Commons. Some are bottom-up, supporting FAIR for investigator-led projects (FAIRDOM), biodiversity analytics (BioVel), and FAIR drug discovery (Open PHACTS, FAIRplus). Some have become movements, like Bioschemas, the Common Workflow Language and Research Objects. Others focus on cross-cutting approaches in reproducibility, computational workflows, metadata representation and scholarly sharing & publication. In this talk I will relate a series of FAIRy tales. Some of them are Grimm. There are villains and heroes. Some have happy endings; all have morals.
  2. FAIR was on the opening slides of the meeting Maryann Martone is an author along with me
  3. “Cyberinfrastructure that collocates data, storage, and computing infrastructure with commonly used tools for analyzing and sharing data to create an interoperable resource for the research community.” (Open Commons Consortium) “An environment where participants make use of computing and communication technologies to access shared instruments and data, as well as to communicate with others” (Wikipedia) a database organizes data for a project; a data warehouse organizes data for an organization; and a data commons organizes data for a field or discipline. (Bob Grossman)
  4. https://www.humanbrainproject.eu/en/explore-the-brain/ And the HPB Collaboratory
  5. Incrementing Interop – services, standards, know-how Stuff is massive legacy No one governance
  6. 13 Ris Almost like a Meta-Commons
  7. 91 properties for dataset Bioschema’s dataset Compliant with Google Dataset Profile 5 minimal properties 8 recommended properties Link to DataCatalog Link to DataDownload
  8. Bioschemas markup added to MarRef pages Markup crawled using BuzzBang Data included as a BioSample Curation Depicted by the External Links
  9. Villains and Heroes
  10. Its is context dependent - fair for a library not for plant sciences. Though it all helps! Though links to other metadata help, but they may not be harmonised Its about identifying and describing stuff.
  11. Subset of the FAIR principles BIDS, NeuroML and PyNN are endorsed (https://www.incf.org/resources/incf-endorsed-standards-best-practices) https://www.incf.org/resources/other-standards-best-practices
  12. Beware… beauty is in the eye of the beholder What’s FAIR from a Cataloguer perspective maybe useless from a biologists viewpoint
  13. 50 shades of FAIR – Robert-John Schmidt
  14. FAIRsFAIR Open Consultation on FAIR Data Policies and Practices in Europe
  15. Bioschemas mark-up about licence?
  16. This group really tried this Scale up and scale out automation of indicators and their evaluation Mark volunteers to write compliance tests
  17. Cookbooks, BYODs, Tools A miracle occurs with very clever people Running at the same time as defining FAIR
  18. “50 Shades of FAIR” Identifier use is chaotic, for both data and metadata. Separating metadata and data is problematic FAIR is a set of behaviours not a specific technology Content negotiation is NOT how you differentiate data from Metadata. It's how you negotiate serialization of the identified thing. Identifier use is chaotic, for both data and metadata, and no clear way to point from one to another. Separating what is metadata and what is data given a URI is a problem” FAIR is a set of behaviours (use of tech and people) not a specific technology
  19. Born FAIR
  20. Hence stuff like ReproNim need Community engagement: The ‘last mile’ challenge for European research e-infrastructures Dimitrios Koureas, ed
  21. HIDDEN SLIDE
  22. From the opening talk
  23. HIDDEN SLIDE
  24. Villians mentioned: PIs and senior faculty Heroes: PhD students
  25. Computational and SOPs (here its Computational)
  26. FAIR Software should facilitate making FAIR Data.
  27. Maintainability Testing Portability Contributor policy Identity Copyright Licenses Documentation Sustainability
  28. Join in! Like Data: many FAIR Data Principles apply Repositories (F) Standardising descriptions of workflow, provenance and components (I, R): CWL, PROV Metadata about, combining and referencing between components (I, R): Research Objects
  29. HIDDEN SLIDE The EOSC Life computational workflows stack
  30. Standardize exchange of HTS workflows for regulatory submissions between FDA, pharma, bioinformatics platform providers and researchers replicate the computational analytical workflow to review and approve the bioinformatics Inspect and replicate the computational analytical workflow to review and approve the bioinformatics
  31. HIDDEN SLIDE