The slides used for a lecture given at the University of York, where I introduced the FAIR principles and FAIRsharing, and put them in the context of ELIXIR and ELIXIR UK. FAIRsharing is an informative and educational resource on interlinked standards, databases and policies, three key elements of the FAIR ecosystem. FAIRsharing is adopted by funders, publishers and communities across all research disciplines. It promotes the existence and value of these resources to aid data discovery, interoperability and sharing across all of our stakeholder groups. Here we discuss how FAIRsharing can be searched and updated by our user community, and how you can make the best use out of it as part of a broader data management infrastructure.
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
A Whirlwind tour of the FAIR Principles, ELIXIR, and FAIRsharing in the context of research data management (RDM)
1. CC BY-SA 4.0 International
An ecosystem of research standards and databases
for effective RDM
University of York: 26 April 2022
@FAIRsharing_org
contact@fairsharing.org
10.25504/FAIRsharing.2abjs5
datareadiness.eng.ox.ac.uk
Allyson Lister, PhD
FAIRsharing - Content and Community Coordinator
Senior Knowledge Engineer
ORCiD: 0000-0002-7702-4495
2. CC BY-SA 4.0 International
Metadata is a love
note to the future
Jason Scott https://twitter.com/textfiles/status/119403173436850176 (2011)
3. CC BY-SA 4.0 International
The FAIR Principles:
What they are and why we’re
interested
4. CC BY-SA 4.0 International
Discoveries are made using shared data and this requires data that are:
• Retrievable and structured in standard format(s)
• Self-described so that third parties can make sense of it
The challenge
Forbes article on 2016 Data Scientist Report
https://www.forbes.com/sites/gilpress/2016/03/23/data-preparat
ion-most-time-consuming-least-enjoyable-data-science-task-surve
y-says/#276a35e6f637
Data preparation accounts for about 80% of the work of data scientists
5. CC BY-SA 4.0 International
doi.org/10.2777/02999
The cost of not having FAIR research data
Impact on innovation
6. CC BY-SA 4.0 International
The scholarly publishing
ecosystem is changing
Data-relates mandates by
funders and institutions are
growing
Researchers need
recognition and credit
theconversation.com/how-robots-can-help-us-embrace-a-more-human-view-of-disability-76
815
Human-machine collaboration is the future
o 21% pharmacology data (doi.org/10.1038/nrd3439-c1)
o 11% cancer data (doi.org/10.1038/483531a)
o unsatisfactory in ML (openreview.net/pdf?id=By4l2PbQ-)
towardsdatascience.com/scientific-data-analysis-pipelines-and-reproducibility-75ff9df5b4c5
Reproducibility of published studies is still problematic
Responding to needs
7. CC BY-SA 4.0 International
“Most metadata field names and their values are not standardized
or controlled. Even simple binary or numeric fields are often
populated with inadequate values of different data types”
Standardised description - does it matter? YES
https://doi.org/10.1038/sdata.2019.21
8. CC BY-SA 4.0 International
A set of principles to enhance the
value of all digital resources and its
reuse by humans and machines
Data that is discoverable and reusable at scale
9. CC BY-SA 4.0 International
Findable Accessible Interoperable Reusable
The FAIR Principles in a nutshell
10. CC BY-SA 4.0 International
Globally unique,
resolvable, and
persistent identifiers
To retrieve and
connect data
Community-defined
descriptive metadata
To enhance
discoverability and
interpretability
Community-defined
terminologies
To use the same term
and mean the
same thing
Findable Accessible Interoperable Reusable
11. CC BY-SA 4.0 International
Globally unique,
resolvable, and
persistent identifiers
To retrieve and
connect data
Community-defined
descriptive metadata
To enhance
discoverability and
interpretability
Community-defined
terminologies
To use the same term
and mean the
same thing
Detailed provenance
and workflows
To contextualize the
data and facilitate use in
applications
Terms of access “as
open as possible, as
closed a necessary”
To understand how data
can be accessed
Terms of use and clear
licenses
To enable innovation
and reuse, ensuring
credit as needed
Findable Accessible Interoperable Reusable
12. CC BY-SA 4.0 International
Findable
Accessible
Interoperable
Reusable
Providing for a continuum of features, attributes
and behaviours
FAIR: aspirational guidance
13. CC BY-SA 4.0 International
FAIR: just principles, not practice
14. CC BY-SA 4.0 International
ELIXIR and ELIXIR-UK:
Bringing together life science resources
from across Europe into a single
infrastructure
15. CC BY-SA 4.0 International
ELIXIR –
a sustainable infrastructure for biological data
16. CC BY-SA 4.0 International
https://elixiruknode.org/welcome-to-elixir-uk/about/member-organisations-2/
17. CC BY-SA 4.0 International
Resources to make data FAIR
• ELIXIR’s Recommended Interoperability
Resources.
• These are resources that:
• Establish connections between resources
• Acquire and expose metadata of different
resources
• Create the infrastructure needed to build
integratable data collections
• They will help you translate your data
across databases or resources or find the
best standards for your data.
18. CC BY-SA 4.0 International
FAIR service framework
Ontologies,
formats, reporting
guidelines,
Identifier
Authorities
Metadata
Annotation
&Validation
Citation
Harvesting,
Indexing,
Search
Ontology
Mapping
Identifier
Mapping
Ontology
Lookup
Identifier
resolution
Ontology
Management
Identifier
minting
Standards &
databases
Ontologies Tools Workflows
Identifiers
Registries
Type specific
Mapping &
resolution
Extract-
Transform-Load
APIs
Standard
s
Services Type specific KBs,
integration &
aggregation
Knowledge
19. CC BY-SA 4.0 International
FAIR service framework
Ontologies,
formats, reporting
guidelines,
Identifier
Authorities
Metadata
Annotation
&Validation
Citation
Harvesting,
Indexing,
Search
Ontology
Mapping
Identifier
Mapping
Ontology
Lookup
Identifier
resolution
Ontology
Management
Identifier
minting
Standards &
databases
Ontologies Tools Workflows
Identifiers
Registries
Type specific
Mapping &
resolution
Extract-
Transform-Load
APIs
Standard
s
Services Type specific KBs,
integration &
aggregation
Knowledge
20. CC BY-SA 4.0 International
FAIRsharing in a nutshell:
scope and mission
21. CC BY-SA 4.0 International
Users, adopters, communities and working groups
Including:
An endorsed output of the
FAIRsharing WG
(since 2015):
A WG (since 2015) in:
Researchers in academia,
industry and government
Developers & curators of
resources and tools
Research data facilitators,
librarians, trainers
Society, unions
and community alliances
Journal publishers and
organisations with data policies
Funders and data
policy makers
A recommended resource in EOSC reports
Used by all stakeholder groups
https://fairsharing.org/communities
22. CC BY-SA 4.0 International
Guides consumers to discover, select and use these resources with confidence
Helps producers to make their resources more visible, more widely adopted and cited
Promoting use and value of databases, standards, policies
Total of
over 3584
resources
(Feb 2022)
repositories
standards
policie
s
23. CC BY-SA 4.0 International
Search by subject
Powered by our Subject Ontology of 436 terms
https://fairsharing.org/browse/subject
https://github.com/FAIRsharing/subject-ontology
https://www.ebi.ac.uk/ols/ontologies/srao
24. CC BY-SA 4.0 International
Search by subject
Powered by our Subject Ontology of 436 terms
https://fairsharing.org/browse/subject
https://github.com/FAIRsharing/subject-ontology
https://www.ebi.ac.uk/ols/ontologies/srao
25. CC BY-SA 4.0 International
Search by subject
Powered by our Subject Ontology of 436 terms
https://fairsharing.org/browse/subject
https://github.com/FAIRsharing/subject-ontology
https://www.ebi.ac.uk/ols/ontologies/srao
26. CC BY-SA 4.0 International
Search by subject
Powered by our Subject Ontology of 436 terms
https://fairsharing.org/browse/subject
https://github.com/FAIRsharing/subject-ontology
https://www.ebi.ac.uk/ols/ontologies/srao
27. CC BY-SA 4.0 International
Search by subject
Powered by our Subject Ontology of 436 terms
https://fairsharing.org/browse/subject
https://github.com/FAIRsharing/subject-ontology
https://www.ebi.ac.uk/ols/ontologies/srao
28. CC BY-SA 4.0 International
A repository record:
at-a-glance view
29. CC BY-SA 4.0 International
How to get an account and create/update a record with us:
https://fairsharing.gitbook.io/fairsharing/
30. CC BY-SA 4.0 International
Record page
and general
information
> 40 descriptors including:
● Subject(s) coverage
● Maintainer(s)
● Funder(s)
● Organization(s)
● Access
● Licence(s)
● Publication(s)
● Related resources
https://doi.org/10.25504/FAIRsharing.wkggtx
31. CC BY-SA 4.0 International
Maintainer
Actions
New login options
via our trusted
status with
and also via
36. CC BY-SA 4.0 International
All records have
relationship graphs
https://fairsharing.org/graph/1998
Dryad’s graph, for example, shows the
• standards it implements
• repositories it shares data or code
with
• policies that recommend it
• collections that contain it
37. CC BY-SA 4.0 International
Data
Conditions
and Access
38. CC BY-SA 4.0 International
Data
Conditions
and Access
42. CC BY-SA 4.0 International
Evolution is monitored and reasons provided
43. CC BY-SA 4.0 International
Evolution is monitored and reasons provided
44. CC BY-SA 4.0 International
• Every edit is checked by the in-house
curation team
• Maintainers are notified via email when
their record is updated
In-house curation
and the ‘life cycle
status’ tags
Life cycle tags:
45. CC BY-SA 4.0 International
How do consumers and producers
of standards, repositories and policies
use FAIRsharing and benefit from it?
46. CC BY-SA 4.0 International
Research data facilitators,
librarians, trainers
Use FAIRsharing to provide a
foundation for lectures, training
and teaching material; to plug
into data management planning
tools and other FAIR-supporting
resources.
Funders and data
policy makers
Recommend FAIRsharing to
your awardees or community
to inform development of their
data management plan; select
the appropriate resources to
recommend in your data policy.
Researchers in academia,
industry, government
Use FAIRsharing to identify and
cite the resources that exist for
your discipline when creating a
data management plan,
releasing data or submitting a
manuscript to a journal.
A closer look at our community
47. CC BY-SA 4.0 International
Developers and curators
of resources
Make your standard, database
or repository discoverable by
adding or claiming it in
FAIRsharing; increase exposure
outside your community and
promote adoption.
Journal publishers and
data policy developers
Create personalised
inter-related lists of citable
resources relevant to your
authors, users or their
community; maintain and revise
your recommendation over time.
Learned societies, unions
and associations
Collaborate with FAIRsharing
to raise awareness of your
resource; mobilize your
community to take action to
promote registration, use and
citation of key resources.
A closer look at our community
48. CC BY-SA 4.0 International
Tailored views for education and promotion
Collections are are branded pages that group selected standards and/or repositories
Initiatives and projects have created them for several purposes, e.g. to list resources:
URL: https://fairsharing.org/CrosswalkOfMostUsedMetadataSchemesAndGuidelines
Maintainers
Subjects
Mapped to each other
URL: https://fairsharing.org/RDACovid19WG
Maintainers
Subjects
Recommended by a community
URL: https://fairsharing.org/IVOA
Maintainers
Subjects
Developed by the community
URL: https://fairsharing.org/CDISC
Maintainers
Subjects
Developed by a SDO
49. CC BY-SA 4.0 International
To discover and search the 230 related standards part
of the specification developed by the
ISO Technical Committee on Biotechnology Processes
URL: https://fairsharing.org/ISO20691
Researchers: knowledge graphs to complement lists
URL: https://committee.iso.org/standard/68848.html
50. CC BY-SA 4.0 International
A growing number of FAIR-enabling Services access
FAIRsharing API, and use it as look-up and select
service for standards and repositories for:
● data management plans and guidelines
● FAIR assessment
A new, in-development Data Discovery Service, part
of ELIXIR-driven EOSC-Life and BY-COVID projects:
● register repositories’ access methodologies, e.g.
schema.org, bioschemas, OAI-PMH
● enable (meta)data harvesting
Services: curated content to power 3rd party tools
ds-wizard.org
dmponline.dcc.ac.uk
w3id.org/AmIFAIR
openaire.eu
fairshake.cloud
51. CC BY-SA 4.0 International
How FAIRsharing helps in FAIR evaluation and
RDM / DMP applications
52. CC BY-SA 4.0 International
1. For FAIR evaluation, DMP and other tools/services
it provides a look-up service (via the API) for standards and repositories
2. For developers and users of FAIRness tests and indicators
it provides a searchable registry to describe and discover them
3. For community, organizations and projects
it enables creation of profiles to declare and visual the standards and repositories
they use
4. (future?) For standards, repositories and policies’ owners/maintainer
it could enable them to display a FAIRness level on their record page; along with
the name of the tool used to measure it, a time stamp, and improvement tracking
Ways in which FAIRsharing assists others with FAIRness
53. CC BY-SA 4.0 International
1. For FAIR evaluation, DMP and other
tools/services it provides a look-up service
(via the API) for standards: name, types and
other metadata
To help with questions like:
● what are the types of PID schemas?
● Is SpectralDM the right community
standard data model to describe the
structure of spectrophotometric
datasets?
As a look-up service for standards
Identifiers
Terminologies Guidelines
Formats
830
507
232
21
1590 data and metadata
standards and growing
54. CC BY-SA 4.0 International
1. For FAIR evaluation, DMP and other
tools/services it provides a look-up service
(via the API) for standards: name, types and
other metadata
To help with questions like:
● what are the types of PID schemas?
● Is SpectralDM the right community
standard data model to describe the
structure of spectrophotometric
datasets?
As a look-up service for standards
➔ Be aware: checking ‘compliance’ against standards is not
easy, or even possible for many, due to their narrative
form (checklists) and/or absence of canonical validators;
these challenges are the focus of our pre-print
doi.org/10.5281/zenodo.5596465
55. CC BY-SA 4.0 International
1. For FAIR evaluation, DMP and other
tools/services it provides a look-up service
(via the API) for standards: name, types and
other metadata
To help with questions like:
● what are the types of PID schemas?
● Is SpectralDM the right community
standard data model to describe the
structure of spectrophotometric
datasets?
As a look-up service for standards
➔ Be aware: checking ‘compliance’ against standards is not
easy, or even possible for many, due to their narrative
form (checklists) and/or absence of canonical validators;
these challenges are the focus of our pre-print
doi.org/10.5281/zenodo.5596465
56. CC BY-SA 4.0 International
1. For FAIR evaluation, DMP and other
tools/services it provides a look-up service (via
the API) for repositories: name, types and
other metadata
To help collecting information (at data resource
level), such as the presence and type of:
● licence or terms of use
● datasets accessibility/openness
● accessibility mechanism
● policies (e.g. curation, preservation)
● PIDs schema
● data and metadata standards
● and many others…
As a look-up service for repositories
57. CC BY-SA 4.0 International
1. For FAIR evaluation, DMP and other
tools/services it provides a look-up service (via
the API) for repositories: name, types and
other metadata
To help collecting information (at data resource
level), such as the presence and type of:
● licence or terms of use
● datasets accessibility/openness
● accessibility mechanism
● policies (e.g. curation, preservation)
● PIDs schema
● data and metadata standards
● and many others…
As a look-up service for repositories
58. CC BY-SA 4.0 International
1. For FAIR evaluation, DMP and other
tools/services it provides a look-up service (via
the API) for repositories: name, types and
other metadata
To help collecting information (at data resource
level), such as the presence and type of:
● licence or terms of use
● datasets accessibility/openness
● accessibility mechanism
● policies (e.g. curation, preservation)
● PIDs schema
● data and metadata standards
● and many others…
As a look-up service for repositories
59. CC BY-SA 4.0 International
2. For developers and users of FAIRness tests
and indicators it provides a searchable
registry to describe and discover them
To complement their storage in e.g. GitHub,
and help to make them to be:
● discoverable
● citable (via record’s DOI)
● unique
○ ensuring community ‘sees’ what
exists and adds/extends as needed
● usable in community profiles
As a registry for FAIRness tests and indicators
➔ This functionality is in a testing phase
60. CC BY-SA 4.0 International
3. For community, organizations and projects
it enables the creation of profiles to declare and
visualise standards and repositories they use
This is done through collections, connecting the
standards used to the FAIR indicators fulfilled
As a place to compare profiles, fostering FAIRness
Translational
Medicine
Clinical Developments
URL: https://fairsharing.org/PistoiaAllianceFIPs
(work in progress!)
A collaboration with the FAIR Implementation WG
➔ This is an explorative work, which may be
improved as the tests and indicators
mature and are globally agreed!
61. CC BY-SA 4.0 International
3. For community, organizations and projects
it enables the creation of profiles to declare and
visualise standards and repositories they use
This is done through collections, connecting the
standards used to the FAIR indicators fulfilled
As a place to compare profiles, fostering FAIRness
Disclaimer: These profiles speak for a limited community and do not represent any company standards
Clinical Developments
➔ This is an explorative work, which may be
improved as the tests and indicators
mature and are globally agreed!
62. CC BY-SA 4.0 International
Clinical Developments
Disclaimer: These profiles speak for a limited community and do not represent any company standards
URL: https://fairsharing.org/PistoiaAllianceFIPs
(work in progress!)
Comparing the use of standards and
promoting convergence
63. CC BY-SA 4.0 International
The FAIR Cookbook:
A collection of recipes that cover the operation
steps of FAIR data management
64. The FAIR Cookbook
What is it?
An online, ‘live’
resource for the life
sciences
A collection of
recipes that cover
the operation steps
of FAIR data
management
Who is it for?
Who developed it?
Researchers and
data managers
professionals in the
life sciences, from
academia and
industry
Including ELIXIR
members
Email: fairplus-cookbook@elixir-europe.org
Online resource:
https://faircookbook.elixir-europe.org
65. Learning objectives
Learn how to improve the FAIRness with exemplar datasets
Understand the levels and indicators of FAIRness
Discover open source technologies, tools and services
Find out the required skills
Acknowledge the challenges
Email: fairplus-cookbook@elixir-europe.org
Online resource:
https://faircookbook.elixir-europe.org
66.
67. Creators and contributors to date
+50 life sciences professionals, researchers and data managers
FARIplus
partners
Industry
+
Academia
ELIXIR
Nodes
represented
Email: fairplus-cookbook@elixir-europe.org
Online resource:
https://faircookbook.elixir-europe.org
68. CC BY-SA 4.0 International
Stakeholder Advisors
● Amye Kenall, VP of Publishing and Product, Research Square
● Adam Leary, Oxford University Press
● Catriona MacCallum, Hindawi
● Dagmar Meyer, European Research Council, Executive Agency
● Dominic Fripp, JISC, UK
● Emma Ganley, Protocols.io
● Geraldine Clement-Stoneham, Medical Research Council
● Helena Cousijn, DataCite
● Iain Hrynaszkiewicz, PLoS
● Imma Subirats, FAO of the United Nations
● Kiera McNiece, Cambridge University Press
● Luiz Olavo Bonino, GO-FAIR
● Marina Soares E Silva and Sarah Callaghan, Elsevier
● Michael Ball, Biotechnology and Biological Sciences Research Council
● Mike Huerta, NIH National Library of Medicine
● Molly Cranston and Guillaume Wright, F1000Research
● Nick Everitt and Matthew Cannon, Taylor and Francis
● Scott Edmunds, GigaScience, Oxford University Press
● Simon Hodson, CODATA
● Theo Bloom, British Medical Journal
● Thomas Lemberger, EMBO Press
● Wei-Mun Chan, eLife
● Sowmya Swaminathan, Springer Nature
Current Operational Team
● Allyson Lister, Content and Community Lead
● Milo Thurston, Technical Lead
● Ramon Granell, Data Enrichment & Quality Manager
● Delphine Dauga, Data Curator Manager
● Hiring in progress, Web Developer
● Dominique Batista, Research Software Engineer
● Philippe Rocca-Serra, Co-Founder
● Susanna-Assunta Sansone, PI and Founder
● and many collaborators and contributors!
Executive Advisors
● Varsha Khodiyar, HDRUK
● David Carr, Independent expert
● Chris Graf, Springer Nature
● Marta Teperek, Data Stewardship Coordinator, TUDelft
● Robert Hanisch, Director, NIST Office of Data & Informatics
● Peter McQuilton, FAIRsharing Founding Member, GSK
Early Adopter Community Curators
● Kyle Copas, GBIF
● Annie Elkjær Ørum-Kristensen, GBIF
● Lindsey Anderson, PNNL
● Joe Miller, GBIF
Thank you!
SAS and AL contributed to these slides.