FAIRsharing is an informative and educational resource on interlinked standards, databases and policies, three key elements of the FAIR ecosystem. FAIRsharing is adopted by funders, publishers and communities across all research disciplines. It promotes the existence and value of these resources to aid data sharing and consequently requires a high standard of curation to ensure accurate and timely information is provided for all of our stakeholder groups. Here we discuss the methods employed and challenges faced during curation and maintenance of existing content as well as the introduction of new features. We will describe how our curation team uses a blend of manual and semi-automated curation to work on individual records and across large subsets of the registry. We also will discuss the benefits of both in-house curation and community-driven curation provided by our stakeholder groups.
FAIRsharing: curating an ecosystem of research standards and databases
1. CC BY-SA 4.0 International
datareadiness.eng.ox.ac.uk
Allyson Lister, PhD
FAIRsharing - Content and Community Coordinator,
Senior Knowledge Engineer,
ORCiD: 0000-0002-7702-4495
@FAIRsharing_org
contact@fairsharing.org
10.25504/FAIRsharing.2abjs5
NLM Curation at Scale Workshop, 28-30 March 2022
Curating an ecosystem of research standards and databases
2. CC BY-SA 4.0 International
An informative and educational resource, and a service
FAIRsharing provides curated descriptions and relationship graphs of
standards, databases and policies in all disciplines
COMMUNITY STANDARDS
POLICIES
by funders, journals
and other organizations
DATABASES
including repositories
and knowledgebases
Identifiers
Terminologies Guidelines
Formats
3. CC BY-SA 4.0 International
Resource-level
curators are
jacks-of-all-trades,
with an eye for
relationships
Standards, databases
and policies, and the
connections among
them, are constantly
evolving
Scarcity, scope and
mission creep create
curation pressure points
Outreach,
attribution, partial
automation and
flexibility are vital
The challenges we
all face
The solutions we take
come from the
community we serve
Within FAIRsharing, Curation at Scale is the curation of relationships
across resources, organisations and domains
4. CC BY-SA 4.0 International
Resource-level
curators are
jacks-of-all-trades,
with an eye for
relationships
Standards, databases
and policies, and the
connections among
them, are constantly
evolving
Scarcity, scope and
mission creep create
curation pressure points
Outreach,
attribution, partial
automation and
flexibility are vital
The solutions we take
come from the
community we serve
The challenges we
all face
Within FAIRsharing, Curation at Scale is the curation of relationships
across resources, organisations and domains
5. CC BY-SA 4.0 International
Rich curation is
available at all levels
of granularity
Expertise moves from the
depth of data-level curation
within a particular domain
6. CC BY-SA 4.0 International
Rich curation is
available at all levels
of granularity
Expertise moves from the
depth of data-level curation
within a particular domain
to the width of
resource-level curation
across all research areas
and resource types
7. CC BY-SA 4.0 International
Providing relevant
information across the
entire research community
is a complex challenge
requiring a strong user
community
Relationships (among
resources and within the
user community) are at the
core of FAIRsharing’s
curation approaches
https://fairsharing.org/browse/subject
8. CC BY-SA 4.0 International
Tailored views for education and promotion
Collections are are branded pages that group selected standards and/or repositories
Initiatives and projects have created them for several purposes, e.g. to list resources:
URL: https://fairsharing.org/CrosswalkOfMostUsedMetadataSchemesAndGuidelines
Maintainers
Subjects
Mapped to each other
URL: https://fairsharing.org/RDACovid19WG
Maintainers
Subjects
Recommended by a community
URL: https://fairsharing.org/IVOA
Maintainers
Subjects
Developed by the community
URL: https://fairsharing.org/CDISC
Maintainers
Subjects
Developed by a SDO
9. CC BY-SA 4.0 International
URL: https://fairsharing.org/ISO20691, https://fairsharing.org/graph/3533
URL: https://committee.iso.org/standard/68848.html
Collections provide context
and connectivity
10. CC BY-SA 4.0 International
https://fairsharing.org/graph/3381
Relationships are central to
FAIRsharing, presenting both a
challenge and an opportunity.
11. CC BY-SA 4.0 International
https://fairsharing.org/graph/3381
Relationships are central to
FAIRsharing, presenting both a
challenge and an opportunity.
They are a dynamic system of
interconnected resources across
research domains.
12. CC BY-SA 4.0 International
https://fairsharing.org/graph/1699
Relationships are central to
FAIRsharing, presenting both a
challenge and an opportunity.
They are a dynamic system of
interconnected resources across
research domains.
Such systems showcase
interoperability and improve
findability.
13. CC BY-SA 4.0 International
https://fairsharing.org/graph/1699
14. CC BY-SA 4.0 International
Resource-level
curators are
jacks-of-all-trades,
with an eye for
relationships
Standards, databases
and policies, and the
connections among
them, are constantly
evolving
Scarcity, scope and
mission creep create
curation pressure points
Outreach,
attribution, partial
automation and
flexibility are vital
The solutions we take
come from the
community we serve
The challenges we
all face
Within FAIRsharing, Curation at Scale is the curation of relationships
across resources, organisations and domains
15. CC BY-SA 4.0 International
All manually-curated repositories are faced with the challenge of curator scarcity
Curator numbers do not follow Moore’s Law
>3600 resources
(March 2022)
repositories
standards
policies
16. CC BY-SA 4.0 International
https://fairassist.org
Resource-level metadata is vital for FAIR assessment
17. CC BY-SA 4.0 International
https://guidelines.openaire.eu/en/latest/data/index.html,
https://www.federalregister.gov/documents/2020/01/17/2020-00689/request-for-public-comment-on-draft-desirable-characteristics-of-repositories-for-managing-and,
https://www.rd-alliance.org/groups/data-repository-attributes-wg, Metadata Schema for the Description of Research Data Repositories, Repository Features to Help Researchers: An
invitation to a dialogue, Identifying ELIXIR Core Data Resources, Core Trust Seal, Science Europe, The TRUST Principles for digital repositories, COAR Community Framework for
Good Practices in Repositories, NIH: Selecting a Repository for Data Resulting from NIH-Supported Research, Standards, OpenDOAR Repositories and Metadata Practices, DCAT,
https://datascience.nih.gov/news/nih-office-of-data-science-strategy-announces-new-initiative-to-improve-data-access, https://doi.org/10.48550/arXiv.2012.13117,
https://rd-alliance.org/group/fair-research-software-fair4rs-wg/outcomes/fair-principles-research-software-fair4rs-0
The complexity challenge: implementing community consensus
18. CC BY-SA 4.0 International
The complexity challenge: implementing community consensus
https://doi.org/10.1126/science.aab2374 and
https://www.cos.io/initiatives/top-guidelines
Published in 2015 http://doi.org/10.5334/dsj-2020-005
Published in 2020
Published in 2022
Different focus but common goals?
1. Improve clarity and efficiency
2. Increase alignment and comparability,
3. Better guidance to researchers to manage and share digital objects
Transparency and Openness Promotion
19. CC BY-SA 4.0 International
Resource-level
curators are
jacks-of-all-trades,
with an eye for
relationships
Standards, databases
and policies, and the
connections among
them, are constantly
evolving
Scarcity, scope and
mission creep create
curation pressure points
Outreach,
attribution, partial
automation and
flexibility are vital
The challenges we
all face
The solutions we take
come from the
community we serve
Within FAIRsharing, Curation at Scale is the curation of relationships
across resources, organisations and domains
20. CC BY-SA 4.0 International
Reducing scarcity through community
Adopters and collaborators include:
An endorsed output of the
FAIRsharing WG
(since 2015):
A WG (since 2015) in:
Researchers in academia,
industry and government
Developers & curators of
resources and tools
Research data facilitators,
librarians, trainers
Society, unions
and community alliances
Journal publishers and
organisations with data policies
Funders and data
policy makers
A recommended resource in EOSC reports
Used by all stakeholder groups
https://fairsharing.org/communities
21. CC BY-SA 4.0 International
Content & Community Curator Programme
● Gain on-the-job curation
expertise
● Engagement and networking with
a community of like-minded
people
● Become an expert on data and
metadata standards, repositories
and data policies in their area
● Influence new FAIRsharing
functionalities
● Attribution via ORCID, specialist
user profiles, and others
Curate – Influence – Gain Attribution – Engage – Learn
Watch this space for exciting news!
Early adopters
22. CC BY-SA 4.0 International
Linking associated data resources,
standards and policies
Organization and user
attribution
23. CC BY-SA 4.0 International
We are soliciting feedback on:
Design and functionality of
organisation pages
Microattribution via edit history
display
Recognition for Community Curators
and maintainers
Attribution via ORCID as a trusted
partner
Organization and user
attribution
24. CC BY-SA 4.0 International
https://fairsharing.org/graph/3776
EU-funded BiCIKL Consortium
is curating entries in
FAIRsharing to structure
landscaping analysis of
adjacent infrastructures
https://bicikl-project.eu/
25. CC BY-SA 4.0 International
https://fairsharing.org/graph/1206
MINSEQE
https://fairsharing.org/graph/3776
EU-funded BiCIKL Consortium
is curating entries in
FAIRsharing to structure
landscaping analysis of
adjacent infrastructures
Traversal of the graph aids
resource discovery, gap
analysis and further targeted
outreach and engagement
26. CC BY-SA 4.0 International
Semi-automated curation:
Repository and knowledgebase mapping to EOSC
Identifier mapping via iterative
NLP techniques and curator checks
58% of FAIRsharing records are unique
Existing repository metadata
provided by OpenaAIRE
Mappings regularly produced
by FAIRsharing to aid
integration within
Future work can build on standards and policy record metadata, unique to
27. CC BY-SA 4.0 International
Semi-automated curation:
Organisation mapping to ROR
Mapping via iterative
NLP techniques and curator checks
55% of FAIRsharing organisations are
unique
Retrieve organisations
present in ROR
Mappings regularly produced
by FAIRsharing to aid
identification and attribution
for our user community
ROR IDs are added, but our curation is retained. In future, where conflicts exist,
FAIRsharing metadata could be fed back to ROR.
100,000+ organisations
top-level entities only
3200+ organisations
Granularity determined by
our community
28. CC BY-SA 4.0 International
Adding more
metadata descriptors
with our flexible data
model
● Refine database records
○ according to community
needs, e.g. the new RDA
Data Repository
Attributes WG, NIH GREI
Initiative
● Expand policy records
○ according to ongoing
work in RDA Funders IG /
Policy Standardisation IG
/ FAIRsharing WG
29. CC BY-SA 4.0 International
Working within RDA groups: common policy template
Objectives
● Develop and disseminate recommendations to improve funder-publisher policy
alignment; leveraging the template developed by the http://doi.org/10.5334/dsj-2020-005
● Consider how the recommendations could be used as a basis for continued alignment
● Implement the policy templates in FAIRsharing
Initial areas of focus
● Data availability statements (DASs) and data deposit requirements
https://www.rd-alliance.org/funder-publisher-research-data-policy-alignment
30. CC BY-SA 4.0 International
Working within RDA Data Repository Attribute WG: common
repository template
Objectives
● a list of common descriptive attributes of a data repository with
○ a definition of each attribute,
○ a rationale for the use and value of each attribute,
○ the feasibility of its implementation,
○ a gap analysis of its current availability from data repositories, and
● a selection of examples that illustrate the approaches currently being taken by repositories to
express and expose these attributes to users and user agents.
https://www.rd-alliance.org/groups/data-repository-attributes-wg
31. CC BY-SA 4.0 International
FAIRsharing can provide human- and machine-readable
implementations for community-agreed checklists/guidance
We can engage with
stakeholders, many of which
are already our users
We can prototype and
implement a common
template for
policy/database/standard
attributes, as consensus is
reached
We provide registration to enable:
● Citability
● Discoverability
● Flexible and clearer descriptions
● Relationships
● Machine readability
● Comparability
32. CC BY-SA 4.0 International
FAIRsharing can provide human- and machine-readable
implementations for community-agreed checklists/guidance
We can engage with
stakeholders, many of which
are already our users
We can prototype and
implement a common
template for
policy/database/standard
attributes, as consensus is
reached
We provide registration to enable:
● Citability
● Discoverability
● Flexible and clearer descriptions
● Relationships
● Machine readability
● Comparability
Curating, describing, tagging,
classifying the resources is not
trivial.
These activities need funding!
Can we reach
convergence of a common
system and template?
Maintaining the description
up-to-date and monitoring
evolution of each policy
require continued
engagement and curation
33. CC BY-SA 4.0 International
FAIRsharing content is diverse in terms of
breadth and depth, to help our users and
collaborators we are working on:
● Educational Guidance
○ to educate users on the many
functionalities FAIRsharing offers via
visual storytelling, infographics and videos
● Resource Finder
○ to create routes/pathways for
stakeholders to use our content
○ Via decision tree, developed in phases,
that guides users to discover, filter, select
resources
Educational Content
34. CC BY-SA 4.0 International
Stakeholder Advisors
● Amye Kenall, VP of Publishing and Product, Research Square
● Adam Leary, Oxford University Press
● Catriona MacCallum, Hindawi
● Dagmar Meyer, European Research Council, Executive Agency
● Dominic Fripp, JISC, UK
● Emma Ganley, Protocols.io
● Geraldine Clement-Stoneham, Medical Research Council
● Helena Cousijn, DataCite
● Iain Hrynaszkiewicz, PLoS
● Imma Subirats, FAO of the United Nations
● Kiera McNiece, Cambridge University Press
● Luiz Olavo Bonino, GO-FAIR
● Marina Soares E Silva and Sarah Callaghan, Elsevier
● Michael Ball, Biotechnology and Biological Sciences Research Council
● Mike Huerta, NIH National Library of Medicine
● Molly Cranston and Guillaume Wright, F1000Research
● Nick Everitt and Matthew Cannon, Taylor and Francis
● Scott Edmunds, GigaScience, Oxford University Press
● Simon Hodson, CODATA
● Theo Bloom, British Medical Journal
● Thomas Lemberger, EMBO Press
● Wei-Mun Chan, eLife
● Sowmya Swaminathan, Springer Nature
Current Operational Team
● Allyson Lister, Content and Community Lead
● Milo Thurston, Technical Lead
● Ramon Granell, Data Enrichment & Quality Manager
● Delphine Dauga, Data Curator Manager
● Hiring in progress, Web Developer
● Dominique Batista, Research Software Engineer
● Philippe Rocca-Serra, Co-Founder
● Susanna-Assunta Sansone, PI and Founder
● and many collaborators and contributors!
Executive Advisors
● Varsha Khodiyar, HDRUK
● David Carr, Independent expert
● Chris Graf, Springer Nature
● Marta Teperek, Data Stewardship Coordinator, TUDelft
● Robert Hanisch, Director, NIST Office of Data & Informatics
● Peter McQuilton, FAIRsharing Founding Member, GSK
Early Adopter Community Curators
● Kyle Copas, GBIF
● Annie Elkjær Ørum-Kristensen, GBIF
● Lindsey Anderson, PNNL
● Joe Miller, GBIF
Thank you!