Overview of FAIR and the IMI FAIRplus project at the UK Conference of Bioinformatics and Computational Biology 2020: https://www.earlham.ac.uk/uk-conference-bioinformatics-and-computational-biology-2020
1. The FAIR Principles and FAIRplus:
tools & guidelines for making life science data FAIR
Susanna-Assunta Sansone
ORCiD: 0000-0001-5306-5690 | Twitter: @SusannaASansone
datareadiness.eng.ox.ac.uk
Associate Professor, Engineering Science
Associate Director, Oxford e-Research Centre
UK Conference of Bioinformatics and Computational Biology 2020, 29-30 Sept 2020
Slides: https://www.slideshare.net/SusannaSansone
2. A set of principles to enhance the
value of all digital resources and its
reuse by humans and machines
Data that is discoverable and reusable at scale
3. Findable
Accessible
Interoperable
Reusable
• Globally unique, resolvable, and persistent identifiers
▪ To retrieve and connect data
• Community defined descriptive metadata
▪ To enhance discoverability
• Common terminologies
▪ To use the same term mean the same thing
• Detailed provenance
▪ To contextualize the data and facilitate reproducibility
• Terms of access
▪ Open as possible, closed as necessary
• Terms of use
▪ Clear licences, ideally to enable innovation and reuse
The FAIR Principles in a nutshell
5. The scholarly publishing
ecosystem is changing
Data-relates mandates by
funders and institutions are
growing
Researchers need
recognition and credit
theconversation.com/how-robots-can-help-us-embrace-a-more-human-view-of-disability-76815
Human-machine collaboration is the future
o 21% pharmacology data (doi.org/10.1038/nrd3439-c1)
o 11% cancer data (doi.org/10.1038/483531a)
o unsatisfactory in ML (openreview.net/pdf?id=By4l2PbQ-)
towardsdatascience.com/scientific-data-analysis-pipelines-and-reproducibility-75ff9df5b4c5
Reproducibility of published studies is still problematic
Responding to needs and crisis
6. Findable
Accessible
Interoperable
Reusable
Is NOT a standard but a set of guiding
principles that provide for a continuum of
features, attributes and behaviours, via
many different implementations
The FAIR Principles are aspirational
7. Depends upon several stakeholders actively playing their parts to:
• deliver research infrastructures and tools
• use and harmonize the (meta)data standards
• address policies, education and training
• overcome technical, social and cultural challenges
• identify motivators, credit and rewards mechanisms
Making FAIR a reality in the research ecosystem
10. • Biopharma R&D productivity can be improved
by implementing the FAIR Principles
• FAIR enables powerful new AI analytics to
access data for machine learning and prediction
Ø Requirements
§ financial, technical, training
Ø Challenges
§ change the culture, show business value,
achieve the ‘FAIR enough’
Credit to:
Ian Harrow,
FAIR & OM projects
FAIR as enabler for the digital transformation
11. Funded from January 2019 to June 2022
23 participants:
• 3 SMEs
• 7 Pharmas
• 13 Academics
including ELIXIR-UK members:
o University of Oxford (SA. Sansone, P. Rocca-Serra)
o University of Manchester (G. Goble, N. Juty)
o Heriot Watt University (A. Gray)
Project Coordinator: ELIXIR
Project Leader: Janssen
www.fairplus-project.eu
12. Project types:
• completed
• ongoing
• new
Data types:
• molecular
• clinical
IMI projects:
Objectives and outputs
13. Project types:
• completed
• ongoing
• new
Data types:
• molecular
• clinical
• non-clinical
IMI projects: 1. Create the FAIR cookbook with
best practices on data
FAIRification and FAIR data
management
2. Identify FAIRification processes
and tools that work in the real
world
3. Increase FAIR levels of at least
20 IMI projects and internal
EFPIA datasets
4. Networking events for SMEs
5. FAIR Fellowship Programme -
FAIR data training
6. Change data management
culture - FAIR from the start,
with explicit plans built into IMI
projects
14. 1. Create the FAIR cookbook with
best practices on data
FAIRification and FAIR data
management
2. Identify FAIRification processes
and tools that work in the real
world
3. Increase FAIR levels of at least
20 IMI projects and internal
EFPIA datasets
4. Networking events for SMEs
5. FAIR Fellowship Programme -
FAIR data training
6. Change data management
culture - FAIR from the start,
with explicit plans built into IMI
projects
Project types:
• completed
• ongoing
• new
IMI projects:
Data types:
• molecular
• clinical
• non-clinical
15. Address questions/issues, rather then perform technical duties
Prioritization of the work based on pharma's needs
The FAIRification process
16. Address questions/issues, rather then perform technical duties
Prioritization of the work based on pharma's needs
The FAIRification process
17. Address questions/issues, rather then perform technical duties
Prioritization of the work based on pharma's needs
The FAIRification process
18. 1. Ontologies
2. Standards
3. Versioning
4. Identifiers
5. Licensing
Driver use cases:
• Update data to new ontology versions
• Capture ontology annotation provenance
• Ontologies as a service
• Ontology recommendations
• Ontology annotation recommendations
Top needs and challenges
19. ● To measures the FAIRness level of data
○ For use in the FAIRification processes to define initial/final level of data
FAIRness => leveraging on
● To measures capability and performance of an organization for
FAIR data generation and management
○ For use at the strategy level to identify investment areas, monitor
processes
FAIR indicators and capability maturity model
21. ● A comprehensive resource collating ‘recipes’ for making different
types of data FAIR
● Examples of published recipes:
● Converting Excel files to frictionless data package readable by computers
● Request terms to be added to a public ontology
What is it?
22. ● How to FAIRify or improve the FAIRness of exemplar datasets
● Which are the levels and indicators of FAIRness
● Which open source technologies, tools and services are available
● What skills are required
● Awareness of known challenges
Learning outcomes
26. 26
Technical infrastructure
• The Carpentries
• Galaxy Training Community
• The Alan Turing Institute’s Turing Way book
• Elixir Training Platform
Content
• The NIH Common Fund Data Ecosystem
• Pistoia Alliance’s FAIR Toolkit
• IMI EDHEN for OMOP documentation
Engaging and outreach