FAIR Data: principles and practices
A growing worldwide movement for reproducible research encourages making data, along with the experimental details, available according to the FAIR principles of Findability, Accessibility, Interoperability and Reusability (see http://www.nature.com/articles/sdata201618). Several data management, sharing policies and plans have emerged and, in parallel, a growing number of community-based groups are developing hundreds of standards to harmonize the reporting of different experiments. Community mobilization is evident also by the number of efforts and alliances, but also data journals and data centres being launched.
1. Associate Professor, Associate Director
Susanna-Assunta Sansone, PhD
@SusannaASansone
Oxford OpenCon Oxford, 1st Dec 2017 – Slides at: https://www.slideshare.net/SusannaSansone
Consultant and Honorary Academic Editor
2.
3. We need to do better science more efficiently
• Achieve research data transparency
• Standards for annotation and interoperability
• Meet ethics and public expectations
• Safe use of data
• Maximise the use of e-infrastructure
• Secure, distributed and scalable
• Engage the innovation ecosystem
• Academia, industry and government
• Invest in people, skills and methods
• Connect existing silo-ed disciplines
4. We need to do better science more efficiently
• Achieve research data transparency
• Standards for annotation and interoperability
• Meet ethics and public expectations
• Safe use of data
• Maximise the use of e-infrastructure
• Secure, distributed and scalable
• Engage the innovation ecosystem
• Academia, industry and government
• Invest in people, skills and methods
• Connect existing silo-ed disciplines
5. A set of principles, for those
wishing to enhance
the value of their
data holdings
Designed and endorsed by a diverse
set of stakeholders - representing
academia, industry, funding agencies,
and scholarly publishers.
https://www.force11.org/group/fairgroup/fairprinciples
6. These put emphasis on enhancing the
ability of machines to automatically
find and use the data, in addition to
supporting its reuse by individual
7. These put emphasis on enhancing the
ability of machines to automatically
find and use the data, in addition to
supporting its reuse by individual
9. Wider adoption by many biomedical research
infrastructure programmes in EU and USA, e.g.
10. Wider adoption by many biomedical research
infrastructure programmes in EU and USA, e.g.
€19 million
2015 - 2019
11. Wider adoption by many biomedical research
infrastructure programmes in EU and USA, e.g.
€19 million
2015 - 2019
€3.3 billion
2014 - 2020
12. Wider adoption by many biomedical research
infrastructure programmes in EU and USA, e.g.
€19 million
2015 - 2019 $95.5 million
2017 - 2020
€3.3 billion
2014 - 2020
13. Big
Life
Science
Company
Yesterday Today Tomorrow
Yesterday Today Tomorrow
Innovation Model Innovation inside Searching for Innovation Heterogeneity of collaborations;
part of the wider ecosystem
IT Internal apps & data Struggling with change
security and trust
Cloud, services
Data Mostly inside In and out Distributed
Portfolio Internally driven and owned Partially shared Shared portfolio
Credit to:
Big
Life
Science
Company
Proprietary
content
provider
Public
content
provider
Academic
group
Software vendor
CRO
Service provider
Regulatory
authorities
The rise of public-private-partnerships
22. Domain-specific metadata standards for datasets
MIAME
MIRIAM
MIQAS
MIX
MIGEN
ARRIVE
MIAPE
MIASE
MIQE
MISFISHIE
….
REMARK
CONSORT
SRAxml
SOFT FASTA
DICOM
MzML
SBRML
SEDML
…
GELML
ISA
CML
MITAB
AAO
CHEBIOBI
PATO ENVO
MOD
BTO
IDO
…
TEDDY
PRO
XAO
DO
VO
de jure
standard
organizations
de facto
grass-roots
groups
Formats Terminologies Guidelines
220+
115+
548+
~1000
23.
24. Map of the landscape, monitoring development and evolution of
data and metadata standards, their use in databases and the
adoption of both in data policies
28. • Data has to become an integral part
of the scholarly communications
• Responsibilities lie across several
stakeholder groups: researchers,
data centers, librarians, funding
agencies and publishers
• But publishers occupy a “leverage
point” in this process
FAIR data - roles and responsibilities
29. • Incentive, credit for sharing
- Big and small data
- Unpublished data
- Long tail of data
- Curated aggregation
• Peer review of data
• Value of data vs. analysis
• Discoverability and reusability
- Complementing community
databases
FAIR data – the value of data articles/journals