O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

FAIR and metadata standards - FAIRsharing and Neuroscience

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio

Confira estes a seguir

1 de 53 Anúncio

FAIR and metadata standards - FAIRsharing and Neuroscience

Baixar para ler offline

My presentation at the http://neuroinformatics2017.org (Kuala Lumpur, Malaysia) on FAIR and FAIRsharing (previously BioSharing); metadata standards and their implementation by databases/repositories and adoption by journals' and funders' data policies.

My presentation at the http://neuroinformatics2017.org (Kuala Lumpur, Malaysia) on FAIR and FAIRsharing (previously BioSharing); metadata standards and their implementation by databases/repositories and adoption by journals' and funders' data policies.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Diapositivos para si (20)

Semelhante a FAIR and metadata standards - FAIRsharing and Neuroscience (20)

Anúncio

Mais de Susanna-Assunta Sansone (17)

Mais recentes (20)

Anúncio

FAIR and metadata standards - FAIRsharing and Neuroscience

  1. 1. FAIR digital research assets: beyond the acronym Susanna-Assunta Sansone, PhD @SusannaASansone ORCiD 0000-0001-5306-5690 Consultant, Founding Academic Editor Associate Director, Principal Investigator Neuroinformatics, Kuala Lumpur, 20-21 August, 2017
  2. 2. • Available in a public repository • Findable through some sort of search facility • Retrievable in a standard format • Self-described so that third parties can make sense of it • Intended to outlive the experiment for which they were collected To do better science, more efficiently we need data that are…
  3. 3. A set of principles, for those wishing to enhance the value of their data holdings
  4. 4. Wider adoption of the FAIR principles, by research infrastructure programmes, e.g.
  5. 5. Defining FAIRness
  6. 6. Defining a framework for evaluating FAIRness By the fairmetrics.org Working Group
  7. 7. NOTE: The Principles are high-level; do not suggest any specific technology, standard, or implementation-solution Principles put emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals Interoperability standards – the pillars of FAIR
  8. 8. The invisible machinery • Identifiers and metadata to be implemented by technical experts in tools, registries, catalogues, databases, services • It is essential to make standards ‘invisible’ to lay users, who often have little or no familiarity with them
  9. 9. http://nometadata.org/logo
  10. 10. Metadata standards – fundamentals • Descriptors for a digital object that help to understand what it is, where to find it, how to access it etc. • The type of metadata depends also on the type of digital object (e.g. software, dataset) • The depth and breadth of metadata varies according to their purpose § e.g. reproducibility requires richer metadata then citation
  11. 11. • Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets • The depth and breadth of descriptors vary according to the domain, broadly covering the what, who, when, how and why Metadata standards - datasets
  12. 12. • Domain-level descriptors that are essential for interpretation, verification and reproducibility of datasets • The depth and breadth of descriptors vary according to the domain, broadly covering the what, who, when, how and why allowing: § experimental components (e.g., design, conditions, parameters), § fundamental biological entities (e.g., samples, genes, cells), § complex concepts (such as bioprocesses, tissues and diseases), § analytical process and the mathematical models, and § their instantiation in computational simulations (from the molecular level through to whole populations of individuals) to be harmonized with respect to structure, format and annotation Metadata standards - datasets
  13. 13. Metadata for discovery
  14. 14. model and related formats Metadata for discovery, but not only
  15. 15. …..
  16. 16. Domain-specific metadata standards for datasets MIAME MIRIAM MIQAS MIX MIGEN ARRIVE MIAPE MIASE MIQE MISFISHIE …. REMARK CONSORT SRAxml SOFT FASTA DICOM MzML SBRML SEDML … GELML ISA CML MITAB AAO CHEBIOBI PATO ENVO MOD BTO IDO … TEDDY PRO XAO DO VO de jure standard organizations de facto grass-roots groups Formats Terminologies Guidelines 220+ 115+ 548+ ~1000
  17. 17. https://doi.org/10.6084/m9.figshare.3795816.v2 https://doi.org/10.6084/m9.figshare.4055496.v1
  18. 18. • Perspective and focus vary, ranging: § from standards with a specific biological or clinical domain of study (e.g. neuroscience) or significance (e.g. model processes) § to the technology used (e.g. imaging modality) • Motivation is different, spanning: § creation of new standards (to fill a gap) § mapping and harmonization of complementary or contrasting efforts § extensions and repurposing of existing standards • Stakeholders are diverse, including those: § involved in managing, serving, curating, preserving, publishing or regulating data and/or other digital objects § academia, industry, governmental sectors, and funding agencies § producers but also also consumers of the standards, as domain (and not just technical) expertise is a must A complex landscape
  19. 19. Standards’ life cycle • Formulation § use cases, scope, prioritization and expertise • Development § iterations, tests, feedback and evaluation § harmonization of different perspectives and available options • Maintenance § (exemplar) implementations, technical documentation, education material, metrics § sustainability, evolution (versions) and conversion modules
  20. 20. Technologically-delineated views of the world Biologically-delineated views of the world Generic features (‘common core’) - description of source biomaterial - experimental design components Arrays & Scanning … Columns Gels MS MS FTIR NMR Columns … transcriptomics proteomics metabolomics plant biology epidemiology neuroscience Fragmentation, duplications and gaps Arrays Scanning …
  21. 21. Arrays Scanning … Arrays & Scanning … Columns Gels MS MS FTIR NMR Columns … transcriptomics proteomics metabolomics Modularization to combine and validate plant biology epidemiology neuroscience Proteomics-based investigations of neurodegenerative diseases Proteomics and metabolomics- based investigations of neurodegenerative diseases
  22. 22. Working in/across multiple domains is challenging • Requires § Mapping between/among heterogeneous representations § Conceptual modelling framework to encompass the domain specific metadata standards § Tools to handle customizable annotation, multiple conversions and validation
  23. 23. Technical and social engineering required • Pain points include § Fragmentation § Coordination, harmonization, extensions § Credit, incentives for contributors § Governance, ownership § Indicators and evaluation methods § Outreach and engagement with all stakeholders § Synergies between basic and clinical/medical areas § Implementations: infrastructures, tools, services § Education, documentation and training § Funding streams § Business models for sustainability
  24. 24. Too many cooks in the standards’ kitchen?
  25. 25. Standards fusion…anyone?
  26. 26. doi: 10.1126/science.1180598 doi:10.1038/nbt1346doi:10.1038/nbt1346 OBO Portal and Foundry Portal and Foundrydoi: 10.1038/nbt.1411 Doing my fair share
  27. 27. • Consumers: § How do I find the standards appropriate for my case? • Producers § How do I make my standards visible to others? Improving discoverability of standards
  28. 28. Monitors the development and evolution of standards, their use in databases and the adoption of both in data policies, to inform and educate the user community
  29. 29. Standard developing groups, incl:Journal, publishers, incl: Cross-links, data exchange, incl: Societies and organisations, incl: Institutional RDM services, incl: Projects, programmes: Working with and for producers and consumers
  30. 30. Databases/data repositories Metadata standards Formats Terminologies Guidelines Interlink standards among themselves and with repositories Data policies by funders, journals and other organizations
  31. 31. Formats Terminologies Guidelines …and to indicate ‘adoption’ Databases/data repositories Data policies by funders, journals and other organizations Metadata standards
  32. 32. 270 48 23 2 97 87 4 204 9 6 8 Assign ‘indicators’ to describe their status… Paper in preparation, preliminary information as of July 2017 Ready for use, implementation, or recommendation In development Status uncertain Deprecated as subsumed or superseded All records are manually curated in-house and verified by the community behind each resource
  33. 33. Help us map the neuroscience standards landscape
  34. 34. Journal Recommendations Models/Formats Reporting Guidelines Terminology Artifacts Number of standards recommended by 68 journals/publishers policies (the top one) 6 out of 223 (ISA-Tab) 26 out of 118 (MIAME) 8 out of 343 (NCBI Tax) Paper in preparation, preliminary information as of July 2017 Activating the decision-making chain
  35. 35. Models/Formats Reporting Guidelines Terminology Artifacts Database Implementations Journal Recommendations Models/Formats Reporting Guidelines Terminology Artifacts Number of standards recommended by 68 journals/publishers policies (the top one) Number of standards implemented by 544 databases/repositories (the top one) 6 out of 223 (ISA-Tab) 26 out of 118 (MIAME) 8 out of 343 (NCBI Tax) 59 out of 116 (MIAME) 146 out of 223 (FASTA) 121 out of 343 (GO) Paper in preparation, preliminary information as of July 2017 Activating the decision-making chain
  36. 36. Philippe Rocca-Serra, PhD Senior Research Lecturer Alejandra Gonzalez-Beltran, PhD Research Lecturer Milo Thurston, DPhD Research Software Engineer Massimiliano Izzo, PhD Research Software Engineer Peter McQuilton, PhD Knowledge Engineer Allyson Lister, PhD Knowledge Engineer Eamonn Maguire, Dphil Contractor David Johnson, PhD Research Software Engineer Melanie Adekale, PhD Biocurator Contractor Delphine Dauga, PhD Biocurator Contractor Susanna-Assunta Sansone, PhD Principal Investigator, Associate Director
  37. 37. The (long) road to FAIR Interoperability standards are digital objects in their own right, with their associated research, development and educational activities

×