O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

AMIA Webinar - BioSharing - Mapping the landscape of standards in the life sciences

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Próximos SlideShares
How to share useful data
How to share useful data
Carregando em…3
×

Confira estes a seguir

1 de 50 Anúncio

AMIA Webinar - BioSharing - Mapping the landscape of standards in the life sciences

Baixar para ler offline

A 45 minute webinar presented to the AMIA (American Medical Informatics Association - www.amia.org) in May 2016 on BioSharing, a curated, searchable portal of inter-related data standards, databases, and policies in the life, environmental and biomedical sciences. We cover how we describe standards, how one can search using our simple, advanced and faceted search, how our wizard can guide you, and how our recommendations from journal data policies can aid your selection of metadata standards and repositories for your data.

A 45 minute webinar presented to the AMIA (American Medical Informatics Association - www.amia.org) in May 2016 on BioSharing, a curated, searchable portal of inter-related data standards, databases, and policies in the life, environmental and biomedical sciences. We cover how we describe standards, how one can search using our simple, advanced and faceted search, how our wizard can guide you, and how our recommendations from journal data policies can aid your selection of metadata standards and repositories for your data.

Anúncio
Anúncio

Mais Conteúdo rRelacionado

Semelhante a AMIA Webinar - BioSharing - Mapping the landscape of standards in the life sciences (20)

Mais de Peter McQuilton (20)

Anúncio

Mais recentes (20)

AMIA Webinar - BioSharing - Mapping the landscape of standards in the life sciences

  1. 1. BioSharing.org – mapping the landscape of standards in the life sciences Peter McQuilton, PhD (@drosophilic) BioSharing content lead On behalf of the BioSharing team (@biosharing)
  2. 2. Outline • Standards, databases and policies in the life sciences • BioSharing – an informative and educational resource • What it is • How to use it • How it can help you
  3. 3. A growth in data, a growth in databases Number of databases in the NAR database issue, up to 2015 (from @AlexBateman1)
  4. 4. Credit: ttps://projects.ac/blog/five-top-reasons-to-protect-your-data-and-practise-safe-science/ 2014 Better data = better science
  5. 5. The FAIR Principles
  6. 6. But in all fairness, not all data is FAIR!
  7. 7. A B C D E 1 Group1 Group2 2 Day 0 3 Sodium 139 142 4 Potassium 3.3 4.8 5 Chloride 100 108 6 BUN 18 18 7 Creatine 1.2 1.2 8 Uric acid 5.5* 6.2* 9 Day 7 10 Sodium 140 146 11 Potassium 3.4 5.1 12 Chloride 97 108 S1Sh.cuo Credit to: Iain Hrynaszkiewicz Sharing starts with good metadata…
  8. 8. A B C D E 1 Group1 Group2 2 Day 0 3 Sodium 139 142 4 Potassium 3.3 4.8 5 Chloride 100 108 6 BUN 18 18 7 Creatine 1.2 1.2 8 Uric acid 5.5* 6.2* 9 Day 7 10 Sodium 140 146 11 Potassium 3.4 5.1 12 Chloride 97 108 S1Sh.cuo Meaningless column titles Special characters can cause text mining errors No units Unhelpful document name Undefined abbreviation Formatting for information that should be in metadata Credit to: Iain Hrynaszkiewicz …. which this isn’t...
  9. 9. A B C D E F 1 Parameter Day Control Treated Units P 2 Sodium 0 139 142 mEq/l 0.82 3 Sodium 7 140 146 mEq/l 0.70 4 Sodium 14 140 158 mEq/l 0.03 5 Sodium 21 143 160 mEq/l 0.02 6 Potassium 0 3.3 4.8 mEq/l 0.06 7 Potassium 7 3.4 5.1 mEq/l 0.07 8 Potassium 14 3.7 4.7 mEq/l 0.10 9 Potassium 21 3.1 3.6 mEq/l 0.52 10 Chloride 0 100 108 mEq/l 0.56 11 Chloride 7 97 108 mEq/l 0.68 12 Chloride 14 101 106 mEq/l 0.79 Table_S1_Shanghai_blood.xls Credit to: Iain Hrynaszkiewicz …. This is much clearer!
  10. 10. Seven week old C57BL/6N mice were treated with low-fat diet. Liver was dissected out, hepatocytes prepared… From natural language to structured data
  11. 11. Age value Unit Strain name Subject of the experiment Type of diet and experimental condition Anatomy part Seven week old C57BL/6N mice were treated with low-fat diet. Liver was dissected out, hepatocytes prepared … Type of protocol – cell preparation Type of protocol - sample treatment Type of protocol – liver preparation From natural language to structured data
  12. 12. • Data/content standards: • Structure, enrich and report the description of the datasets and the experimental context under which they were produced • Facilitate the discovery, sharing, understanding and reuse of datasets Data has to be structured for sharing – we need standards
  13. 13. de jure de facto grass-roots groups standard organizations Nanotechnology Working Group Community mobilisation to develop content standards Formats Terminologies Guidelines
  14. 14. Guidelines = Minimum information reporting requirements, checklists o Report the same core, essential information o e.g. ARRIVE guidelines Terminologies = Controlled vocabularies, taxonomies, thesauri, ontologies etc. o Use the same word and refer to the same ‘thing’ o e.g. Gene Ontology Models/Formats = Conceptual model, conceptual schema, exchange formats o Allow data to flow from one system to another o e.g. FASTA Enablers: to better describe, share and query data Formats Terminologies Guidelines
  15. 15. 193 85 346 miame MIAPA MIRIAM MIQASMIX MIGEN ARRIVE MIAPE MIASE MIQE MISFISHIE…. REMARK CONSORT MAGE-Tab GCDML SRAxml SOFT FASTA DICOM MzML SBRML SEDML… GELML ISA-Tab CML MITAB AAO CHEBI OBI PATO ENVO MOD BTO IDO… TEDDY PRO XAO DO VO There are over 600 standards in the life sciences Formats Terminologies Guidelines
  16. 16. Data policies (30) Databases (763) data/metadata standards (652) A complex and evolving landscape Formats Terminologies Guidelines
  17. 17. Is there a database, implementing standards, where to deposit my metagenomics dataset? My funder’s data sharing policy recommends the use of established standards, but which ones are widely endorsedand applicable to my toxicological and clinical data? Am I using the most up-to-date version of this terminology to annotate cell-based assays? I understand this format has been deprecated; what has been replaced by and how is leading the work? Are there databases implementing this exchange format, whose development we have funded? What are the mature standardsand standards-compliant databases we should recommend to our authors? Helping people make the right decision
  18. 18. Introducing BioSharing
  19. 19. Mapping the landscape of ‘standards’ in the life, environmental and biomedical sciences Mapping the landscape of ‘standards’ in the life, environmental and biomedical sciences 1,400 records and growing What is BioSharing? A web-based, curated and searchable portal that monitors the development and evolution of standards, their use in databases and the adoption of both in data policies, to inform and educate the user community.
  20. 20. Mapping the landscape of ‘standards’ in the life, environmental and biomedical sciences Mapping the landscape of ‘standards’ in the life, environmental and biomedical sciences What is BioSharing? Launched in 2011, as an evolution of the MIBBI portal (2008-2011) Manually curated Community driven Growing userbase and visibility 1,400 records and growing
  21. 21. also operates as a WG in Run at is also an Resource that The BioSharing community 1,400 records and growing
  22. 22. How do we describe standards?
  23. 23. 23 Criteria for evaluating standards
  24. 24. Linking standards, databases and policies
  25. 25. Model/format formalizing reporting guideline --> <-- Reporting guideline used by model/format Cross-linking standards to standards and databases
  26. 26. Model/format formalizing reporting guideline --> <-- Reporting guideline used by model/format Cross-linking standards to standards and databases
  27. 27. Indicators of life cycle status Ready for use, implementation, or recommendation In development Status uncertain Deprecated as subsumed or superseded Manually curated, approved by the community
  28. 28. An informative and educational resource Simple and advanced searches, ask our wizard or view journal recommendations
  29. 29. The International Conference on Systems Biology (ICSB), 22-28 August,2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project Search, filter, and refine using our faceted search Search, filter, and refine using our faceted search
  30. 30. Collections group together one or more types of resource by domain, project or organization. Recommendations are a core- set of resources that are selected and recommended by a funder or journal data policy.
  31. 31. Standards and databases recommended by journal data policies Standards and databases recommended by journal data policies
  32. 32. The wizard: • Guides users through the data • Will grow in functionality and complexity, based on user feedback • Powered by curated descriptions of each standard and database, and their relations
  33. 33. Is there a database, implementing standards, where to deposit my metagenomics dataset? My funder’s data sharing policy recommends the use of established standards, but which ones are widely endorsedand applicable to my toxicological and clinical data? Am I using the most up-to-date version of this terminology to annotate cell-based assays? I understand this format has been deprecated; what has been replaced by and how is leading the work? Are there databases implementing this exchange format, whose development we have funded? What are the mature standardsand standards-compliant databases we should recommend to our authors? Helping people make the right decision
  34. 34. BioSharing – what we do Inform – what’s out there, which databases use which standards. Map the landscape. Educate– what databases are recommended by your funder, or journal of choice, which standards should you be using, which standards and databases should you recommend? Explore the landscape.
  35. 35. Acknowledgements Eamonn Maguire, DPhil Software Engineer (contractor) Philippe Rocca-Serra, PhD Senior Research Lecturer Alejandra Gonzalez-Beltran, PhD Research Lecturer Milo Thurston, DPhil Research SW Engineer Massimiliano Izzo, PhD Research SW Engineer Peter McQuilton, PhD Knowledge Engineer Allyson Lister, PhD Knowledge Engineer David Johnson, PhD Research SW Engineer Susanna-Assunta Sansone, PhD Centre’s Associate Director, Principal Investigator and Springer Nature’s Consultant for Scientific Data

×