O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Bioschemas: Datasets and Data Catalogs

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 15 Anúncio
Anúncio

Mais Conteúdo rRelacionado

Semelhante a Bioschemas: Datasets and Data Catalogs (20)

Mais recentes (20)

Anúncio

Bioschemas: Datasets and Data Catalogs

  1. 1. Alasdair J G Gray ELIXIR-UK Heriot-Watt University Carole Goble University of Manchester Rafael C Jimenez ELIXIR-Hub Bioschemas Datasets and Data Catalogs
  2. 2. <div itemscope itemtype="http://schema.org/Recipe"> <h1 itemprop="name">Classic potato salad</h1> <div itemprop="nutrition” itemscope itemtype="http://schema.org/NutritionInformation"> Nutrition facts: <span itemprop="calories">144 kcal</span>, </div> Ingredients: - <span itemprop="recipeIngredient">800g small new potato</span> - <span itemprop="recipeIngredient">3 shallot</span> Markup for web pages RDFa JSON-LD Microdata With markup 4 Oct 2017 @bioschemas 2
  3. 3. 4 Oct 2017 @bioschemas 3
  4. 4. Schema for Datasets and Catalogs Schema definitions: • Dataset: A body of structured information describing some topic(s) of interest http://schema.org/Dataset – 91 properties • DataCatalog: A collection of datasets http://schema.org/DataCatalog – 91 properties 4 Oct 2017 @bioschemas 5
  5. 5. Schema for Datasets and Catalogs Schema definitions: • Dataset: A body of structured information describing some topic(s) of interest http://schema.org/Dataset – 91 properties • DataCatalog: A collection of datasets http://schema.org/DataCatalog – 91 properties Google Profile • Dataset: 9 basic properties • DataCatalog: 1 property • DataDownload: 2 properties • Many more 4 Oct 2017 @bioschemas 6
  6. 6. Bioschemas • Schema.org for life sciences –Introduce life sciences types • Use case driven –Finding data –Presenting search results –Metadata exchange • Minimum properties – 6 • Link to domain ontologies Specification on top of schema.org Layer of constrains + documentation + extensions Specification Data model Minimum information Controlled vocabularies Cardinality Documentation Examples New (properties | types) 4 Oct 2017 @bioschemas 7
  7. 7. Mapping SpecificationUse cases Mockup Adoption Testing Application Bioschemas Process 4 Oct 2017 @bioschemas 9
  8. 8. BioSchema Specifications 4 Oct 2017 @bioschemas 10
  9. 9. UniProt • Name • Description • License • Release • Citation • Metrics • Tools • … Tools Citation Identifiers MetricsRelease Interfaces
  10. 10. Bioschemas DataCatalog 4 Oct 2017 @bioschemas 13 http://bioschemas.org/specifications/ A collection of datasets, e.g. catalogs, repositories, registries, …
  11. 11. Bioschemas Dataset 4 Oct 2017 @bioschemas 14 http://bioschemas.org/specifications/ A body of structured information describing some topic(s) of interest
  12. 12. Bioschemas Dataset Deployment Reactome dataset • Status: in production • Available from: view-source: http://reactome.org/content/detail/R-HSA-74160 • Use case: discovery • Documentation: http://reactome.org/ContentService/#!/discover/eventDiscoveryUsingGET 4 Oct 2017 @bioschemas 15
  13. 13. Deployment: Training Material 4 Oct 2017 @bioschemas 16
  14. 14. Benefit: Discovering Training Material 4 Oct 2017 @bioschemas 17
  15. 15. bioschemas.org Acknowledgements Haydee Artaza Terri Atwood Phil Barker Dominique Batista Niall Beard Raoul Bonnal Cath Brooksbank Tony Burdett Guillermo Calderon Mantilla Ethy Cannon Justin Clark-Casey Martin Cook Manuel Corpas Michael R Crusoe Pavel Dallakian Luc Deltombe Stephen Ficklin Leyla Garcia Carole Goble Alejandra Gonzalez- Beltran Alasdair Gray Jeffrey Grethe Henning Hermjakob Richard Holland Carlos Horro Jon Ison Christa Janko Andy Jenkinson Rafael C Jimenez Claire Johnson Simon Jupp Nick Juty Lee Larcombe Nicolas Le Novère Mikael Linden Audald Lloret Federico López Gómez Ronald Margolis Maria Martin Michaela Th. Mayrhofer Peter McQuilton Sarah Morgan Chris Mungall Aleksandra Nenadic Helen Parkinson Roberto Preste Giuseppe Profiti Philippe Rocca-Serra Gabriella Rustici Susanna A Sansone Vicky Schneider Serena Scollen Chris Taylor Milo Thurston Dan Timmons John Van Horn Susheel Varma Sameer Velankar Premysl Velek Andra Waagmeester Liz Williams Sarala Wimalaratne Anil Wipat Olga Ximena Giraldo Anita de Waard Peter van Heusden + others to be added 4 Oct 2017 @bioschemas 18

Notas do Editor

  • Adoption meeting: 12 catalogs marked up
    Define use case
    Metadata crosswalk and mapping to schema.org
    Metadata providers
    Metadata registries
    Standards defining metadata
    Bioschemas specification
    Define minimum properties based on “finding” use cases
    Define cardinality and suggested controlled vocabularies
    Test with existing entries
    Adoption by data repositories and registries
    Applications
  • Beacon
    Data Catalog
    Dataset
    Event
    Laboratory Protocol
    Organization
    Person
    Phenotype
    Protein
    Protein Annotation
    Protein Structure
    Sample
    Standard
    Tool
    Training Material
  • For example, lets look at the lading page of the uniprot data repository. There is the description of the repository, link to the latest release, citation information, licensing information etc. We can understand the text but it is not easy for a software agent to extract this information automatically. So in our data repository specification, we aim to encode this information as metadata which can be automatically extracted by a software agent.
  • 6 minimal
    7 recommended
    2 optional
  • More details in tutorial tomorrow

×