SlideShare uma empresa Scribd logo
1 de 46
Baixar para ler offline
Bio2RDF cloud of
Virtuoso SPARQL endpoints


 Life Science
Raw Data Now


François Belleau, Marc-Alexandre Nolin,
    Peter Ansell, Michel Dumontier

          30th April 2009
W3C-HCLS F2F Meeting, Cambridge, MA
Agenda

    Why we did Bio2RDF ?
●



    How we did it ?
●



    What is know about hexokinase ?
●



    Where we are going ?
●
The problem

According to NAR 2009 Database
collection 1170 public databases
exists.

How can they be integrated to behave
like a global coherent resource ?
Public map of 1744 namespaces according to
  BioMoby, NAR, SRS, GO, NCBI, UniProt
Bio2RDF vision in 2007



 Johanne Luciano vision for
knowledge integration in 2005




 W3C vision of semantic web
          in 2006
Bio2RDF Mouse and Human Atlas map
      in 2008 65 millions triples
Bio2RDF actual contribution
                  to the Linked Data cloud




    Linked data cloud
         in 2007




                                                     Linked data cloud
                                                      in March 2009

http://linkeddata.org/
http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics
Bio2RDF cloud map of
2,3 billions triples in 2009
Why do it ?
Not to replace HTML or XML by an other new
format, RDF and OWL, but to answer science
question by submiting SPARQL query over
the global knowledge base accessible through
the Internet to the Life Science SPARQL
endpoints cloud.
Solution


Bio2RDF approach to the data integration
problem in bioinformatics :
Apply the semantic web approach based
on RDF, OWL and SPARQL technologies.
How we did it ?
Bio2RDF architecture
Our design principles



http://www.w3.org/DesignIssues/LinkedData




                    http://bio2rdf.wiki.sourceforge.net/Banff%20Manifesto
YeastHub design in 2005

       Conversion of Dataset to RDF
   ●


       Use of Sesame Triplestore
   ●


       SeRQL query interface
   ●




http://www.ncbi.nlm.nih.gov/pubmed/15961502
Bio2RDF at ISMB 2005
                 the begining



Thanks to Kei Cheung,
Johanne Luciano, Eric
Neumann and
Christopher Baker they
draw the lines.
Bio2RDF realtime rdfiser in 2007
Actual Architecture




              Offline rdfising process
            ●

            ● Virtuoso SPARQL endpoints

               network
            ● Namespace resolution

              through DNS subdomain
Main REST services
    Describe a ressource by a dereferencable URI
●


      http://bio2rdf.org/ns:id
    ●


    Global services over federated endpoints
●


        http://bio2rdf.org/links/ns:id
    ●


        http://bio2rdf.org/search/searchedTerm
    ●


    Targeted services to a specific endpoint
●



        http://bio2rdf.org/linksns/ns2/ns1:id
    ●


        http://bio2rdf.org/searchns/ns/searchedTerm
    ●


    other services are available.
●
Describe service implementation
    http://bio2rdf.org/ns:id
●



    Corresponding SPARQL query :
●


        CONSTRUCT {
    ●

          ?s ?p ?o .
        }
        WHERE {
          ?s ?p ?o .
          FILTER(?s = <http://bio2rdf.org/ns:id>).
        }
    Submited at this URL
●


        http://ns.bio2rdf.org/sparql?query=...
    ●


            Based of DNS subdomain resolution service
        –
Bio2RDF JSP server software
http://sourceforge.net/projects/bio2rdf/
Peter Ansell is writing the Bio2RDF
            JSP server
    The software transform Bio2RDF URIs to SPARQL
●

    queries in real time.
    Its aim is to access normalised RDF information
●

    located in multiple endpoints using the concept of
    Public Namespaces and Private Record Identifiers and
    distributed SPARQL queries which are matched to the
    content in each endpoint.
    Each of the following databases have normalisation
●

    rules which normalise them back to bio2rdf.org
    URI's :Dbpedia, Drugbank, LinkedCT, HCLS
    KB/Neurocommons, Diseasome, Dailymed, Bioguid
    DOI
Bio2RDF.war package future
    Provide more pipes to perform integrated actions without
●

    having to put HTTP SPARQL requests into a workflow
    system when a URI resolution can perform the query in a
    distributed and normalised manner more efficiently
    Bring together the current distributed efforts to provide a
●

    complete HTML redirection registry so that a large
    percentage of Bio2RDF namespaces can be redirected
    with http://bio2rdf.org/html/namespace:identifier
    Form ontologies describing the query type, provider, rdf
●

    normalisation rule, namespace paradigm
    Integrate http://rdf.myexperiment.org/sparql and similar
●

    workflow RDF endpoints so that scientific workflows can
    be linked to their data cleanly
Bio2RDF.owl




http://quebec.bio2rdf.org/download/bio2rdf-2008.owl
Michel Dumontier will design
Bio2RDF.owl ontology next version
What is known about hexokinase ?
Submit your query...
    To the web search engine
●


    To existing public web site offering data
●

    integration services;
    Using Bio2RDF SPARQL endpoints
●


        Submitting a SPARQL query;
    ●


        Using facet browser interface from Virtuoso 6.0
    ●

        server;
        Dereferencing Bio2RDF search URI;
    ●


        Using a Taverna workflow composed of SPARQL
    ●

        queries to obtain federated results from KEGG,
        Entrez Gene and GO;
The usual unsemantic way
Existing integrated search services


                     EBI/EB-eye
 NCBI/Entrez




KEGG/DBGET           GoPubmed
By submitting a SPARQL query
   http://atlas.bio2rdf.org/sparql
What is know about « hexokinase »
                with semantic ?
select ?t1 ?p2 count(*)
where {
    ?s1 ?p1 ?o1 .
    FILTER( bif:contains(?o1, quot;hexokinasequot;)) .
    ?s1 a ?t1 .
    ?s1 ?p2 ?o2 .
}
ORDER BY ?t1 ?p2
Use Virtuoso 6.0 facet browser
    http://lod.openlinksw.com/
Dereferencing search URL
http://bio2rdf.org/search/hexokinase
How can we submit a complex
query over the network of SPARQL
            endpoints ?
By building a mashup with Taverna
1) Write your complex SPARQL query as if a
  global graph would be available
2) Identify the needed namespaces and split the
  query to fetch each data source separetly
3) Build a mashup using a Taverna workflow that
  instanciate a local triplestore
4) Execute your complex query locally on the
  mashup
The SPARQL query needed
 (dont try this home, do it on the web !)
Get the list of genes
    from KEGG pathways of a specified taxon
    Clear graph
●



    Get KEGG pathways list for a
●

    specific taxon
    For each pathway get genes
●

    list and import instances
    Count the number of genes
●

    found




                                   http://www.myexperiment.org/workflows/747
Insert into local triplestore
       GeneID genes and KEGG pathways
    Get the list of genes
●



    Get the list of pathways
●



    Insert into local triplestore
●

    each corresponding graph




                                    http://www.myexperiment.org/workflows/748
Insert into local triplestore
             the needed GO annotations
    Get the GO annotations for
●

    each gene
Finally, the neeeded query merging
KEGG, Entrez Gene and GO together
Bio2RDF resources
Bio2RDF's mirrors
http://quebec.bio2rdf.org/
  http://qut.bio2rdf.org/
Bio2RDF SPARQL endpoints
http://www.freebase.com/view/user/bio2rdf/public/sparql
Life Science Raw Data Now
http://quebec.bio2rdf.org/download
Visit our Wiki rdfiser cookbook
http://bio2rdf.wiki.sourceforge.net/
Bio2RDF news




 http://bio2rdf.blogspot.com/
                                http://www.slideshare.net/search/slideshow?q=bio2rdf




                                        http://scholar.google.com/scholar?q=bio2rdf
http://groups.google.ca/group/bio2rdf
Our 2009 objectives
    Get approval from data provider to distribute
●

    RDF dump and publish SPARQL endpoints
    (UniProt, BioCyc, Pathway Commons, Bind are
    in);
    Start using Virtuoso 6 cluster;
●


    Design more services accessible with REST
●

    protocol via our JSP package;
    Recruit mirror server;
●


    Develop new rdfiser program in a community
●

    effort;
Thanks
Jean Morissette, Nicole Tourigny

    The Bio2RDF community
●


    Centre de recherche du CHUL
●


    Université Laval
●


    Dumontier Lab
●


    QUT eResearch Center
●


    Openlink Virtuoso
●

Mais conteúdo relacionado

Destaque

Knowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisKnowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisCOST action BM1006
 
Linux for bioinformatics
Linux for bioinformaticsLinux for bioinformatics
Linux for bioinformaticscursoNGS
 
XVIII FIRA DEL JOGUET- Feria del Juguete antiguo en Dénia
XVIII FIRA DEL JOGUET- Feria del Juguete antiguo en DéniaXVIII FIRA DEL JOGUET- Feria del Juguete antiguo en Dénia
XVIII FIRA DEL JOGUET- Feria del Juguete antiguo en DéniaJuan Fernández Moro
 
Aquality Forum 2016 - Lateral Thinking
Aquality Forum 2016 - Lateral ThinkingAquality Forum 2016 - Lateral Thinking
Aquality Forum 2016 - Lateral ThinkingMarco Pellegrini
 
Rekrutterings dag - MHI Vestas Offshore Wind
Rekrutterings dag - MHI Vestas Offshore WindRekrutterings dag - MHI Vestas Offshore Wind
Rekrutterings dag - MHI Vestas Offshore WindThomas Grænge
 
España: Impacto y recuperación del clima laboral tras una reestructuración
España: Impacto y recuperación del clima laboral tras una reestructuración  España: Impacto y recuperación del clima laboral tras una reestructuración
España: Impacto y recuperación del clima laboral tras una reestructuración LLYC
 
Análisis del libro didáctico
Análisis del libro didácticoAnálisis del libro didáctico
Análisis del libro didácticoRafaela Sá
 
Internship Development Portfolio-DCCG
Internship Development Portfolio-DCCGInternship Development Portfolio-DCCG
Internship Development Portfolio-DCCGGladys Sanchez
 
Ejemplo de un encerado diagnóstico en un pfu
Ejemplo de un encerado diagnóstico en un pfuEjemplo de un encerado diagnóstico en un pfu
Ejemplo de un encerado diagnóstico en un pfuSimone Vasquez
 
Top 8 digital marketing conference of 2016
Top 8 digital marketing conference of 2016Top 8 digital marketing conference of 2016
Top 8 digital marketing conference of 2016SZI Technologies
 
Tina's_Professional_Portfolio
Tina's_Professional_PortfolioTina's_Professional_Portfolio
Tina's_Professional_PortfolioTina Hamilton
 
Wip 43560 project-king william 2016
Wip 43560 project-king william 2016Wip 43560 project-king william 2016
Wip 43560 project-king william 2016Mark Klingman
 
Acreditacion de la Educacion Superior
Acreditacion de la Educacion SuperiorAcreditacion de la Educacion Superior
Acreditacion de la Educacion SuperiorMilton Guillin
 
Informatica juridica principios rectores. juan melean
Informatica juridica principios rectores. juan meleanInformatica juridica principios rectores. juan melean
Informatica juridica principios rectores. juan meleanjm11540042
 
Semantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveSemantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveAdrian Paschke
 
Estudio comparativo sobre la adaptación de cofias metálicas
Estudio comparativo sobre la adaptación de cofias metálicasEstudio comparativo sobre la adaptación de cofias metálicas
Estudio comparativo sobre la adaptación de cofias metálicasALVAROUAC
 

Destaque (20)

Knowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysisKnowledge management for integrative omics data analysis
Knowledge management for integrative omics data analysis
 
Linux for bioinformatics
Linux for bioinformaticsLinux for bioinformatics
Linux for bioinformatics
 
XVIII FIRA DEL JOGUET- Feria del Juguete antiguo en Dénia
XVIII FIRA DEL JOGUET- Feria del Juguete antiguo en DéniaXVIII FIRA DEL JOGUET- Feria del Juguete antiguo en Dénia
XVIII FIRA DEL JOGUET- Feria del Juguete antiguo en Dénia
 
Radio on the Move Trend Update
Radio on the Move Trend UpdateRadio on the Move Trend Update
Radio on the Move Trend Update
 
Aquality Forum 2016 - Lateral Thinking
Aquality Forum 2016 - Lateral ThinkingAquality Forum 2016 - Lateral Thinking
Aquality Forum 2016 - Lateral Thinking
 
Rekrutterings dag - MHI Vestas Offshore Wind
Rekrutterings dag - MHI Vestas Offshore WindRekrutterings dag - MHI Vestas Offshore Wind
Rekrutterings dag - MHI Vestas Offshore Wind
 
España: Impacto y recuperación del clima laboral tras una reestructuración
España: Impacto y recuperación del clima laboral tras una reestructuración  España: Impacto y recuperación del clima laboral tras una reestructuración
España: Impacto y recuperación del clima laboral tras una reestructuración
 
Vacuna contra el papilomavirus humano
Vacuna contra el papilomavirus humanoVacuna contra el papilomavirus humano
Vacuna contra el papilomavirus humano
 
Análisis del libro didáctico
Análisis del libro didácticoAnálisis del libro didáctico
Análisis del libro didáctico
 
Ok t area envio iii
Ok t area envio iiiOk t area envio iii
Ok t area envio iii
 
Internship Development Portfolio-DCCG
Internship Development Portfolio-DCCGInternship Development Portfolio-DCCG
Internship Development Portfolio-DCCG
 
Ejemplo de un encerado diagnóstico en un pfu
Ejemplo de un encerado diagnóstico en un pfuEjemplo de un encerado diagnóstico en un pfu
Ejemplo de un encerado diagnóstico en un pfu
 
Top 8 digital marketing conference of 2016
Top 8 digital marketing conference of 2016Top 8 digital marketing conference of 2016
Top 8 digital marketing conference of 2016
 
Tina's_Professional_Portfolio
Tina's_Professional_PortfolioTina's_Professional_Portfolio
Tina's_Professional_Portfolio
 
Wip 43560 project-king william 2016
Wip 43560 project-king william 2016Wip 43560 project-king william 2016
Wip 43560 project-king william 2016
 
Acreditacion de la Educacion Superior
Acreditacion de la Educacion SuperiorAcreditacion de la Educacion Superior
Acreditacion de la Educacion Superior
 
Informatica juridica principios rectores. juan melean
Informatica juridica principios rectores. juan meleanInformatica juridica principios rectores. juan melean
Informatica juridica principios rectores. juan melean
 
Semantic Web from the 2013 Perspective
Semantic Web from the 2013 PerspectiveSemantic Web from the 2013 Perspective
Semantic Web from the 2013 Perspective
 
Estudio comparativo sobre la adaptación de cofias metálicas
Estudio comparativo sobre la adaptación de cofias metálicasEstudio comparativo sobre la adaptación de cofias metálicas
Estudio comparativo sobre la adaptación de cofias metálicas
 
Derecho romano
Derecho romanoDerecho romano
Derecho romano
 

Semelhante a Bio2RDF cloud of Virtuoso SPARQL endpoints

Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013François Belleau
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012François Belleau
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelabCAMELIA BOBAN
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod GmodJun Zhao
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionJasonRafeMiller
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference InformationKai Schlegel
 
Introduction to BioHackathon 2014
Introduction to BioHackathon 2014Introduction to BioHackathon 2014
Introduction to BioHackathon 2014Toshiaki Katayama
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meetingJohannes Keizer
 
BioPAX Models and Pathways
BioPAX Models and PathwaysBioPAX Models and Pathways
BioPAX Models and PathwaysMichel Dumontier
 
Publishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDFPublishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDFPeterWinstanley1
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2nolmar01
 

Semelhante a Bio2RDF cloud of Virtuoso SPARQL endpoints (20)

Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
2013 eswc-bio2rdf-r2
2013 eswc-bio2rdf-r22013 eswc-bio2rdf-r2
2013 eswc-bio2rdf-r2
 
GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelab
 
SADI CSHALS 2013
SADI CSHALS 2013SADI CSHALS 2013
SADI CSHALS 2013
 
2009 0807 Lod Gmod
2009 0807 Lod Gmod2009 0807 Lod Gmod
2009 0807 Lod Gmod
 
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, IntroductionArabidopsis Information Portal, Developer Workshop 2014, Introduction
Arabidopsis Information Portal, Developer Workshop 2014, Introduction
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Informationballoon Fusion: SPARQL Rewriting Based on  Unified Co-Reference Information
balloon Fusion: SPARQL Rewriting Based on Unified Co-Reference Information
 
Introduction to BioHackathon 2014
Introduction to BioHackathon 2014Introduction to BioHackathon 2014
Introduction to BioHackathon 2014
 
Publishing Linked Data from RDB
Publishing Linked Data from RDBPublishing Linked Data from RDB
Publishing Linked Data from RDB
 
Bio2RDF should we do it
Bio2RDF should we do itBio2RDF should we do it
Bio2RDF should we do it
 
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meeting
 
BioPAX Models and Pathways
BioPAX Models and PathwaysBioPAX Models and Pathways
BioPAX Models and Pathways
 
2016-07-06-openphacts-docker
2016-07-06-openphacts-docker2016-07-06-openphacts-docker
2016-07-06-openphacts-docker
 
Publishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDFPublishing "5 star" data: the case for RDF
Publishing "5 star" data: the case for RDF
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2
 

Mais de François Belleau

Pitch Reactome2json_ld @ swat4hcls 2020
Pitch Reactome2json_ld @ swat4hcls 2020Pitch Reactome2json_ld @ swat4hcls 2020
Pitch Reactome2json_ld @ swat4hcls 2020François Belleau
 
Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017François Belleau
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ESFrançois Belleau
 
BD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submissionBD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submissionFrançois Belleau
 
Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)François Belleau
 
Bio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conferenceBio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conferenceFrançois Belleau
 
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDFAcfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDFFrançois Belleau
 
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...François Belleau
 
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and MouseBio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and MouseFrançois Belleau
 

Mais de François Belleau (15)

Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008Bio2RDF @ DILS 2008
Bio2RDF @ DILS 2008
 
Pitch Reactome2json_ld @ swat4hcls 2020
Pitch Reactome2json_ld @ swat4hcls 2020Pitch Reactome2json_ld @ swat4hcls 2020
Pitch Reactome2json_ld @ swat4hcls 2020
 
Show de boucane pour ELK
Show de boucane pour ELKShow de boucane pour ELK
Show de boucane pour ELK
 
Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017Pitch Qliic coopérathon 2017
Pitch Qliic coopérathon 2017
 
2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES2015-11-17 Présentation SEAO et ES
2015-11-17 Présentation SEAO et ES
 
Linuq 20160130
Linuq 20160130Linuq 20160130
Linuq 20160130
 
textOdossier
textOdossiertextOdossier
textOdossier
 
BD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submissionBD2K hackathon - Bio2RDF submission
BD2K hackathon - Bio2RDF submission
 
Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)Découvrir le web sémantique en 15 minutes (Decideo 2014)
Découvrir le web sémantique en 15 minutes (Decideo 2014)
 
Bio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conferenceBio2RDF poster for Biocurator 2014 conference
Bio2RDF poster for Biocurator 2014 conference
 
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDFAcfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
Acfas 2013 - Comment publier sur le web sémantique : la méthode de Bio2RDF
 
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
Producing, Publishing and Consuming Linked Data Three lessons from the Bio2RD...
 
Bio2RDF-ISMB2008
Bio2RDF-ISMB2008Bio2RDF-ISMB2008
Bio2RDF-ISMB2008
 
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and MouseBio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
Bio2RDF : A Semantic Web Atlas of post genomic knowledge about Human and Mouse
 
Bio2RDF/Virtuoso
Bio2RDF/VirtuosoBio2RDF/Virtuoso
Bio2RDF/Virtuoso
 

Último

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Último (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Bio2RDF cloud of Virtuoso SPARQL endpoints

  • 1. Bio2RDF cloud of Virtuoso SPARQL endpoints Life Science Raw Data Now François Belleau, Marc-Alexandre Nolin, Peter Ansell, Michel Dumontier 30th April 2009 W3C-HCLS F2F Meeting, Cambridge, MA
  • 2. Agenda Why we did Bio2RDF ? ● How we did it ? ● What is know about hexokinase ? ● Where we are going ? ●
  • 3. The problem According to NAR 2009 Database collection 1170 public databases exists. How can they be integrated to behave like a global coherent resource ?
  • 4. Public map of 1744 namespaces according to BioMoby, NAR, SRS, GO, NCBI, UniProt
  • 5. Bio2RDF vision in 2007 Johanne Luciano vision for knowledge integration in 2005 W3C vision of semantic web in 2006
  • 6. Bio2RDF Mouse and Human Atlas map in 2008 65 millions triples
  • 7. Bio2RDF actual contribution to the Linked Data cloud Linked data cloud in 2007 Linked data cloud in March 2009 http://linkeddata.org/ http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics
  • 8. Bio2RDF cloud map of 2,3 billions triples in 2009
  • 9. Why do it ? Not to replace HTML or XML by an other new format, RDF and OWL, but to answer science question by submiting SPARQL query over the global knowledge base accessible through the Internet to the Life Science SPARQL endpoints cloud.
  • 10. Solution Bio2RDF approach to the data integration problem in bioinformatics : Apply the semantic web approach based on RDF, OWL and SPARQL technologies.
  • 11. How we did it ? Bio2RDF architecture
  • 12. Our design principles http://www.w3.org/DesignIssues/LinkedData http://bio2rdf.wiki.sourceforge.net/Banff%20Manifesto
  • 13. YeastHub design in 2005 Conversion of Dataset to RDF ● Use of Sesame Triplestore ● SeRQL query interface ● http://www.ncbi.nlm.nih.gov/pubmed/15961502
  • 14. Bio2RDF at ISMB 2005 the begining Thanks to Kei Cheung, Johanne Luciano, Eric Neumann and Christopher Baker they draw the lines.
  • 16. Actual Architecture Offline rdfising process ● ● Virtuoso SPARQL endpoints network ● Namespace resolution through DNS subdomain
  • 17. Main REST services Describe a ressource by a dereferencable URI ● http://bio2rdf.org/ns:id ● Global services over federated endpoints ● http://bio2rdf.org/links/ns:id ● http://bio2rdf.org/search/searchedTerm ● Targeted services to a specific endpoint ● http://bio2rdf.org/linksns/ns2/ns1:id ● http://bio2rdf.org/searchns/ns/searchedTerm ● other services are available. ●
  • 18. Describe service implementation http://bio2rdf.org/ns:id ● Corresponding SPARQL query : ● CONSTRUCT { ● ?s ?p ?o . } WHERE { ?s ?p ?o . FILTER(?s = <http://bio2rdf.org/ns:id>). } Submited at this URL ● http://ns.bio2rdf.org/sparql?query=... ● Based of DNS subdomain resolution service –
  • 19. Bio2RDF JSP server software http://sourceforge.net/projects/bio2rdf/
  • 20. Peter Ansell is writing the Bio2RDF JSP server The software transform Bio2RDF URIs to SPARQL ● queries in real time. Its aim is to access normalised RDF information ● located in multiple endpoints using the concept of Public Namespaces and Private Record Identifiers and distributed SPARQL queries which are matched to the content in each endpoint. Each of the following databases have normalisation ● rules which normalise them back to bio2rdf.org URI's :Dbpedia, Drugbank, LinkedCT, HCLS KB/Neurocommons, Diseasome, Dailymed, Bioguid DOI
  • 21. Bio2RDF.war package future Provide more pipes to perform integrated actions without ● having to put HTTP SPARQL requests into a workflow system when a URI resolution can perform the query in a distributed and normalised manner more efficiently Bring together the current distributed efforts to provide a ● complete HTML redirection registry so that a large percentage of Bio2RDF namespaces can be redirected with http://bio2rdf.org/html/namespace:identifier Form ontologies describing the query type, provider, rdf ● normalisation rule, namespace paradigm Integrate http://rdf.myexperiment.org/sparql and similar ● workflow RDF endpoints so that scientific workflows can be linked to their data cleanly
  • 23. Michel Dumontier will design Bio2RDF.owl ontology next version
  • 24. What is known about hexokinase ?
  • 25. Submit your query... To the web search engine ● To existing public web site offering data ● integration services; Using Bio2RDF SPARQL endpoints ● Submitting a SPARQL query; ● Using facet browser interface from Virtuoso 6.0 ● server; Dereferencing Bio2RDF search URI; ● Using a Taverna workflow composed of SPARQL ● queries to obtain federated results from KEGG, Entrez Gene and GO;
  • 27. Existing integrated search services EBI/EB-eye NCBI/Entrez KEGG/DBGET GoPubmed
  • 28. By submitting a SPARQL query http://atlas.bio2rdf.org/sparql
  • 29. What is know about « hexokinase » with semantic ? select ?t1 ?p2 count(*) where { ?s1 ?p1 ?o1 . FILTER( bif:contains(?o1, quot;hexokinasequot;)) . ?s1 a ?t1 . ?s1 ?p2 ?o2 . } ORDER BY ?t1 ?p2
  • 30. Use Virtuoso 6.0 facet browser http://lod.openlinksw.com/
  • 32. How can we submit a complex query over the network of SPARQL endpoints ?
  • 33. By building a mashup with Taverna 1) Write your complex SPARQL query as if a global graph would be available 2) Identify the needed namespaces and split the query to fetch each data source separetly 3) Build a mashup using a Taverna workflow that instanciate a local triplestore 4) Execute your complex query locally on the mashup
  • 34. The SPARQL query needed (dont try this home, do it on the web !)
  • 35. Get the list of genes from KEGG pathways of a specified taxon Clear graph ● Get KEGG pathways list for a ● specific taxon For each pathway get genes ● list and import instances Count the number of genes ● found http://www.myexperiment.org/workflows/747
  • 36. Insert into local triplestore GeneID genes and KEGG pathways Get the list of genes ● Get the list of pathways ● Insert into local triplestore ● each corresponding graph http://www.myexperiment.org/workflows/748
  • 37. Insert into local triplestore the needed GO annotations Get the GO annotations for ● each gene
  • 38. Finally, the neeeded query merging KEGG, Entrez Gene and GO together
  • 42. Life Science Raw Data Now http://quebec.bio2rdf.org/download
  • 43. Visit our Wiki rdfiser cookbook http://bio2rdf.wiki.sourceforge.net/
  • 44. Bio2RDF news http://bio2rdf.blogspot.com/ http://www.slideshare.net/search/slideshow?q=bio2rdf http://scholar.google.com/scholar?q=bio2rdf http://groups.google.ca/group/bio2rdf
  • 45. Our 2009 objectives Get approval from data provider to distribute ● RDF dump and publish SPARQL endpoints (UniProt, BioCyc, Pathway Commons, Bind are in); Start using Virtuoso 6 cluster; ● Design more services accessible with REST ● protocol via our JSP package; Recruit mirror server; ● Develop new rdfiser program in a community ● effort;
  • 46. Thanks Jean Morissette, Nicole Tourigny The Bio2RDF community ● Centre de recherche du CHUL ● Université Laval ● Dumontier Lab ● QUT eResearch Center ● Openlink Virtuoso ●