3. A Typical Question? Select drugs related to asthma that are linked to a curated molecular interaction in the literature where the protein is known to cause inflammatory response… 2009/10/08 Bio-IT World, Hannover
4.
5. A Typical Question? Select all human genes, which code for proteins with known molecular interactions and are analyzed with molecular techniques like ‘Transfection‘; Restrict the results just to gene or proteins which are known drug targets… 2009/10/08 Bio-IT World, Hannover
6.
7.
8.
9.
10.
11.
12. Select drugs related to asthma that are linked to a curated molecular interaction in the literature where the protein is known to cause inflammatory response… 2009/10/08 Bio-IT World, Hannover
13. Select all human genes, which code for proteins with known molecular interactions and are analyzed with molecular techniques like ‘Transfection‘; Restrict the results just to gene or proteins which are known drug targets… 2009/10/08 Bio-IT World, Hannover
14.
15.
16. RDF Technology 2009/10/08 Bio-IT World, Hannover ERBB2 HER2 CD340 Q4H1F1 Q4H1F2 Protein GO:0005023 EGF receptor activity receptor activity peroxisome receptor ENSG00000141736 Gene 2064 Gene Ontology Term GO: 0004872 GO: 0005006 is_a is_a type type type type type type label label label label label label database cross-reference hasProtein hasProtein hasGene hasGene
17.
18. LLD Integration Process Data Source Identification Flat files OBO files XML RDBMS RDF Special tailored transformer OBO to SKOS converter Custom XSLT RDBMS to RDF formatter RDF warehouse Reasoner Instance Mappings Semantic Annotations 2009/10/08 Bio-IT World, Hannover
19. Over 20 Different Sources Number of statements: 4.792.035.475 Number of explicit statements: 2.218.239.691 Number of entities: 370.230.951 2009/10/08 Bio-IT World, Hannover Data source Description RDF statements Disease Ontology Disease Ontology is a controlled 446,066 Human Phenotype Ontology The human phenotype ontology (HPO) intends 70,911 Symptom Ontology The symptom ontology was designed around 4,163 DrugBank The DrugBank database is a unique bioinformatics 493,794 Diseasome The diseasome website is a disease relationships 69,546 DailyMed DailyMed provides high quality information about 116,992 SIDER SIDER contains information on marketed medicines 96,272 BioGRID The Biological General Repository for Interaction Datasets 1,892,897 INOH INOH (Integrating Network Objects with Hierarchies) 432,456 CellMap The Cancer Cell Map contains selected 173,914 HPRD The Human Protein Reference Database 18,05,651 HumanCYC HumanCyc is a bioinformatics database that describes 341,225 IMID General Repository for Interaction Datasets. 154,408 IntAct IntAct provides a freely available, open source database 11,005,555 Reactome Reactome is a free, online, open-source, curated resource 2,538,793 NCI-Nature Nature pathway interaction database. 333,415 KEGG KEGG PATHWAY is a collection of manually drawn 18,128,735 Entrez-Gene Entrez Gene is a searchable database of genes 107,193,308 PubMed PubMed is a service of the U.S. National Library of Medicine 807,851,455 UniProt Major resource for protein sequences 1,252,667,885 UMLS Metathesaurus Database that contains information about biomedical 12,420,882 UMLS Semantic network Semantic categorization of terminology 1,368
22. X Y ns-x: id ns-y: id db id X Y db: id X Y accession db: id db: accession X term Y Y X Y X text to describe name name 2009/10/08 Bio-IT World, Hannover Namespace mapping Reference node Mismatched identifiers Value dereference Transitive link Semantic Annotations
23. Semantic Annotations 2009/10/08 Bio-IT World, Hannover broader umls:C0035204 broader broaderTransitive COPD Bronchial Diseases Respiration Disorders umls:C0006261 Chronic Obstructive Airway Diseases This an example text of document that mentions COPD disease hasDocumentText mentions Natural Language Processing Natural Language Processing Natural Language Processing Natural Language Processing