SlideShare uma empresa Scribd logo
1 de 57
Semantic (Web) Technologies for Translational Research in Life Sciences Ohio State University, June 16, 2011 Amit P. Sheth Ohio Center ofExcellence in Knowledge-enabled Computing (Kno.e.sis) amit.sheth@wright.edu Thanks to Kno.e.sis team (Satya, Priti, Rama, and Ajith); Collaborators at CTEGD UGA(Dr. Tarleton, Brent Weatherly), NLM(Olivier Bodenreider), CCRC, UGA (Will York), NCBO/Stanford,  CITAR/WSU
Kno.e.sis: Ohio Center of Excellence in Knowledge-enabled Computing
Web ofpeople    - social networks, user-createdcasualcontent Web of resources     - data, service, data, mashups Web of databases    - dynamically generated pages    - web query interfaces Web of pages    - text, manually created links    - extensive navigation Evolutionof Web & Semantic Computing Tech assimilated in life Web ofSensors, Devices/IoT - 40 billionsensors, 5 billionmobile connections 2007 Situations, Events Web 3.0 Semantic TechnologyUsed Objects Web 2.0 Patterns Keywords 1997 Web 1.0
Outline Semantic Web – very brief intro Scenarios to demonstrate the applications and benefit of semantic web technologies HealthCare BiomedicalResearch Translational
Biomedical Informatics... Biomedical Informatics Pubmed Clinical  Trials.gov ...needs a connection Hypothesis Validation Experiment design Predictions Personalized medicine Semantic Web research aims at providing this connection! Etiology  Pathogenesis Clinical findings Diagnosis Prognosis Treatment Genome Transcriptome Proteome Metabolome Physiome ...ome More advanced capabilities for  	search,  	integration,  	analysis,  	linking to new insights  	and discoveries! Genbank Uniprot Medical Informatics Bioinformatics
Decision Making, Insights, InnovationsHuman Performance Data and Facts Knowledge and Understanding Health & Performance Cognitive Science, Psychology Neuroscience Anatomy, Physiology Cellular biology Molecular Biology ACATATGGGTACTATTTACTATTCATGGGTACTATTTATGGCATATGGCGTACTATTCTAATCCTATATCCGTCTAATCTATTTACTATTATCTATTACTATACCTTTTGGGGAAAAAAATTCTATACCGTCTAATCCTATAAATCAAGCCG Biochemistry
Semantic Web standards @ W3C Semantic Web is built in a layered manner Not everybody needs all the layers … Queries: SPARQL, Rules: RIF Semantic Web Rich ontologies: OWL Simple data models & taxonomies: RDF Schema  Uniformmetamodel: RDF+ URI  Encoding structure: XML  Encoding characters : Unicode
Linked Data: Semantic Web “diluted” Achieve for data what Web did to documents Relationship with the original Semantic Web vision: no AI, no agents, no autonomy Interoperability is still very important interoperability of formats interoperability of semantics Enables interchange of large data sets (thus very useful in, say, collaborative research) Semantic Web vision is largely predicated on the availability of data Linked Data is a movement that gets us there Thanks – OraLassila
Opportunity: exploiting clinical and biomedical data text Health  Information  Services Elsevier  iConsult Scientific  Literature PubMed 300 Documents  Published Online  each day User-contributed  Content (Informal) GeneRifs WikiGene NCBI  Public Datasets Genome,  Protein DBs new sequences daily Laboratory  Data Lab tests,  RTPCR, Mass spec Clinical Data Personal  health history Search, browsing, complex query, integration, workflow, analysis, hypothesis validation, decision support.
Major Community Efforts W3C Semantic Web Health Care & Life Sciences Interest Group: http://www.w3.org/2001/sw/hcls/ Clinical Observations Interoperability: EMR + Clinical Trials: http://esw.w3.org/HCLS/ClinicalObservationsInteroperability National Center for Biomedical Ontologies: http://bioportal.bioontology.org/
Major SW Projects OpenPHACTS: A knowledge management project of the Innovative Medicines Initiative (IMI), a unique partnership between the European Community and the European Federation of Pharmaceutical Industries and Associations (EFPIA). http://www.openphacts.org/ LarKC: develop the Large Knowledge Collider, a platform for massive distributed incomplete reasoning that will remove the scalability barriers of currently existing reasoning systems for the Semantic Web. http://www.larkc.eu/ NCBO: contribute to collaborative science and translational research. http://bioportal.bioontology.org/
Semantic Web Enablers and Techniques Ontology: Agreement with Common Vocabulary & Domain Knowledge; Schema + Knowledge base Semantic Annotation (meatadata Extraction): Manual, Semi-automatic (automatic with human verification), Automatic Semantic Computation: semantics enabled search, integration, complex queries, analysis (paths, subgraph), pattern finding, mining, inferencing, reasoning, hypothesis validation, discovery, visualization
Drug Ontology Hierarchy(showing is-a relationships) owl:thing prescription_drug_ brand_name brandname_undeclared brandname_composite prescription_drug monograph_ix_class cpnum_ group prescription_drug_ property indication_ property formulary_ property non_drug_ reactant interaction_property property formulary brandname_individual interaction_with_prescription_drug interaction indication generic_ individual prescription_drug_ generic generic_ composite interaction_with_monograph_ix_class interaction_ with_non_ drug_reactant
N-glycan_beta_GlcNAc_9 N-glycan_alpha_man_4 GNT-Vattaches GlcNAc at position 6 N-acetyl-glucosaminyl_transferase_V UDP-N-acetyl-D-glucosamine + alpha-D-Mannosyl-1,3-(R1)-beta-D-mannosyl-R2 <=>  UDP + N-Acetyl-$beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(R1)-beta-D-mannosyl-$R2  UDP-N-acetyl-D-glucosamine + G00020 <=> UDP + G00021  N-Glycosylation metabolic pathway GNT-Iattaches GlcNAc at position 2
Maturing capabilites and ongoing research Ontology Creation SemanticAnnotation & Textmining: Entity recognition, Relationship extraction SemanticIntegration & Provenance:  Integratingalltypesof data used in biomedicalresearch: text, experimetal data, curated/structured/publicandmultimedia Semantic search, browsing, analysis Clinical and Scientific Workflows with semantic web services SemanticExplorationofscientific literature, Undiscovered publicknowledge
Project 1: ASEMR Why:Improve Quality of Care and Decision Making without loss of Efficiency in active Cardiology practice.  What: Use of semantic Web technologies for clinical decision support Where: Athens Heart Center & its partners and labs Status: In usecontinuously since 01/2006
Operational since January 2006 Details: http://knoesis.org/library/resource.php?id=00004
Active Semantic EMR Annotate ICD9s Annotate Doctors Lexical Annotation Insurance Formulary Level 3 Drug Interaction Drug Allergy Demo at: http://knoesis.org/library/demos/
Project 2: Glycomics Why:To help in the treatment of certain kinds of cancer and Parkinson's Disease. What: Semantic Annotation of Experiment Data Where:Complex Carbohydrate Research Center, UGA Status: Research prototype in use Workflow with Semantic Annotation of Experimental Data already in use
N-Glycosylation Process (NGP) Cell Culture extract Glycoprotein Fraction proteolysis Glycopeptides Fraction 1 Separation technique I n Glycopeptides Fraction PNGase n Peptide Fraction Separation technique II n*m Peptide Fraction Mass spectrometry ms data ms/ms data Data reduction Data reduction ms peaklist ms/ms peaklist binning Peptide identification Glycopeptide identification and quantification Peptide list N-dimensional array Data correlation Signal integration
Agent  Agent  Agent  Agent  Biological Sample  Analysis by MS/MS Raw Data to Standard Format Data Pre- process DB Search (Mascot/Sequest) Results Post-process (ProValt) O I O I O I O I O Storage Standard Format Data Raw Data Filtered Data Search Results Final Output Biological Information Scientific workflow for proteome analysis Semantic Annotation Applications
Semantic Annotation of Experimental Data  parent ion charge 830.9570    194.9604    2     580.2985     0.3592     688.3214     0.2526     779.4759    38.4939     784.3607    21.7736    1543.7476     1.3822    1544.7595     2.9977    1562.8113    37.4790    1660.7776   476.5043 parent ion m/z parent ionabundance fragment ion m/z fragment ionabundance ms/ms peaklist data Mass Spectrometry (MS) Data
Semantic Annotation of Experimental Data  <ms-ms_peak_list> <parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer”        mode=“ms-ms”/> 	<parent_ionm-z=“830.9570” abundance=“194.9604” z=“2”/> 			<fragment_ionm-z=“580.2985” abundance=“0.3592”/> 			<fragment_ionm-z=“688.3214” abundance=“0.2526”/> 			<fragment_ionm-z=“779.4759” abundance=“38.4939”/> 			<fragment_ionm-z=“784.3607” abundance=“21.7736”/> 			<fragment_ionm-z=“1543.7476” abundance=“1.3822”/> 			<fragment_ionm-z=“1544.7595” abundance=“2.9977”/> 			<fragment_ionm-z=“1562.8113” abundance=“37.4790”/> 			<fragment_ionm-z=“1660.7776” abundance=“476.5043”/> </ms-ms_peak_list> OntologicalConcepts Semantically Annotated MS Data
Project 3:  Why: To associate genotype and phenotype information for knowledge discovery What:integrated data sources to run complex queries Enriching data with ontologies for integration, querying, and automation Ontologies beyond vocabularies: the power of relationships Where: NCRR (NIH)  Status:Completed
Use data to test hypothesis Gene name GO Interactions gene Sequence PubMed OMIM Link between glycosyltransferase activity and congenital muscular dystrophy? Glycosyltransferase Congenital muscular dystrophy Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07
In a Web pages world… (GeneID: 9215) has_associated_disease Congenital muscular dystrophy,type 1D has_molecular_function Acetylglucosaminyl-transferase activity Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07
With the semantically enhanced data glycosyltransferase GO:0016757 isa GO:0008194 GO:0016758 acetylglucosaminyl-transferase GO:0008375 has_molecular_function acetylglucosaminyl-transferase GO:0008375 EG:9215 LARGE Muscular dystrophy, congenital, type 1D  MIM:608840 has_associated_phenotype SELECT DISTINCT ?t ?g ?d  {     ?t is_a GO:0016757 .     ?g has molecular function ?t .     ?g has_associated_phenotype ?b2 .     ?b2 has_textual_description ?d . FILTER (?d, “muscular distrophy”, “i”) . FILTER (?d, “congenital”, “i”)      } From medinfo paper. Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07
Project 4: Nicotine Dependence Why: For understanding the genetic basis of nicotine dependence.  What:Integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. How: Semantic Web technologies (especially RDF, OWL, and SPARQL) support information integration and make it easy to create semantic mashups (semantically integrated resources).  Where: NLM (NIH)  Status: Completed research
Motivation NIDA study on nicotine dependency List of candidate genes in humans Analysis objectives include: ,[object Object]
Identification of active genes – maximum number of pathways
Identification of genes based on anatomical locationsRequires integration of genome and biological pathway information
Genome and pathway information integration KEGG Reactome HumanCyc ,[object Object]
protein
pmidEntrez Gene ,[object Object]
protein
pmid
pathway
protein
pmidGeneOntology HomoloGene ,[object Object]
HomoloGene ID,[object Object]
Entrez Knowledge Model (EKoM) BioPAX ontology
Results: Gene Pathway network and Hub Genes involved with Nicotine Dependence
Project 5: T. cruzi SPSE  Why: For Integrative Parasite Research to help expedite knowledge discovery What: Semantics and Services Enabled Problem Solving Environment (PSE) for Trypanosomacruzi Where: Center for Tropical and Emerging Global  Diseases (CTEGD), UGA  Who: Kno.e.sis, UGA, NCBO (Stanford) Status: Research prototype – in regular lab use
Project Outline Data Sources ,[object Object],Gene Knockout Strain Creation Microarray Proteome ,[object Object],Ontological Infrastructure ,[object Object]
Parasite ExperimentQuery processing ,[object Object],Results
Provenance in Parasite Research Gene Name Sequence Extraction Gene Knockout and Strain Creation* Related Queries from Biologists List all groups in the lab that used a Target Region Plasmid? Which researcher created a new strain of the parasite (with ID = 66)? An experiment was not successful – has this experiment been conducted earlier? What were the results?  3‘ & 5’ Region Drug Resistant Plasmid Gene Name Plasmid Construction Knockout Construct Plasmid T.Cruzi sample ? Transfection Transfected Sample Drug Selection Cloned Sample Selected Sample Cell Cloning Cloned Sample *T.cruzi Semantic Problem Solving Environment Project, Courtesy of D.B. Weatherly and Flora Logan, Tarleton Lab, University of Georgia
Research Accomplishments SPSE ,[object Object]
Developed semantic provenance framework and influence W3C community
SPSE supports complex biological queries that help find gene knockout, drug and/or vaccination targets.  For example:
Show me proteins that are downregulated in the epimastigote stage and exist in a single metabolic pathway.
Give me the gene knockout summaries, both for plasmid construction and strain creation, for all gene knockout targets that are 2-fold upregulated in amastigotes at the transcript level and that have orthologs in Leishmania but not in Trypanosomabrucei.,[object Object],[object Object]
 Focused KB Work Flow  (Use case: HPCO) HPC keywords Doozer: Base Hierarchy from Wikipedia Focused Pattern based extraction SenseLab Neuroscience Ontologies Initial KB Creation Meta Knowledgebase PubMed Abstracts Knoesis: Parsing based NLP Triples   Enrich Knowledge Base NLM: Rule based BKR Triples Final Knowledge Base
 Triple Extraction Approaches Open Extraction  No fixed number of predetermined entities and predicates At  Knoesis – NLP (parsing and dependency trees) Supervised Extraction Predetermined set of entities and predicates At  Knoesis – Pattern based extraction to connect entities in the base hierarchy using statistical techniques At NLM – NLP and rule based approaches
Mapping Triples to Base Hierarchy Entities in both subject and object must contain at least one concept from the hierarchy to be mapped to the KB Preliminary synonyms based on anchor labels and page redirects in Wikipedia Prolactostatin redirects to Dopamine Predicates  (verbs) and entities are subjected to stemming using Wordnet
Scooner:  Full Architecture
Scooner Features Knowledge-based browsing: Relations window, inverse relations, creating trails Persistent projects: Work bench, browsing history, comments, filtering Collaboration: comments, dashboard, exporting (sub)projects, importing projects
Scooner Screenshot

Mais conteúdo relacionado

Mais procurados

FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseRothamsted Research, UK
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Philippe Rocca-Serra
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Michel Dumontier
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge DiscoveryMichel Dumontier
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Alejandra Gonzalez-Beltran
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsChunlei Wu
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Michel Dumontier
 
Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesMichel Dumontier
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAGopen_phacts
 
BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013Andrea de Souza
 
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-upopen_phacts
 
2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformaticsc.titus.brown
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europeopen_phacts
 
Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011cmzmasek
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET
 

Mais procurados (20)

FAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use CaseFAIR Agronomy, where are we? The KnetMiner Use Case
FAIR Agronomy, where are we? The KnetMiner Use Case
 
Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3Ontomaton icbo2013-alternative order-t_wv3
Ontomaton icbo2013-alternative order-t_wv3
 
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)
 
Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012Drug Discovery- ELRIG -2012
Drug Discovery- ELRIG -2012
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
 
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
Metagenomic Data Provenance and Management using the ISA infrastructure --- o...
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
Enriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentationEnriching Scholarship Personal Genomics presentation
Enriching Scholarship Personal Genomics presentation
 
High-performance web services for gene and variant annotations
High-performance web services for gene and variant annotationsHigh-performance web services for gene and variant annotations
High-performance web services for gene and variant annotations
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
Semantic approaches for biomedical knowledge discovery - Discovery Science 20...
 
Generating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web TechnologiesGenerating Biomedical Hypotheses Using Semantic Web Technologies
Generating Biomedical Hypotheses Using Semantic Web Technologies
 
2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG2011-11-28 Open PHACTS at RSC CICAG
2011-11-28 Open PHACTS at RSC CICAG
 
BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013BioAssay Research Database Presentation at the Chem Axon UGM 2013
BioAssay Research Database Presentation at the Chem Axon UGM 2013
 
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
 
2015 balti-and-bioinformatics
2015 balti-and-bioinformatics2015 balti-and-bioinformatics
2015 balti-and-bioinformatics
 
A biologist in e-Science
A biologist in e-ScienceA biologist in e-Science
A biologist in e-Science
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011Zmasek TOPSAN Biohackathon 2011
Zmasek TOPSAN Biohackathon 2011
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019
 

Semelhante a Semantic (Web) Technologies for Translational Research in Life Sciences

Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsAmit Sheth
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dcc.titus.brown
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsmikaelhuss
 
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Artificial Intelligence Institute at UofSC
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisCatherine Canevet
 
Dynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical CommunicationsDynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical CommunicationsTim Clark
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataPaul Groth
 
Contractor-Borner-SNA-SAC
Contractor-Borner-SNA-SACContractor-Borner-SNA-SAC
Contractor-Borner-SNA-SACwebuploader
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...Chris Evelo
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesUniversity of Malaya
 
is there life between standards? Data interoperability for AI.
is there life between standards? Data interoperability for AI.is there life between standards? Data interoperability for AI.
is there life between standards? Data interoperability for AI.Chris Evelo
 
Thesis def
Thesis defThesis def
Thesis defJay Vyas
 
Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Sage Base
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data miningSangeeta Das
 

Semelhante a Semantic (Web) Technologies for Translational Research in Life Sciences (20)

Semantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical InformaticsSemantic Web for Health Care and Biomedical Informatics
Semantic Web for Health Care and Biomedical Informatics
 
D1803012022
D1803012022D1803012022
D1803012022
 
2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc2013 nas-ehs-data-integration-dc
2013 nas-ehs-data-integration-dc
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
Practical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS projectPractical semantics in the pharmaceutical industry - the Open PHACTS project
Practical semantics in the pharmaceutical industry - the Open PHACTS project
 
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysis
 
Dynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical CommunicationsDynamic Semantic Metadata in Biomedical Communications
Dynamic Semantic Metadata in Biomedical Communications
 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture Data
 
Contractor-Borner-SNA-SAC
Contractor-Borner-SNA-SACContractor-Borner-SNA-SAC
Contractor-Borner-SNA-SAC
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
 
Bioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future PerspectivesBioinformatics databases: Current Trends and Future Perspectives
Bioinformatics databases: Current Trends and Future Perspectives
 
Online Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery SystemsOnline Resources to Support Open Drug Discovery Systems
Online Resources to Support Open Drug Discovery Systems
 
is there life between standards? Data interoperability for AI.
is there life between standards? Data interoperability for AI.is there life between standards? Data interoperability for AI.
is there life between standards? Data interoperability for AI.
 
'A PAL's Life' for OMII-UK Board, May 2008
'A PAL's Life' for OMII-UK Board, May 2008'A PAL's Life' for OMII-UK Board, May 2008
'A PAL's Life' for OMII-UK Board, May 2008
 
Thesis def
Thesis defThesis def
Thesis def
 
Improving online chemistry one structure at a time
Improving online chemistry one structure at a timeImproving online chemistry one structure at a time
Improving online chemistry one structure at a time
 
Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24Stephen Friend Dana Farber Cancer Institute 2011-10-24
Stephen Friend Dana Farber Cancer Institute 2011-10-24
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data mining
 

Último

What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 

Último (20)

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 

Semantic (Web) Technologies for Translational Research in Life Sciences

  • 1. Semantic (Web) Technologies for Translational Research in Life Sciences Ohio State University, June 16, 2011 Amit P. Sheth Ohio Center ofExcellence in Knowledge-enabled Computing (Kno.e.sis) amit.sheth@wright.edu Thanks to Kno.e.sis team (Satya, Priti, Rama, and Ajith); Collaborators at CTEGD UGA(Dr. Tarleton, Brent Weatherly), NLM(Olivier Bodenreider), CCRC, UGA (Will York), NCBO/Stanford, CITAR/WSU
  • 2. Kno.e.sis: Ohio Center of Excellence in Knowledge-enabled Computing
  • 3. Web ofpeople - social networks, user-createdcasualcontent Web of resources - data, service, data, mashups Web of databases - dynamically generated pages - web query interfaces Web of pages - text, manually created links - extensive navigation Evolutionof Web & Semantic Computing Tech assimilated in life Web ofSensors, Devices/IoT - 40 billionsensors, 5 billionmobile connections 2007 Situations, Events Web 3.0 Semantic TechnologyUsed Objects Web 2.0 Patterns Keywords 1997 Web 1.0
  • 4. Outline Semantic Web – very brief intro Scenarios to demonstrate the applications and benefit of semantic web technologies HealthCare BiomedicalResearch Translational
  • 5. Biomedical Informatics... Biomedical Informatics Pubmed Clinical Trials.gov ...needs a connection Hypothesis Validation Experiment design Predictions Personalized medicine Semantic Web research aims at providing this connection! Etiology Pathogenesis Clinical findings Diagnosis Prognosis Treatment Genome Transcriptome Proteome Metabolome Physiome ...ome More advanced capabilities for search, integration, analysis, linking to new insights and discoveries! Genbank Uniprot Medical Informatics Bioinformatics
  • 6. Decision Making, Insights, InnovationsHuman Performance Data and Facts Knowledge and Understanding Health & Performance Cognitive Science, Psychology Neuroscience Anatomy, Physiology Cellular biology Molecular Biology ACATATGGGTACTATTTACTATTCATGGGTACTATTTATGGCATATGGCGTACTATTCTAATCCTATATCCGTCTAATCTATTTACTATTATCTATTACTATACCTTTTGGGGAAAAAAATTCTATACCGTCTAATCCTATAAATCAAGCCG Biochemistry
  • 7. Semantic Web standards @ W3C Semantic Web is built in a layered manner Not everybody needs all the layers … Queries: SPARQL, Rules: RIF Semantic Web Rich ontologies: OWL Simple data models & taxonomies: RDF Schema Uniformmetamodel: RDF+ URI Encoding structure: XML Encoding characters : Unicode
  • 8. Linked Data: Semantic Web “diluted” Achieve for data what Web did to documents Relationship with the original Semantic Web vision: no AI, no agents, no autonomy Interoperability is still very important interoperability of formats interoperability of semantics Enables interchange of large data sets (thus very useful in, say, collaborative research) Semantic Web vision is largely predicated on the availability of data Linked Data is a movement that gets us there Thanks – OraLassila
  • 9. Opportunity: exploiting clinical and biomedical data text Health Information Services Elsevier iConsult Scientific Literature PubMed 300 Documents Published Online each day User-contributed Content (Informal) GeneRifs WikiGene NCBI Public Datasets Genome, Protein DBs new sequences daily Laboratory Data Lab tests, RTPCR, Mass spec Clinical Data Personal health history Search, browsing, complex query, integration, workflow, analysis, hypothesis validation, decision support.
  • 10. Major Community Efforts W3C Semantic Web Health Care & Life Sciences Interest Group: http://www.w3.org/2001/sw/hcls/ Clinical Observations Interoperability: EMR + Clinical Trials: http://esw.w3.org/HCLS/ClinicalObservationsInteroperability National Center for Biomedical Ontologies: http://bioportal.bioontology.org/
  • 11. Major SW Projects OpenPHACTS: A knowledge management project of the Innovative Medicines Initiative (IMI), a unique partnership between the European Community and the European Federation of Pharmaceutical Industries and Associations (EFPIA). http://www.openphacts.org/ LarKC: develop the Large Knowledge Collider, a platform for massive distributed incomplete reasoning that will remove the scalability barriers of currently existing reasoning systems for the Semantic Web. http://www.larkc.eu/ NCBO: contribute to collaborative science and translational research. http://bioportal.bioontology.org/
  • 12. Semantic Web Enablers and Techniques Ontology: Agreement with Common Vocabulary & Domain Knowledge; Schema + Knowledge base Semantic Annotation (meatadata Extraction): Manual, Semi-automatic (automatic with human verification), Automatic Semantic Computation: semantics enabled search, integration, complex queries, analysis (paths, subgraph), pattern finding, mining, inferencing, reasoning, hypothesis validation, discovery, visualization
  • 13. Drug Ontology Hierarchy(showing is-a relationships) owl:thing prescription_drug_ brand_name brandname_undeclared brandname_composite prescription_drug monograph_ix_class cpnum_ group prescription_drug_ property indication_ property formulary_ property non_drug_ reactant interaction_property property formulary brandname_individual interaction_with_prescription_drug interaction indication generic_ individual prescription_drug_ generic generic_ composite interaction_with_monograph_ix_class interaction_ with_non_ drug_reactant
  • 14. N-glycan_beta_GlcNAc_9 N-glycan_alpha_man_4 GNT-Vattaches GlcNAc at position 6 N-acetyl-glucosaminyl_transferase_V UDP-N-acetyl-D-glucosamine + alpha-D-Mannosyl-1,3-(R1)-beta-D-mannosyl-R2 <=> UDP + N-Acetyl-$beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(R1)-beta-D-mannosyl-$R2 UDP-N-acetyl-D-glucosamine + G00020 <=> UDP + G00021 N-Glycosylation metabolic pathway GNT-Iattaches GlcNAc at position 2
  • 15. Maturing capabilites and ongoing research Ontology Creation SemanticAnnotation & Textmining: Entity recognition, Relationship extraction SemanticIntegration & Provenance: Integratingalltypesof data used in biomedicalresearch: text, experimetal data, curated/structured/publicandmultimedia Semantic search, browsing, analysis Clinical and Scientific Workflows with semantic web services SemanticExplorationofscientific literature, Undiscovered publicknowledge
  • 16. Project 1: ASEMR Why:Improve Quality of Care and Decision Making without loss of Efficiency in active Cardiology practice. What: Use of semantic Web technologies for clinical decision support Where: Athens Heart Center & its partners and labs Status: In usecontinuously since 01/2006
  • 17. Operational since January 2006 Details: http://knoesis.org/library/resource.php?id=00004
  • 18. Active Semantic EMR Annotate ICD9s Annotate Doctors Lexical Annotation Insurance Formulary Level 3 Drug Interaction Drug Allergy Demo at: http://knoesis.org/library/demos/
  • 19. Project 2: Glycomics Why:To help in the treatment of certain kinds of cancer and Parkinson's Disease. What: Semantic Annotation of Experiment Data Where:Complex Carbohydrate Research Center, UGA Status: Research prototype in use Workflow with Semantic Annotation of Experimental Data already in use
  • 20. N-Glycosylation Process (NGP) Cell Culture extract Glycoprotein Fraction proteolysis Glycopeptides Fraction 1 Separation technique I n Glycopeptides Fraction PNGase n Peptide Fraction Separation technique II n*m Peptide Fraction Mass spectrometry ms data ms/ms data Data reduction Data reduction ms peaklist ms/ms peaklist binning Peptide identification Glycopeptide identification and quantification Peptide list N-dimensional array Data correlation Signal integration
  • 21. Agent Agent Agent Agent Biological Sample Analysis by MS/MS Raw Data to Standard Format Data Pre- process DB Search (Mascot/Sequest) Results Post-process (ProValt) O I O I O I O I O Storage Standard Format Data Raw Data Filtered Data Search Results Final Output Biological Information Scientific workflow for proteome analysis Semantic Annotation Applications
  • 22. Semantic Annotation of Experimental Data parent ion charge 830.9570 194.9604 2 580.2985 0.3592 688.3214 0.2526 779.4759 38.4939 784.3607 21.7736 1543.7476 1.3822 1544.7595 2.9977 1562.8113 37.4790 1660.7776 476.5043 parent ion m/z parent ionabundance fragment ion m/z fragment ionabundance ms/ms peaklist data Mass Spectrometry (MS) Data
  • 23. Semantic Annotation of Experimental Data <ms-ms_peak_list> <parameter instrument=“micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer” mode=“ms-ms”/> <parent_ionm-z=“830.9570” abundance=“194.9604” z=“2”/> <fragment_ionm-z=“580.2985” abundance=“0.3592”/> <fragment_ionm-z=“688.3214” abundance=“0.2526”/> <fragment_ionm-z=“779.4759” abundance=“38.4939”/> <fragment_ionm-z=“784.3607” abundance=“21.7736”/> <fragment_ionm-z=“1543.7476” abundance=“1.3822”/> <fragment_ionm-z=“1544.7595” abundance=“2.9977”/> <fragment_ionm-z=“1562.8113” abundance=“37.4790”/> <fragment_ionm-z=“1660.7776” abundance=“476.5043”/> </ms-ms_peak_list> OntologicalConcepts Semantically Annotated MS Data
  • 24. Project 3: Why: To associate genotype and phenotype information for knowledge discovery What:integrated data sources to run complex queries Enriching data with ontologies for integration, querying, and automation Ontologies beyond vocabularies: the power of relationships Where: NCRR (NIH) Status:Completed
  • 25. Use data to test hypothesis Gene name GO Interactions gene Sequence PubMed OMIM Link between glycosyltransferase activity and congenital muscular dystrophy? Glycosyltransferase Congenital muscular dystrophy Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07
  • 26. In a Web pages world… (GeneID: 9215) has_associated_disease Congenital muscular dystrophy,type 1D has_molecular_function Acetylglucosaminyl-transferase activity Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07
  • 27. With the semantically enhanced data glycosyltransferase GO:0016757 isa GO:0008194 GO:0016758 acetylglucosaminyl-transferase GO:0008375 has_molecular_function acetylglucosaminyl-transferase GO:0008375 EG:9215 LARGE Muscular dystrophy, congenital, type 1D MIM:608840 has_associated_phenotype SELECT DISTINCT ?t ?g ?d { ?t is_a GO:0016757 . ?g has molecular function ?t . ?g has_associated_phenotype ?b2 . ?b2 has_textual_description ?d . FILTER (?d, “muscular distrophy”, “i”) . FILTER (?d, “congenital”, “i”) } From medinfo paper. Adapted from: Olivier Bodenreider, presentation at HCLS Workshop, WWW07
  • 28. Project 4: Nicotine Dependence Why: For understanding the genetic basis of nicotine dependence. What:Integrate gene and pathway information and show how three complex biological queries can be answered by the integrated knowledge base. How: Semantic Web technologies (especially RDF, OWL, and SPARQL) support information integration and make it easy to create semantic mashups (semantically integrated resources). Where: NLM (NIH) Status: Completed research
  • 29.
  • 30. Identification of active genes – maximum number of pathways
  • 31. Identification of genes based on anatomical locationsRequires integration of genome and biological pathway information
  • 32.
  • 34.
  • 36. pmid
  • 39.
  • 40.
  • 41. Entrez Knowledge Model (EKoM) BioPAX ontology
  • 42. Results: Gene Pathway network and Hub Genes involved with Nicotine Dependence
  • 43. Project 5: T. cruzi SPSE Why: For Integrative Parasite Research to help expedite knowledge discovery What: Semantics and Services Enabled Problem Solving Environment (PSE) for Trypanosomacruzi Where: Center for Tropical and Emerging Global Diseases (CTEGD), UGA Who: Kno.e.sis, UGA, NCBO (Stanford) Status: Research prototype – in regular lab use
  • 44.
  • 45.
  • 46. Provenance in Parasite Research Gene Name Sequence Extraction Gene Knockout and Strain Creation* Related Queries from Biologists List all groups in the lab that used a Target Region Plasmid? Which researcher created a new strain of the parasite (with ID = 66)? An experiment was not successful – has this experiment been conducted earlier? What were the results? 3‘ & 5’ Region Drug Resistant Plasmid Gene Name Plasmid Construction Knockout Construct Plasmid T.Cruzi sample ? Transfection Transfected Sample Drug Selection Cloned Sample Selected Sample Cell Cloning Cloned Sample *T.cruzi Semantic Problem Solving Environment Project, Courtesy of D.B. Weatherly and Flora Logan, Tarleton Lab, University of Georgia
  • 47.
  • 48. Developed semantic provenance framework and influence W3C community
  • 49. SPSE supports complex biological queries that help find gene knockout, drug and/or vaccination targets. For example:
  • 50. Show me proteins that are downregulated in the epimastigote stage and exist in a single metabolic pathway.
  • 51.
  • 52. Focused KB Work Flow (Use case: HPCO) HPC keywords Doozer: Base Hierarchy from Wikipedia Focused Pattern based extraction SenseLab Neuroscience Ontologies Initial KB Creation Meta Knowledgebase PubMed Abstracts Knoesis: Parsing based NLP Triples Enrich Knowledge Base NLM: Rule based BKR Triples Final Knowledge Base
  • 53. Triple Extraction Approaches Open Extraction No fixed number of predetermined entities and predicates At Knoesis – NLP (parsing and dependency trees) Supervised Extraction Predetermined set of entities and predicates At Knoesis – Pattern based extraction to connect entities in the base hierarchy using statistical techniques At NLM – NLP and rule based approaches
  • 54. Mapping Triples to Base Hierarchy Entities in both subject and object must contain at least one concept from the hierarchy to be mapped to the KB Preliminary synonyms based on anchor labels and page redirects in Wikipedia Prolactostatin redirects to Dopamine Predicates (verbs) and entities are subjected to stemming using Wordnet
  • 55. Scooner: Full Architecture
  • 56. Scooner Features Knowledge-based browsing: Relations window, inverse relations, creating trails Persistent projects: Work bench, browsing history, comments, filtering Collaboration: comments, dashboard, exporting (sub)projects, importing projects
  • 58. New Knowledge/hypothesis Example Three triples from different abstracts VIP Peptide – increases – Catecholamine Biosynthesis Catecholamines – induce – β-adrenergic receptor activity β-adrenergic receptors – are involved – fear conditioning New implicit knowledge VIP Peptide – affects – fear conditioning Caveat: Each triple above was observed in a different organism (cows, mice, humans), but still interesting hypothesis. Scooner’s contextual browsing makes this clear to the user.
  • 59. Project 7: Drug Abuse Why: To study social trends in pharmaceutical opioid abuse What: Describe drug user’s knowledge, attitudes, and behaviors related to illicit use of OxyContin® Describe temporal patterns of non-medical use of OxyContin® tablets as discussed on Web-based forums Where: CITAR (Center for Interventions, Treatment and Addictions Research) at Wright State Univ. Status: In-progress (Recently funded from NIDA)
  • 60.
  • 61. Project 8: NMR Why: Streamline the NMR data processing tasks. Processing NMR experimental data is complex and time consuming. What: Providing biologists with tools to effectively process and manage Nuclear Magnetic Resonance (NMR) experimental data. How: Use Domain Specific Languages (DSL) to create scientist-friendly abstractions for complex statistical workflows. Use semantics based techniques to store and manage data. Where: Air Force Research Lab Status: In progress
  • 62.
  • 63. A complex NMR spectrum, marked with chemical compound identifiers by human observers.
  • 64.
  • 65. Use a DSL to provide abstractions for the operators (named SCALE)
  • 66.
  • 67. Future Interoperability Challenge:360 degree health Insurance, Financial Aspects Clinical Care Follow up, Lifestyle Genetic Tests… Profiles Clinical Trials Social Media
  • 68. For each component in 360-degree health care, we have data, processes, knowledge and experience. Interoperability solutions need to encompass all these! Possibly largest growth in data will be in sensors (eg Body Area Networks, Biosensors) and social content. Extensive use of mobile phones. Credit: ece.virginia.edu
  • 69. Summary Semantic Web is an “interoperability technology” Semantic Web provides the needed interoperability, and can accommodate all necessary “points of view” Linked Data as a way of sharing data is highly promising Many examples of viable usage of Semantic Web technologies Words of warning about deployment Significant research challenges remain as Health presents the most complex domain
  • 70. Representative References A. Sheth, S. Agrawal, J. Lathem, N. Oldham, H. Wingate, P. Yadav, and K. Gallagher, Active Semantic Electronic Medical Record, Intl Semantic Web Conference, 2006. SatyaSahoo, Olivier Bodenreider, Kelly Zeng, and AmitSheth, An Experiment in Integrating Large Biomedical Knowledge Resources with RDF: Application to Associating Genotype and Phenotype InformationWWW2007 HCLS Workshop, May 2007. Satya S. Sahoo, Kelly Zeng, Olivier Bodenreider, and AmitSheth, From "Glycosyltransferase to Congenital Muscular Dystrophy: Integrating Knowledge from NCBI Entrez Gene and the Gene Ontology, Amsterdam: IOS, August 2007, PMID: 17911917, pp. 1260-4 Satya S. Sahoo, Olivier Bodenreider, Joni L. Rutter, Karen J. Skinner , Amit P. Sheth, An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence, Journal of Biomedical Informatics, 2008. CarticRamakrishnan, Krzysztof J. Kochut, and AmitSheth, "A Framework for Schema-Driven Relationship Discovery from Unstructured Text", Intl Semantic Web Conference, 2006, pp. 583-596 Satya S. Sahoo, Christopher Thomas, AmitSheth, William S. York, and SamirTartir, "Knowledge Modeling and Its Application in Life Sciences: A Tale of Two Ontologies", 15th International World Wide Web Conference (WWW2006), Edinburgh, Scotland, May 23-26, 2006. Satya S. Sahoo, Olivier Bodenreider, Pascal Hitzler, Amit Sheth and KrishnaprasadThirunarayan, 'Provenance Context Entity (PaCE): Scalable provenance tracking for scientific RDF data.’ SSDBM, Heidelberg, Germany 2010. Papers: http://knoesis.org/library Demos at: http://knoesis.wright.edu/library/demos/

Notas do Editor

  1. Cognitive model, cognitive behavioral model
  2. In parasite research, create new strains of a parasite by knocking out specific genes. So, given a cloned sample, we may need to know the gene(s) that was knocked out.Both these scenarios are real world examples of the importance of provenance. There are many research issues in provenance management. This presentation is on addressing 1) the provenance modeling issue. Specifically, provenance interoperability, consistent modeling, and reduction of terminological heterogeneity. (2) Provenance Query
  3. References: http://www.armman.org/projecthero http://www.armman.org/mmitra