SlideShare uma empresa Scribd logo
1 de 21
k-BOOM
A Bayesian approach to ontology structure inference,
with applications in disease ontology construction
Chris Mungall
Lawrence Berkeley Laboratory
PhenoDay 2016
@monarchinit
@chrismungall
Building a cohesive, complete disease
ontology
Objective
• Combine existing disease
classifications and lists into
unified cohesive
framework
• Best of all worlds
• Integrate data from multiple
resources
Challenges
• Current resources
developed independently,
different perspectives
• Mappings are imprecise
OMIM Orphanet DO MESH NCIT
Deciphe
r
ICD SNOMED
Combined, coherent view
Disease classifications and why
mappings are not enough
• Given N disease lists
– Where each provides cross-references
(xrefs) to up to N-1 others
– Up to (N^2)-N sets of mappings
• Even more with 3rd party mappings
– These are frequently
• Inconsistent (directly or indirectly)
• Different meanings and levels of specificity
• Incomplete
• Stale
• Difficult to computationally verify
• Fundamental issue
– Xrefs lack semantics
– Explicit semantics would enable
computational checks
Ont1
Ont2 Ont3
Ont4
Ont5
Ont6
DOID
(blue)
OMIM
(brown)
MESH
(grey)
ORDO/Orphanet
(yellow)
SubClassOf
(solid line)
Xref
(dashed grey line)
4 disease resources
plus mappings:
Hemolytic anemia
Objective: Coherent OWL Ontology
Merging (OOM)
• Criteria for OOM
– Merged
• Combines multiple lists and classifications (terminologies
and lists treated as ‘degenerate’ ontologies), Presented as a
single ontology
• Equivalent classes merged
– Logically Connected
• OWL/Description Logic constructs
– e.g. SubClassOf, EquivalentClass, SomeValuesFrom
• Not xrefs
– Coherent
• Logically coherent: no unsatisfiable classes
• Biologically coherent: makes biological and clinical sense
Our previous approach, applied to
phenotypes: L-DOOM
Logical Definition based OWL Ontology Merging
Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2.
doi:10.1186/gb-2010-11-1-r2
Köhler, S., Doelken, S. C., Ruef, B. J., Bauer, S., Washington, N., Westerfield, M., … Mungall, C. J. (2013). Construction and accessibility of a cross-species phenotype ontology
along with gene annotations for biomedical research. F1000Research, 1–12. doi:10.3410/f1000research.2-30.v1
Application to diseases?
• Works well for compositional classes (e.g. many cancer terms)
• Less well for genetic diseases, complex syndromes
1. Assign Logical Definitions
(OWL equivalence axioms) to
classes in each ontology
• Can be assigned
manually or semi-
automatically (Obol)
HP:0002180
Neuro-
degeneration
MP:0000876
Purkinje cell
degeneration
Equiv
CL:0000540
neuron
CL:0000121
Purkinje cell
Equiv
degenerate
AND
inheres-in SOME
neuron
degenerate
AND
inheres-in SOME
Purkinje cell
2. Using reasoning to infer logical
axioms
SubClassOf
Probabilistic Ontology OP = <A,H>
BOOM Bayes OWL Ontology Merging:
Finds the set of hypothetical axioms that maximises P(OP)
Merged Coherent
OWL Ontology
Elk
Reasoner
Ontology 1
Inter-
Ontology
Mappings
mapping
tool
Ontology 2
Ontology ..
Ontology n
Hypothetical
Logical Axioms
plus Weights (H)
mapping
curation
Axiom Weight Estimator
Weight
Curation
Next iteration
Merge equivalent
classes
Generating hypothetical logical axioms
Inter-
Ontology
Mappings
Hypothetical
Logical Axioms
plus Weights (H)
Axiom Weight Estimator
E.g:
OMIM:123 xref
DOID:987
Pr(OMIM:123 ≡ DOID:987) = 0.3
Pr(OMIM:123 ⊂ DOID:987) = 0.4
Pr(OMIM:123 ⊃DOID:987) = 0.1
Domain rules
(lexical, structural, …):
K-BOOM Algorithm for finding most
likely merged ontology
1. Factorize calculation by dividing combined
axioms into k modules (k-BOOM)
Algorithm:
i. Assert all hypothetical axioms to be true,
ii. Make module from equivalence clique
Find values for H that maximises P.
Problem: 2^N ontologies
hi
: boolean representing truth value of hypothetical axiom Hi
2. Use greedy algorithm; start with
Most likely hypothetical axioms in Ok
3. Test each configuration using OWL
Reasoner (Elk) for satisfiability
(unsat => Pr=0), calc posterior probability
4. Repeat until number of tests
exceeds threshold
5. Return most likely configuration for Ok
Probability guided curator workflow:
A little knowledge goes a long way
• Run cycle
• Examine results for modules
with:
– low posterior probability
– low confidence (top ranked
solution has similar P to next
ranked)
– Pr(H_i = true) << threshold
• Apply biological/clinical
knowledge
• Override auto-generated
hypothetical axiom weights with
curated ones
– Feedback issues to source
ontologies
• Repeat
dialog
Mondo
curator
External
ontology
curator
Application: merging diseases into
MonDO
https://github.com/monarch-initiative/monarch-disease-ontology
“Ontology” Classes (before, after
merge)
SubClass axioms Xrefs
Inputs:
DOID 6878  6012 7082 36656
MESH (D) 11314  4152 19036
OMIM (D) 7783  7783 0 31242
Orphanet (D) 8740  4683 15182 20326
OMIA 4833  4833 3120 355
DC 209  208 310 316
Medic 0 8630 3435
Output:
MonDO 39757  27617 44837
Held back: NCIT, SNOMED, ICD9, GARD
Example Module Resolution: ITM2B
amyloidosis
Example failed resolution – due to
ontology error
https://github.com/monarch-initiative/monarch-disease-ontology/issues/99
https://github.com/DiseaseOntology/HumanDiseaseOntology/issues/164
Example failed resolution – due to
mesh duplicates
https://github.com/monarch-initiative/monarch-disease-ontology/issues/81
Evaluating results of disease merger
• No gold standard for multiple ontology merger
– Partial evaluation using held-back Orphanet NTBT/E calls:
• 6977/7986 (87% agreement)
• Ad-hoc evaluation by curator
– Approach: use posterior probabilities to rank modules requiring
attention
– This is the killer-app feature
– Iteratively refine curated probabilities
• https://github.com/monarch-initiative/monarch-disease-ontology/issues/
• Results
– Manual inspection and use of mondo
– Detection of errors in source ontologies
• E.g. duplicates in MESH
• Incorrect xrefs in DO, e.g.
– https://github.com/DiseaseOntology/HumanDiseaseOntology/issues - issues #164, #163,
#156, #154, #151, #150, #149, #140, #135
Next Steps
• Integrate hypothetical axiom weight estimation into
Bayesian model
• Apply Markov Chain Monte Carlo (MCMC) methods for
estimating most likely graph
– E.g Metropolis-Hastings
• Integrate other knowledge
– Logical Definitions (Phenotypes)
– Molecular knowledge
• Improve Evaluation
– Test k-BOOM on task where we have gold standard, e.g.
neuroanatomy/uberon
– Formal comparison with EFO, MedGen, …
Discussion
• Retrospective merging vs prospective
development
– Better to work together from outset (OBO model)
– However, current state of affairs is such that
expert knowledge is distributed across resources
– We want to preserve that rather than reinvent
– Coherent merging of molecular knowledge with
classical top-down knowledge will be required
moving forward
Implementation/Availability
• Software
– https://github.com/monarch-initiative/kboom
• Paper
– https://github.com/cmungall/kboom-paper
– http://biorxiv.org/content/early/2016/04/15/048843
• MonDO
– https://github.com/monarch-initiative/monarch-
disease-ontology
– Both OWL ontology and axiom weight rules
Acknowledgments
k-BOOM
• Ian Holmes
• Sebastian Kohler
• Jim Balhoff
• Peter Robinson
• Melissa Haendel
Curation
• Nicole Vasilesky (MonDO,
DC)
• Sue Bello (DC)
• Elvira Mitraka (DO)
• Lynn Shriml (DO)
FUNDING: NIH Office of Director: 1R24OD011883; NIH-UDP:
HHSN268201300036C

Mais conteúdo relacionado

Semelhante a Kboom phenoday-2016

Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08Russ Altman
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vuploadProf. Wim Van Criekinge
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtCITE
 
Differential diagnosis
Differential diagnosisDifferential diagnosis
Differential diagnosisClinton Pong
 
Data-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk LearningData-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk LearningPo-Hsiang (Barnett) Chiu
 
Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchFranciscoJAzuajeG
 
Partitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score RegressionPartitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score Regressionbbuliksullivan
 
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryMolecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryLee Larcombe
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningjaumebp
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSAksw Group
 
Modelling physiological uncertainty
Modelling physiological uncertaintyModelling physiological uncertainty
Modelling physiological uncertaintyNatal van Riel
 
How can we har­ness the Human Brain Project to max­i­mize its future health a...
How can we har­ness the Human Brain Project to max­i­mize its future health a...How can we har­ness the Human Brain Project to max­i­mize its future health a...
How can we har­ness the Human Brain Project to max­i­mize its future health a...SharpBrains
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
A scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materializationA scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materializationRokan Uddin Faruqui
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..butest
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPatricia Francis-Lyon
 
Generation and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeGeneration and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeJoel Saltz
 

Semelhante a Kboom phenoday-2016 (20)

Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload
 
Analogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thoughtAnalogy, Causality, and Discovery in Science: The engines of human thought
Analogy, Causality, and Discovery in Science: The engines of human thought
 
Differential diagnosis
Differential diagnosisDifferential diagnosis
Differential diagnosis
 
Data-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk LearningData-driven Disease Phenotyping and Bulk Learning
Data-driven Disease Phenotyping and Bulk Learning
 
Challenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical researchChallenges and opportunities for machine learning in biomedical research
Challenges and opportunities for machine learning in biomedical research
 
Partitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score RegressionPartitioning Heritability using GWAS Summary Statistics with LD Score Regression
Partitioning Heritability using GWAS Summary Statistics with LD Score Regression
 
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryMolecular modelling for in silico drug discovery
Molecular modelling for in silico drug discovery
 
Knowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learningKnowledge extraction and visualisation using rule-based machine learning
Knowledge extraction and visualisation using rule-based machine learning
 
2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge
 
Paul Groth
Paul GrothPaul Groth
Paul Groth
 
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGSEVOLUTION OF ONTOLOGY-BASED MAPPINGS
EVOLUTION OF ONTOLOGY-BASED MAPPINGS
 
Modelling physiological uncertainty
Modelling physiological uncertaintyModelling physiological uncertainty
Modelling physiological uncertainty
 
How can we har­ness the Human Brain Project to max­i­mize its future health a...
How can we har­ness the Human Brain Project to max­i­mize its future health a...How can we har­ness the Human Brain Project to max­i­mize its future health a...
How can we har­ness the Human Brain Project to max­i­mize its future health a...
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
A scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materializationA scalable ontology reasoner via incremental materialization
A scalable ontology reasoner via incremental materialization
 
32_Nov07_MachineLear..
32_Nov07_MachineLear..32_Nov07_MachineLear..
32_Nov07_MachineLear..
 
Predicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learningPredicting phenotype from genotype with machine learning
Predicting phenotype from genotype with machine learning
 
Generation and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeGeneration and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology Phenotype
 

Mais de Chris Mungall

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxChris Mungall
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesChris Mungall
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxChris Mungall
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupChris Mungall
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeChris Mungall
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in UberonChris Mungall
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)Chris Mungall
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Chris Mungall
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...Chris Mungall
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributionsChris Mungall
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesChris Mungall
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyChris Mungall
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyChris Mungall
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodelChris Mungall
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
 

Mais de Chris Mungall (20)

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 

Último

Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubaikojalkojal131
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)riyaescorts54
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...D. B. S. College Kanpur
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx023NiWayanAnggiSriWa
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomyDrAnita Sharma
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 

Último (20)

Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPests of soyabean_Binomics_IdentificationDr.UPR.pdf
Pests of soyabean_Binomics_IdentificationDr.UPR.pdf
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In DubaiDubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
Dubai Calls Girl Lisa O525547819 Lexi Call Girls In Dubai
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
(9818099198) Call Girls In Noida Sector 14 (NOIDA ESCORTS)
 
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
Fertilization: Sperm and the egg—collectively called the gametes—fuse togethe...
 
Bioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptxBioteknologi kelas 10 kumer smapsa .pptx
Bioteknologi kelas 10 kumer smapsa .pptx
 
basic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomybasic entomology with insect anatomy and taxonomy
basic entomology with insect anatomy and taxonomy
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 

Kboom phenoday-2016

  • 1. k-BOOM A Bayesian approach to ontology structure inference, with applications in disease ontology construction Chris Mungall Lawrence Berkeley Laboratory PhenoDay 2016 @monarchinit @chrismungall
  • 2. Building a cohesive, complete disease ontology Objective • Combine existing disease classifications and lists into unified cohesive framework • Best of all worlds • Integrate data from multiple resources Challenges • Current resources developed independently, different perspectives • Mappings are imprecise OMIM Orphanet DO MESH NCIT Deciphe r ICD SNOMED Combined, coherent view
  • 3. Disease classifications and why mappings are not enough • Given N disease lists – Where each provides cross-references (xrefs) to up to N-1 others – Up to (N^2)-N sets of mappings • Even more with 3rd party mappings – These are frequently • Inconsistent (directly or indirectly) • Different meanings and levels of specificity • Incomplete • Stale • Difficult to computationally verify • Fundamental issue – Xrefs lack semantics – Explicit semantics would enable computational checks Ont1 Ont2 Ont3 Ont4 Ont5 Ont6
  • 5.
  • 6.
  • 7. Objective: Coherent OWL Ontology Merging (OOM) • Criteria for OOM – Merged • Combines multiple lists and classifications (terminologies and lists treated as ‘degenerate’ ontologies), Presented as a single ontology • Equivalent classes merged – Logically Connected • OWL/Description Logic constructs – e.g. SubClassOf, EquivalentClass, SomeValuesFrom • Not xrefs – Coherent • Logically coherent: no unsatisfiable classes • Biologically coherent: makes biological and clinical sense
  • 8. Our previous approach, applied to phenotypes: L-DOOM Logical Definition based OWL Ontology Merging Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2 Köhler, S., Doelken, S. C., Ruef, B. J., Bauer, S., Washington, N., Westerfield, M., … Mungall, C. J. (2013). Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research. F1000Research, 1–12. doi:10.3410/f1000research.2-30.v1 Application to diseases? • Works well for compositional classes (e.g. many cancer terms) • Less well for genetic diseases, complex syndromes 1. Assign Logical Definitions (OWL equivalence axioms) to classes in each ontology • Can be assigned manually or semi- automatically (Obol) HP:0002180 Neuro- degeneration MP:0000876 Purkinje cell degeneration Equiv CL:0000540 neuron CL:0000121 Purkinje cell Equiv degenerate AND inheres-in SOME neuron degenerate AND inheres-in SOME Purkinje cell 2. Using reasoning to infer logical axioms SubClassOf
  • 9. Probabilistic Ontology OP = <A,H> BOOM Bayes OWL Ontology Merging: Finds the set of hypothetical axioms that maximises P(OP) Merged Coherent OWL Ontology Elk Reasoner Ontology 1 Inter- Ontology Mappings mapping tool Ontology 2 Ontology .. Ontology n Hypothetical Logical Axioms plus Weights (H) mapping curation Axiom Weight Estimator Weight Curation Next iteration Merge equivalent classes
  • 10. Generating hypothetical logical axioms Inter- Ontology Mappings Hypothetical Logical Axioms plus Weights (H) Axiom Weight Estimator E.g: OMIM:123 xref DOID:987 Pr(OMIM:123 ≡ DOID:987) = 0.3 Pr(OMIM:123 ⊂ DOID:987) = 0.4 Pr(OMIM:123 ⊃DOID:987) = 0.1 Domain rules (lexical, structural, …):
  • 11. K-BOOM Algorithm for finding most likely merged ontology 1. Factorize calculation by dividing combined axioms into k modules (k-BOOM) Algorithm: i. Assert all hypothetical axioms to be true, ii. Make module from equivalence clique Find values for H that maximises P. Problem: 2^N ontologies hi : boolean representing truth value of hypothetical axiom Hi 2. Use greedy algorithm; start with Most likely hypothetical axioms in Ok 3. Test each configuration using OWL Reasoner (Elk) for satisfiability (unsat => Pr=0), calc posterior probability 4. Repeat until number of tests exceeds threshold 5. Return most likely configuration for Ok
  • 12. Probability guided curator workflow: A little knowledge goes a long way • Run cycle • Examine results for modules with: – low posterior probability – low confidence (top ranked solution has similar P to next ranked) – Pr(H_i = true) << threshold • Apply biological/clinical knowledge • Override auto-generated hypothetical axiom weights with curated ones – Feedback issues to source ontologies • Repeat dialog Mondo curator External ontology curator
  • 13. Application: merging diseases into MonDO https://github.com/monarch-initiative/monarch-disease-ontology “Ontology” Classes (before, after merge) SubClass axioms Xrefs Inputs: DOID 6878  6012 7082 36656 MESH (D) 11314  4152 19036 OMIM (D) 7783  7783 0 31242 Orphanet (D) 8740  4683 15182 20326 OMIA 4833  4833 3120 355 DC 209  208 310 316 Medic 0 8630 3435 Output: MonDO 39757  27617 44837 Held back: NCIT, SNOMED, ICD9, GARD
  • 14. Example Module Resolution: ITM2B amyloidosis
  • 15. Example failed resolution – due to ontology error https://github.com/monarch-initiative/monarch-disease-ontology/issues/99 https://github.com/DiseaseOntology/HumanDiseaseOntology/issues/164
  • 16. Example failed resolution – due to mesh duplicates https://github.com/monarch-initiative/monarch-disease-ontology/issues/81
  • 17. Evaluating results of disease merger • No gold standard for multiple ontology merger – Partial evaluation using held-back Orphanet NTBT/E calls: • 6977/7986 (87% agreement) • Ad-hoc evaluation by curator – Approach: use posterior probabilities to rank modules requiring attention – This is the killer-app feature – Iteratively refine curated probabilities • https://github.com/monarch-initiative/monarch-disease-ontology/issues/ • Results – Manual inspection and use of mondo – Detection of errors in source ontologies • E.g. duplicates in MESH • Incorrect xrefs in DO, e.g. – https://github.com/DiseaseOntology/HumanDiseaseOntology/issues - issues #164, #163, #156, #154, #151, #150, #149, #140, #135
  • 18. Next Steps • Integrate hypothetical axiom weight estimation into Bayesian model • Apply Markov Chain Monte Carlo (MCMC) methods for estimating most likely graph – E.g Metropolis-Hastings • Integrate other knowledge – Logical Definitions (Phenotypes) – Molecular knowledge • Improve Evaluation – Test k-BOOM on task where we have gold standard, e.g. neuroanatomy/uberon – Formal comparison with EFO, MedGen, …
  • 19. Discussion • Retrospective merging vs prospective development – Better to work together from outset (OBO model) – However, current state of affairs is such that expert knowledge is distributed across resources – We want to preserve that rather than reinvent – Coherent merging of molecular knowledge with classical top-down knowledge will be required moving forward
  • 20. Implementation/Availability • Software – https://github.com/monarch-initiative/kboom • Paper – https://github.com/cmungall/kboom-paper – http://biorxiv.org/content/early/2016/04/15/048843 • MonDO – https://github.com/monarch-initiative/monarch- disease-ontology – Both OWL ontology and axiom weight rules
  • 21. Acknowledgments k-BOOM • Ian Holmes • Sebastian Kohler • Jim Balhoff • Peter Robinson • Melissa Haendel Curation • Nicole Vasilesky (MonDO, DC) • Sue Bello (DC) • Elvira Mitraka (DO) • Lynn Shriml (DO) FUNDING: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C

Notas do Editor

  1. 20 minutes. Sat July 9. 9.40am
  2. TODO: Make data integration
  3. https://github.com/monarch-initiative/monarch-disease-ontology/issues/90 Note the two subgraphs; little overlap in the upper areas
  4. Note Typical (top left) and Atypical are connected
  5. Note Typical (top left) and Atypical are connected
  6. We treat every resource as an ontology, even the degenerate case where it’s a flat list (e.g. OMIM). Pink = novel
  7. Heuristic/ad-hoc
  8. Fig. 2. Module resolution graph exported by kBOOM; Initial input is nodes plus solid arrows (SubClassOf axioms in ORDO). Dotted lines are supplied mappings (no logical interpretation). Figure shows inferred most likely configuration. equivalence=red, subclass=blue, with prior probabilities written as edge labels (thick lines more probable). Enclosing boxes denote equivalence cliques, which can be merged to a single class, yielding a grouping class with two children.
  9. TODO: Example of dupes in MESH Highlight flipping example