SlideShare uma empresa Scribd logo
1 de 35
Baixar para ler offline
Giovanni Bocci, Cristian Bologa, Daniel Byrd, Jayme Holmes, Stephen
Mathias, Oleg Ursu, Anna Waller, Jeremy Yang & Tudor Oprea
03/15/2019
INBRE-NMBIST Symposium
Santa Fe, NM Funding: NIH U24 CA224370 & NIH U24 TR002278
ILLUMINATING THE DRUGGABLE
GENOME WITH KNOWLEDGE
ENGINEERING AND MACHINE
LEARNING
datascience.unm.edupharos.nih.gov/idg/ druggablegenome.net
75% of protein research still
focused on 10% genes known
before human genome was mapped
AM Edwards et al, Nature, 2011
This prompted NIH to start the
Illuminating the Druggable Genome
Initiative (U54, Common Fund)
HGP
3
"If I have seen further it is by
standing on the shoulders of
Giants." - Isaac Newton, ~1675
Organization of this talk:
1. Shoulders development (knowledge engineering).
2. Seeing further efforts (machine learning).
4
IDG: AN INTERNATIONAL CONSORTIUM
5
http://hscnews.unm.edu/news/shedding-light-on-the-dark-genome
6
pharos.nih.gov/idg/ Omics
Dimensions
ML READY: PHAROS
11/13/18 revisionhttps://pharos.nih.gov/idg/targets/GRIN2A
7
ML READY: DRUGCENTRAL
10/16/18 revision
http://drugcentral.org/drugcard/1679
8
ML READY: HARMONIZOME
9
https://amp.pharm.mssm.edu/Harmonizome/
COMPONENTS OF IDG
https://druggablegenome.net/
DRGC
RDOC
IT
KMC
RFA-RM-16-026
(DRGC)
GPCR
U24 DK116195:
Bryan Roth, M.D., Ph.D. (UNC)
Brian Shoichet, Ph.D. (UCSF)
Ion
Channel
U24 DK116214:
Lily Jan, Ph.D. (UCSF)
Michael T. McManus, Ph.D. (UCSF)
Kinase
U24 DK116204:
Gary L. Johnson, Ph.D. (UNC)
RFA-RM-16-025
(RDOC)
U24 TR002278:
Stephan C. Schürer, Ph.D.  (UMiami)
Dusica Vidovic, Ph.D.  (UMiami)
Tudor Oprea, M.D., Ph.D.  (UNM)
Larry A. Sklar, Ph.D.  (UNM)
RFA-RM-16-024
(KMC)
U24 CA224260:
Avi Ma’ayan, Ph.D.  (ISMMS)
U24 CA224370:
Tudor Oprea, M.D., Ph.D. (UNM)
RFA-RM-18-011
(CEIT)
Awards starting date March 2019
Further information
Email: idg.rdoc@gmail.com
Follow: @DruggableGenome
URLs:
https://druggablegenome.net
/
https://commonfund.nih.gov/i
dg/
IDG Knowledge User-Interface
Email: pharos@mail.nih.gov
Follow: @IDG_Pharos
URL: https://pharos.nih.gov/
10
TARGET DEVELOPMENT LEVEL (TDL)
▪ Most protein classification schemes are
based on structural and functional criteria.
▪ For therapeutic development, it is useful to
understand how much and what types of
data are available for a given protein,
thereby highlighting well-studied and
understudied targets.
▪ Tclin: Proteins annotated as drug targets
▪ Tchem: Proteins for which potent small
molecules are known
▪ Tbio: Proteins for which biology is better
understood
▪ Tdark: These proteins lack antibodies,
publications or Gene RIFs
3/23/18 revision
T. Oprea et al., Nature Rev. Drug Discov. 2018,
https://www.nature.com/articles/nrd.2018.14
11
TDL LEVELS: Tclin and Tchem
▪ Tclin proteins are associated
with drug Mechanism of Action
(MoA) – NRDD 2017
▪ Tchem proteins have
bioactivitis in ChEMBL and
DrugCentral, + human curation
for some targets
▪ Kinases: <= 30nM
▪ GPCRs: <= 100nM
▪ Nuclear Receptors: <= 100nM
▪ Ion Channels: <= 10μM
▪ Non-IDG Family Targets: <= 1μM
10/19/16 revision
Bioactivities of approved drugs (by Target class)
ChEMBL: database of bioactive chemicals
https://www.ebi.ac.uk/chembl/
DrugCentral: online drug compendium
http://drugcentral.org/
R. Santos et al., Nature Rev. Drug Discov. 2017, https://www.nature.com/articles/nrd.2016.230
12
TDL LEVELS Tbio and Tdark
▪ Tbio proteins lack small molecule annotation cf. Tchem criteria,
and satisfy one of these criteria:
▪ protein is above the cutoff criteria for Tdark
▪ protein is annotated with a GO Molecular Function or Biological Process
leaf term(s) with an Experimental Evidence code
▪ protein has confirmed OMIM phenotype(s)
▪ Tdark (“ignorome”) have little information available, and satisfy
these criteria:
▪ PubMed text-mining score from Jensen Lab < 5
▪ <= 3 Gene RIFs
▪ <= 50 Antibodies available according to antibodypedia.com
13
TDL: EXTERNAL VALIDATION
Tdark parameters differ from the other TDLs across the 4 external
metrics cf. Kruskal-Wallis post-hoc pairwise Dunn tests
2/23/18 revision
T. Oprea et al., Nature Rev. Drug Discov. 2018,
https://www.nature.com/articles/nrd.2018.14
14
WHY FUND TDARK RESEARCH?
2/23/18 revision
T. Oprea et al., Nature Rev. Drug Discov. 2018,
https://www.nature.com/articles/nrd.2018.14
Typically, it takes 15-20 years for a Tdark protein to become druggable
15
IMPC BOLDLY GOES WHERE NO ONE
HAS GONE BEFORE
95% of eligible IDG genes
(339/356) have plans,
attempts, or models
384 genes were prioritized
by IDG KMC (2014-2016) 17
28
17
1
63
24
50
79
168306 Tbio genes
90 Tdark genes
42 Tchem genes
11/29/17 revision
Slide from Steve Murray, Jackson Lab 16
TAKE HOME MESSAGE:
THERE IS A
KNOWLEDGE DEFICIT
3/12/18 revision
~35% of the proteins remain
poorly described (Tdark)
~11% of the Proteome (Tclin & Tchem) are currently targeted
by small molecule probes
Choosing to work on dark genes is a high-risk endeavor
(Funders are less likely to award grants for Tdark)
CHALLENGE: RANKING & SCORING
PROTEIN-DISEASE ASSOCIATIONS
https://pharos-beta.ncats.io/targets/GRIN2A
The IDG KMC tracks more ~10 information
channels for protein-disease associations,
accessible via the Pharos portal.
Our challenge is to harmonize disease
concepts, and to enable computational
use: e.g., GRIN2A with GRIN1 form the
Glutamate NMDA receptor, MoA drug
target for memantine (Alzheimer’s).
The challenge for ML & AI: How to
prioritize targets? i.e., which
protein-disease associations are clinically
actionable?
10/07/18 revision
18
WHAT DO WE KNOW ABOUT
DISEASES?
▪ There are between 9,000 and 25,000 disease concepts
▪ Pharos/TCRD tracks ~11,000 disease via Disease
Ontology, and ~10500 rare disease via eRAM,
OrphaNet and the Monarch Initiative MONDO system
19
PROTEIN KNOWLEDGE GRAPHS
▪ IDG KMC2 seeks knowledge gaps
across the five branches of the
“knowledge tree”:
▪ Genotype; Phenotype; Interactions
& Pathways; Structure & Function;
and Expression, respectively.
▪ We can use biological systems
network modeling to infer novel
relationships based on available
evidence, and infer new “function”
and “role in disease” data based
on other layers of evidence
▪ Primary focus on Tdark & Tbio
O. Ursu, T Oprea et al., IDG2 KMC 2/01/18 revision
20
THE METAPATH-ML APPROACH▪ A metapath is a sequence of
relations defined between
different object types.
▪ Our metapaths encode
type-specific network topology
between the source node (Protein)
and the destination node
(Disease/Phenotype).
▪ This approach enables the
transformation of
assertions/evidence chains of
heterogeneous biological data
types into a ML ready format.
SOME REFS: G. Fu et al., BMC Bioinformatics 2016. D Himmelstein & S Baranzini, PLOS Comp Bio, 2015.
Similar assertions or evidence form metapaths (white).
Instances of metapath (paths) are used to determine the strength of the evidence linking a
gene to disease/phenotype/function.
21
22
SOME EARLY ACKNOWLEDGMENTS ...
Abstract: We hear a lot about machine learning and its role in health
care, but these methods require large amounts of training data. Using
these and other related method to study rare diseases poses
substantial challenges: how can we get tens of thousands of training
examples when there are tens or hundreds of people with a disease?
(Abstract from this
conference)
Scarce training data our
problem too. But with
genes instead of people.
Our MetapathML method
similar to
Himmelstein-Baranzini.
(Daniel is now post-doc in
Greene Lab.)
METAPATH-ML DATA SOURCES
O. Ursu et al., manuscript in preparation
Data source Data type Data points
CCLE Gene expression 19,006,134
GTEx Gene expression 2,612,227
Protein Atlas Gene & Protein expression 949,199
Reactome Biological pathways 303,681
KEGG Biological pathways 27,683
StringDB Protein-Protein interactions 5,080,023
Gene ontology Biological pathways & Gene function 434,317
InterPro Protein structure and function 467,163
ClinVar Human Gene - Disease/Phenotype associations 881,357
GWAS Gene - Disease/Phenotype associations 54,360
OMIM Human Gene - Disease/Phenotype associations 25,557
UniProt Disease Human Gene - Disease/Phenotype associations 5,365
JensenLab DISEASE Gene - Disease associations from text mining 44,829
NCBI Homology Homology mapping of human/mouse/rat genes 70,922
IMPC Mouse Gene - Phenotype associations 2,153,999
RGD Rat Gene - Phenotype associations 117,606
LINCS Drug induced gene signatures 230,111,315
We developed automated
methods for data collection
(TCRD), visualization (Pharos)
and data aggregation.
 
These aggregated datasets
were used to build machine
learning models for 20+
disease and 73 mouse
phenotype.
Each knowledge graph
contains ~22,000 metapaths
and 284 million path instances.
10/07/18 revision
23
METAPATH-ML WORKFLOW
▪ A meta-path encodes type-specific network topology between the source node
(e.g., Protein target) and the destination node (e.g., Disease or Function)
▪ Target –– (member of) → PPI Network ← (member of) –– Protein –– (associated
with) → Disease
▪ Target –– (expressed in) → Tissue ← (localized in) –– Disease
O. Ursu, T Oprea et al., IDG2 KMC 2/01/18 revision
24
METAPATH-ML @ UNM
one protein-disease
association at the time
O. Ursu, T Oprea et al., IDG2 KMC 2/01/18 revision
Genes associated with a disease/phenotype are positive examples, whereas genes lacking the same
association are negative examples. The Metapath approach transforms assertions/evidence chains into
classification problems that can be solved using suitably designed machine learning algorithms.
25
Use of XGBoost
(XGBoost = eXtreme Gradient Boosting)
● https://xgboost.ai/
● GitHub
● Documentation
● R package
● Exceptional interpretability
MetapathML employs XGBoost via the R package API. The inputs to XGBoost
are datasets specific to each disease or phenotype. For each disease/phenotype
some known associated genes correspond with the positive Y labels in the
dataset. XGBoost parameters are optimized via grid search, i.e. iterative testing
over discrete parameter value combinations.
27
ALZHEIMER’S DISEASE (AD) METAPATH
ML MODEL
▪Build data matrix from “Alzheimer’s disease” in
TCRD subset
▪ protein knowledge graph along metapaths:
▪ Protein – Protein Interactions
▪ Pathways
▪ GO terms
▪ Gene expression
▪ Etc.
▪ Training set: 53 genes associated with
Alzheimer’s disease (positives); 3,952 genes
associated with other pathologies from OMIM
were assumed to be negative
▪ Test set: 23 genes associated with Alzheimer's
(positives) and 200 genes not associated with
Alzheimer's (negatives) ← from Text Mining
▪ “Complete forest” binary classifier using
XGBoost & 5-fold cross-validation.
2/14/18 revisionML work by Oleg Ursu
Predicted
Actual
Pos Neg
Pos 20 3
Neg 41 159
29
AD XGBOOST CLASSIFIER:
VARIABLE IMPORTANCE PLOT
▪ The top most important features are interactions with
proteins mediating inflammatory processes
(JAK2/Tclin, IL10 & IL2 / Tchem), response to oxidative
stress (GSTP1/Tchem), nervous system development
(BDNF/Tbio) and glycolysis (GAPDH/Tchem).
▪ LINCS drug-induced gene expression perturbations are
the largest category of features for these predictions.
▪ Brain cortex expression is a necessary requirement.
▪ One Reactome pathway (AU-rich mRNA elements binding
proteins) is also important.
▪ Weighted approached showed better performance in the
test set for Alzheimer's Disease, Schizophrenia, and Dilated
Cardiomyopathy.
4/23/18 revisionML work by Oleg Ursu
30
EXPERIMENTAL VALIDATION: AD
▪ SHSY5Ys pTau siRNA test
▪ Measured pTau levels after knock-down of gene expression
• Human iPSNs qPCR
▪ Measuring endogenous gene expression levels, AD vs Ctrl
▪ Western blot or ICC to characterize AD phenotype versus control
• Human Tissue qPCR
▪ Measuring endogenous gene expression levels, AD vs Ctrl
▪ Western blot or ICC to characterize AD phenotype versus control
11/14/18 revision
AD validation work by Jessica Binder & Kiran Bhaskar (UNM), funded by U24CA224370-S2 supplement 31
2/14/19 revisionAD validation work by Jessica Binder & Kiran Bhaskar (UNM), funded by U24CA224370-S2 supplement
▪Validation on the 20 predicted genes: AKNA, BC02, CCNY,
CRTAM, FAM92B, FOXP4, FRRS1, GRIN2C, 1L17REL, LILRA3, LM04,
NDRG2, PIBF1, RAB40A, SCGB3A1, SLC44A2, SPOP, STARD3,
TMEFF2, TXNDC12
▪The most obvious effects based on the combined Cellomics &
qPCR of iPSNs & autopsy brains suggests that AKNA, LILRA3,
NDRG2 and TXNDC12 significantly increased pTau (as tracked
by two different antibodies for T180
, S202
and S205
)
▪For now, it appears that machine learning models may have
identified between 4 and 7 new genes that have previously not
been associated with Alzheimer’s Disease
32
EXPERIMENTAL VALIDATION: AD
33
EXPERIMENTAL VALIDATION: MORE
DISEASES AND COLLABORATORS
Disease Experimental Collaboration
Prostate cancer Work by Art Cherkasov, Kriti Singh & Mike Hsing (UBC, Vancouver). Of
the top 50 ML predicted genes, 19 commonly upregulated in YZ Wang
Transdifferentiation PDX model and Beltran dataset 2016.
Ovarian cancer Spheroid tumor & patient-derived xenograft (PDX) work by Mara
Steinkamp (UNM). From the top ML predicted 63 genes, 12 genes show
significant changes in cancer cells.
NEXT STEPS:
● In vivo experiments.
● More diseases and phenotypes.
ML LEARNINGS IN TARGET AND
DRUG DISCOVERY
1. Model quality is limited by data quality. Good data → good models.
2. ML can identify hidden patterns in big data. For example, the
central node(s) in PPI network(s) that are a playing critical role in
disease pathology.
3. Deep learning not so applicable to our task (better for tall datasets,
well defined good solutions, less need for interpretability).
4. XGBoost (decision tree algorithm) excels in performance &
interpretability.
5. Shows real promise in Target Repurposing. 34
35
IN CLOSING...
35
● IDG platform for knowledge
discovery about the "dark genome."
● ML provides new insights by
integrating multi-omics knowledge
graphs.
● Hard questions should be directed to
Tudor Oprea!

Mais conteúdo relacionado

Mais procurados

What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...mhb120
 
Bioinformatics workshop presentation
Bioinformatics   workshop presentationBioinformatics   workshop presentation
Bioinformatics workshop presentationSKUAST-Kashmir
 
Epigeneticsand methylation
Epigeneticsand methylationEpigeneticsand methylation
Epigeneticsand methylationShubhda Roy
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...Chris Evelo
 
A new assay for measuring chromosome instability (CIN) and identification of...
A new assay for measuring chromosome instability  (CIN) and identification of...A new assay for measuring chromosome instability  (CIN) and identification of...
A new assay for measuring chromosome instability (CIN) and identification of...Enrique Moreno Gonzalez
 
Integrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics dataIntegrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics dataEnrico Glaab
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartAlexander Pico
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesAmos Watentena
 
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast CancerBioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast CancerTom Kelly
 
Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...
Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...
Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...Enrique Moreno Gonzalez
 
A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...Laurence Dawkins-Hall
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways: Chris Evelo
 
FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)BrianSchilder
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekingeProf. Wim Van Criekinge
 
The human genome project was started in 1990 with the goal of sequencing and ...
The human genome project was started in 1990 with the goal of sequencing and ...The human genome project was started in 1990 with the goal of sequencing and ...
The human genome project was started in 1990 with the goal of sequencing and ...Rania Malik
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.Elena Sügis
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleLaurence Dawkins-Hall
 

Mais procurados (20)

Ml in genomics
Ml in genomicsMl in genomics
Ml in genomics
 
What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...
 
Bioinformatics workshop presentation
Bioinformatics   workshop presentationBioinformatics   workshop presentation
Bioinformatics workshop presentation
 
Epigeneticsand methylation
Epigeneticsand methylationEpigeneticsand methylation
Epigeneticsand methylation
 
WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...WikiPathways: how open source and open data can make omics technology more us...
WikiPathways: how open source and open data can make omics technology more us...
 
A new assay for measuring chromosome instability (CIN) and identification of...
A new assay for measuring chromosome instability  (CIN) and identification of...A new assay for measuring chromosome instability  (CIN) and identification of...
A new assay for measuring chromosome instability (CIN) and identification of...
 
Integrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics dataIntegrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics data
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And Challenges
 
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast CancerBioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
 
Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...
Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...
Variant G6PD levels promote tumor cell proliferation or apoptosis via the STA...
 
Insilico binding studies on tau protein and pp2 a as alternative targets in a...
Insilico binding studies on tau protein and pp2 a as alternative targets in a...Insilico binding studies on tau protein and pp2 a as alternative targets in a...
Insilico binding studies on tau protein and pp2 a as alternative targets in a...
 
A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...A systematic, data driven approach to the combined analysis of microarray and...
A systematic, data driven approach to the combined analysis of microarray and...
 
Analysis with biological pathways:
Analysis with biological pathways: Analysis with biological pathways:
Analysis with biological pathways:
 
FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)FunGen JC Presentation - Mostafavi et al. (2019)
FunGen JC Presentation - Mostafavi et al. (2019)
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
 
The human genome project was started in 1990 with the goal of sequencing and ...
The human genome project was started in 1990 with the goal of sequencing and ...The human genome project was started in 1990 with the goal of sequencing and ...
The human genome project was started in 1990 with the goal of sequencing and ...
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
Genome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattleGenome responses of trypanosome infected cattle
Genome responses of trypanosome infected cattle
 
Marsh pers strat-mednov2014
Marsh pers strat-mednov2014Marsh pers strat-mednov2014
Marsh pers strat-mednov2014
 

Semelhante a Illuminating the Druggable Genome with Knowledge Engineering and Machine Learning

Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Tudor Oprea
 
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicinemhaendel
 
Overpromise of AI in Drug Discovery
Overpromise of AI in Drug DiscoveryOverpromise of AI in Drug Discovery
Overpromise of AI in Drug DiscoveryTudor Oprea
 
Illuminating the druggable genome and the quest for new drug targets
Illuminating the druggable genome and the quest for new drug targetsIlluminating the druggable genome and the quest for new drug targets
Illuminating the druggable genome and the quest for new drug targetsLaura Berry
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Sean Ekins
 
Update on the Druggable Proteome
Update on the Druggable ProteomeUpdate on the Druggable Proteome
Update on the Druggable ProteomeChris Southan
 
Moving from Big Data to Better Models of Disease and Drug Response - Joel Dudley
Moving from Big Data to Better Models of Disease and Drug Response - Joel DudleyMoving from Big Data to Better Models of Disease and Drug Response - Joel Dudley
Moving from Big Data to Better Models of Disease and Drug Response - Joel DudleyCityAge
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicJoaquin Dopazo
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Andrei KUCHARAVY
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformaticaMartín Arrieta
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksAlexander Pico
 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataChirag Patel
 
ConstructPrecisePhenotypesBigDataChallenge
ConstructPrecisePhenotypesBigDataChallengeConstructPrecisePhenotypesBigDataChallenge
ConstructPrecisePhenotypesBigDataChallengeAthula Herath
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Ian Foster
 
Human genome project(ibri)
Human genome project(ibri)Human genome project(ibri)
Human genome project(ibri)ajay vishwakrma
 
How to analyse large data sets
How to analyse large data setsHow to analyse large data sets
How to analyse large data setsimprovemed
 

Semelhante a Illuminating the Druggable Genome with Knowledge Engineering and Machine Learning (20)

Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923Drug Repositioning Conference Washington DC 20190923
Drug Repositioning Conference Washington DC 20190923
 
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
 
The Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision MedicineThe Monarch Initiative: From Model Organism to Precision Medicine
The Monarch Initiative: From Model Organism to Precision Medicine
 
Overpromise of AI in Drug Discovery
Overpromise of AI in Drug DiscoveryOverpromise of AI in Drug Discovery
Overpromise of AI in Drug Discovery
 
Illuminating the druggable genome and the quest for new drug targets
Illuminating the druggable genome and the quest for new drug targetsIlluminating the druggable genome and the quest for new drug targets
Illuminating the druggable genome and the quest for new drug targets
 
INBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision Workshop at MIE 2011. Victoria López
INBIOMEDvision Workshop at MIE 2011. Victoria López
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
Collaborative Drug Discovery: A Platform For Transforming Neglected Disease R...
 
Update on the Druggable Proteome
Update on the Druggable ProteomeUpdate on the Druggable Proteome
Update on the Druggable Proteome
 
Moving from Big Data to Better Models of Disease and Drug Response - Joel Dudley
Moving from Big Data to Better Models of Disease and Drug Response - Joel DudleyMoving from Big Data to Better Models of Disease and Drug Response - Joel Dudley
Moving from Big Data to Better Models of Disease and Drug Response - Joel Dudley
 
4. HGP.pptx
4. HGP.pptx4. HGP.pptx
4. HGP.pptx
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...
 
Introducción a la bioinformatica
Introducción a la bioinformaticaIntroducción a la bioinformatica
Introducción a la bioinformatica
 
Technology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive NetworksTechnology R&D Theme 2: From Descriptive to Predictive Networks
Technology R&D Theme 2: From Descriptive to Predictive Networks
 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big data
 
ConstructPrecisePhenotypesBigDataChallenge
ConstructPrecisePhenotypesBigDataChallengeConstructPrecisePhenotypesBigDataChallenge
ConstructPrecisePhenotypesBigDataChallenge
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Human genome project(ibri)
Human genome project(ibri)Human genome project(ibri)
Human genome project(ibri)
 
How to analyse large data sets
How to analyse large data setsHow to analyse large data sets
How to analyse large data sets
 

Mais de Jeremy Yang

TIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS AnalyticsTIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS AnalyticsJeremy Yang
 
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizerDrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizerJeremy Yang
 
Mining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypothesesMining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypothesesJeremy Yang
 
TIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APITIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APIJeremy Yang
 
Ex-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles ExplorerEx-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles ExplorerJeremy Yang
 
Open Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterOpen Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterJeremy Yang
 
Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)Jeremy Yang
 
Bibliological data science and drug discovery
Bibliological data science and drug discoveryBibliological data science and drug discovery
Bibliological data science and drug discoveryJeremy Yang
 
BioMISS: Language Diversity of Computing
BioMISS: Language Diversity of ComputingBioMISS: Language Diversity of Computing
BioMISS: Language Diversity of ComputingJeremy Yang
 
The Language Diversity of Computing
The Language Diversity of ComputingThe Language Diversity of Computing
The Language Diversity of ComputingJeremy Yang
 
RMSD: routine measure stirs doubts
RMSD: routine measure stirs doubtsRMSD: routine measure stirs doubts
RMSD: routine measure stirs doubtsJeremy Yang
 
Canonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformaticsCanonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformaticsJeremy Yang
 
Molecular scaffolds poster
Molecular scaffolds posterMolecular scaffolds poster
Molecular scaffolds posterJeremy Yang
 
Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryJeremy Yang
 
The BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDThe BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDJeremy Yang
 
Cheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case StudiesCheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case StudiesJeremy Yang
 
How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...Jeremy Yang
 
UNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applicationsUNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applicationsJeremy Yang
 
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in BiocomputingCyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in BiocomputingJeremy Yang
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNJeremy Yang
 

Mais de Jeremy Yang (20)

TIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS AnalyticsTIGA: Target Illumination GWAS Analytics
TIGA: Target Illumination GWAS Analytics
 
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizerDrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
DrugCentralDb and BioClients: Dockerized PostgreSql with Python API-tizer
 
Mining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypothesesMining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
Mining ClinicalTrials.gov via CTTI AACT for drug target hypotheses
 
TIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST APITIN-X v2: modernized architecture with REST API
TIN-X v2: modernized architecture with REST API
 
Ex-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles ExplorerEx-files: Sex-Specific Gene Expression Profiles Explorer
Ex-files: Sex-Specific Gene Expression Profiles Explorer
 
Open Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterOpen Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource poster
 
Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)Badapple: promiscuity patterns from noisy evidence (poster)
Badapple: promiscuity patterns from noisy evidence (poster)
 
Bibliological data science and drug discovery
Bibliological data science and drug discoveryBibliological data science and drug discovery
Bibliological data science and drug discovery
 
BioMISS: Language Diversity of Computing
BioMISS: Language Diversity of ComputingBioMISS: Language Diversity of Computing
BioMISS: Language Diversity of Computing
 
The Language Diversity of Computing
The Language Diversity of ComputingThe Language Diversity of Computing
The Language Diversity of Computing
 
RMSD: routine measure stirs doubts
RMSD: routine measure stirs doubtsRMSD: routine measure stirs doubts
RMSD: routine measure stirs doubts
 
Canonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformaticsCanonicalized systematic nomenclature in cheminformatics
Canonicalized systematic nomenclature in cheminformatics
 
Molecular scaffolds poster
Molecular scaffolds posterMolecular scaffolds poster
Molecular scaffolds poster
 
Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discovery
 
The BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARDThe BADAPPLE promiscuity plugin for BARD
The BADAPPLE promiscuity plugin for BARD
 
Cheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case StudiesCheminformatics Software Development: Case Studies
Cheminformatics Software Development: Case Studies
 
How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...How am I supposed to organize a protein database when I can't even organize m...
How am I supposed to organize a protein database when I can't even organize m...
 
UNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applicationsUNM Division of Biocomputing public web applications
UNM Division of Biocomputing public web applications
 
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in BiocomputingCyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
 
Promiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCNPromiscuous patterns and perils in PubChem and the MLSCN
Promiscuous patterns and perils in PubChem and the MLSCN
 

Último

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 

Último (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 

Illuminating the Druggable Genome with Knowledge Engineering and Machine Learning

  • 1. Giovanni Bocci, Cristian Bologa, Daniel Byrd, Jayme Holmes, Stephen Mathias, Oleg Ursu, Anna Waller, Jeremy Yang & Tudor Oprea 03/15/2019 INBRE-NMBIST Symposium Santa Fe, NM Funding: NIH U24 CA224370 & NIH U24 TR002278 ILLUMINATING THE DRUGGABLE GENOME WITH KNOWLEDGE ENGINEERING AND MACHINE LEARNING datascience.unm.edupharos.nih.gov/idg/ druggablegenome.net
  • 2. 75% of protein research still focused on 10% genes known before human genome was mapped AM Edwards et al, Nature, 2011 This prompted NIH to start the Illuminating the Druggable Genome Initiative (U54, Common Fund) HGP
  • 3. 3 "If I have seen further it is by standing on the shoulders of Giants." - Isaac Newton, ~1675 Organization of this talk: 1. Shoulders development (knowledge engineering). 2. Seeing further efforts (machine learning).
  • 7. ML READY: PHAROS 11/13/18 revisionhttps://pharos.nih.gov/idg/targets/GRIN2A 7
  • 8. ML READY: DRUGCENTRAL 10/16/18 revision http://drugcentral.org/drugcard/1679 8
  • 10. COMPONENTS OF IDG https://druggablegenome.net/ DRGC RDOC IT KMC RFA-RM-16-026 (DRGC) GPCR U24 DK116195: Bryan Roth, M.D., Ph.D. (UNC) Brian Shoichet, Ph.D. (UCSF) Ion Channel U24 DK116214: Lily Jan, Ph.D. (UCSF) Michael T. McManus, Ph.D. (UCSF) Kinase U24 DK116204: Gary L. Johnson, Ph.D. (UNC) RFA-RM-16-025 (RDOC) U24 TR002278: Stephan C. Schürer, Ph.D.  (UMiami) Dusica Vidovic, Ph.D.  (UMiami) Tudor Oprea, M.D., Ph.D.  (UNM) Larry A. Sklar, Ph.D.  (UNM) RFA-RM-16-024 (KMC) U24 CA224260: Avi Ma’ayan, Ph.D.  (ISMMS) U24 CA224370: Tudor Oprea, M.D., Ph.D. (UNM) RFA-RM-18-011 (CEIT) Awards starting date March 2019 Further information Email: idg.rdoc@gmail.com Follow: @DruggableGenome URLs: https://druggablegenome.net / https://commonfund.nih.gov/i dg/ IDG Knowledge User-Interface Email: pharos@mail.nih.gov Follow: @IDG_Pharos URL: https://pharos.nih.gov/ 10
  • 11. TARGET DEVELOPMENT LEVEL (TDL) ▪ Most protein classification schemes are based on structural and functional criteria. ▪ For therapeutic development, it is useful to understand how much and what types of data are available for a given protein, thereby highlighting well-studied and understudied targets. ▪ Tclin: Proteins annotated as drug targets ▪ Tchem: Proteins for which potent small molecules are known ▪ Tbio: Proteins for which biology is better understood ▪ Tdark: These proteins lack antibodies, publications or Gene RIFs 3/23/18 revision T. Oprea et al., Nature Rev. Drug Discov. 2018, https://www.nature.com/articles/nrd.2018.14 11
  • 12. TDL LEVELS: Tclin and Tchem ▪ Tclin proteins are associated with drug Mechanism of Action (MoA) – NRDD 2017 ▪ Tchem proteins have bioactivitis in ChEMBL and DrugCentral, + human curation for some targets ▪ Kinases: <= 30nM ▪ GPCRs: <= 100nM ▪ Nuclear Receptors: <= 100nM ▪ Ion Channels: <= 10μM ▪ Non-IDG Family Targets: <= 1μM 10/19/16 revision Bioactivities of approved drugs (by Target class) ChEMBL: database of bioactive chemicals https://www.ebi.ac.uk/chembl/ DrugCentral: online drug compendium http://drugcentral.org/ R. Santos et al., Nature Rev. Drug Discov. 2017, https://www.nature.com/articles/nrd.2016.230 12
  • 13. TDL LEVELS Tbio and Tdark ▪ Tbio proteins lack small molecule annotation cf. Tchem criteria, and satisfy one of these criteria: ▪ protein is above the cutoff criteria for Tdark ▪ protein is annotated with a GO Molecular Function or Biological Process leaf term(s) with an Experimental Evidence code ▪ protein has confirmed OMIM phenotype(s) ▪ Tdark (“ignorome”) have little information available, and satisfy these criteria: ▪ PubMed text-mining score from Jensen Lab < 5 ▪ <= 3 Gene RIFs ▪ <= 50 Antibodies available according to antibodypedia.com 13
  • 14. TDL: EXTERNAL VALIDATION Tdark parameters differ from the other TDLs across the 4 external metrics cf. Kruskal-Wallis post-hoc pairwise Dunn tests 2/23/18 revision T. Oprea et al., Nature Rev. Drug Discov. 2018, https://www.nature.com/articles/nrd.2018.14 14
  • 15. WHY FUND TDARK RESEARCH? 2/23/18 revision T. Oprea et al., Nature Rev. Drug Discov. 2018, https://www.nature.com/articles/nrd.2018.14 Typically, it takes 15-20 years for a Tdark protein to become druggable 15
  • 16. IMPC BOLDLY GOES WHERE NO ONE HAS GONE BEFORE 95% of eligible IDG genes (339/356) have plans, attempts, or models 384 genes were prioritized by IDG KMC (2014-2016) 17 28 17 1 63 24 50 79 168306 Tbio genes 90 Tdark genes 42 Tchem genes 11/29/17 revision Slide from Steve Murray, Jackson Lab 16
  • 17. TAKE HOME MESSAGE: THERE IS A KNOWLEDGE DEFICIT 3/12/18 revision ~35% of the proteins remain poorly described (Tdark) ~11% of the Proteome (Tclin & Tchem) are currently targeted by small molecule probes Choosing to work on dark genes is a high-risk endeavor (Funders are less likely to award grants for Tdark)
  • 18. CHALLENGE: RANKING & SCORING PROTEIN-DISEASE ASSOCIATIONS https://pharos-beta.ncats.io/targets/GRIN2A The IDG KMC tracks more ~10 information channels for protein-disease associations, accessible via the Pharos portal. Our challenge is to harmonize disease concepts, and to enable computational use: e.g., GRIN2A with GRIN1 form the Glutamate NMDA receptor, MoA drug target for memantine (Alzheimer’s). The challenge for ML & AI: How to prioritize targets? i.e., which protein-disease associations are clinically actionable? 10/07/18 revision 18
  • 19. WHAT DO WE KNOW ABOUT DISEASES? ▪ There are between 9,000 and 25,000 disease concepts ▪ Pharos/TCRD tracks ~11,000 disease via Disease Ontology, and ~10500 rare disease via eRAM, OrphaNet and the Monarch Initiative MONDO system 19
  • 20. PROTEIN KNOWLEDGE GRAPHS ▪ IDG KMC2 seeks knowledge gaps across the five branches of the “knowledge tree”: ▪ Genotype; Phenotype; Interactions & Pathways; Structure & Function; and Expression, respectively. ▪ We can use biological systems network modeling to infer novel relationships based on available evidence, and infer new “function” and “role in disease” data based on other layers of evidence ▪ Primary focus on Tdark & Tbio O. Ursu, T Oprea et al., IDG2 KMC 2/01/18 revision 20
  • 21. THE METAPATH-ML APPROACH▪ A metapath is a sequence of relations defined between different object types. ▪ Our metapaths encode type-specific network topology between the source node (Protein) and the destination node (Disease/Phenotype). ▪ This approach enables the transformation of assertions/evidence chains of heterogeneous biological data types into a ML ready format. SOME REFS: G. Fu et al., BMC Bioinformatics 2016. D Himmelstein & S Baranzini, PLOS Comp Bio, 2015. Similar assertions or evidence form metapaths (white). Instances of metapath (paths) are used to determine the strength of the evidence linking a gene to disease/phenotype/function. 21
  • 22. 22 SOME EARLY ACKNOWLEDGMENTS ... Abstract: We hear a lot about machine learning and its role in health care, but these methods require large amounts of training data. Using these and other related method to study rare diseases poses substantial challenges: how can we get tens of thousands of training examples when there are tens or hundreds of people with a disease? (Abstract from this conference) Scarce training data our problem too. But with genes instead of people. Our MetapathML method similar to Himmelstein-Baranzini. (Daniel is now post-doc in Greene Lab.)
  • 23. METAPATH-ML DATA SOURCES O. Ursu et al., manuscript in preparation Data source Data type Data points CCLE Gene expression 19,006,134 GTEx Gene expression 2,612,227 Protein Atlas Gene & Protein expression 949,199 Reactome Biological pathways 303,681 KEGG Biological pathways 27,683 StringDB Protein-Protein interactions 5,080,023 Gene ontology Biological pathways & Gene function 434,317 InterPro Protein structure and function 467,163 ClinVar Human Gene - Disease/Phenotype associations 881,357 GWAS Gene - Disease/Phenotype associations 54,360 OMIM Human Gene - Disease/Phenotype associations 25,557 UniProt Disease Human Gene - Disease/Phenotype associations 5,365 JensenLab DISEASE Gene - Disease associations from text mining 44,829 NCBI Homology Homology mapping of human/mouse/rat genes 70,922 IMPC Mouse Gene - Phenotype associations 2,153,999 RGD Rat Gene - Phenotype associations 117,606 LINCS Drug induced gene signatures 230,111,315 We developed automated methods for data collection (TCRD), visualization (Pharos) and data aggregation.   These aggregated datasets were used to build machine learning models for 20+ disease and 73 mouse phenotype. Each knowledge graph contains ~22,000 metapaths and 284 million path instances. 10/07/18 revision 23
  • 24. METAPATH-ML WORKFLOW ▪ A meta-path encodes type-specific network topology between the source node (e.g., Protein target) and the destination node (e.g., Disease or Function) ▪ Target –– (member of) → PPI Network ← (member of) –– Protein –– (associated with) → Disease ▪ Target –– (expressed in) → Tissue ← (localized in) –– Disease O. Ursu, T Oprea et al., IDG2 KMC 2/01/18 revision 24
  • 25. METAPATH-ML @ UNM one protein-disease association at the time O. Ursu, T Oprea et al., IDG2 KMC 2/01/18 revision Genes associated with a disease/phenotype are positive examples, whereas genes lacking the same association are negative examples. The Metapath approach transforms assertions/evidence chains into classification problems that can be solved using suitably designed machine learning algorithms. 25
  • 26. Use of XGBoost (XGBoost = eXtreme Gradient Boosting) ● https://xgboost.ai/ ● GitHub ● Documentation ● R package ● Exceptional interpretability MetapathML employs XGBoost via the R package API. The inputs to XGBoost are datasets specific to each disease or phenotype. For each disease/phenotype some known associated genes correspond with the positive Y labels in the dataset. XGBoost parameters are optimized via grid search, i.e. iterative testing over discrete parameter value combinations.
  • 27. 27
  • 28.
  • 29. ALZHEIMER’S DISEASE (AD) METAPATH ML MODEL ▪Build data matrix from “Alzheimer’s disease” in TCRD subset ▪ protein knowledge graph along metapaths: ▪ Protein – Protein Interactions ▪ Pathways ▪ GO terms ▪ Gene expression ▪ Etc. ▪ Training set: 53 genes associated with Alzheimer’s disease (positives); 3,952 genes associated with other pathologies from OMIM were assumed to be negative ▪ Test set: 23 genes associated with Alzheimer's (positives) and 200 genes not associated with Alzheimer's (negatives) ← from Text Mining ▪ “Complete forest” binary classifier using XGBoost & 5-fold cross-validation. 2/14/18 revisionML work by Oleg Ursu Predicted Actual Pos Neg Pos 20 3 Neg 41 159 29
  • 30. AD XGBOOST CLASSIFIER: VARIABLE IMPORTANCE PLOT ▪ The top most important features are interactions with proteins mediating inflammatory processes (JAK2/Tclin, IL10 & IL2 / Tchem), response to oxidative stress (GSTP1/Tchem), nervous system development (BDNF/Tbio) and glycolysis (GAPDH/Tchem). ▪ LINCS drug-induced gene expression perturbations are the largest category of features for these predictions. ▪ Brain cortex expression is a necessary requirement. ▪ One Reactome pathway (AU-rich mRNA elements binding proteins) is also important. ▪ Weighted approached showed better performance in the test set for Alzheimer's Disease, Schizophrenia, and Dilated Cardiomyopathy. 4/23/18 revisionML work by Oleg Ursu 30
  • 31. EXPERIMENTAL VALIDATION: AD ▪ SHSY5Ys pTau siRNA test ▪ Measured pTau levels after knock-down of gene expression • Human iPSNs qPCR ▪ Measuring endogenous gene expression levels, AD vs Ctrl ▪ Western blot or ICC to characterize AD phenotype versus control • Human Tissue qPCR ▪ Measuring endogenous gene expression levels, AD vs Ctrl ▪ Western blot or ICC to characterize AD phenotype versus control 11/14/18 revision AD validation work by Jessica Binder & Kiran Bhaskar (UNM), funded by U24CA224370-S2 supplement 31
  • 32. 2/14/19 revisionAD validation work by Jessica Binder & Kiran Bhaskar (UNM), funded by U24CA224370-S2 supplement ▪Validation on the 20 predicted genes: AKNA, BC02, CCNY, CRTAM, FAM92B, FOXP4, FRRS1, GRIN2C, 1L17REL, LILRA3, LM04, NDRG2, PIBF1, RAB40A, SCGB3A1, SLC44A2, SPOP, STARD3, TMEFF2, TXNDC12 ▪The most obvious effects based on the combined Cellomics & qPCR of iPSNs & autopsy brains suggests that AKNA, LILRA3, NDRG2 and TXNDC12 significantly increased pTau (as tracked by two different antibodies for T180 , S202 and S205 ) ▪For now, it appears that machine learning models may have identified between 4 and 7 new genes that have previously not been associated with Alzheimer’s Disease 32 EXPERIMENTAL VALIDATION: AD
  • 33. 33 EXPERIMENTAL VALIDATION: MORE DISEASES AND COLLABORATORS Disease Experimental Collaboration Prostate cancer Work by Art Cherkasov, Kriti Singh & Mike Hsing (UBC, Vancouver). Of the top 50 ML predicted genes, 19 commonly upregulated in YZ Wang Transdifferentiation PDX model and Beltran dataset 2016. Ovarian cancer Spheroid tumor & patient-derived xenograft (PDX) work by Mara Steinkamp (UNM). From the top ML predicted 63 genes, 12 genes show significant changes in cancer cells. NEXT STEPS: ● In vivo experiments. ● More diseases and phenotypes.
  • 34. ML LEARNINGS IN TARGET AND DRUG DISCOVERY 1. Model quality is limited by data quality. Good data → good models. 2. ML can identify hidden patterns in big data. For example, the central node(s) in PPI network(s) that are a playing critical role in disease pathology. 3. Deep learning not so applicable to our task (better for tall datasets, well defined good solutions, less need for interpretability). 4. XGBoost (decision tree algorithm) excels in performance & interpretability. 5. Shows real promise in Target Repurposing. 34
  • 35. 35 IN CLOSING... 35 ● IDG platform for knowledge discovery about the "dark genome." ● ML provides new insights by integrating multi-omics knowledge graphs. ● Hard questions should be directed to Tudor Oprea!