SlideShare uma empresa Scribd logo
1 de 37
Baixar para ler offline
MOLIERE
Automatic Biomedical Hypothesis Generation System
Justin Sybrandt Michael Shtutman Ilya Safro
Clemson University
School of Computing
University of South Carolina
College of Pharmacy
Clemson University
School of Computing
UNDISCOVERED PUBLIC KNOWLEDGE
• Proposed by Swanson in 1986
• The set of public knowledge is too large
• Contains implicit connections
2
HUMAN LIMITS
• No person can read everything
• 2,000 – 4,000 new papers daily
• People Specialize
• Limits knowledge sharing across disciplines
• Bias and Inconsistent Assumptions
3
WHAT IS A HYPOTHESIS?
4
DRUG DISEASE
?Inhibits Causes
• An idea or explanation for something that is based on
known facts but has not yet been proven
• (Definition of “hypothesis” from the Cambridge Academic Content Dictionary ©
Cambridge University Press)
AUTOMATIC
HYPOTHESIS GENERATION
5
DISEASEDRUG
Network of Medical Objects
Connections NotWell Studied
AUTOMATIC
HYPOTHESIS GENERATION
6
DISEASEDRUG
Network of Medical Objects
Connections NotWell Studied
AUTOMATIC
HYPOTHESIS GENERATION
7
DISEASEDRUG
Network of Medical Objects
Connections NotWell Studied
8
Methodology Highlights
ARROWSMITH • One of the first hypothesis generation systems.
• Found link between Fish Oil and Raynaud’s Disease.
DiseaseConnect • Finds connections between genes and diseases.
• Displays information in an interactive prompt.
BioLDA • Constructs high quality topic models aided by domain-specific information.
• Identified link betweenVenlafaxine and HTR1A.
RELATED APPROACHES
9
Methodology Limitation Reference
ARROWSMITH Limited Document Set Neil R Smalheiser and Don R Swanson. 1998.
DiseaseConnect Limited Document Set Chun-Chi Liu et al. 2014.
BioLDA LimitedVocabulary Huijun Wang et al. 2011.
RELATED APPROACHES
NEW APPROACH
10
• Construct a large network
• Identify meaningful paths
• Extend paths to neighboring nodes
• Mine neighborhoods for important topics
NETWORK CONSTRUCTION
11
QUERY PROCESS
NETWORK CONSTRUCTION
• National Library of Medicine (NLM)
• National Center for Biotechnology Information (NCBI)
• MEDLINE
• 25 Million Documents
• Titles and Abstracts
12
• SPECALIST NLPTOOLSET
• Natural LanguageToolKit (NLTK)
13
some example text that
hopefully more understandable
than typical medical abstract
This is some example text that is
hopefully more understandable
than the typical medical abstract.
Raw Data CleanedText
NETWORK CONSTRUCTION
NETWORK CONSTRUCTION
• Topical Pattern Mining
• Groups together common phrases
• Creates 2,3,…,n-grams
14
example text hopefully
more understandable typical
medical abstract
some example text that
hopefully more understandable
than typical medical abstract
CleanedText Phrases
NETWORK CONSTRUCTION
• FastText: Projects phrases into real valued vector space
• Long word embedding composed from subwords
15
example text hopefully
more understandable typical
medical abstract
Phrases
Point Cloud
16
Point Cloud Centroid
NETWORK CONSTRUCTION
• Embed documents by averaging over point clouds
NETWORK CONSTRUCTION
• Construct KNN
• Fast Library for Approximate Nearest Neighbors
17
All Centroids Network
NETWORK CONSTRUCTION
18
• UMLS Metathesarus
• Curated keyword network
• 2 Million Nodes
• Superset of MESH
NETWORK CONSTRUCTION
19
• Edge weight ~ Distance
• Inv.TF-IDF cross-layer edges
• Edges normalized [0,1]
NETWORK CONSTRUCTION
20
QUERY PROCESS
QUERY PROCESS
21
22
• User selects two nodes
• Restrained to keywords
• Can generalize to two sets
QUERY PROCESS
QUERY PROCESS
23
• Identify shortest path
between query sets
QUERY PROCESS
24
• N: Abstracts close to those in original path
• C: Abstracts which share path-adjacent keywords
QUERY PROCESS
25
• PLDA+: Identifies topics present in a set of text.
Topic
…
…
…
Topic
…
…
…
Topic
…
…
…
• Topic patterns shed light on results
QUERY PROCESS
• Hypothesis represented as a topic model
• Shown: Venlaflaxine vs. HTR1A
26
TOPIC: 0
antidepressant_drugs
milnacipran
org
selected
ht
TOPIC: 1
increase
reduced
treatment
dorsal_raphe_nucleus
effect
TOPIC: 2
rats
sert
ht_receptor
escitalopram
potency
RESULTS
27
VENLAFAXINE
DRUG
REPURPOSING
?
VENLAFAXINE EXAMPLE
• Venlafaxine:
• Treats depression / anxiety
• HTR[12]A:
• Linked to depression / anxiety
• No paper linked these concepts
28
Venlafaxine
HTR1A
HTR2A
Depression
Anxiety
VENLAFAXINE RESULTS
29
0
2
4
6
8
10
12
14
16
18
20
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
KeywordCount
Topic Number
Depression Related Keywords PerTopic
HTR1A HTR2A
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
KeywordCount
Topic Number
Anxiety Related Keywords PerTopic
HTR1A HTR2A
• Hypothesis represented as a topic model
• Shown: Venlaflaxine vs. HTR1A
30
TOPIC: 0
antidepressant_drugs
milnacipran
org
selected
ht
TOPIC: 1
increase
reduced
treatment
dorsal_raphe_nucleus
effect
TOPIC: 2
rats
sert
ht_receptor
escitalopram
potency
antidepressant
produces
serotonin
“Sertraline”
antidepressant controlled by
HTR1A
antidepressant
VENLAFAXINE RESULTS
DRUG REPURPOSING EXAMPLE
31
• Drugs can be modified to treat new diseases
• Decreases drug development time and costs
HIVDRUG
?
Cancer
DDX3
DRUG REPURPOSING EXPERIMENT
32
• Ran nearly 10,000 queries involving DDX3:
Signal
Transduction
Transcription
Adhesion
Cancer
Development
Translation
RNA
DDX3
DRUG REPURPOSING EXPERIMENT
33
• Ran nearly 10,000 queries involving DDX3:
• Expecting:
WNT Signaling
Pathways
Cell – Cell
Adhesion
Cell – Matrix
Adhesion
DRUG REPURPOSING RESULTS
34
WNT Signaling
Pathways
Cell – Cell
Adhesion
Cell – Matrix
Adhesion
substrate adhesion
RGD cell adhesion domain
cell adhesion factor
focal adhesion kinase
cell-cell adhesion
regulation of cell-cell adhesion
cell-adhesion molecules
signal-transduction associated kinases
cell adhesion kinase
APPLICATIONS
35
• Drug Repurposing
• Extensions to new domains
• Patents, Economics, etc.
• Coping with Deadlines
OPEN RESEARCH QUESTIONS
• Result Interpretation
• SystemVerification
• Automatic Network Tuning
• Streaming Network Reconstruction
• Inclusion of Additional Data Sources
36
THANK YOU
J. Sybrandt, M. Shtutman, I. Safro “MOLIERE:Automatic Biomedical Hypothesis Generation System”
Code and Data: https://people.cs.clemson.edu/~isafro/software.html
Email: JSYBRAN@CLEMSON.EDU

Mais conteúdo relacionado

Semelhante a MOLIERE: Automatic Biomedical Hypothesis Generation System

dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET
 
How to Conduct a Literature Review
How to Conduct a Literature ReviewHow to Conduct a Literature Review
How to Conduct a Literature Review
Robin Featherstone
 
Practicing up-to-date-medicine
Practicing up-to-date-medicinePracticing up-to-date-medicine
Practicing up-to-date-medicine
impamilne
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
Databricks
 
(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...
(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...
(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...
EricksonArthur
 

Semelhante a MOLIERE: Automatic Biomedical Hypothesis Generation System (20)

Paul Groth
Paul GrothPaul Groth
Paul Groth
 
Evidence based psychiatry
Evidence based psychiatryEvidence based psychiatry
Evidence based psychiatry
 
BigData'18: Validation and Analysis of Hypothesis Generation Systems
BigData'18: Validation and Analysis of Hypothesis Generation SystemsBigData'18: Validation and Analysis of Hypothesis Generation Systems
BigData'18: Validation and Analysis of Hypothesis Generation Systems
 
Mobilizing informational resources webinar
Mobilizing informational resources   webinarMobilizing informational resources   webinar
Mobilizing informational resources webinar
 
NCompass Live: PubMed, PubMed Central, MEDLINE, MedlinePlus...What's the diff...
NCompass Live: PubMed, PubMed Central, MEDLINE, MedlinePlus...What's the diff...NCompass Live: PubMed, PubMed Central, MEDLINE, MedlinePlus...What's the diff...
NCompass Live: PubMed, PubMed Central, MEDLINE, MedlinePlus...What's the diff...
 
Presentation at Rare Disease conference in San-Antonio
Presentation at Rare Disease conference in San-AntonioPresentation at Rare Disease conference in San-Antonio
Presentation at Rare Disease conference in San-Antonio
 
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
dkNET Webinar: Choosing Sample Sizes for Multilevel and Longitudinal Studies ...
 
Biomarkers brain regions
Biomarkers brain regionsBiomarkers brain regions
Biomarkers brain regions
 
How to Conduct a Literature Review
How to Conduct a Literature ReviewHow to Conduct a Literature Review
How to Conduct a Literature Review
 
Literature searching the professional way
Literature searching the professional wayLiterature searching the professional way
Literature searching the professional way
 
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
 
Practicing up-to-date-medicine
Practicing up-to-date-medicinePracticing up-to-date-medicine
Practicing up-to-date-medicine
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
 
Drug Discovery and Development Using AI
Drug Discovery and Development Using AIDrug Discovery and Development Using AI
Drug Discovery and Development Using AI
 
2015 EMS 3.0
2015 EMS 3.02015 EMS 3.0
2015 EMS 3.0
 
Mini manual-database-instruction-sanders
Mini manual-database-instruction-sandersMini manual-database-instruction-sanders
Mini manual-database-instruction-sanders
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...
(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...
(1) George Fink - Stress_ Neuroendocrinology and Neurobiology_ Handbook of St...
 
Krithara meetup may18_final (1)
Krithara meetup may18_final (1)Krithara meetup may18_final (1)
Krithara meetup may18_final (1)
 
NLP tutorial at AIME 2020
NLP tutorial at AIME 2020NLP tutorial at AIME 2020
NLP tutorial at AIME 2020
 

Último

Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
anilsa9823
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
University of Hertfordshire
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 

Último (20)

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 

MOLIERE: Automatic Biomedical Hypothesis Generation System

  • 1. MOLIERE Automatic Biomedical Hypothesis Generation System Justin Sybrandt Michael Shtutman Ilya Safro Clemson University School of Computing University of South Carolina College of Pharmacy Clemson University School of Computing
  • 2. UNDISCOVERED PUBLIC KNOWLEDGE • Proposed by Swanson in 1986 • The set of public knowledge is too large • Contains implicit connections 2
  • 3. HUMAN LIMITS • No person can read everything • 2,000 – 4,000 new papers daily • People Specialize • Limits knowledge sharing across disciplines • Bias and Inconsistent Assumptions 3
  • 4. WHAT IS A HYPOTHESIS? 4 DRUG DISEASE ?Inhibits Causes • An idea or explanation for something that is based on known facts but has not yet been proven • (Definition of “hypothesis” from the Cambridge Academic Content Dictionary © Cambridge University Press)
  • 5. AUTOMATIC HYPOTHESIS GENERATION 5 DISEASEDRUG Network of Medical Objects Connections NotWell Studied
  • 6. AUTOMATIC HYPOTHESIS GENERATION 6 DISEASEDRUG Network of Medical Objects Connections NotWell Studied
  • 7. AUTOMATIC HYPOTHESIS GENERATION 7 DISEASEDRUG Network of Medical Objects Connections NotWell Studied
  • 8. 8 Methodology Highlights ARROWSMITH • One of the first hypothesis generation systems. • Found link between Fish Oil and Raynaud’s Disease. DiseaseConnect • Finds connections between genes and diseases. • Displays information in an interactive prompt. BioLDA • Constructs high quality topic models aided by domain-specific information. • Identified link betweenVenlafaxine and HTR1A. RELATED APPROACHES
  • 9. 9 Methodology Limitation Reference ARROWSMITH Limited Document Set Neil R Smalheiser and Don R Swanson. 1998. DiseaseConnect Limited Document Set Chun-Chi Liu et al. 2014. BioLDA LimitedVocabulary Huijun Wang et al. 2011. RELATED APPROACHES
  • 10. NEW APPROACH 10 • Construct a large network • Identify meaningful paths • Extend paths to neighboring nodes • Mine neighborhoods for important topics
  • 12. NETWORK CONSTRUCTION • National Library of Medicine (NLM) • National Center for Biotechnology Information (NCBI) • MEDLINE • 25 Million Documents • Titles and Abstracts 12
  • 13. • SPECALIST NLPTOOLSET • Natural LanguageToolKit (NLTK) 13 some example text that hopefully more understandable than typical medical abstract This is some example text that is hopefully more understandable than the typical medical abstract. Raw Data CleanedText NETWORK CONSTRUCTION
  • 14. NETWORK CONSTRUCTION • Topical Pattern Mining • Groups together common phrases • Creates 2,3,…,n-grams 14 example text hopefully more understandable typical medical abstract some example text that hopefully more understandable than typical medical abstract CleanedText Phrases
  • 15. NETWORK CONSTRUCTION • FastText: Projects phrases into real valued vector space • Long word embedding composed from subwords 15 example text hopefully more understandable typical medical abstract Phrases Point Cloud
  • 16. 16 Point Cloud Centroid NETWORK CONSTRUCTION • Embed documents by averaging over point clouds
  • 17. NETWORK CONSTRUCTION • Construct KNN • Fast Library for Approximate Nearest Neighbors 17 All Centroids Network
  • 18. NETWORK CONSTRUCTION 18 • UMLS Metathesarus • Curated keyword network • 2 Million Nodes • Superset of MESH
  • 19. NETWORK CONSTRUCTION 19 • Edge weight ~ Distance • Inv.TF-IDF cross-layer edges • Edges normalized [0,1]
  • 22. 22 • User selects two nodes • Restrained to keywords • Can generalize to two sets QUERY PROCESS
  • 23. QUERY PROCESS 23 • Identify shortest path between query sets
  • 24. QUERY PROCESS 24 • N: Abstracts close to those in original path • C: Abstracts which share path-adjacent keywords
  • 25. QUERY PROCESS 25 • PLDA+: Identifies topics present in a set of text. Topic … … … Topic … … … Topic … … … • Topic patterns shed light on results
  • 26. QUERY PROCESS • Hypothesis represented as a topic model • Shown: Venlaflaxine vs. HTR1A 26 TOPIC: 0 antidepressant_drugs milnacipran org selected ht TOPIC: 1 increase reduced treatment dorsal_raphe_nucleus effect TOPIC: 2 rats sert ht_receptor escitalopram potency
  • 28. ? VENLAFAXINE EXAMPLE • Venlafaxine: • Treats depression / anxiety • HTR[12]A: • Linked to depression / anxiety • No paper linked these concepts 28 Venlafaxine HTR1A HTR2A Depression Anxiety
  • 29. VENLAFAXINE RESULTS 29 0 2 4 6 8 10 12 14 16 18 20 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 KeywordCount Topic Number Depression Related Keywords PerTopic HTR1A HTR2A 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 KeywordCount Topic Number Anxiety Related Keywords PerTopic HTR1A HTR2A
  • 30. • Hypothesis represented as a topic model • Shown: Venlaflaxine vs. HTR1A 30 TOPIC: 0 antidepressant_drugs milnacipran org selected ht TOPIC: 1 increase reduced treatment dorsal_raphe_nucleus effect TOPIC: 2 rats sert ht_receptor escitalopram potency antidepressant produces serotonin “Sertraline” antidepressant controlled by HTR1A antidepressant VENLAFAXINE RESULTS
  • 31. DRUG REPURPOSING EXAMPLE 31 • Drugs can be modified to treat new diseases • Decreases drug development time and costs HIVDRUG ? Cancer DDX3
  • 32. DRUG REPURPOSING EXPERIMENT 32 • Ran nearly 10,000 queries involving DDX3: Signal Transduction Transcription Adhesion Cancer Development Translation RNA DDX3
  • 33. DRUG REPURPOSING EXPERIMENT 33 • Ran nearly 10,000 queries involving DDX3: • Expecting: WNT Signaling Pathways Cell – Cell Adhesion Cell – Matrix Adhesion
  • 34. DRUG REPURPOSING RESULTS 34 WNT Signaling Pathways Cell – Cell Adhesion Cell – Matrix Adhesion substrate adhesion RGD cell adhesion domain cell adhesion factor focal adhesion kinase cell-cell adhesion regulation of cell-cell adhesion cell-adhesion molecules signal-transduction associated kinases cell adhesion kinase
  • 35. APPLICATIONS 35 • Drug Repurposing • Extensions to new domains • Patents, Economics, etc. • Coping with Deadlines
  • 36. OPEN RESEARCH QUESTIONS • Result Interpretation • SystemVerification • Automatic Network Tuning • Streaming Network Reconstruction • Inclusion of Additional Data Sources 36
  • 37. THANK YOU J. Sybrandt, M. Shtutman, I. Safro “MOLIERE:Automatic Biomedical Hypothesis Generation System” Code and Data: https://people.cs.clemson.edu/~isafro/software.html Email: JSYBRAN@CLEMSON.EDU