Presented at the 2016 ACS Fall Meeting in Philadelpha, session "Effectively Harnessing the World's Literature to Inform Rational Compound Design", on 8/21/16.
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
Bibliological data science and drug discovery
1. Bibliological data science
and drug discovery
Knowing the knowns*
Effectively Harnessing the World’s Literature To Inform Rational Compound Design - ACS National Meeting, Philadelphia, Aug 21-24, 2016
Jeremy J Yang
Translational Informatics Division
School of Medicine
University of New Mexico
Integrative Data Science Lab
School of Informatics & Computing
Indiana University
*phrase borrowed from Edgar Jacoby, Janssen.
2. In science, luck favors the prepared.
- Louis Pasteur
The main thing was not to . . . "foul up."
- The Right Stuff, by Tom Wolfe, about John Glenn.
3. Overview of talk
● Formulation of problem
● Resources and examples:
TIN-X, Target Importance and Novelty Explorer (&IDG)
Chem2Bio2RDF
OPDDR, Open Phenotypic Drug Discovery Resource
DrugCentral
4. Formulation of problem
● "World's Literature" redefined by online revolution
● Rational Compound Design = improving our odds
● For given research question, what are the known knowns?
● Connect the dots and weigh the evidence from global
knowledge graph.
6. TIN-X
Target Importance & Novelty Explorer
● Bibliometric application developed for Illuminating the
Druggable Genome (IDG) project
● Text mining from Novo Nordisk Center for Protein
Research (U. Copenhagen) lab of Lars Juhl Jensen.
● Algorithm and client developed at UNM (Cristian Bologa,
Daniel Cannon)
● Disease Ontology (DO) classification
● Drug Target Ontology (DTO) protein classification
11. Target Novelty:
Fk
= 1 / Tk
● Tk
= # targets in paper (k)
● Fk
= fractional score of paper (k)
● for papers where Tk
> 0
Ni
= 1 / ∑(Fk
)
● Ni
= novelty, target (i)
● sum over papers where target (i) mentioned
Target-Disease Importance:
Fk
= 1 / (Tk
* Dk
)
● Tk
= # targets in paper (k)
● Dk
= # diseases in paper (k)
● Fk
= fractional score of paper (k)
Iij
= ∑(Fk
)
● Iij
= importance, target (i) for disease (j)
● sum over papers where both mentioned
Target Importance and Novelty Explorer (TIN-X), Daniel Cannon, Jeremy Yang, Stephen Mathias, Oleg Ursu, Subramani
Mani, Anna Waller, Stephan Schürer, Lars Juhl Jensen, Larry Sklar, Cristian Bologa, and Tudor Oprea (manuscript in
preparation).
TIN-X
12. TIN-X
Target Importance & Novelty Explorer
● Text mining is a valuable tool for monitoring literature,
filtering and ranking, and detecting trends.
● Automation can infer patterns regarding community
trends and consensus.
● Interactive visualization tools help navigate big data.
● Good big data text miners care about small data too!
20. ● Data semantics essential for integration of
heterogeneous sources
● Strong evidence requires strong semantics
● Semantic Web Technologies common framework
enabling -- but not assuring -- community progress
● Chem2Bio2RDF v2.0 to leverage major community
advances (esp. Open PHACTS)
● Data ecosystems, coop-tition & prisoner's dilemma
29. OPDDR
● OPDDR phenotypic assays have been linked and
integrated via community semantics to both
phenotypic (cell lines) and molecular
(genomic/protein targets)
● New phenotypic knowledge domain offers additional
value in drug discovery and pharmacological
informatics
● Open PHACTS excellent, well suited platform
31. DrugCentral
● DrugCentral is a free, open, curated resource about
approved drugs, designed for research
● Compounds, products, labels, targets, IDs, names
● DrugCentral developed over several years at UNM
● DrugCentral recently released with new interface
● License: CC-BY-SA
http://drugcentral.org
34. DrugCentral
● Free, open, accurate, comprehensive drug reference
for biomolecular and biomedical informatics research
Compounds 4444
Products 84787
Synonyms 20522
Structures 4231
Targets 3651
Bioactivities 15620
MoA 3484
SNOMED 45349
35. "DrugCentral: online drug compendium", Oleg Ursu, Jayme Holmes, Jeffrey Knockel, Cristian Bologa,
Jeremy Yang, Stephen Mathias, Stuart Nelson, Tudor Oprea (manuscript submitted).
36. In Conclusion
● New resources continue to emerge and evolve, providing
opportunities for knowledge driven drug discovery
● Community standards → more intelligent web
● Adapt to new data environment for success
● Private + public data must be integrated to
○ Be prepared (like Pasteur)
○ Not "foul up" (like Glenn)