This document summarizes a webinar presentation about the Illuminating the Druggable Genome (IDG) Knowledge Management Center (KMC) and its Pharos interface. The presentation discusses how Pharos provides access to integrated data on over 20,000 protein targets to help characterize well and poorly studied targets. It highlights how Pharos analyzes target data using methods like knowledge availability scoring and similarity in knowledge space to help prioritize "dark" targets lacking extensive research that may have therapeutic potential. The long-term vision is for Pharos to generate customized, semi-natural language summaries of target data to act as a biological dashboard for users.
32. KAS vs. Other measures
• Best correlation with Pubmed count
• As expected, data for Tdark is noisier
• Of interest are those targets with higher values of
knowledge availability but small values of another
metric
• In particular the Jensen
Pubmed Score seems to
lead to such targets
1
100
10000
0 20 40 60
Knowledge Availability Score
JensenPubmedScore
Tbio
Tchem
Tclin
Tdark
38. Next Steps - Target Knowledge Vectors
• Based on sparse vector representation of data
availability, applied to 20K targets
• A target is a document mixture of discrete and
continuous variable descriptors
• Set of facet values/terms and frequencies
• Amino acid sequence length and individual AA residue
profiles
• Counts of related publications, ligands, Xtals, diseases,
protein-protein interactions, etc.
• Similar to TD-IDF, facet value frequencies are
inversely weighted by popularity
• The similarity is calculated as generalized Tanimoto
42. The Long Term Vision
• Incorporate dependencies
between data types to support
inference and sophisticated filters
• From presentation to summarization
• Use explicit links & computational
inference to generate (semi-) natural language
summary using all known data
• Influenced by the query
• The result is a biological dashboard,
customized for the user and the query
Target X has been implicated in 3
diseases related to skeletal, urological
and nervous systems. It has been
investigated in 5 in vitro assay, 2 in
vivo assays. There are 4 compounds
active against this target, 3 of which
are in clinical trials.