1. 3D Virtual Screening of PknB Inhibitors using data fusion methods
Abhik Seal 1, Perumal Yogeeswari 2, Dharmaranjan Sriram2, David J Wild1,OSDD Consortium 3
1School
of Informatics and Computing Indiana University Bloomington USA,
2Department of Pharmacy Birla Institute of Technology Hyderabad Campus, Shameerpet, Hyderbad-500078 India.,
3Open Source Drug Discovery, Council of Scientific and Industrial Research, India
Datasets : 62 available Inhibitors collected from E-Pharmacophores:E-pharmacophore II was optimized
Mycobacterium tuberculosis encodes 11 putative serine- literature, PknB Protein (PDB ID: 2FUM) & 1000 decoy to e-pharmacophore III based on the %yield of actives, • Data fusion algorithms here confirm identification of
threonine proteins Kinases (STPK) which regulates dataset available from http://www.schrodinger.com/gli specificity and GH score. active compounds “early” in a virtual screening
transcription, cell development and interaction with the de decoy_set.A validation dataset of 35 actives from 62 process. In this work reciprocal rank has the best
host cells. From the 11 STPKs three kinases namely PknA, performance.
and 1000 decoys was prepared.
PknB and PknG have been related to the mycobacterial • Data fusion reduces dependency of using a single
Tools used: Glide(Docking),E-pharmacophore(Glide XP +
growth. PknB sequence identity is less than 27% but the tool for virtual screening.
structure showed a very low RMSD of 1.36 Å and 1.72 Å Phase) ,ROCS(Shape Similarity),enrichVS(R package)
Pharmacophore: Glide XP descriptors was used for E- • The reciprocal rank algorithm was capable enough
with eukaryotic kinases. When developing the to select most of the active compounds early in a
pharmacophore we found that the new compounds pharmacophore generation. E-pharmacophore I and II
are from compound I and compound VIII respectively virtual screening process with a very high BEDROC
Figure 1 in the pipeline resembles a typical kinase Class I
as because these compounds docked top 2 in the score.
type pharmacophore.
docking program. • Optimization of E-pharmacophore is very crucial to
ROCS : 1000 conformations were generated for identify most important sites of a pharmacophore.
validation dataset using low energy cut-off of • Random forest models were tested based on the
5(kcal/mol),RMSD (0.6 Å) for duplicate removal as MACCS keys, 2D Pharmacophore fingerprints and
suggested by Bostrom etal. Compound VIII was taken CATS descriptors on the Asinex datasets. All the
Table 1 Showing different types e-pharmacophore results methods unable to select compounds from 3D
as query for ROCS based virtual screening and
compounds are scored and ranked based on Tanimoto screening sets showing a possible chance of lead
Combo score. hopping in 3D methods.
Glide Docking: 2FUM is prepared using Maestro Prime • A list of 45 compounds were finally selected for
with water molecules removed. A grid box of 12Å was experimental validation.
I used for docking.
Fusion Algorithms:
The selection of possible pharmacophore was based on • Sum score - The normalized scores of each
the Enrichment results, %yield of actives, specificity and ranking are summed to get the fused score of a
Goodness of Hit list (GHscore). compound
• Sum rank - The ranks of each method are Table 2 Showing the statistical measures of structure based
summed to get a fused rank of a compound ,ligand based and data fusion methods.
• Reciprocal rank - Reciprocal rank combine the
normalized scores based on Equation 1.
1
r (Ci )
j
1 pos(Cij )
• Salam etal .J. Chem. Inf. Model. 2009, 49, 2356–2368.
Equation1. Reciprocal rank fusion score • Truchon et al.J. Chem. Inf. Model. 2007, 47, 488-508.
• Svensson et al.J. Chem. Inf. Model., 2012, 52 (1),225–232
Another objective of our screening was how early in a • Nuray, R. etal. Information Processing and Management
virtual screening run the program can identify actives 42 (2006) 595–614
compounds. We used BEDROC and RIE metric to • Zuccotto etal. J.Med Chem 2010,53 2681-2694.
determine it.
In this work we used pharmacophore, shape based
screening and docking, scores and ranks as input in data
fusion ranking algorithms namely, sum rank, sum score • Indo US Science technology forum for
and reciprocal rank. We have identified reciprocal rank providing travel grant and stipend.
algorithm performs best in selecting compounds "early" • Open source drug discovery for providing
in a virtual screening process. We have also screened the publication charges.
Asinex database of 400K compounds with reciprocal rank Figure 3 a)showing the performance metrics of structure • Birla Institute of Technology Hyderabad
algorithm to select potential 45 hits for PknB. and ligand based methods with data fusion. b) PCA plot of
. predicted inhibitors with the PknB inhibitors Campus India.
Figure 1 Workflow of data fusion
1 st Official conference of the International Chemical Biology Society Oct 4-5 2012 Cambridge, MA, USA