Docking Pose Assessment: The importance of keeping your GARD up
1. Abcd
Docking Pose Assessment:
The importance of keeping your GARD up
David C. Thompson
J. Christian Baber[a]
Jason B. Cross[b, c]
[a] Wyeth Research, Chemical Sciences, Cambridge, MA
[b] Wyeth Research, Chemical Sciences, Collegeville, PA
[c] Cubist Pharmaceuticals, Inc. Lexington, MA
2. The Why Abcd
• Large-scale docking evaluation study[1]
— Glide, DOCK6, PhDOCK, SurFlex, FlexX, and ICM
— Cognate ligand docking
— Virtual Screening
• Project aims:
— Assess our computational needs: Right tools for the job?
— Assess and revise best practices
[1] J. B. Cross et al., J. Chem. Inf. Model. (In press) The Why
3. How do we assess a docking program’s ability
to regenerate a known binding mode?
Abcd
Measures of Accuracy: RMSD
Pose # Score RMSD Top scoring
1 -72.0 1.9 pose
2 -56.0 2.3
3 -24.0 1.8 Best RMSD
4 -9.00 2.7
… … …
• We dock the native ligand back into the protein
• We look at the RMSD of the top pose
• We look at the best RMSD of all the poses
The Why
4. Comparing docking programs is difficult …[2] Abcd
• RMSD, and statistics derived from RMSD, are used heavily in
comparing docking programs
• This is fine as RMSD works a lot of the time, however there are
some issues
— Not bounded (how big is too big?)
— Large RMSDs can dominate aggregate statistics
— RMSD is chemically ambivalent
• We may be losing useful information
[2] J. C. Cole et al., Proteins, 60, 325 (2005) The Why
5. What has come before Abcd
• These observations on RMSD are not new
• Relative Displacement Error (RDE)[3]
— Statistics compiled using the RDE measure are less dominated by very bad docking poses
— Would still miss poses that contain correct binding modes
• Interaction-Based Accuracy Classification (IBAC)[4]
— Would not miss poses that have a correct binding mode
— Highly subjective, not easily automated
• Real-space R-factor (RSR)[5]
— Inclusion of experimental information
— Un-bounded (how big is too big?)
• All of these methods address some of the issues associated with RMSD, but not
in one single measure
• RMSTanimoto[6]
[3] R. A. Abagyan et al., J. Mol. Bio., 268, 678 (1997)
[4] R. T. Kroemer et al., J. Chem. Inf. Comput. Sci., 44, 871 (2004)
[5] D. Yusuf et al., J. Chem. Inf. Model., 48, 1411 (2008)
[6] OpenEye Scientific Software, Santa Fe, NM The Why
6. The Why: A Recap Abcd
RMSD works a lot of the time, so we need a function that preserves
this feature, but that also accounts for those difficult cases where
useful information maybe lost
We would also like:
• To avoid the skewing problem associated with large RMSDs
• To have an objective measure
• An element of chemical awareness
The Why
7. The How Abcd
• A Generally Applicable Replacement for RMSD: GARD[7]
• GARD is a metric for analyzing docking poses
• It is bounded on [0,1] to remove arbitrary cutoffs which distort
average measures
• It is based on an analysis performed by P. R. Andrews et al. [8]*
— Regression analysis of the binding constants and structural components of 200
drugs and enzyme inhibitors
• Automated, and no more expensive than RMSD
[7] Submitted, J. Chem. Inf. Model.
[8] P. R. Andrews et al., J. Med. Chem., 27, 1648 1984
* Yes, we know that this is an old study . . . The How
8. GARD: The Algorithm Abcd
Atomic RMSD = 3.68Å
• For each atom compute an RMSD (di)
• Use Andrews weight corresponding to the
atom type (wi)
• Define a ‘good’ and ‘bad’ RMSD: dmin and
dmax
— dmin = 1Å
— dmax = 2.5Å
∑δ w i i
GARD = i
∑w i
i
⎧ 1 di ≤ dmin
⎪ d −d
⎪
δi = ⎨( i min ) dmin ≤ di ≤ dmax
⎪ dmax − dmin
⎪
⎩ 0 di ≥ dmax
RMSD = 1.38Å
GARD = 0.90
Reference structure (cyan); Docking pose (tan) The How
9. GARD: Worked Example Abcd
di ATOM TYPE wi δiwi
0.28 C (sp3) 0.8 0.8
0.48 C (sp3) 0.8 0.8
0.69 N 1.2 1.2
0.60 C (sp3) 0.8 0.8
0.36 C (sp3) 0.8 0.8
0.96 C (sp2) 0.7 0.7
0.96 N 1.2 1.2
3.68 C (sp3) 0.8 0
0.60 C (sp3) 0.8 0.8
SUM 7.9 7.1
GARD = 7.1/7.9 = 0.90
RMSD = 1.38Å
GARD = 0.90
Reference structure (cyan); Docking pose (tan) The How
10. Comparing docking programs is difficult … but
we do it anyway
Abcd
“Cognate ligand docking to 68 diverse, high-resolution x-ray
complexes revealed that ICM, GLIDE, and Surflex generated
ligand poses close to the X-ray conformation more often than the
other docking programs. GLIDE and Surflex also outperformed
the other docking programs when used for virtual screening,
based on mean ROC AUC and ROC enrichment . . .[1]”
Protocol:
1. Initial ligand coordinates used as input for the docking were generated using
CORINA[9]
2. The 10 top scoring poses (or fewer, depending on the specific output for a
particular X-ray complex/docking program combination) were retained for
analysis
3. These poses were then evaluated using both the GARD and RMSD measures
[1] J. B. Cross et al., J. Chem. Inf. Model. (In press)
[9] CORINA v1.82, Molecular Networks GmbH: Erlangen, Germany, 1997 The What
11. The What Abcd
30
25
20
RMSD
15
y = -7.3x + 7.2
R2 = 0.59
10
5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
GARD
Correlation between GARD scores and RMSD across the top 10 poses of compounds from 68 different targets and 6 docking methods
The What
(4725 points)
12. The What: Some Specific Examples Abcd
5
1GLQ 4.5
RMSD = 4.44Å 4
GARD = 0.77
3.5
3
2
RMSD
R = 0.53
1A4Q
2.5
2
RMSD = 4.90Å 1.5
GARD = 0.78 1
0.5
0
0.75 0.8 0.85 0.9 0.95 1
GARD
Correlation between GARD scores and RMSD for those poses with a GARD score of at least 0.75 across the top 10 poses of compounds
The What
from 68 different targets and 6 docking methods (1469 points)
13. 1A4Q: Neuraminidase with dihydropyran-phenethyl-
propy-carboxamide inhibitor (1.90Å)
Abcd
1A4Q
SurFlex Ringflex docking pose (green wire)
RMSD = 4.90Å
GARD = 0.78
X-tal (grey tube) The What
14. 1GLQ: Glutathione-S-transferase with p-nitrobenzyl Abcd
glutathione (1.80Å)
1GLQ
ICM docking pose (green wire)
RMSD = 4.44Å
X-tal (grey tube)
GARD = 0.77
The What
15. 1HPX: HIV Protease with KNI-272 inhibitor (2.00 Å)* Abcd
1 1
2 2
3 4 3 4
Best RMSD Crystal Structure Top Scoring
GARD=0.63 / RMSD=1.89 GARD=0.75 / RMSD=2.35
GLIDE SP 4.5 (10/30) GLIDE SP 4.5 (1/30)
*Additional example, not in the original docking evaluation data set The What
17. GPCR Model Validation: IFD[9] Abcd
β2 adrenergic receptor (2RH1) IFD, default parameters, Pose #1
X-tal ligand (cyan); model protein (cyan) RMSD = 1.85Å
IFD pose (tan); IFD protein (tan) GARD = 0.65
[9] Schrödinger Suite 2008, Induced Fit Docking protocol; Glide
version 5.0, Schrödinger, LLC, New York, NY, 2008; Prime version
2.0, Schrödinger, LLC, New York, NY, 2008 The What
18. Concluding remarks Abcd
• RMSD is a good measure most of the time, although it has known drawbacks
which can result in the discarding of useful information
• A Generally Applicable Replacement to RMSD (GARD) has been proposed
which overcomes most of the drawbacks of RMSD, whilst preserving it’s
strengths. This measure is:
— Normalized
— ‘Chemically aware’
— Automated / objective
• Illustrated GARD utility showing specific examples from a large scale docking
evaluation exercise, and examples from the Protein Data Bank
• Future application: Use with RMSD to triage docking results for protein model
evaluation
— Of particular utility when considering multiple models, and tens/hundreds of
docking poses
19. Cultural highlight Abcd
• Ethnographic examination of
‘simulators’
— Crystallographers
— Architects
— Oceanographers
• “All models are wrong, but some
models are useful” – G. E. P. Box
• “If exactitude is elusive, it is better to
be approximately right than
certifiably wrong” – B. B. Mandelbrot
Simulation and its discontents, Sherry Turkle, Cambridge, MA: MIT Press (2009)
20. Acknowledgments Abcd
• Boehringer Ingelheim
— Dr. Ingo Mügge
— Dr. Sandy Farmer
• Wyeth Research
— The Docking Evaluation Team
(Dr. YongBo Hu, Dr. Kristi Yi Fan and Dr. Brajesh K. Rai*)
— Dr. Jack A. Bikker
— Dr. Christine Humblet
* Pfizer Global Research and Development, Groton, CT