SlideShare uma empresa Scribd logo
1 de 37
VIRTUAL SCREENING IN 
DRUG DISCOVERY 
Cheminformatics Approach 
dsdht.wikispaces.com 
Abhik Seal 
Indiana University Bloomington(dsdht.wikispaces.com)
Challenge in Drug Discovery 
Biological Space 
104 to 105 
Chemical Space 
> 1020 
“Available” 
Compounds 
Validated 
Targets 
Computational 
Screening, 
Experimental Screening 
dsdht.wikispaces.com
Choosing the right molecule 
• Goal: to find a lead compound that can be optimized to give a drug candidate 
− Optimization: using chemical synthesis to modify the lead molecule in order to 
improve its chances of being a successful drug. 
• The challenge: chemical space is vast 
− Estimates vary 
• Reymond et al. suggest there are ~1 billion compounds with up to 13 heavy atoms 
• There are ~65 million known compounds (example UniChem, PubChem) 
• A typical pharmaceutical compound collection contains ~1-5 million compounds 
• High throughput screening allows large (up to 1 million) numbers of compounds to be tested 
− But very small proportion of “available” compounds 
− Large scale screening is expensive 
− Not all targets are suitable for HTS 
dsdht.wikispaces.com 
Blum, L.C. & Reymond, J.-louis . J. Am. Chem. Soc. 131, 
8732-8733(2009).
Screening Schema in Drug Discovery 
dsdht.wikispaces.com 
Virtual 
Library 
Database 
Comb 
Library 
Target Disease 
Metabolic Pathways 
Target Protein 
Leads 
Lead 
Optimization 
Virtual 
Screening 
HTS 
3D Structure 
Screening 
The 
basic 
goal 
of 
the 
virtual 
screening 
is 
the 
reduc4on 
of 
the 
enormous 
virtual 
chemical 
space, 
to 
a 
manageable 
number 
of 
the 
compounds 
that 
would 
inhibit 
a 
target 
protein 
responsible 
for 
disease 
and 
also 
have 
highest 
chance 
to 
lead 
to 
a 
drug 
candidate.
Virtual Screening 
dsdht.wikispaces.com 
3D 
Structure 
Ligand 
Based 
Structure 
Based 
unknown Known 
Actives Known Actives and Inactive Known 
Machine 
Learning 
Pharmacophore 
Similarity 
Searching Mapping 
Docking 
Depending upon structural and Bioactvity 
data available : 
• One or more actives molecule known 
perform similarity searching. 
• Several active known try to identify a 
common 3D pharmacophore and 
then do 3D database search. 
• Reasonable number of active and 
inactive known train a machine 
learning model. 
• 3D structure of protein known use 
protein ligand docking.
dsdht.wikispaces.com 
Hybrid Virtual Screening 
Mostly, people in pharmaceutical industry does not follow a specific route they follow a hybrid of methods 
as discussed in previous slide. 
Shape 
Similarity 
Filter : Rule of 5 , 
ADME, TOX 
ROCS, FlexS 
LUDI, Ligand Scout, 
Phase, DrugScore 
Pharmacophore 
based Screening 
Structure based 
Pharmacophore 
Docking based 
Screening 
Post Process 
Starting 
database 
Potential Lead compounds 
Prepared 
database 
Ligand Scout, Phase, Ligand fit 
Dock, Gold, Glide, ICM 
Cscore, MM/PBSA, Solvation Corrections 
Cleaning Molecules 
Remove isotopes, salts and 
mixtures 
Protonation and 
normalization 
Remove duplicates and 
invalid structures 
Filtering Molecules 
User defined or other 
filter 
Remove problematic 
moieties using PAINS, 
Frequent Hitters etc. 
PhyChem property 
descriptor calculation 
and filtration 
Apply protonation at 
pH 7.4
dsdht.wikispaces.com 
Drug Like Properties 
Drug-like properties are an integral element of drug discovery projects. 
Properties of interest to discovery scientists include the following: 
• Structural properties 
Hydrogen Bonding, Polar Surface area , Lipophilicity, Shape , Molecular 
Weight, Reactivity, pka 
• Physicochemical Properties 
Solubility, Permeability, Chemical Stability 
• Biochemical Properties 
Metabolism(Phase 1 and 2) , Protein and tissue binding, transport 
• Pharmacokinetics(PK) and toxicity 
Clearance, Half-life, Bioavailability, Drug-Drug Interaction,LD50
dsdht.wikispaces.com 
Leadlike & Druglike 
• Leadlike 
- Molecular weight (MW) = 200–350 (optimization might add 100–200) 
- clogP <1.0–3.0 (optimization might increase by 1–2 log units) 
- Single charge present (secondary or tertiary amine preferred) 
- Importantly, exclude chemically reactive functional groups ,‘promiscuous 
inhibitors’, ‘frequent hitters’ and warheads 
- Non-substrate peptides are suitable. 
• Druglike 
- Importantly, exclude chemically reactive functional groups ,‘promiscuous 
inhibitors’, ‘frequent hitters’ and warheads 
- MW < 500 
- cloP < 5 
- H-bond donors < 5 
- Sum of N and O (H-bond acceptors) < 10 
- Polar surface area < 140 A2 
- Number of rotatable bonds <= 10
dsdht.wikispaces.com 
Filtering molecules using structural properties 
Basic Washing – 
• Removing Salts & Unwanted Elements 
Filter out cationic atoms: Ca2+, Na+, etc. 
Filter out metals: 
Sc,Ti,V,Cr,Mn,Fe,Co,Ni,Cu,Zn,Y,Zr,Nb,Mo,Tc,Ru,Rh,Pd,Ag,Cd 
Often the salt “filter” = keeping the largest molecule in the sdf entry. 
• ALLOWED_ELEMENTS H, C, N, O, F, P, S, Cl, Br, I 
• Check proper Atom Types by adding hydrogen and checks if O, N, C valences are 
correct. 
• Check formal charge
dsdht.wikispaces.com 
Filter out Reactives (false positives for proteins) 
Rishton, G.M. “Nonleadlikeness and leadlikeness in biochemical 
screening” Drug Discovery Today (2003) 8, 86-96
dsdht.wikispaces.com 
Filter out: Synthesis Intermediates, Chelators 
‘warhead’ agents - functional groups which shows high reactivity to proteins due 
which there is high attrition rate in drug development. 
Rishton, G.M. “Nonleadlikeness and leadlikeness in biochemical 
screening” Drug Discovery Today (2003) 8, 86-96
PAINS Filter 
• PAINS = “Pan-Assay Interference Compounds” 
• Problematic scaffolds – has cost their Institute time and $$ 
dsdht.wikispaces.com 
New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in 
Bioassays. J. Med. Chem. (2010) 53, 2719-2740
dsdht.wikispaces.com 
Rules-of-Thumb for Hit Selection & Lead Optimization 
Muchmore, SW et al. “Cheminformatic Tools for Medicinal Chemists” J. Med. Chem. (2010) 53, 4830 – 4841
Similarity Searching 
What is it ?? 
dsdht.wikispaces.com 
Chemical, pharmacological or biological properties of two compounds 
match. 
The more the common features, the higher the similarity between two 
molecules. 
Chemical! 
The two structures on top are chemically similar to each other. This is reflected in their 
common sub-graph, or scaffold: they share 14 atoms 
Pharmacophore 
The two structures above are less similar chemically (topologically) yet have the same 
pharmacological activity, namely they both are Angiotensin-Converting Enzyme (ACE) 
inhibitors
dsdht.wikispaces.com 
What is required for a similarity search ? 
• A Database SQL or NoSQL ( Postgres, MySQL, 
MongoDB) or flat file of descriptors eg: ChemFP 
• Chemical Cartridge to generate fingerprints(descriptors) 
for molecules ( RDKit, openbabel) 
• Similarity function to calculate similarity( Jaccard, Dice, 
Tversky) this can be written in c,c++ or python as a 
function inside SQL databases.
2D fingerprints: molecules 
represented as binary vectors 
dsdht.wikispaces.com 
• Each bit in the bit string (binary vector) represents one molecular fragment. Typical length is ~1000 bits 
• The bit string for a molecule records the presence (“1”) or absence (“0”) of each fragment in the molecule 
• Originally developed for speeding up substructure search 
− for a query substructure to be present in a database molecule each bit set to “1” 
in the query must also be set to “1” in the database structure 
• Similarity is based on determining the number of bits that are common to two structures
Calculate molecular similarity 
dsdht.wikispaces.com 
Sequences/vectors of bits, or numeric values that can be compared by 
distance functions, similarity metrics . 
E= Euclidean distance T = Tanimoto index /Jaccard 
T x y B x y 
( , ) ( & ) 
B x B y B x y 
( ) + ( ) − 
( & ) 
n 
= ( ) Σ= 
( , ) 2 
E x y = x y 
i − 
i i 
1
dsdht.wikispaces.com 
Don’t forget to check out the video of 2D 
similarity searching on a database 
https://www.youtube.com/watch?v=VHjUWCJITmE
3D based similarity 
• Shape-based 
ROCS (Rapid Overlay of Chemical Structures) 
Silicos-it.com (Shape it) 
• Computationally more expensive than 2D methods 
• Requires consideration of conformational flexibility 
− Rigid search - based on a single conformer 
− Flexible search 
• Conformation explored at search time 
• Ensemble of conformers generated prior to search time with each 
conformer of each molecule considered in turn 
• How many conformers are required? 
Check The video on ROCS 
https://www.youtube.com/watch?v=FEAUBlQWf0s 
dsdht.wikispaces.com
dsdht.wikispaces.com 
Multiple actives known: pharmacophore 
searching 
• IUPAC Definition: “An ensemble of steric and electronic features that is necessary to 
ensure the optimal supramolecular interactions with a specific biological target and to 
trigger (or block) its biological response“ 
H Aromatic HBA R HBD 
• In drug design, the term 'pharmacophore‘ refers to a set of features that is common to a 
series of active molecules 
• Hydrogen-bond donors and acceptors, positively and negatively charged groups, and 
hydrophobic regions are typical features 
We will refer to such features as 'pharmacophoric groups'
dsdht.wikispaces.com 
3D- pharmacophores 
• A three-dimensional pharmacophore specifies the spatial relation-ships 
between the groups 
• Expressed as distance ranges, angles and planes
dsdht.wikispaces.com 
Workflow of pharmacophore modeling 
http://i571.wikispaces.com/
dsdht.wikispaces.com 
Watch the video on pharmacophore searching 
Ligand based pharmacophore screening using ligand scout 
https://www.youtube.com/watch?v=gwAB_Xz9g94
Tools to perform pharmacophore 
searching 
dsdht.wikispaces.com 
1) Catalyst (Accelrys) 
2) Phase (Schrodinger) 
3) LigandScout (Inte:Ligand) 
4) PharmaGist 
5) Pharmer 
6) SHAFTS
dsdht.wikispaces.com 
Many Actives and inactives known : Machine learning 
methods 
• SAR Modeling 
• Use knowledge of known active and known inactive compounds to build a predictive 
model 
• Quantitative-Structure Activity Relationships (QSARs) 
− Long established (Hansch analysis, Free-Wilson analysis) 
− Generally restricted to small, homogeneous datasets eg lead optimization. 
• Structure-Activity Relationships (SARs) 
− “Activity” data is usually treated qualitatively 
− Can be used with data consisting of diverse structural classes and multiple binding 
modes 
− Some resistance to noisy data (HTS data) 
− Resulting models used to prioritize compounds for lead finding (not to identify 
candidates or drugs)
dsdht.wikispaces.com 
Practical predictive modeling in Cheminformatics 
QSAR 
https://www.youtube.com/watch?v=EHbxFMfU_3A 
Binary Classification 
https://www.youtube.com/watch?v=6Mund7C964I
Protein Ligand Docking 
dsdht.wikispaces.com 
Computational method which mimics the binding of a ligand to a protein. 
It predicts .. 
a) the pose of the molecule in the binding site 
b) The binding affinity or score representing the strength of binding
Pose and Binding Site 
dsdht.wikispaces.com 
• Binding Site (or “active site”) 
- the part of the protein where the ligand binds . 
- generally a cavity on the protein surface. 
- can be identified by looking at the crystal structure of the protein bund with a 
known inhibitor. 
• Pose ( “binding mode”) 
- the geometry of the ligand in the binding site 
- Geometry- location , orientation and conformation of the molecule 
http://www.slideshare.net/baoilleach/proteinligand-docking
dsdht.wikispaces.com 
Protein Ligand Docking 
• How does a ligand (small molecule) bind into the active site of a 
protein? 
• Docking algorithms are based on two key components 
− search algorithm 
• to generate “poses” (conformation, position and orientation) of 
the ligand within the active site 
− scoring function 
• to identify the most likely pose for an individual ligand 
• to assign a priority order to a set of diverse ligands 
docked to the same protein – estimate binding affinity
The search space 
dsdht.wikispaces.com 
• The difficulty with protein–ligand docking is in part due to the fact that it involves many degrees of 
freedom 
− The translation and rotation of one molecule relative to another involves six degrees of 
freedom 
− These are in addition the conformational degrees of freedom of both the ligand and the protein 
− The solvent may also play a significant role in determining the protein–ligand geometry (often 
ignored though) 
• The search algorithm generates poses, orientations of particular conformations of the 
molecule in the binding site 
− Tries to cover the search space, if not exhaustively, then as extensively as possible 
− There is a tradeoff between time and search space coverage 
AR Leach, VJ Gillet, An Introduction to Cheminformatics
Dock Algorithms 
• DOCK: first docking program by Kuntz et al. 1982 
− Based on shape complementarity and rigid ligands 
• Current algorithms 
dsdht.wikispaces.com 
− Fragment-based methods: FlexX, DOCK (since version 4.0) 
− Monte Carlo/Simulated annealing: QXP(Flo), Autodock, Affinity 
& LigandFit (Accelrys) 
− Genetic algorithms: GOLD, AutoDock (since version 3.0) 
− Systematic search: FRED (OpenEye), Glide (Schrödinger) 
R. D. Taylor et al. “A review of protein-small molecule docking methods”, J. Comput. Aid. Mol. Des. 2002, 16, 
151-166.
DOCK (Kuntz et al. 1982) 
• Rigid docking based on shape 
• A negative image of the cavity is constructed 
by filling it with spheres 
• Spheres are of varying size 
• Each touches the surface at two points 
• The centres of the spheres become 
potential locations for ligand atoms. 
dsdht.wikispaces.com 
AR Leach, VJ Gillet, An Introduction to Cheminformatics
DOCK 
• Ligand atoms are matched to sphere centers 
so that distances between atoms equals 
distances between sphere centers. 
• The matches are used to position the ligand 
within the active site. 
• If there are no steric clashes the ligand is scored. 
dsdht.wikispaces.com 
• Many different mappings (poses) are possible 
• Each pose is scored based on goodness of fit 
• Highest scoring pose is presented to the user 
AR Leach, VJ Gillet, An Introduction to Cheminformatics
dsdht.wikispaces.com 
Energetics of protein-ligand binding 
a) Ligand-receptor binding is driven by 
• electrostatics (including hydrogen bonding interactions) 
• dispersion or van der Waals forces 
• hydrophobic interactions 
• desolvation: surfaces buried between the protein and the ligand 
have to be desolvated 
• Conformational changes to protein and ligand 
• ligand must be properly orientated and translated to interact and 
form a complex 
• loss of entropy of the ligand due to being fixed in one conformation. 
b) Free energy of binding 
ΔGbind = ΔGsolvent + ΔGconf + ΔGint + ΔGrot + ΔGt/r + ΔGvib
dsdht.wikispaces.com 
Conclusions 
• AutoDock, DOCK, e-Hits, FlexX, FRED, Glide, GOLD, LigandFit, 
QXP, Surflex-Dock…among others 
• Different scoring functions, different search algorithms, different 
approaches 
• See Section 12.5 in DC Young, Computational Drug Design (Wiley 2009) 
for good overview of different packages 
http://www.slideshare.net/baoilleach/proteinligand-docking
Conclusions 
dsdht.wikispaces.com 
• Wide range of virtual screening techniques have been 
developed 
• The performance of different methods varies on different 
datasets 
• Increased complexity in descriptors and method does not 
necessarily lead to greater success. 
• Combining different approaches can lead to improved 
results. 
• Computational filters should be applied to remove 
undesirable compounds from further consideration.
dsdht.wikispaces.com 
References 
• Ripphausen et al. (2010) Quo vadis, virtual screening? A comprehensive 
review of prospective applications. Journal of Medicinal Chemistry, 53, 
8461-8467. 
• Scior et al. (2012) Recognizing pitfalls in virtual screening: a critical 
review. Journal of Chemical Information and Modeling, 52, 867-881 
• Sotriffer (Ed) Virtual Screening. Principles, Challenges and 
PracticalGuidelines. Wiley-VCH, 2011. 
• Varnek A, Baskin I. Machine Learning Methods for Property Prediction in 
Chemoinformatics: Quo Vadis? Journal of Chemical Information and 
Modeling 2012, 52, 1413−1437 
• Hartenfeller, M.; Schneider, G. Enabling future drug discovery by de novo 
design. Wiley Interdisciplinary Reviews-Computational Molecular Science 
2011, 1, 742-759. 
• Open-source platform to benchmark fingerprints for ligand-based virtual 
screening Sereina Riniker, Gregory A Landrum Journal of Cheminformatics 
2013, 5:26 (30 May 2013)

Mais conteúdo relacionado

Mais procurados

CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)
Pinky Vincent
 

Mais procurados (20)

Traditional and Rational Drug Designing
Traditional and Rational Drug DesigningTraditional and Rational Drug Designing
Traditional and Rational Drug Designing
 
CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)CoMFA CoMFA Comparative Molecular Field Analysis)
CoMFA CoMFA Comparative Molecular Field Analysis)
 
Role of Target Identification and Target Validation in Drug Discovery Process
Role of Target Identification and Target Validation in Drug Discovery ProcessRole of Target Identification and Target Validation in Drug Discovery Process
Role of Target Identification and Target Validation in Drug Discovery Process
 
Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)Quantitative Structure Activity Relationship (QSAR)
Quantitative Structure Activity Relationship (QSAR)
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
MOLECULAR DOCKING
MOLECULAR DOCKINGMOLECULAR DOCKING
MOLECULAR DOCKING
 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniques
 
Target identification and validation
Target identification and validationTarget identification and validation
Target identification and validation
 
OVERVIEW OF MODERN DRUG DISCOVERY PROCESS
OVERVIEW OF MODERN DRUG DISCOVERY PROCESSOVERVIEW OF MODERN DRUG DISCOVERY PROCESS
OVERVIEW OF MODERN DRUG DISCOVERY PROCESS
 
Rational drug design
Rational drug designRational drug design
Rational drug design
 
De Novo Drug Design
De Novo Drug DesignDe Novo Drug Design
De Novo Drug Design
 
molecular docking its types and de novo drug design and application and softw...
molecular docking its types and de novo drug design and application and softw...molecular docking its types and de novo drug design and application and softw...
molecular docking its types and de novo drug design and application and softw...
 
De novo drug design
De novo drug designDe novo drug design
De novo drug design
 
Presentation on concept of pharmacophore mapping and pharmacophore based scre...
Presentation on concept of pharmacophore mapping and pharmacophore based scre...Presentation on concept of pharmacophore mapping and pharmacophore based scre...
Presentation on concept of pharmacophore mapping and pharmacophore based scre...
 
3 d qsar approaches structure
3 d qsar approaches structure3 d qsar approaches structure
3 d qsar approaches structure
 
Virtual screening strategies and applications
Virtual screening strategies and applicationsVirtual screening strategies and applications
Virtual screening strategies and applications
 
Drug Discovery Process by Kashikant Yadav
Drug Discovery Process by Kashikant YadavDrug Discovery Process by Kashikant Yadav
Drug Discovery Process by Kashikant Yadav
 
Molecular Modeling and virtual screening techniques
Molecular Modeling and virtual screening techniquesMolecular Modeling and virtual screening techniques
Molecular Modeling and virtual screening techniques
 
Denovo Drug Design
Denovo Drug DesignDenovo Drug Design
Denovo Drug Design
 
Target identification
Target identificationTarget identification
Target identification
 

Destaque

1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
Deependra Ban
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
palliyath91
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
N K
 
Protein-Ligand Docking
Protein-Ligand DockingProtein-Ligand Docking
Protein-Ligand Docking
baoilleach
 
Library screening
Library screeningLibrary screening
Library screening
sarumalay
 

Destaque (20)

Molecular docking and_virtual_screening
Molecular docking and_virtual_screeningMolecular docking and_virtual_screening
Molecular docking and_virtual_screening
 
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
Structure based drug design
Structure based drug designStructure based drug design
Structure based drug design
 
Drug discovery
Drug discoveryDrug discovery
Drug discovery
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
 
Protein-Ligand Docking
Protein-Ligand DockingProtein-Ligand Docking
Protein-Ligand Docking
 
Qsar and drug design ppt
Qsar and drug design pptQsar and drug design ppt
Qsar and drug design ppt
 
Computer Aided Drug Design ppt
Computer Aided Drug Design pptComputer Aided Drug Design ppt
Computer Aided Drug Design ppt
 
Esed form b
Esed form bEsed form b
Esed form b
 
The Path to WCAG 2.0 Through Industry Based Training
The Path to WCAG 2.0 Through Industry Based TrainingThe Path to WCAG 2.0 Through Industry Based Training
The Path to WCAG 2.0 Through Industry Based Training
 
RecA foscused antibacterial screening
RecA foscused antibacterial screeningRecA foscused antibacterial screening
RecA foscused antibacterial screening
 
Windows server 2008 R2 : Services de Bureau Distant
Windows server 2008 R2 : Services de Bureau DistantWindows server 2008 R2 : Services de Bureau Distant
Windows server 2008 R2 : Services de Bureau Distant
 
Library screening
Library screeningLibrary screening
Library screening
 
NAAT IN BLOOD BANKING
NAAT IN BLOOD BANKINGNAAT IN BLOOD BANKING
NAAT IN BLOOD BANKING
 
(Form r) sse (ae os) [howpk.com]
(Form r) sse (ae os) [howpk.com](Form r) sse (ae os) [howpk.com]
(Form r) sse (ae os) [howpk.com]
 
Estado del arte de la certificación de sostenbilidad turística 2017
Estado del arte de la certificación de sostenbilidad turística 2017Estado del arte de la certificación de sostenbilidad turística 2017
Estado del arte de la certificación de sostenbilidad turística 2017
 
The Music Industry - what's broken (excerpt of a keynote)
The Music Industry - what's broken (excerpt of a keynote)The Music Industry - what's broken (excerpt of a keynote)
The Music Industry - what's broken (excerpt of a keynote)
 
Validity of a screening test
Validity of a screening testValidity of a screening test
Validity of a screening test
 
CDISC journey using RECISIT 1.1
CDISC journey using RECISIT 1.1CDISC journey using RECISIT 1.1
CDISC journey using RECISIT 1.1
 

Semelhante a Virtual Screening in Drug Discovery

How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
Kamel Mansouri
 

Semelhante a Virtual Screening in Drug Discovery (20)

Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
 
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
 
Cadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.PharmCadd and molecular modeling for M.Pharm
Cadd and molecular modeling for M.Pharm
 
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
How a Structure-Centric Community for Chemists Can Benefit Drug Discovery - V...
 
Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)Computer aided Drug designing (CADD)
Computer aided Drug designing (CADD)
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
cadd-191129134050 (1).pptx
cadd-191129134050 (1).pptxcadd-191129134050 (1).pptx
cadd-191129134050 (1).pptx
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
SLAS Screen Design and Assay Technology SIG: SLAS2013 Presentation
SLAS Screen Design and Assay Technology SIG: SLAS2013 PresentationSLAS Screen Design and Assay Technology SIG: SLAS2013 Presentation
SLAS Screen Design and Assay Technology SIG: SLAS2013 Presentation
 
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
 

Mais de Abhik Seal

Modeling Chemical Datasets
Modeling Chemical DatasetsModeling Chemical Datasets
Modeling Chemical Datasets
Abhik Seal
 
Introduction to Adverse Drug Reactions
Introduction to Adverse Drug ReactionsIntroduction to Adverse Drug Reactions
Introduction to Adverse Drug Reactions
Abhik Seal
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
Abhik Seal
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
Abhik Seal
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
Abhik Seal
 
Understanding Smiles
Understanding Smiles Understanding Smiles
Understanding Smiles
Abhik Seal
 
Learning chemistry with google
Learning chemistry with googleLearning chemistry with google
Learning chemistry with google
Abhik Seal
 
3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data
Abhik Seal
 

Mais de Abhik Seal (20)

Chemical data
Chemical dataChemical data
Chemical data
 
Clinicaldataanalysis in r
Clinicaldataanalysis in rClinicaldataanalysis in r
Clinicaldataanalysis in r
 
Data manipulation on r
Data manipulation on rData manipulation on r
Data manipulation on r
 
Data handling in r
Data handling in rData handling in r
Data handling in r
 
Networks
NetworksNetworks
Networks
 
Modeling Chemical Datasets
Modeling Chemical DatasetsModeling Chemical Datasets
Modeling Chemical Datasets
 
Introduction to Adverse Drug Reactions
Introduction to Adverse Drug ReactionsIntroduction to Adverse Drug Reactions
Introduction to Adverse Drug Reactions
 
Mapping protein to function
Mapping protein to functionMapping protein to function
Mapping protein to function
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
Chemical File Formats for storing chemical data
Chemical File Formats for storing chemical dataChemical File Formats for storing chemical data
Chemical File Formats for storing chemical data
 
Understanding Smiles
Understanding Smiles Understanding Smiles
Understanding Smiles
 
Learning chemistry with google
Learning chemistry with googleLearning chemistry with google
Learning chemistry with google
 
3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data3 d virtual screening of pknb inhibitors using data
3 d virtual screening of pknb inhibitors using data
 
Poster
PosterPoster
Poster
 
R scatter plots
R scatter plotsR scatter plots
R scatter plots
 
Indo us 2012
Indo us 2012Indo us 2012
Indo us 2012
 
Q plot tutorial
Q plot tutorialQ plot tutorial
Q plot tutorial
 
Weka guide
Weka guideWeka guide
Weka guide
 
Pharmacohoreppt
PharmacohorepptPharmacohoreppt
Pharmacohoreppt
 
Document1
Document1Document1
Document1
 

Último

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 

Último (20)

Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 

Virtual Screening in Drug Discovery

  • 1. VIRTUAL SCREENING IN DRUG DISCOVERY Cheminformatics Approach dsdht.wikispaces.com Abhik Seal Indiana University Bloomington(dsdht.wikispaces.com)
  • 2. Challenge in Drug Discovery Biological Space 104 to 105 Chemical Space > 1020 “Available” Compounds Validated Targets Computational Screening, Experimental Screening dsdht.wikispaces.com
  • 3. Choosing the right molecule • Goal: to find a lead compound that can be optimized to give a drug candidate − Optimization: using chemical synthesis to modify the lead molecule in order to improve its chances of being a successful drug. • The challenge: chemical space is vast − Estimates vary • Reymond et al. suggest there are ~1 billion compounds with up to 13 heavy atoms • There are ~65 million known compounds (example UniChem, PubChem) • A typical pharmaceutical compound collection contains ~1-5 million compounds • High throughput screening allows large (up to 1 million) numbers of compounds to be tested − But very small proportion of “available” compounds − Large scale screening is expensive − Not all targets are suitable for HTS dsdht.wikispaces.com Blum, L.C. & Reymond, J.-louis . J. Am. Chem. Soc. 131, 8732-8733(2009).
  • 4. Screening Schema in Drug Discovery dsdht.wikispaces.com Virtual Library Database Comb Library Target Disease Metabolic Pathways Target Protein Leads Lead Optimization Virtual Screening HTS 3D Structure Screening The basic goal of the virtual screening is the reduc4on of the enormous virtual chemical space, to a manageable number of the compounds that would inhibit a target protein responsible for disease and also have highest chance to lead to a drug candidate.
  • 5. Virtual Screening dsdht.wikispaces.com 3D Structure Ligand Based Structure Based unknown Known Actives Known Actives and Inactive Known Machine Learning Pharmacophore Similarity Searching Mapping Docking Depending upon structural and Bioactvity data available : • One or more actives molecule known perform similarity searching. • Several active known try to identify a common 3D pharmacophore and then do 3D database search. • Reasonable number of active and inactive known train a machine learning model. • 3D structure of protein known use protein ligand docking.
  • 6. dsdht.wikispaces.com Hybrid Virtual Screening Mostly, people in pharmaceutical industry does not follow a specific route they follow a hybrid of methods as discussed in previous slide. Shape Similarity Filter : Rule of 5 , ADME, TOX ROCS, FlexS LUDI, Ligand Scout, Phase, DrugScore Pharmacophore based Screening Structure based Pharmacophore Docking based Screening Post Process Starting database Potential Lead compounds Prepared database Ligand Scout, Phase, Ligand fit Dock, Gold, Glide, ICM Cscore, MM/PBSA, Solvation Corrections Cleaning Molecules Remove isotopes, salts and mixtures Protonation and normalization Remove duplicates and invalid structures Filtering Molecules User defined or other filter Remove problematic moieties using PAINS, Frequent Hitters etc. PhyChem property descriptor calculation and filtration Apply protonation at pH 7.4
  • 7. dsdht.wikispaces.com Drug Like Properties Drug-like properties are an integral element of drug discovery projects. Properties of interest to discovery scientists include the following: • Structural properties Hydrogen Bonding, Polar Surface area , Lipophilicity, Shape , Molecular Weight, Reactivity, pka • Physicochemical Properties Solubility, Permeability, Chemical Stability • Biochemical Properties Metabolism(Phase 1 and 2) , Protein and tissue binding, transport • Pharmacokinetics(PK) and toxicity Clearance, Half-life, Bioavailability, Drug-Drug Interaction,LD50
  • 8. dsdht.wikispaces.com Leadlike & Druglike • Leadlike - Molecular weight (MW) = 200–350 (optimization might add 100–200) - clogP <1.0–3.0 (optimization might increase by 1–2 log units) - Single charge present (secondary or tertiary amine preferred) - Importantly, exclude chemically reactive functional groups ,‘promiscuous inhibitors’, ‘frequent hitters’ and warheads - Non-substrate peptides are suitable. • Druglike - Importantly, exclude chemically reactive functional groups ,‘promiscuous inhibitors’, ‘frequent hitters’ and warheads - MW < 500 - cloP < 5 - H-bond donors < 5 - Sum of N and O (H-bond acceptors) < 10 - Polar surface area < 140 A2 - Number of rotatable bonds <= 10
  • 9. dsdht.wikispaces.com Filtering molecules using structural properties Basic Washing – • Removing Salts & Unwanted Elements Filter out cationic atoms: Ca2+, Na+, etc. Filter out metals: Sc,Ti,V,Cr,Mn,Fe,Co,Ni,Cu,Zn,Y,Zr,Nb,Mo,Tc,Ru,Rh,Pd,Ag,Cd Often the salt “filter” = keeping the largest molecule in the sdf entry. • ALLOWED_ELEMENTS H, C, N, O, F, P, S, Cl, Br, I • Check proper Atom Types by adding hydrogen and checks if O, N, C valences are correct. • Check formal charge
  • 10. dsdht.wikispaces.com Filter out Reactives (false positives for proteins) Rishton, G.M. “Nonleadlikeness and leadlikeness in biochemical screening” Drug Discovery Today (2003) 8, 86-96
  • 11. dsdht.wikispaces.com Filter out: Synthesis Intermediates, Chelators ‘warhead’ agents - functional groups which shows high reactivity to proteins due which there is high attrition rate in drug development. Rishton, G.M. “Nonleadlikeness and leadlikeness in biochemical screening” Drug Discovery Today (2003) 8, 86-96
  • 12. PAINS Filter • PAINS = “Pan-Assay Interference Compounds” • Problematic scaffolds – has cost their Institute time and $$ dsdht.wikispaces.com New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays. J. Med. Chem. (2010) 53, 2719-2740
  • 13. dsdht.wikispaces.com Rules-of-Thumb for Hit Selection & Lead Optimization Muchmore, SW et al. “Cheminformatic Tools for Medicinal Chemists” J. Med. Chem. (2010) 53, 4830 – 4841
  • 14. Similarity Searching What is it ?? dsdht.wikispaces.com Chemical, pharmacological or biological properties of two compounds match. The more the common features, the higher the similarity between two molecules. Chemical! The two structures on top are chemically similar to each other. This is reflected in their common sub-graph, or scaffold: they share 14 atoms Pharmacophore The two structures above are less similar chemically (topologically) yet have the same pharmacological activity, namely they both are Angiotensin-Converting Enzyme (ACE) inhibitors
  • 15. dsdht.wikispaces.com What is required for a similarity search ? • A Database SQL or NoSQL ( Postgres, MySQL, MongoDB) or flat file of descriptors eg: ChemFP • Chemical Cartridge to generate fingerprints(descriptors) for molecules ( RDKit, openbabel) • Similarity function to calculate similarity( Jaccard, Dice, Tversky) this can be written in c,c++ or python as a function inside SQL databases.
  • 16. 2D fingerprints: molecules represented as binary vectors dsdht.wikispaces.com • Each bit in the bit string (binary vector) represents one molecular fragment. Typical length is ~1000 bits • The bit string for a molecule records the presence (“1”) or absence (“0”) of each fragment in the molecule • Originally developed for speeding up substructure search − for a query substructure to be present in a database molecule each bit set to “1” in the query must also be set to “1” in the database structure • Similarity is based on determining the number of bits that are common to two structures
  • 17. Calculate molecular similarity dsdht.wikispaces.com Sequences/vectors of bits, or numeric values that can be compared by distance functions, similarity metrics . E= Euclidean distance T = Tanimoto index /Jaccard T x y B x y ( , ) ( & ) B x B y B x y ( ) + ( ) − ( & ) n = ( ) Σ= ( , ) 2 E x y = x y i − i i 1
  • 18. dsdht.wikispaces.com Don’t forget to check out the video of 2D similarity searching on a database https://www.youtube.com/watch?v=VHjUWCJITmE
  • 19. 3D based similarity • Shape-based ROCS (Rapid Overlay of Chemical Structures) Silicos-it.com (Shape it) • Computationally more expensive than 2D methods • Requires consideration of conformational flexibility − Rigid search - based on a single conformer − Flexible search • Conformation explored at search time • Ensemble of conformers generated prior to search time with each conformer of each molecule considered in turn • How many conformers are required? Check The video on ROCS https://www.youtube.com/watch?v=FEAUBlQWf0s dsdht.wikispaces.com
  • 20. dsdht.wikispaces.com Multiple actives known: pharmacophore searching • IUPAC Definition: “An ensemble of steric and electronic features that is necessary to ensure the optimal supramolecular interactions with a specific biological target and to trigger (or block) its biological response“ H Aromatic HBA R HBD • In drug design, the term 'pharmacophore‘ refers to a set of features that is common to a series of active molecules • Hydrogen-bond donors and acceptors, positively and negatively charged groups, and hydrophobic regions are typical features We will refer to such features as 'pharmacophoric groups'
  • 21. dsdht.wikispaces.com 3D- pharmacophores • A three-dimensional pharmacophore specifies the spatial relation-ships between the groups • Expressed as distance ranges, angles and planes
  • 22. dsdht.wikispaces.com Workflow of pharmacophore modeling http://i571.wikispaces.com/
  • 23. dsdht.wikispaces.com Watch the video on pharmacophore searching Ligand based pharmacophore screening using ligand scout https://www.youtube.com/watch?v=gwAB_Xz9g94
  • 24. Tools to perform pharmacophore searching dsdht.wikispaces.com 1) Catalyst (Accelrys) 2) Phase (Schrodinger) 3) LigandScout (Inte:Ligand) 4) PharmaGist 5) Pharmer 6) SHAFTS
  • 25. dsdht.wikispaces.com Many Actives and inactives known : Machine learning methods • SAR Modeling • Use knowledge of known active and known inactive compounds to build a predictive model • Quantitative-Structure Activity Relationships (QSARs) − Long established (Hansch analysis, Free-Wilson analysis) − Generally restricted to small, homogeneous datasets eg lead optimization. • Structure-Activity Relationships (SARs) − “Activity” data is usually treated qualitatively − Can be used with data consisting of diverse structural classes and multiple binding modes − Some resistance to noisy data (HTS data) − Resulting models used to prioritize compounds for lead finding (not to identify candidates or drugs)
  • 26. dsdht.wikispaces.com Practical predictive modeling in Cheminformatics QSAR https://www.youtube.com/watch?v=EHbxFMfU_3A Binary Classification https://www.youtube.com/watch?v=6Mund7C964I
  • 27. Protein Ligand Docking dsdht.wikispaces.com Computational method which mimics the binding of a ligand to a protein. It predicts .. a) the pose of the molecule in the binding site b) The binding affinity or score representing the strength of binding
  • 28. Pose and Binding Site dsdht.wikispaces.com • Binding Site (or “active site”) - the part of the protein where the ligand binds . - generally a cavity on the protein surface. - can be identified by looking at the crystal structure of the protein bund with a known inhibitor. • Pose ( “binding mode”) - the geometry of the ligand in the binding site - Geometry- location , orientation and conformation of the molecule http://www.slideshare.net/baoilleach/proteinligand-docking
  • 29. dsdht.wikispaces.com Protein Ligand Docking • How does a ligand (small molecule) bind into the active site of a protein? • Docking algorithms are based on two key components − search algorithm • to generate “poses” (conformation, position and orientation) of the ligand within the active site − scoring function • to identify the most likely pose for an individual ligand • to assign a priority order to a set of diverse ligands docked to the same protein – estimate binding affinity
  • 30. The search space dsdht.wikispaces.com • The difficulty with protein–ligand docking is in part due to the fact that it involves many degrees of freedom − The translation and rotation of one molecule relative to another involves six degrees of freedom − These are in addition the conformational degrees of freedom of both the ligand and the protein − The solvent may also play a significant role in determining the protein–ligand geometry (often ignored though) • The search algorithm generates poses, orientations of particular conformations of the molecule in the binding site − Tries to cover the search space, if not exhaustively, then as extensively as possible − There is a tradeoff between time and search space coverage AR Leach, VJ Gillet, An Introduction to Cheminformatics
  • 31. Dock Algorithms • DOCK: first docking program by Kuntz et al. 1982 − Based on shape complementarity and rigid ligands • Current algorithms dsdht.wikispaces.com − Fragment-based methods: FlexX, DOCK (since version 4.0) − Monte Carlo/Simulated annealing: QXP(Flo), Autodock, Affinity & LigandFit (Accelrys) − Genetic algorithms: GOLD, AutoDock (since version 3.0) − Systematic search: FRED (OpenEye), Glide (Schrödinger) R. D. Taylor et al. “A review of protein-small molecule docking methods”, J. Comput. Aid. Mol. Des. 2002, 16, 151-166.
  • 32. DOCK (Kuntz et al. 1982) • Rigid docking based on shape • A negative image of the cavity is constructed by filling it with spheres • Spheres are of varying size • Each touches the surface at two points • The centres of the spheres become potential locations for ligand atoms. dsdht.wikispaces.com AR Leach, VJ Gillet, An Introduction to Cheminformatics
  • 33. DOCK • Ligand atoms are matched to sphere centers so that distances between atoms equals distances between sphere centers. • The matches are used to position the ligand within the active site. • If there are no steric clashes the ligand is scored. dsdht.wikispaces.com • Many different mappings (poses) are possible • Each pose is scored based on goodness of fit • Highest scoring pose is presented to the user AR Leach, VJ Gillet, An Introduction to Cheminformatics
  • 34. dsdht.wikispaces.com Energetics of protein-ligand binding a) Ligand-receptor binding is driven by • electrostatics (including hydrogen bonding interactions) • dispersion or van der Waals forces • hydrophobic interactions • desolvation: surfaces buried between the protein and the ligand have to be desolvated • Conformational changes to protein and ligand • ligand must be properly orientated and translated to interact and form a complex • loss of entropy of the ligand due to being fixed in one conformation. b) Free energy of binding ΔGbind = ΔGsolvent + ΔGconf + ΔGint + ΔGrot + ΔGt/r + ΔGvib
  • 35. dsdht.wikispaces.com Conclusions • AutoDock, DOCK, e-Hits, FlexX, FRED, Glide, GOLD, LigandFit, QXP, Surflex-Dock…among others • Different scoring functions, different search algorithms, different approaches • See Section 12.5 in DC Young, Computational Drug Design (Wiley 2009) for good overview of different packages http://www.slideshare.net/baoilleach/proteinligand-docking
  • 36. Conclusions dsdht.wikispaces.com • Wide range of virtual screening techniques have been developed • The performance of different methods varies on different datasets • Increased complexity in descriptors and method does not necessarily lead to greater success. • Combining different approaches can lead to improved results. • Computational filters should be applied to remove undesirable compounds from further consideration.
  • 37. dsdht.wikispaces.com References • Ripphausen et al. (2010) Quo vadis, virtual screening? A comprehensive review of prospective applications. Journal of Medicinal Chemistry, 53, 8461-8467. • Scior et al. (2012) Recognizing pitfalls in virtual screening: a critical review. Journal of Chemical Information and Modeling, 52, 867-881 • Sotriffer (Ed) Virtual Screening. Principles, Challenges and PracticalGuidelines. Wiley-VCH, 2011. • Varnek A, Baskin I. Machine Learning Methods for Property Prediction in Chemoinformatics: Quo Vadis? Journal of Chemical Information and Modeling 2012, 52, 1413−1437 • Hartenfeller, M.; Schneider, G. Enabling future drug discovery by de novo design. Wiley Interdisciplinary Reviews-Computational Molecular Science 2011, 1, 742-759. • Open-source platform to benchmark fingerprints for ligand-based virtual screening Sereina Riniker, Gregory A Landrum Journal of Cheminformatics 2013, 5:26 (30 May 2013)