SlideShare uma empresa Scribd logo
1 de 48
Baixar para ler offline
Reihaneh Safavi-Sohi, Jahan B Ghasemi
Drug Design in silico Lab
Chem Faculty, K. N. Toosi Univ of Tech
Tehran, Iran
History
Graph theory is a branch of mathematics which studies the
structure of graphs and networks.
Graph theory started in 1736, when Euler solved the
problem known as the Konigsberg bridges problem.
2
 Graph considers sets of objects, called nodes, and the
relationships, called edges, between pairs of these objects.
 A collection of dots that may or may not be connected to
each other by lines. It doesn't matter how big the dots are,
how long the lines are, or whether the lines are straight,
curved.
 Neighbor: The neighbors of a vertex are all the vertices
which are connected to that vertex by a single edge.
 The degree of a vertex in a graph is the number
of edges that touch it.
 The definition is completely general, allowing graphs to
be used in many different application domains as long as
an appropriate representation can be derived.
3
Seven Bridges of Königsberg
The problem was to find a walk through the city that would cross each bridge once and
only once. The islands could not be reached by any route other than the bridges, and
every bridge must have been crossed completely every time.
4
Another Examples Of Graphs
5
Isomorphism Algorithm
 Isomorphism procedures enable the comparison of pairs of
graphs .
 To detect whether two graphs, or parts of there, are
identical.
 Maximal common subgraph (MCS) isomorphism enables
the identification of the largest subgraph common to a pair
of graphs.
6
A subgraph isomorphism exists if all the nodes (atoms) of one graph (GQ)
can be mapped to a subset of the nodes of the other graph (GF) in such a
way that the edges (bonds) of GQ simultaneously map to a subset of the
edges in GF.
 The labels carried by the nodes and edges
(atom type and bond type, respectively) mustbe
identical if the nodes or edges are to be mapped to each other.
 Testing for subgraph isomorphism is an NP-complete problem.
 Trying every possible way of mapping each of the n nodes in GQ onto one
of the nF nodes in GF (nQ < nF);
Each of the nQ!/(nF - nQ)!nF! possible mappings must then be tested to see
if any of them obeys the adjacency condition.
 Even for very small graphs, the number of possible mappings rapidly
becomes unmanageable.
 The use of this approach has been limited due to its combinatorially
explosive nature and it has not been possible to apply the technique to
complex and large structural databases.
7
8
Molecular Similarity
Molecular similarity is a measure of the degree of overlap
between a pair of molecules in some property space.
 It can be calculated for a wide range of molecular properties.
 An important tool in the generation of quantitative structure-
activity relationships (QSAR's).
 QSAR aim to link systematically the chemical or biological
properties of molecules to their structures (rational drug
design)
9
Why similarity searching is important?
 Development of combinatorial chemistry and the ability to synthesise and to
assay vastly greater numbers of molecules than even a very few years ago
requires tools to rationalise the resulting structural and biological data.
 Simply increasing the throughput of the synthesis and test cycle in
itself does not necessarily lead to more high quality lead compounds.
 Many of the techniques that have been developed are based on the concept
of molecular similarity.
 Similarity methods have been used for many years and are typically applied
early in the drug discovery process when little is known about the biological
target.
10
The main aims and objectives
1. Similarity searching methods in chemical data
bases (Identification of new Cliques)
2. Exploring Binding Site Similarity (to find similarities
amongst various cavities)
3. Ligand Superposition (Alignment)
4. Clustering compounds
11
Similarity searching requirements
Definition of :
1. A chemistry space
2. Molecular descriptors 2D, 3D
3. A similarity index
12
2D descriptors (Physicochemical properties)
 Molecular weight
 Octanol - water partition coefficient
 Total energy
 Heat of formation
 Ionization potential
 Molar refractivity
 2D Fingerprints : record the presence or absence of molecular
fragments within a molecule.
13
2D Substructure Searching
Query
N
O
O
N N
O
O
O
O
N N
N
N
NN
N
N O
N
N
O
O
N
O
O
N
O
O
14
Problem
• Structurally similar molecules will tend to have the
same properties.
• Less effective at identifying compounds that have
similar activities but that do not share the same
structural skeleton.
 Similar Property Principle: Structurally similar compounds
will exhibit
similar physicochemical and biological properties
 For lead discovery want a diverse space to locate all possible
hits (actives) – called a diverse library
15
3D Similarity
 Distance-based and angle-based descriptors (e.g. inter-atomic distance)
 Field similarity
 Comparative Molecular Field Analysis (CoMFA), CoMSIA
 Electrostatic potential
 Shape
 Electron density
 Any grid-based structural property
 Shape descriptors
 van der Waals volume and surface (reflect the size of substituents)
 Molecular Shape Analysis
 WHIM descriptors (Weighted Holistic Invariant Molecular Descriptors)
 Receptor binding
16
3D Substructure Searching
O
O N
a
b
c
a = 8.62 0.58 Angstroms
b = 7.08 0.56 Angstroms
c = 3.35 0.65 Angstroms+-
+-
+-
O
O
O
O
OO
N
O
O
O
N
N
N O
O
O
O
O
O
N
N
N
N
S
O
O
O O P O
O
O P O
O P
O
O
O
O
N
N
N
N
N
O
O
O O
O
N
N
N
O
N
O
O
O
17
3D descriptors
 3D descriptors are highly selective than 2D descriptors
 Handling of conformational flexibility is a major problem !
18
Creativity is not about inventing something
totally new, it is about making new connections!
(Albert Einstein)
219
History
• A back-tracking graph algorithm for establishing a
mapping between two structures was first described
by Ray and Kirsch in 1957.
• The use of reduced graph representations of
individual structures and the creation of
hyperstructures encompassing all
the structures in a database
 was first described by Martin,Y.C.,PeterWillett
 University of Sheffield, UK,1990.
20
The HierarchicalTree Substructure Search (HTSS)
Each level in the hierarchical fragment tree is effectively part of an
hierarchical classification of all the atoms in the database as a
whole, initially by number of neighbors and atom type, and then
by bonding pattern and atom type of neighbors.
21
Similarity Searching using Reduced Graphs
Reduced graphs are topological graphs that
attempt to summarize the features of molecules
that can result in drug-receptor binding while
retaining the connections between the features.
potential to find structures that have the same binding
characteristics with different carbon skeletons and may
belong to different lead series.
22
 R ring systems
 Ac acyclic comp.
 C corbon
 H heteroatom
 Ar aromatic rings
 F functional group
23
24
25
 Many ways of computing the similarity between two
molecules
 Different representations
 Different similarity coefficients
26
2D Similarity Searching
OH
N
N
OH
N
H
N NH2
O
N
N
H
N
NH2
N
H
N
NH2
N
H
N
N
N
H
N
N
H
OH
Query
27
2D Substructure Searching
Query
N
O
O
N N
O
O
O
O
N N
N
N
NN
N
N O
N
N
O
O
N
O
O
N
O
O
28
Similarity indices
29
Similarity indices
30
31
Calculation of similarity
32
Enrichment Factor
EF =
na : Number of actives, in the top of 10% of the ranked
hit list
Na : Total number of actives
33
New Cliques Detection
 Discovery of novel bioactive molecules
34
Graph-Based Molecular Alignment
(GMA) was first carried out by J.Apostolakis
35
RMSD = 1.29
Comparison between aligned pose and crystal structure
RMSD= 0.59
Template is kept rigid, query molecule is flexible
By using torsion space optimization
36
Searching 3D Protein Structures
 Extensive collaboration between Information Studies and
Molecular Biology and Biotechnology to develop graph
representations of proteins that can be searched with
isomorphism algorithms analogous to those used for
chemical structures
 Focus here on folding motifs (secondary structure elements)
in proteins and protein amino acid side chains
37
Representation Of Protein Folding Motifs
 The helix and strand secondary structure elements (SSE)
are both approximately linear, repeating structures, which
can be represented by vectors drawn along their major axes
 The nodes of the graph are these vectors (individual amino
acid side-chains) and the edges comprise:
 The angle between a pair of vectors
 The distance of closest approach of the two vectors
 The distance between the vectors’ mid-points
 PROTEP compares such representation using a maximal
common subgraph isomorphism algorithm to identify
common folds
38
Secondary Structure Elements (SSE)
39
The majority of PDB files include lines labeled HELIX and
SHEET, detailing the residues involved in α-helices and β-
sheets within the protein.
 This information is used to label each residue in the vector
file with
 ‘h’ to indicate α-helix
 ‘s’ to indicate β-sheet
 ‘x’ for residues that are in neither of these.
 e.g. ASPh
 A minority of PDB files (about 7.5% in the search files)
do not have secondary structure information
40
Data Fusion
 Improved performance can be obtained in many
classification tasks by combing evidence from several
different sources
 Calculate the similarity between a user query and each of the
documents in a database
 Rank the documents in order of decreasing similarity
 Repeat using several different representations, coefficients,
etc.
 Add the rank positions for a given document to give an overall
fused rank position
 The resulting fused ranking is the output from the search
 Small, but consistent, improvements in performance over use
of a single ranking
41
M2
M41
M23
M3
M9
M18
.
.
.
.
M6
M89
M234
M13
M90
M180
.
.
.
.
M8
M130
M257
M16
M99
M198
.
.
.
.
sort
M8
M16
M99
M130
M198
M257
.
.
.
.
sum
Ar/F R/F Fused Data
42
Fusion Of Chemical Similarity Coefficients
 Searches were carried out using different
similarity coefficients, and the resulting rankings
fused to give rankings corresponding to all
combinations of 1, 2, 3…. 20, 21 coefficients
 The effectiveness of a combination was evaluated
by the number of actives in the 10% top positions
of the fused ranking
43
Advantages Data Fusion:
 Fusion of rankings can provide a small, but consistent,
improvement in the effectiveness of searching if an
appropriate combination of coefficients is chosen
 Data fusion of this sort provides a simple, but highly
cost-effective, way of enhancing existing systems for
chemical similarity searching
44
Advantages of using Reduced Graph
 Highly effective at finding compounds that share the same
activity but that are based on different lead series.
 High-throughput screening
 Method is capable of analysing large data sets and can
deal with noisy data.(High speed )
45
Structure is not the sole factor for biological
activity
 Interactions with
environment
 Solvation effects
 Metabolism
 Time dependence
 More...
46
References
 Searching Databases of Three-Dimensional Structures , Chapter 6, pp 213-263 Martin Y.C.,
Willett P.
 Applications of Graph Theory in Chemistry (J. Chem. InJ Compur. Sci. 1985, 25, 334-343)
 Substructure Searching Methods: Old and New (J. Chem. Inf. Comput. Sci. 1993, 33, 532-538)
 Similarity Searching Using Reduced Graphs(J. Chem. Inf. Comput. Sci. 2003, 43, 338-345)
 Searching for Patterns of Amino Acids in 3D Protein Structures (J. Chem. Inf. Comput. Sci.
2003, 43, 412-421)
 Graph-Based Molecular Alignment (GMA),( J. Chem. Inf. Model. 2007, 47, 591-601)
 Chemical Similarity Searching (J. Chem. Inf. Comput. Sci. 1998, 38, 983-996)
 Scaffold Hopping Using Clique Detection Applied to Reduced Graphs (J. Chem. Inf. Model.
2006, 46, 503-511)
 Further Development of Reduced Graphs for Identifying Bioactive Compounds (J. Chem. Inf.
Comput. Sci. 2003, 43, 346-356)
 Maximum common subgraph isomorphism algorithms for the matching of chemical structures
(J. Comp. Aided. Mol. Des, 16: 521–533, 2002)
 A Robust Clustering Method for Chemical Structures (J. Med. Chem. 2005, 48, 4358-4366)
 Automatic Identification of Molecular Similarity Using Reduced-Graph Representation of
Chemical Structure (J. Chem. Inf: Comput. Sci. 1992, 32, 639-643)
47
Thank You

Mais conteúdo relacionado

Mais procurados

Introduction to Graph Theory
Introduction to Graph TheoryIntroduction to Graph Theory
Introduction to Graph Theory Kazi Md. Saidul
 
CS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf bookCS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf bookappasami
 
Slides Chapter10.1 10.2
Slides Chapter10.1 10.2Slides Chapter10.1 10.2
Slides Chapter10.1 10.2showslidedump
 
Applications of graphs
Applications of graphsApplications of graphs
Applications of graphsTech_MX
 
Graph Application in Traffic Control
Graph Application in Traffic ControlGraph Application in Traffic Control
Graph Application in Traffic ControlMuhammadu Isa
 
Graph Theory Introduction
Graph Theory IntroductionGraph Theory Introduction
Graph Theory IntroductionMANISH T I
 
Isomorphism (Graph)
Isomorphism (Graph) Isomorphism (Graph)
Isomorphism (Graph) Pritam Shil
 
Cs6702 graph theory and applications 2 marks questions and answers
Cs6702 graph theory and applications 2 marks questions and answersCs6702 graph theory and applications 2 marks questions and answers
Cs6702 graph theory and applications 2 marks questions and answersappasami
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)Learnbay Datascience
 
Graph theory Application
Graph theory ApplicationGraph theory Application
Graph theory ApplicationSultan Mehmood
 
Discreet_Set Theory
Discreet_Set TheoryDiscreet_Set Theory
Discreet_Set Theoryguest68d714
 
A study on connectivity in graph theory june 18 pdf
A study on connectivity in graph theory  june 18 pdfA study on connectivity in graph theory  june 18 pdf
A study on connectivity in graph theory june 18 pdfaswathymaths
 

Mais procurados (20)

Introduction to Graph Theory
Introduction to Graph TheoryIntroduction to Graph Theory
Introduction to Graph Theory
 
Graph theory
Graph theory Graph theory
Graph theory
 
CS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf bookCS6702 graph theory and applications notes pdf book
CS6702 graph theory and applications notes pdf book
 
Graph theory
Graph  theoryGraph  theory
Graph theory
 
Bipartite graph
Bipartite graphBipartite graph
Bipartite graph
 
Graph theory
Graph theoryGraph theory
Graph theory
 
Slides Chapter10.1 10.2
Slides Chapter10.1 10.2Slides Chapter10.1 10.2
Slides Chapter10.1 10.2
 
Applications of graphs
Applications of graphsApplications of graphs
Applications of graphs
 
Graph Application in Traffic Control
Graph Application in Traffic ControlGraph Application in Traffic Control
Graph Application in Traffic Control
 
Graph Theory Introduction
Graph Theory IntroductionGraph Theory Introduction
Graph Theory Introduction
 
Number theory
Number theoryNumber theory
Number theory
 
demonstration lecture on Homology modeling
demonstration lecture on Homology modelingdemonstration lecture on Homology modeling
demonstration lecture on Homology modeling
 
Isomorphism (Graph)
Isomorphism (Graph) Isomorphism (Graph)
Isomorphism (Graph)
 
Clustering
ClusteringClustering
Clustering
 
Cs6702 graph theory and applications 2 marks questions and answers
Cs6702 graph theory and applications 2 marks questions and answersCs6702 graph theory and applications 2 marks questions and answers
Cs6702 graph theory and applications 2 marks questions and answers
 
PCA (Principal component analysis)
PCA (Principal component analysis)PCA (Principal component analysis)
PCA (Principal component analysis)
 
Bayes' theorem
Bayes' theoremBayes' theorem
Bayes' theorem
 
Graph theory Application
Graph theory ApplicationGraph theory Application
Graph theory Application
 
Discreet_Set Theory
Discreet_Set TheoryDiscreet_Set Theory
Discreet_Set Theory
 
A study on connectivity in graph theory june 18 pdf
A study on connectivity in graph theory  june 18 pdfA study on connectivity in graph theory  june 18 pdf
A study on connectivity in graph theory june 18 pdf
 

Semelhante a Graph Theory and Molecular Similarity Techniques

Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...N. Sukumar
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Robin Gutell
 
Gutell 075.jmb.2001.310.0735
Gutell 075.jmb.2001.310.0735Gutell 075.jmb.2001.310.0735
Gutell 075.jmb.2001.310.0735Robin Gutell
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Atai Rabby
 
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphscsandit
 
Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559Robin Gutell
 
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...ArchiLab 7
 
IJBB-51-3-188-200
IJBB-51-3-188-200IJBB-51-3-188-200
IJBB-51-3-188-200sankar basu
 
Quantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxQuantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxRadhaChafle1
 
Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)Mathew Shum
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformaticsbaoilleach
 
Gutell 079.nar.2001.29.04724
Gutell 079.nar.2001.29.04724Gutell 079.nar.2001.29.04724
Gutell 079.nar.2001.29.04724Robin Gutell
 
Paper Explained: Understanding the wiring evolution in differentiable neural ...
Paper Explained: Understanding the wiring evolution in differentiable neural ...Paper Explained: Understanding the wiring evolution in differentiable neural ...
Paper Explained: Understanding the wiring evolution in differentiable neural ...Devansh16
 

Semelhante a Graph Theory and Molecular Similarity Techniques (20)

Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...Disintegration of the small world property with increasing diversity of chemi...
Disintegration of the small world property with increasing diversity of chemi...
 
Lanjutan kimed
Lanjutan kimedLanjutan kimed
Lanjutan kimed
 
Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383Gutell 119.plos_one_2017_7_e39383
Gutell 119.plos_one_2017_7_e39383
 
Qsar
QsarQsar
Qsar
 
acs.jpca.9b08723.pdf
acs.jpca.9b08723.pdfacs.jpca.9b08723.pdf
acs.jpca.9b08723.pdf
 
3 D QSAR Approaches and Contour Map Analysis
3 D QSAR Approaches and Contour Map Analysis3 D QSAR Approaches and Contour Map Analysis
3 D QSAR Approaches and Contour Map Analysis
 
Gutell 075.jmb.2001.310.0735
Gutell 075.jmb.2001.310.0735Gutell 075.jmb.2001.310.0735
Gutell 075.jmb.2001.310.0735
 
Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)Quantative Structure-Activity Relationships (QSAR)
Quantative Structure-Activity Relationships (QSAR)
 
Protein Threading
Protein ThreadingProtein Threading
Protein Threading
 
poster-lowe-6
poster-lowe-6poster-lowe-6
poster-lowe-6
 
A tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus KaiserA tutorial in Connectome Analysis (1) - Marcus Kaiser
A tutorial in Connectome Analysis (1) - Marcus Kaiser
 
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphsUse of eigenvalues and eigenvectors to analyze bipartivity of network graphs
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
 
Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559Gutell 061.nar.1997.25.01559
Gutell 061.nar.1997.25.01559
 
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...Coates  p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
Coates p 1999: exploring 3_d design worlds using lindenmeyer systems and gen...
 
IJBB-51-3-188-200
IJBB-51-3-188-200IJBB-51-3-188-200
IJBB-51-3-188-200
 
Quantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptxQuantitative Structure Activity Relationship.pptx
Quantitative Structure Activity Relationship.pptx
 
Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)Urop poster 2014 benson and mat 5 13 14 (2)
Urop poster 2014 benson and mat 5 13 14 (2)
 
Cheminformatics
CheminformaticsCheminformatics
Cheminformatics
 
Gutell 079.nar.2001.29.04724
Gutell 079.nar.2001.29.04724Gutell 079.nar.2001.29.04724
Gutell 079.nar.2001.29.04724
 
Paper Explained: Understanding the wiring evolution in differentiable neural ...
Paper Explained: Understanding the wiring evolution in differentiable neural ...Paper Explained: Understanding the wiring evolution in differentiable neural ...
Paper Explained: Understanding the wiring evolution in differentiable neural ...
 

Último

Introduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative BiolabsIntroduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative BiolabsCreative-Biolabs
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxtuking87
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasChayanika Das
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxpriyankatabhane
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionJadeNovelo1
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPRPirithiRaju
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptAmirRaziq1
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxGiDMOh
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docxUlahVanessaBasa
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerLuis Miguel Chong Chong
 
Telephone Traffic Engineering Online Lec
Telephone Traffic Engineering Online LecTelephone Traffic Engineering Online Lec
Telephone Traffic Engineering Online Lecfllcampolet
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsSérgio Sacani
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 

Último (20)

Introduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative BiolabsIntroduction of Organ-On-A-Chip - Creative Biolabs
Introduction of Organ-On-A-Chip - Creative Biolabs
 
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptxQ4-Mod-1c-Quiz-Projectile-333344444.pptx
Q4-Mod-1c-Quiz-Projectile-333344444.pptx
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika DasBACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
BACTERIAL SECRETION SYSTEM by Dr. Chayanika Das
 
Ultrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptxUltrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptx
 
Environmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptxEnvironmental acoustics- noise criteria.pptx
Environmental acoustics- noise criteria.pptx
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and Function
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.ppt
 
Introduction Classification Of Alkaloids
Introduction Classification Of AlkaloidsIntroduction Classification Of Alkaloids
Introduction Classification Of Alkaloids
 
DNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptxDNA isolation molecular biology practical.pptx
DNA isolation molecular biology practical.pptx
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of Cancer
 
Telephone Traffic Engineering Online Lec
Telephone Traffic Engineering Online LecTelephone Traffic Engineering Online Lec
Telephone Traffic Engineering Online Lec
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
Observational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive starsObservational constraints on mergers creating magnetism in massive stars
Observational constraints on mergers creating magnetism in massive stars
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 

Graph Theory and Molecular Similarity Techniques

  • 1. Reihaneh Safavi-Sohi, Jahan B Ghasemi Drug Design in silico Lab Chem Faculty, K. N. Toosi Univ of Tech Tehran, Iran
  • 2. History Graph theory is a branch of mathematics which studies the structure of graphs and networks. Graph theory started in 1736, when Euler solved the problem known as the Konigsberg bridges problem. 2
  • 3.  Graph considers sets of objects, called nodes, and the relationships, called edges, between pairs of these objects.  A collection of dots that may or may not be connected to each other by lines. It doesn't matter how big the dots are, how long the lines are, or whether the lines are straight, curved.  Neighbor: The neighbors of a vertex are all the vertices which are connected to that vertex by a single edge.  The degree of a vertex in a graph is the number of edges that touch it.  The definition is completely general, allowing graphs to be used in many different application domains as long as an appropriate representation can be derived. 3
  • 4. Seven Bridges of Königsberg The problem was to find a walk through the city that would cross each bridge once and only once. The islands could not be reached by any route other than the bridges, and every bridge must have been crossed completely every time. 4
  • 6. Isomorphism Algorithm  Isomorphism procedures enable the comparison of pairs of graphs .  To detect whether two graphs, or parts of there, are identical.  Maximal common subgraph (MCS) isomorphism enables the identification of the largest subgraph common to a pair of graphs. 6
  • 7. A subgraph isomorphism exists if all the nodes (atoms) of one graph (GQ) can be mapped to a subset of the nodes of the other graph (GF) in such a way that the edges (bonds) of GQ simultaneously map to a subset of the edges in GF.  The labels carried by the nodes and edges (atom type and bond type, respectively) mustbe identical if the nodes or edges are to be mapped to each other.  Testing for subgraph isomorphism is an NP-complete problem.  Trying every possible way of mapping each of the n nodes in GQ onto one of the nF nodes in GF (nQ < nF); Each of the nQ!/(nF - nQ)!nF! possible mappings must then be tested to see if any of them obeys the adjacency condition.  Even for very small graphs, the number of possible mappings rapidly becomes unmanageable.  The use of this approach has been limited due to its combinatorially explosive nature and it has not been possible to apply the technique to complex and large structural databases. 7
  • 8. 8
  • 9. Molecular Similarity Molecular similarity is a measure of the degree of overlap between a pair of molecules in some property space.  It can be calculated for a wide range of molecular properties.  An important tool in the generation of quantitative structure- activity relationships (QSAR's).  QSAR aim to link systematically the chemical or biological properties of molecules to their structures (rational drug design) 9
  • 10. Why similarity searching is important?  Development of combinatorial chemistry and the ability to synthesise and to assay vastly greater numbers of molecules than even a very few years ago requires tools to rationalise the resulting structural and biological data.  Simply increasing the throughput of the synthesis and test cycle in itself does not necessarily lead to more high quality lead compounds.  Many of the techniques that have been developed are based on the concept of molecular similarity.  Similarity methods have been used for many years and are typically applied early in the drug discovery process when little is known about the biological target. 10
  • 11. The main aims and objectives 1. Similarity searching methods in chemical data bases (Identification of new Cliques) 2. Exploring Binding Site Similarity (to find similarities amongst various cavities) 3. Ligand Superposition (Alignment) 4. Clustering compounds 11
  • 12. Similarity searching requirements Definition of : 1. A chemistry space 2. Molecular descriptors 2D, 3D 3. A similarity index 12
  • 13. 2D descriptors (Physicochemical properties)  Molecular weight  Octanol - water partition coefficient  Total energy  Heat of formation  Ionization potential  Molar refractivity  2D Fingerprints : record the presence or absence of molecular fragments within a molecule. 13
  • 14. 2D Substructure Searching Query N O O N N O O O O N N N N NN N N O N N O O N O O N O O 14
  • 15. Problem • Structurally similar molecules will tend to have the same properties. • Less effective at identifying compounds that have similar activities but that do not share the same structural skeleton.  Similar Property Principle: Structurally similar compounds will exhibit similar physicochemical and biological properties  For lead discovery want a diverse space to locate all possible hits (actives) – called a diverse library 15
  • 16. 3D Similarity  Distance-based and angle-based descriptors (e.g. inter-atomic distance)  Field similarity  Comparative Molecular Field Analysis (CoMFA), CoMSIA  Electrostatic potential  Shape  Electron density  Any grid-based structural property  Shape descriptors  van der Waals volume and surface (reflect the size of substituents)  Molecular Shape Analysis  WHIM descriptors (Weighted Holistic Invariant Molecular Descriptors)  Receptor binding 16
  • 17. 3D Substructure Searching O O N a b c a = 8.62 0.58 Angstroms b = 7.08 0.56 Angstroms c = 3.35 0.65 Angstroms+- +- +- O O O O OO N O O O N N N O O O O O O N N N N S O O O O P O O O P O O P O O O O N N N N N O O O O O N N N O N O O O 17
  • 18. 3D descriptors  3D descriptors are highly selective than 2D descriptors  Handling of conformational flexibility is a major problem ! 18
  • 19. Creativity is not about inventing something totally new, it is about making new connections! (Albert Einstein) 219
  • 20. History • A back-tracking graph algorithm for establishing a mapping between two structures was first described by Ray and Kirsch in 1957. • The use of reduced graph representations of individual structures and the creation of hyperstructures encompassing all the structures in a database  was first described by Martin,Y.C.,PeterWillett  University of Sheffield, UK,1990. 20
  • 21. The HierarchicalTree Substructure Search (HTSS) Each level in the hierarchical fragment tree is effectively part of an hierarchical classification of all the atoms in the database as a whole, initially by number of neighbors and atom type, and then by bonding pattern and atom type of neighbors. 21
  • 22. Similarity Searching using Reduced Graphs Reduced graphs are topological graphs that attempt to summarize the features of molecules that can result in drug-receptor binding while retaining the connections between the features. potential to find structures that have the same binding characteristics with different carbon skeletons and may belong to different lead series. 22
  • 23.  R ring systems  Ac acyclic comp.  C corbon  H heteroatom  Ar aromatic rings  F functional group 23
  • 24. 24
  • 25. 25
  • 26.  Many ways of computing the similarity between two molecules  Different representations  Different similarity coefficients 26
  • 27. 2D Similarity Searching OH N N OH N H N NH2 O N N H N NH2 N H N NH2 N H N N N H N N H OH Query 27
  • 28. 2D Substructure Searching Query N O O N N O O O O N N N N NN N N O N N O O N O O N O O 28
  • 31. 31
  • 33. Enrichment Factor EF = na : Number of actives, in the top of 10% of the ranked hit list Na : Total number of actives 33
  • 34. New Cliques Detection  Discovery of novel bioactive molecules 34
  • 35. Graph-Based Molecular Alignment (GMA) was first carried out by J.Apostolakis 35
  • 36. RMSD = 1.29 Comparison between aligned pose and crystal structure RMSD= 0.59 Template is kept rigid, query molecule is flexible By using torsion space optimization 36
  • 37. Searching 3D Protein Structures  Extensive collaboration between Information Studies and Molecular Biology and Biotechnology to develop graph representations of proteins that can be searched with isomorphism algorithms analogous to those used for chemical structures  Focus here on folding motifs (secondary structure elements) in proteins and protein amino acid side chains 37
  • 38. Representation Of Protein Folding Motifs  The helix and strand secondary structure elements (SSE) are both approximately linear, repeating structures, which can be represented by vectors drawn along their major axes  The nodes of the graph are these vectors (individual amino acid side-chains) and the edges comprise:  The angle between a pair of vectors  The distance of closest approach of the two vectors  The distance between the vectors’ mid-points  PROTEP compares such representation using a maximal common subgraph isomorphism algorithm to identify common folds 38
  • 40. The majority of PDB files include lines labeled HELIX and SHEET, detailing the residues involved in α-helices and β- sheets within the protein.  This information is used to label each residue in the vector file with  ‘h’ to indicate α-helix  ‘s’ to indicate β-sheet  ‘x’ for residues that are in neither of these.  e.g. ASPh  A minority of PDB files (about 7.5% in the search files) do not have secondary structure information 40
  • 41. Data Fusion  Improved performance can be obtained in many classification tasks by combing evidence from several different sources  Calculate the similarity between a user query and each of the documents in a database  Rank the documents in order of decreasing similarity  Repeat using several different representations, coefficients, etc.  Add the rank positions for a given document to give an overall fused rank position  The resulting fused ranking is the output from the search  Small, but consistent, improvements in performance over use of a single ranking 41
  • 43. Fusion Of Chemical Similarity Coefficients  Searches were carried out using different similarity coefficients, and the resulting rankings fused to give rankings corresponding to all combinations of 1, 2, 3…. 20, 21 coefficients  The effectiveness of a combination was evaluated by the number of actives in the 10% top positions of the fused ranking 43
  • 44. Advantages Data Fusion:  Fusion of rankings can provide a small, but consistent, improvement in the effectiveness of searching if an appropriate combination of coefficients is chosen  Data fusion of this sort provides a simple, but highly cost-effective, way of enhancing existing systems for chemical similarity searching 44
  • 45. Advantages of using Reduced Graph  Highly effective at finding compounds that share the same activity but that are based on different lead series.  High-throughput screening  Method is capable of analysing large data sets and can deal with noisy data.(High speed ) 45
  • 46. Structure is not the sole factor for biological activity  Interactions with environment  Solvation effects  Metabolism  Time dependence  More... 46
  • 47. References  Searching Databases of Three-Dimensional Structures , Chapter 6, pp 213-263 Martin Y.C., Willett P.  Applications of Graph Theory in Chemistry (J. Chem. InJ Compur. Sci. 1985, 25, 334-343)  Substructure Searching Methods: Old and New (J. Chem. Inf. Comput. Sci. 1993, 33, 532-538)  Similarity Searching Using Reduced Graphs(J. Chem. Inf. Comput. Sci. 2003, 43, 338-345)  Searching for Patterns of Amino Acids in 3D Protein Structures (J. Chem. Inf. Comput. Sci. 2003, 43, 412-421)  Graph-Based Molecular Alignment (GMA),( J. Chem. Inf. Model. 2007, 47, 591-601)  Chemical Similarity Searching (J. Chem. Inf. Comput. Sci. 1998, 38, 983-996)  Scaffold Hopping Using Clique Detection Applied to Reduced Graphs (J. Chem. Inf. Model. 2006, 46, 503-511)  Further Development of Reduced Graphs for Identifying Bioactive Compounds (J. Chem. Inf. Comput. Sci. 2003, 43, 346-356)  Maximum common subgraph isomorphism algorithms for the matching of chemical structures (J. Comp. Aided. Mol. Des, 16: 521–533, 2002)  A Robust Clustering Method for Chemical Structures (J. Med. Chem. 2005, 48, 4358-4366)  Automatic Identification of Molecular Similarity Using Reduced-Graph Representation of Chemical Structure (J. Chem. Inf: Comput. Sci. 1992, 32, 639-643) 47