Molecular Activity Prediction Using Graph Convolutional Deep Neural Network Considering Distance on a Molecular Graph
Int’l Workshop on Mathematical Modeling and Problem Solving (MPS)
2019 Int’l Conference on Parallel and Distributed Processing Techniques & Applications (PDPTA’19)
Session 2. July 29, 2019 @Luxor, Las Vegas
https://americancse.org/events/csce2019/program/pdp_csc_ipc_msv_gcc_29
Night 7k to 12k Navi Mumbai Call Girl Photo 👉 BOOK NOW 9833363713 👈 ♀️ night ...
Molecular Activity Prediction Using Graph Convolutional Deep Neural Network Considering Distance on a Molecular Graph
1. Molecular Activity Prediction Using
Graph Convolutional Deep Neural Network
Considering Distance on a Molecular Graph
Int’l Workshop on Mathematical Modeling and Problem Solving (MPS)
2019 Int’l Conference on Parallel and Distributed Processing Techniques & Applications (PDPTA’19)
Session 2. July 29, 2019 @Las Vegas
Masahito Ohue Ryota Ii Keisuke Yanagisawa Yutaka Akiyama
Department of Computer Science, School of Computing,
Tokyo Institute of Technology, JAPAN
2. Agenda
• Introduction
– Computer-Aided Drug Discovery
– Graph Convolutional Deep Neural Network
– Weave Module
• Proposed Method
A) Modify atom distance in ring structures
B) Modify convolution of pair features
C) Modify assembling pair features
• Computational Experiments
• Conclusion
1
4. • Drug discovery and development
– >10 years time and >2 billion US dollars
– Possibility to reduce costs by computational approaches
• Activity prediction, toxicity prediction, molecular property prediction
– Machine learning is powerful tool for CADD
3
Computer-Aided Drug Discovery (CADD)
Paul SM, et al. Nat Rev Drug Discov. 2010, 9(3):203.
5. Graph Convolutional Network (GCN)
4
C
O
C
C
C
C
C
C C
Br
S
N N
convert
feature vector
molecule
molecular graph
input
convolutional
neural network (CNN)
Traditional approach
molecule
convert
machine learning
model
SVM, Random Forest, LightGBM, …
molecular vector
(fingerprint, descriptor)
input
Graph convolutional network (GCN) approach
represent a molecule as a graph;
atoms → nodes, bonds → edges
6. Related Work
5
[Duvenaud+2015] Duvenaud DK, et al. In Proc NIPS, 2215-2223, 2015.
[Altae-Tran+2017] Altae-Tran H, et al. ACS Central Science, 3: 283–293, 2017.
a) Neural graph fingerprints
・Generate molecular fingerprint with neural network
・Update atom feature only using adjacent atoms
・Use different weights for node degrees
b) GCN by Altae-Tran
・Update atom feature by convolutional and pooling layers
only using adjacent atoms
・They did not consider property of edges (bonds)
・They did not consider atoms other than 1-neighbor
[Altae-Tran+2017]
[Duvenaud+2015]
7. Atom
feature
Pair
feature
Related Work
6
Information of distant atoms can be considered (not just 1-neighbors)
c) Weave module
Weave module considers not only atoms but also atom pairs
[Kearnes+2016] Kearnes S, et al. J Comput-Aided Mol Des, 30(8): 593-608, 2016.
[Kearnes+2016]
The difference in distance between atom pairs
was not considered
Atom
feature
Pair
feature
Atom
feature
Pair
feature
… y
Weave layer 0 Weave
layer 1
Weave
layer k
softmax
fullyconnectedlayer
8. Distances on Molecular Graph
HIV dataset MUV dataset PCBA dataset
80
70
60
50
40
30
20
10
0
1 2 3 4 5 6 7 8 9
distance
80
70
60
50
40
30
20
10
0
80
70
60
50
40
30
20
10
0
1 2 3 4 5 6 7 8 9
distance
1 2 3 4 5 6 7 8 9
distance
The distribution of interatomic distances is not uniform
▶ It is necessary to consider the difference of atom-atom distance
count
We counted all atom-atom distances in 3 datasets from MoleculeNet [Wu+2018]
[Wu+2018] Wu Z, et al. Chem Sci, 9: 513–530, 2018. 7
9. Graph vs. 3D Structure
8
Molecular Graph 3D Structure
The distance on the graph does not necessarily correlate with
the Euclidean distance between atoms on the 3D structure
▶ Need to consider to modify the definition of
graph distance
10. Purpose of This Study
9
Operation
in Weave module
Ordinary Weave module This study
(A) Generation of pair
features
Use ordinary graph
distance on molecular
graph
Prop. A
Correction of distances
related to atoms in ring
structures
(B) Convolution of
pair features
Use the same weight
regardless of the distance
on the graph
Prop. B
Convolution of pair features
with different weights
(C) Assembly of pair
features
Add in uniformly
Prop. C
Use weighted sum based on
distance
Improve the Weave module considering the distance
between atom pairs on the molecular graph
14. Initial Features
13
Weave layer 0
arom
C N O F P S Cl Br I metal R S FC PC R=3 R=4 R=5 R=6 R=7 R=8 sp sp
2
sp
3
HBA HBD arom
atom type (1-hot or NULL) chiral charge ring size (1-hot or NULL) hybridization H-bond
same ring?
1 2 3 arom d =1 d =2 d =3 d =4 d =5 d =6 d =7 yes/no
bond type (1-hot or NULL) shortest path length ≦ d
Atom
feature
Pair
feature
Atom
feature
Pair
feature
atom
atom pair
Initial atom vector
Initial atom pair vector
15. Pair→Atom Transform Operation (convolution)
14
Atom
feature
Pair
feature
Atom
feature
Pair
feature
Weave layer k
Input
Output
Weight
Bias vector
atom
i
Activation
convolution
16. Proposed Method
15
A. Correction of distances related to atoms in ring structures
B. Convolution of pair features with different weights
C. Reweighting pair features by its distance
17. A. Correction of distances related to atoms in ring
16
The ring structure is relatively rigid in terms of the actual
molecular conformation compared to the chain structure.
▶ We modified the distance on rings shorter.
At the ortho position and meta position → dist = 1
At the para position → dist = 2 ortho
meta
para
examples
Graph distance closer to the trend of atom-atom distance
in 3D structure
18. B. Convolution of pair features with different weights
17
Weights according to the distance
were used for atom pairs and
convolution was performed.
Weave module This Study
Pair features were convoluted using
the same weight, regardless of the
distance length.
Can consider distances of atom pairs in conv. process
19. C. Reweighting pair features by its distance
18
Distant atom pairs are less important than nearby atom pairs
▶ Represented by decreasing weight as the distance is larger
Can change the importance of distant and near atoms
21. Dataset
• We used benchmarking datasets for molecular activity
prediction from MoleculeNet [Wu+2018]
– Hydrogen atoms were omitted
– Molecules with the huge number of heavy atoms exceeding
maximum number of atoms, nmax (=60), were excluded
20[Wu+2018] Wu Z, et al. Chem Sci, 9: 513–530, 2018.
23. Prediction Scheme
22
training
validation
test
prediction
model
predict
Results were evaluated by
AUC (area under the ROC curve)
dataset
For each task , the best epoch was selected that gives the best
AUC value for the validation data.
Then the selected prediction model with was applied to the test data.
Perfect prediction :1.0
Random prediction:0.5
25. Performance of Props. A and B
24
0.801 0.803 0.806 0.806
0.743
0.783
0.738
0.760
0.824 0.825 0.823 0.823
0.5
0.6
0.7
0.8
Weave Prop. A Prop. B Prop. A&B
AUC
HIV MUV PCBA
Proposed Method
Prop. A improves accuracy with HIV/MUV, and shows particularly good performance with MUV.
Prop. B is slightly more accurate with HIV.
PCBA did not change in performance ← Depending on the size of the dataset?
26. Performance of Prop. C
25
0.801
0.772
0.807 0.803
0.743
0.721
0.749 0.752
0.5
0.6
0.7
0.8
Weave step linear quadratic
AUC
HIV MUV
Prop. C
Improved performance with linear and quadratic.
Similar to Prop. A and B, the improvement in MUV is remarkable.
27. Why did MUV improve its accuracy well?
26
• MUV is an unbalanced dataset with extremely few positive
samples.
• Considering the actual drug discovery hit rate,
MUV is closest to the real activity prediction problem.
The improvement of this study may be more suitable for
real-world data in the field of drug discovery.
28. Why did Prop. B not perform well?
27
0th Weave layer 1st Weave layer Weave
dist 1
dist 2
dist 3
dist ∞
dist 4
dist 5
dist 0
11.5
11.0
10.5
10.0
9.50
0 20 40 60 80 100
15
14
13
12
11
10
9
0 20 40 60 80 100
epoch epoch
We confirmed how the weight matrices changed as the learning progressed
by using Frobenius norm.
Frobenius norm
It is possible to improve the model
performance by using different weights
The slopes are almost same
▶ It may not be necessary to use different weights
in the 1st layer
▶
30. Summary
29
Atom
feature
Pair
feature
This study targeted Weave module in the activity prediction problem
Atom
feature
Pair
feature
Atom
feature
Pair
feature
… y
Weave layer 0 Weave
layer 1
Weave
layer k
softmax
fullyconnectedlayer
We modified these Weave operations
A. Correction of distances related to atoms in ring structures
B. Convolution of pair features with different weights
C. Reweighting pair features by its distance
A
B,C
31. Summary
30
A. correction of the distance on the graph
in the ring structure in the compound
– Prediction accuracy is improved compared to Weave
– Pair features between distant atoms were also used
effectively
B. convolution of paired features with
different weights for different distances
– More generalized model by using different weights
in the convolution process
– Accuracy was slightly higher than the Weave.
– More effective on 0th Weave layer
C. reweighting pair features by its distance
– Using linear and quadratic weights are effective
32. Future Work
• Weave transform operation is complicated, it may not be
possible to achieve a drastic improvement in the accuracy
simply by improving the pair→atom transform.
Other operations can also be improved by utilizing distance
information.
• It is worthwhile to verify that these improvements apply to
other tasks of compound supervised learning.
e.g. side-effect prediction, toxicity prediction, stability prediction
31