SlideShare uma empresa Scribd logo
1 de 29
MolGAN: An implicit generative
model for small molecular graphs
N. De Cao and T. Kipf
(Informatics Institute, University of Amsterdam)
ICML Deep Generative Models Workshop (2018)
arXiv:1805.11973
Gpat Journal Club 2018.10.12, Ryohei Suzuki
Research Summary
• Automatic generation of drug-like small molecules
• Generative Adversarial Net + Graph Neural Network
+ Reinforcement Learning
• Optimization of biochemical properties (e.g., solubility)
→ first step toward in-silico screening by ML
※It is not aimed at designing drugs for specific purposes
About the authors
T. Kipf (Ph.D cand.)
• https://tkipf.github.io/
• Supervisor: Max Welling (ML)
N. De Cao (Ph.D cand.)
• https://nicola-decao.github.io/
• Supervisor: Ivan Titov (NLP)
Supervisor of D. Kingma Pupil of G. t’Hooft
(author of Adam, VAE, etc.) (quantum gravity, string theory)
citation count
1999 (electro-weak)
Drug design / drug discovery (DD)
Properties required for drugs
• Useful bioactivity
• Controllable side effect
• Synthesizability
• Having effect after metabolism (cf. drug delivery)
Vast time and monetary cost of animal/human experiments
→ in-silico screening using computers
Screening by simulation
Case of target drug:
1. Structure determination of
target protein
2. Decision of target site
3. Static affinity prediction
4. Dynamic binding simulation (MD)
days-weeks computation time /molecule
Gefitinib
Mutated EGFR
(non small cell lung cancer)
Why is drug design difficult?
1. Very large and high-dimensional search space
- over 60,000 permutation for only 10 C/N/O atoms
- very limited atomic permutations give valid structure
2. Discrete optimization of molecular structure
- continuous/gradual optimization is not possible
3. Slight change in structure results in large effects
- COH and COOH are absolutely different
Why is drug design difficult?
4. No appropriate data structure for molecular structure
5. Predicting biochemical properties is essentially difficult
- Even QM/MM has limitation. Wet exp. is necessary
CN1CCC[C@H]1c2cccnc2
Image SMILES representation 3D structure
(important for proteins)
Will ML solve the problems?
1. Very large and high-dimensional search space
→ Generative models (e.g. GAN) can
effectively represent complex/high-dimensional data
2. Discrete optimization of molecular structure
→ Goal of this study is just rough screening
(not fine-tuning of specific drugs)
3. Slight change in structure results in large effects
→ Pinpoint affinity prediction can be difficult for ML.
ML suites predicting general properties like solubility
Will ML solve the problems?
4. No appropriate data structure for molecular structure
→ Graph representation
+ Graph convolutional neural network
5. Predicting biochemical properties is essentially difficult
→ ML wouldn’t solve this fundamental problem.
Improved simulation methods are also needed
Problem definition
Generating molecular structure without specific usages
• Generated molecules are evaluated by:
1. Druglikeness (QED: Bickerton et al., 2012)
2. Synthesizability (Synthetic Accessibility: Ertl & Schuffenhauer, 2009)
3. Solubility (logP: Comer & Tam, 2001)
• Methods are evaluated by:
1. Validness = valid structure / output structure
2. Novelty = ratio of valid structures not included in training dataset
3. Uniqueness = unique valid molecules / total valid molecules
Overview
Generator:
Transforms noise
into a structure
Generated
structure Discriminator:
Judges structure
is valid or not
Reward Network:
Predict the properties
of molecular structures
Goal: obtaining a generator that can output
valid molecular structures with good properties
Revisiting neural networks
https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6
1. Input an image or some value
2. Multiple transformation
3. Value (regression) or category (classification) is
outputted
4. Calculate “loss” value
5. Refine the transformation parameter to improve the
loss value (back-propagation)
Generative models
• classification:judge an image to be cat or dog
• regression:predict f(0.5) from f(0), f(1)
• generation:generate data distribution like training data
https://blog.openai.com/generative-models/
Generative models
• 識別モデル:画像を入力してカテゴリ(犬か猫か)を判定
• 回帰モデル:f(0), f(1)が分かってるときのf(0.5)を予測
• 生成モデル:データセットの分布と同じようなデータを生成
https://blog.openai.com/generative-models/
Challenge:
How to calculate the “loss” value to train the model
to generate a “distribution like given dataset?”
Generative Adversarial Net (GAN)
“Rat race between fake bill maker vs. police”
• generator:generate data as resemble as possible dataset samples
• discriminator:distinguish real / fake data as precise as possible
→ train two modules alternately
do not calculate actual distribution
→ danger of mode collapse
https://towardsdatascience.com/generative-adversarial-networks-explained-34472718707a
Power of GANs
e.g., BigGANs (Brock et al., 2018)
Generated Images
Continuous morphing of input noise
Continuous change of noise
gives semantically continuous
change of Image
=learned useful representation
Molecular structure representation
Image:human-interpretable, but inefficient
SMILES:rich information, but syntax is too strict
3D:very rich information, large data size, invariance problem
CN1CCC[C@H]1c2cccnc2
2D Image SMILES 3D structure
Graph and molecular structure
Graph:Network structure consist of nodes V and edges E
Node=atom / Edge=bond → Graph = molecule
https://ja.wikipedia.org/wiki/%E9%9A%A3%E6%8E%A5%E8%A1%8C%E5%88%97
simple graph Adjacency matrix
Node matrix Adjacency tensor
2D-convolution for images
https://developer.nvidia.com/discover/convolutional-neural-network
Convolution:Applying filters for an entire image
http://timdettmers.com/2015/03/26/convolution-deep-learning/
Convolutional Neural Network
Extract abstract information of images
by repeated 2D-convolutions
Graph convolution (Kipf&Welling ICLR2017)
Convolution can be also defined for graphs!
http://tkipf.github.io/misc/SlidesCambridge.pdf
Reinforcement Learning
Learning framework for robot movement
Action under an environment gives
a reward reflecting the goodness
ex) going toward a hole results in death of Mario
Optimizing the policy to maximize the reward
ex) Jump when a hole is located in front of Mario
https://en.wikipedia.org/wiki/Reinforcement_learning
LR for Molecular Design
Action:Generation of a molecule
Environment/Reward:biochemical evaluation of molecule
Policy:Generative model
druglikeness:0.9
synthesizability:0.1
solubility:0.3
…
Feedback
External
software
Design of MolGAN (1) GAN
• Gen directly output a graph
in adjacency matrix
• Gen is a MLP
• Dis judges the validness of a
molecule
• Dis is a graph convolutional
• WGAN-GP* loss
*Please refer to the material of Fukuta-san’s lecture
Design of MolGAN (2) LR
Deep deterministic policy gradient
• Reward network mimics external
program to evaluate molecules
• Reward network has same structure
as the dis
• Reward loss = output of reward
network
• Blend GAN loss & reward loss
Examples of generated molecules
※numbers: druglikeness (QED score)
Exp.1: valance of GAN/reward loss
Evaluate generated molecules with changing the loss valance
Result:Only reward loss is necessary
Exp.2: comparison with other methods
• Validity:
Others: 85-95%
MolGAN: 98-100%
• Uniqueness:
Others: 10-70%
MolGAN: 2%
• Time consumption:
1/10-1/2 to others
Exp.2: comparison with other methods
• druglikeness
• synthesizability
• solubility
Higher score than other methods
for all the properties
Discussion
Pros
• Very high (~100%) valid output structure ratio
• GraphNN+LR is effective for biochemical optimization
• Light computational cost, fast learning
Cons / Future work
• mode collapse = same structure is repeatedly generated
→ normalization techniques (e.g., spectral norm) are useful?
• Fixed atom count

Mais conteúdo relacionado

Mais procurados

Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure PredictionSelimReza76
 
Computational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design TechniquesComputational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design TechniquesPablo Carbonell
 
IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...
IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...
IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...Mekhla Diwan
 
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureAn Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureDevakumar Jain
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its toolsGaurav Diwakar
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure predictionkaramveer prajapat
 
System's Biology
System's Biology System's Biology
System's Biology Pritam Shil
 
Molecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular ModelingMolecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular ModelingAkshay Kank
 
The Basic of Molecular Dynamics Simulation
The Basic of Molecular Dynamics SimulationThe Basic of Molecular Dynamics Simulation
The Basic of Molecular Dynamics SimulationSyed Lokman
 
Structural bioinformatics.
Structural bioinformatics.Structural bioinformatics.
Structural bioinformatics.SALIHAMUGHAL
 
Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Melvin Alex
 
Structure based drug design
Structure based drug designStructure based drug design
Structure based drug designADAM S
 
Computer Aided Molecular Modeling
Computer Aided Molecular ModelingComputer Aided Molecular Modeling
Computer Aided Molecular Modelingpkchoudhury
 
Chemo informatics scope and applications
Chemo informatics scope and applicationsChemo informatics scope and applications
Chemo informatics scope and applicationsshyam I
 
Molecular dynamics and Simulations
Molecular dynamics and SimulationsMolecular dynamics and Simulations
Molecular dynamics and SimulationsAbhilash Kannan
 

Mais procurados (20)

Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
 
Homology modelling
Homology modellingHomology modelling
Homology modelling
 
Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure Prediction
 
Computational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design TechniquesComputational Protein Design. 2. Computational Protein Design Techniques
Computational Protein Design. 2. Computational Protein Design Techniques
 
IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...
IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...
IMMUNOINFORMATICS , MICROARRAY and Machine Learning - All about Immunology, I...
 
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of AgricultureAn Introduction to Chemoinformatics for the postgraduate students of Agriculture
An Introduction to Chemoinformatics for the postgraduate students of Agriculture
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
Kegg database resources
Kegg database resourcesKegg database resources
Kegg database resources
 
System's Biology
System's Biology System's Biology
System's Biology
 
Pubchem
PubchemPubchem
Pubchem
 
Molecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular ModelingMolecular Mechanics in Molecular Modeling
Molecular Mechanics in Molecular Modeling
 
The Basic of Molecular Dynamics Simulation
The Basic of Molecular Dynamics SimulationThe Basic of Molecular Dynamics Simulation
The Basic of Molecular Dynamics Simulation
 
Structural bioinformatics.
Structural bioinformatics.Structural bioinformatics.
Structural bioinformatics.
 
Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)
 
Cheminformatics-1.ppt
Cheminformatics-1.pptCheminformatics-1.ppt
Cheminformatics-1.ppt
 
Structure based drug design
Structure based drug designStructure based drug design
Structure based drug design
 
Computer Aided Molecular Modeling
Computer Aided Molecular ModelingComputer Aided Molecular Modeling
Computer Aided Molecular Modeling
 
Chemo informatics scope and applications
Chemo informatics scope and applicationsChemo informatics scope and applications
Chemo informatics scope and applications
 
Molecular dynamics and Simulations
Molecular dynamics and SimulationsMolecular dynamics and Simulations
Molecular dynamics and Simulations
 

Semelhante a Report: "MolGAN: An implicit generative model for small molecular graphs"

SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesMike Hucka
 
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Sri Ambati
 
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems BiologyMike Hucka
 
Predicting Molecular Properties
Predicting Molecular PropertiesPredicting Molecular Properties
Predicting Molecular PropertiesYassin Youssfi
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryKenta Oono
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxVenkateswaraBabuRavi
 
Interactive Machine Learning Appendix
Interactive  Machine Learning AppendixInteractive  Machine Learning Appendix
Interactive Machine Learning AppendixZitao Liu
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsGUANBO
 
Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeSiby Jose Plathottam
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...Natalio Krasnogor
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIIndrajeetKumar124
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Universitat Politècnica de Catalunya
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273Abutest
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsSajib Sen
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistRebecca Bilbro
 
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Ichigaku Takigawa
 

Semelhante a Report: "MolGAN: An implicit generative model for small molecular graphs" (20)

SBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resourcesSBML (the Systems Biology Markup Language), model databases, and other resources
SBML (the Systems Biology Markup Language), model databases, and other resources
 
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
Get hands-on with Explainable AI at Machine Learning Interpretability(MLI) Gym!
 
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems Biology
 
Predicting Molecular Properties
Predicting Molecular PropertiesPredicting Molecular Properties
Predicting Molecular Properties
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
ProjectReport
ProjectReportProjectReport
ProjectReport
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
Interactive Machine Learning Appendix
Interactive  Machine Learning AppendixInteractive  Machine Learning Appendix
Interactive Machine Learning Appendix
 
Web Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical ModelsWeb Information Extraction Learning based on Probabilistic Graphical Models
Web Information Extraction Learning based on Probabilistic Graphical Models
 
Deep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and HypeDeep learning: Cutting through the Myths and Hype
Deep learning: Cutting through the Myths and Hype
 
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...Evolutionary Symbolic Discovery for Bioinformatics,  Systems and Synthetic Bi...
Evolutionary Symbolic Discovery for Bioinformatics, Systems and Synthetic Bi...
 
Drug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AIDrug properties (ADMET) prediction using AI
Drug properties (ADMET) prediction using AI
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
 
Conv xg
Conv xgConv xg
Conv xg
 
Machine Learning ICS 273A
Machine Learning ICS 273AMachine Learning ICS 273A
Machine Learning ICS 273A
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their Applications
 
The Incredible Disappearing Data Scientist
The Incredible Disappearing Data ScientistThe Incredible Disappearing Data Scientist
The Incredible Disappearing Data Scientist
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
Machine Learning and Model-Based Optimization for Heterogeneous Catalyst Desi...
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 

Mais de Ryohei Suzuki

Transformer based approaches for visual representation learning
Transformer based approaches for visual representation learningTransformer based approaches for visual representation learning
Transformer based approaches for visual representation learningRyohei Suzuki
 
Paper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsPaper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsRyohei Suzuki
 
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Ryohei Suzuki
 
Basic Concepts of Entanglement Measures
Basic Concepts of Entanglement MeasuresBasic Concepts of Entanglement Measures
Basic Concepts of Entanglement MeasuresRyohei Suzuki
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsRyohei Suzuki
 
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"Ryohei Suzuki
 
等号と不等号の物理学
等号と不等号の物理学等号と不等号の物理学
等号と不等号の物理学Ryohei Suzuki
 
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...Ryohei Suzuki
 
コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?Ryohei Suzuki
 
身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線Ryohei Suzuki
 
Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向Ryohei Suzuki
 
Collaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the CrowdCollaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the CrowdRyohei Suzuki
 
汝は計算機なりや?
汝は計算機なりや?汝は計算機なりや?
汝は計算機なりや?Ryohei Suzuki
 
アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―Ryohei Suzuki
 
色字共感覚と書記素学習
色字共感覚と書記素学習色字共感覚と書記素学習
色字共感覚と書記素学習Ryohei Suzuki
 
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時埋め込みによる編集支援AnnoTone: 高周波音の映像収録時埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援Ryohei Suzuki
 
立体音響とインタラクション
立体音響とインタラクション立体音響とインタラクション
立体音響とインタラクションRyohei Suzuki
 
SIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" SessionSIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" SessionRyohei Suzuki
 
Overview of User Interfaces
Overview of User InterfacesOverview of User Interfaces
Overview of User InterfacesRyohei Suzuki
 

Mais de Ryohei Suzuki (20)

Transformer based approaches for visual representation learning
Transformer based approaches for visual representation learningTransformer based approaches for visual representation learning
Transformer based approaches for visual representation learning
 
Paper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problemsPaper memo: persistent homology on biological problems
Paper memo: persistent homology on biological problems
 
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
Paper memo: Optimal-Transport Analysis of Single-Cell Gene Expression Identif...
 
Basic Concepts of Entanglement Measures
Basic Concepts of Entanglement MeasuresBasic Concepts of Entanglement Measures
Basic Concepts of Entanglement Measures
 
Disentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative ModelsDisentangled Representation Learning of Deep Generative Models
Disentangled Representation Learning of Deep Generative Models
 
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
論文紹介: "MolGAN: An implicit generative model for small molecular graphs"
 
等号と不等号の物理学
等号と不等号の物理学等号と不等号の物理学
等号と不等号の物理学
 
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
Wolf et al. "Graph abstraction reconciles clustering with trajectory inferen...
 
コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?コンピュータは知恵熱を出すか?
コンピュータは知恵熱を出すか?
 
身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線身体の中の小宇宙:免疫研究の最前線
身体の中の小宇宙:免疫研究の最前線
 
Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向Single-cell pseudo-temporal ordering 近年の技術動向
Single-cell pseudo-temporal ordering 近年の技術動向
 
Collaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the CrowdCollaborative 3D Modeling by the Crowd
Collaborative 3D Modeling by the Crowd
 
汝は計算機なりや?
汝は計算機なりや?汝は計算機なりや?
汝は計算機なりや?
 
アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―アナログとはなんだろう。―古くて新しい、もう一つの計算―
アナログとはなんだろう。―古くて新しい、もう一つの計算―
 
AnnoTone (CHI 2015)
AnnoTone (CHI 2015)AnnoTone (CHI 2015)
AnnoTone (CHI 2015)
 
色字共感覚と書記素学習
色字共感覚と書記素学習色字共感覚と書記素学習
色字共感覚と書記素学習
 
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時埋め込みによる編集支援AnnoTone: 高周波音の映像収録時埋め込みによる編集支援
AnnoTone: 高周波音の映像収録時 埋め込みによる編集支援
 
立体音響とインタラクション
立体音響とインタラクション立体音響とインタラクション
立体音響とインタラクション
 
SIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" SessionSIGGRAPH 2014 Preview -"Shape Collection" Session
SIGGRAPH 2014 Preview -"Shape Collection" Session
 
Overview of User Interfaces
Overview of User InterfacesOverview of User Interfaces
Overview of User Interfaces
 

Último

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionPriyansha Singh
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 

Último (20)

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Caco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorptionCaco-2 cell permeability assay for drug absorption
Caco-2 cell permeability assay for drug absorption
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 

Report: "MolGAN: An implicit generative model for small molecular graphs"

  • 1. MolGAN: An implicit generative model for small molecular graphs N. De Cao and T. Kipf (Informatics Institute, University of Amsterdam) ICML Deep Generative Models Workshop (2018) arXiv:1805.11973 Gpat Journal Club 2018.10.12, Ryohei Suzuki
  • 2. Research Summary • Automatic generation of drug-like small molecules • Generative Adversarial Net + Graph Neural Network + Reinforcement Learning • Optimization of biochemical properties (e.g., solubility) → first step toward in-silico screening by ML ※It is not aimed at designing drugs for specific purposes
  • 3. About the authors T. Kipf (Ph.D cand.) • https://tkipf.github.io/ • Supervisor: Max Welling (ML) N. De Cao (Ph.D cand.) • https://nicola-decao.github.io/ • Supervisor: Ivan Titov (NLP) Supervisor of D. Kingma Pupil of G. t’Hooft (author of Adam, VAE, etc.) (quantum gravity, string theory) citation count 1999 (electro-weak)
  • 4. Drug design / drug discovery (DD) Properties required for drugs • Useful bioactivity • Controllable side effect • Synthesizability • Having effect after metabolism (cf. drug delivery) Vast time and monetary cost of animal/human experiments → in-silico screening using computers
  • 5. Screening by simulation Case of target drug: 1. Structure determination of target protein 2. Decision of target site 3. Static affinity prediction 4. Dynamic binding simulation (MD) days-weeks computation time /molecule Gefitinib Mutated EGFR (non small cell lung cancer)
  • 6. Why is drug design difficult? 1. Very large and high-dimensional search space - over 60,000 permutation for only 10 C/N/O atoms - very limited atomic permutations give valid structure 2. Discrete optimization of molecular structure - continuous/gradual optimization is not possible 3. Slight change in structure results in large effects - COH and COOH are absolutely different
  • 7. Why is drug design difficult? 4. No appropriate data structure for molecular structure 5. Predicting biochemical properties is essentially difficult - Even QM/MM has limitation. Wet exp. is necessary CN1CCC[C@H]1c2cccnc2 Image SMILES representation 3D structure (important for proteins)
  • 8. Will ML solve the problems? 1. Very large and high-dimensional search space → Generative models (e.g. GAN) can effectively represent complex/high-dimensional data 2. Discrete optimization of molecular structure → Goal of this study is just rough screening (not fine-tuning of specific drugs) 3. Slight change in structure results in large effects → Pinpoint affinity prediction can be difficult for ML. ML suites predicting general properties like solubility
  • 9. Will ML solve the problems? 4. No appropriate data structure for molecular structure → Graph representation + Graph convolutional neural network 5. Predicting biochemical properties is essentially difficult → ML wouldn’t solve this fundamental problem. Improved simulation methods are also needed
  • 10. Problem definition Generating molecular structure without specific usages • Generated molecules are evaluated by: 1. Druglikeness (QED: Bickerton et al., 2012) 2. Synthesizability (Synthetic Accessibility: Ertl & Schuffenhauer, 2009) 3. Solubility (logP: Comer & Tam, 2001) • Methods are evaluated by: 1. Validness = valid structure / output structure 2. Novelty = ratio of valid structures not included in training dataset 3. Uniqueness = unique valid molecules / total valid molecules
  • 11. Overview Generator: Transforms noise into a structure Generated structure Discriminator: Judges structure is valid or not Reward Network: Predict the properties of molecular structures Goal: obtaining a generator that can output valid molecular structures with good properties
  • 12. Revisiting neural networks https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6 1. Input an image or some value 2. Multiple transformation 3. Value (regression) or category (classification) is outputted 4. Calculate “loss” value 5. Refine the transformation parameter to improve the loss value (back-propagation)
  • 13. Generative models • classification:judge an image to be cat or dog • regression:predict f(0.5) from f(0), f(1) • generation:generate data distribution like training data https://blog.openai.com/generative-models/
  • 14. Generative models • 識別モデル:画像を入力してカテゴリ(犬か猫か)を判定 • 回帰モデル:f(0), f(1)が分かってるときのf(0.5)を予測 • 生成モデル:データセットの分布と同じようなデータを生成 https://blog.openai.com/generative-models/ Challenge: How to calculate the “loss” value to train the model to generate a “distribution like given dataset?”
  • 15. Generative Adversarial Net (GAN) “Rat race between fake bill maker vs. police” • generator:generate data as resemble as possible dataset samples • discriminator:distinguish real / fake data as precise as possible → train two modules alternately do not calculate actual distribution → danger of mode collapse https://towardsdatascience.com/generative-adversarial-networks-explained-34472718707a
  • 16. Power of GANs e.g., BigGANs (Brock et al., 2018) Generated Images Continuous morphing of input noise Continuous change of noise gives semantically continuous change of Image =learned useful representation
  • 17. Molecular structure representation Image:human-interpretable, but inefficient SMILES:rich information, but syntax is too strict 3D:very rich information, large data size, invariance problem CN1CCC[C@H]1c2cccnc2 2D Image SMILES 3D structure
  • 18. Graph and molecular structure Graph:Network structure consist of nodes V and edges E Node=atom / Edge=bond → Graph = molecule https://ja.wikipedia.org/wiki/%E9%9A%A3%E6%8E%A5%E8%A1%8C%E5%88%97 simple graph Adjacency matrix Node matrix Adjacency tensor
  • 19. 2D-convolution for images https://developer.nvidia.com/discover/convolutional-neural-network Convolution:Applying filters for an entire image http://timdettmers.com/2015/03/26/convolution-deep-learning/ Convolutional Neural Network Extract abstract information of images by repeated 2D-convolutions
  • 20. Graph convolution (Kipf&Welling ICLR2017) Convolution can be also defined for graphs! http://tkipf.github.io/misc/SlidesCambridge.pdf
  • 21. Reinforcement Learning Learning framework for robot movement Action under an environment gives a reward reflecting the goodness ex) going toward a hole results in death of Mario Optimizing the policy to maximize the reward ex) Jump when a hole is located in front of Mario https://en.wikipedia.org/wiki/Reinforcement_learning
  • 22. LR for Molecular Design Action:Generation of a molecule Environment/Reward:biochemical evaluation of molecule Policy:Generative model druglikeness:0.9 synthesizability:0.1 solubility:0.3 … Feedback External software
  • 23. Design of MolGAN (1) GAN • Gen directly output a graph in adjacency matrix • Gen is a MLP • Dis judges the validness of a molecule • Dis is a graph convolutional • WGAN-GP* loss *Please refer to the material of Fukuta-san’s lecture
  • 24. Design of MolGAN (2) LR Deep deterministic policy gradient • Reward network mimics external program to evaluate molecules • Reward network has same structure as the dis • Reward loss = output of reward network • Blend GAN loss & reward loss
  • 25. Examples of generated molecules ※numbers: druglikeness (QED score)
  • 26. Exp.1: valance of GAN/reward loss Evaluate generated molecules with changing the loss valance Result:Only reward loss is necessary
  • 27. Exp.2: comparison with other methods • Validity: Others: 85-95% MolGAN: 98-100% • Uniqueness: Others: 10-70% MolGAN: 2% • Time consumption: 1/10-1/2 to others
  • 28. Exp.2: comparison with other methods • druglikeness • synthesizability • solubility Higher score than other methods for all the properties
  • 29. Discussion Pros • Very high (~100%) valid output structure ratio • GraphNN+LR is effective for biochemical optimization • Light computational cost, fast learning Cons / Future work • mode collapse = same structure is repeatedly generated → normalization techniques (e.g., spectral norm) are useful? • Fixed atom count