We will discuss how we classified and calculated properties for over 21 million commercially available compounds using a variety of ChemAxon and in-house tools (and aggregated properties). We will show a summary of the analysis of the data and show how we will use that to build better virtual screens.
3. PAGE 1) Manfred Eigen (*1927), German biophysical chemist and one of the worldwide leading pioneers in biotechnology. In 1967, he won the Nobel Prize in Chemistry for his
work on a special measuring method of fast chemical reactions, which, until then, were considered to be immeasurable. He initiated the foundation of Evotec AG.
A global company with a complete offering
Evotec worldwide operations
Sales representation (Boston, Tokyo)
Operations & sales representation
2
San Francisco,
US
~30 employees
Compound
Procurement
Compound QC
and storage
Abingdon,
UK
~215 employees
Med Chem
Comp Chem
DMPK
Structural biology
Munich,
Germany
~30 employees
Phospho-
proteomics
Chemical
proteomics
Göttingen, Germany
~50 employees
Metabolics
Regenerative
Medicine
Thane,
India
~130 employees
Library synthesis &
mgmt.
Development
chemistry
Hamburg,
Germany
~200 employees
Screening
HTS,NMR
in vitro & in vivo
biology
Manfred Eigen1) Campus
6. PAGE
EVOsource
5
Compound selection
Can order from stores
Can see if ordered
Can order from supplier
Can request a quote
Can see if available in another
lab (or site)
Additional information displayed
JChem Cartridge accessed
through Java Persistence API
Marvin Applet
Structure to Name
Standardizer
7. PAGE
The challenges of loading supplier catalogues
Integrated Cyclic Process
Process
Contact
Receive
Prepare
Load
Receive catalogues
– New catalogues
– Catalogue updates
Prepare catalogues
– Convert to SD file
– Structure Checker
– Structure to Name
– Name to Structure
Contact Suppliers
– Existing suppliers
– Preferred suppliers
– New suppliers
Process
– Fix errors
– Expire old data
Load catalogue data
– Multiple parallel
processes
6
9. PAGE
Drug likeness categories
Classification of compounds
Feature count is defined as: # of 5- and 6-membered aromatic rings + # of Lipinski acceptors + # of Lipinski donors
Drug like
No SS fails
Lipinski fails ≤ 1
Additional property
constraints:
MOE LogP ≤ 6
MW ≤ 600
Rot bonds ≤ 10
EVO LogS ≥ -7
TPSA ≤ 180
Amber
No red SS fails
Lipinski fails ≤ 1
or
No SS fails
Lipinski fails ≤ 2
Red
Everything else!
Lead like
No SS fails
No Lipinski fails
Additional property
constraints:
MOE LogP ≤ 3.5
MW ≤ 350
Rot bonds ≤ 6
EVO LogS ≥ -5
TPSA ≤ 140
Fragments
No SS fails
No FSS fails
Lip. donors 1-3
Lip. acc. 0-4
MOE LogP ≤ 3.0
MW 150-350
Rot bonds ≤ 5
EVO LogS ≥ -3
TPSA ≤ 70
Feature count 4-7
8
10. PAGE
Property calculations
Historical process
Historical process
Export all structures
calculate properties using MOE
Structural alerts using MOE
Stored in a MOE database
Problems
Only available to Comp Chem Group
Difficult to update and time consuming
Different values using different tools – chemists use ChemAxon tools
to calculate properties
9
11. PAGE
Property calculations
New process stage 1
Calculate simple properties
Use ChemAxon JChem cartridge calculations where available
Weekly job that calculates properties for new structures
LogS uses MOE to calculate
LogS manually updated every 2 months
10
13. PAGE
MOE SMARTS converted to ChemAxon format
Structural alerts using two types of filter
121 SMARTS Structural Alerts – general screening compounds
19 SMARTS Fragment filters – fragment screening
Assign compounds to category – fragment, lead like
etc.
Problems
Took 6 months to calculate substructure search as had to be broken into
small chunks
MOE SMARTS not automatically converted to ChemAxon format
Property calculations
New process stage 2
12
14. PAGE
Quantitative Estimate of Drug-likeness (QED)
New process stage 3
Quantifying the chemical beauty of drugs, A. L. Hopkins et al, Nature Chemistry 2012, 4, 90–9813
QED calculation is based on similar parameters used to assign compounds to the Evotec
categories, each parameter is weighted from a model fitting
A proof of concept was carried out on a subset of our screening collection
Literature weights were modified with a bias in favour of structural alerts
LogD was substituted for ALogP to consider ionisation
QED calculation runs as a weekly job
Weighted QED
Relativefrequency
0.49 0.67
15. PAGE
What is a beautiful molecule?
Evotec weighted QED
14
QED = medium
QED = high
QED = low
16. PAGE
Agenda
About Evotec
About EVOsource
Calculating Properties
Analysis of the data
Final Thoughts
15
17. PAGE
EVOsource composition
16
EVOsource composition
for Evotec drug-likeness classes
Weighted QED
Relativefrequency
15.3%
2.9M
26.9%
5.3M
57.8%
11.1M
0.49 0.67
EVOsource composition
for QED index
Lead-like 3.1M
Fragment-like 1M
Red 2.1M
Amber 4.2M
Drug-like 8.1M
18. PAGE
Excellent agreement of QED and Evotec flags to
discriminate between different drug-like classes
17
0.670.49
Relativefrequency
Weighted QED
QED index profile of each Evotec drug-likeness class
19. PAGE
EVOsource QED distribution shows large
overlap with orally available drugs
18
0.670.49
Weighted QED
Relativefrequency
Orally available
(682 analyzed out of 770)
EVOsource
Orally available drugs
EVOSource
20. PAGE
Agenda
About Evotec
About EVOsource
Calculating Properties
Analysis of the data
Final Thoughts
19
21. PAGE
Final thoughts and Acknowledgements
EVOsource provides:
A useful tool for our chemists in ordering compounds for our clients
A tool for our Computational Chemists to do analyses of chemical space to better support our clients in
library design
QED calculation is providing a useful tool for filtering data to quickly make virtual
libraries
QED has been added to latest version of ChEMBL
20
Thanks to:
Dr Oliver Barker
Dr Mirco Meniconi
Catherine Reisser
Dr Dan Warner
22. Your contact:
Building innovative
drug discovery alliances
Ian Berry Bob Marmon
Manager, Informatics Senior Applications Developer
+44.(0).1235.441451 Office +44.(0).1235.441402 Office
+44.(0) 7802.438044 Mobile +44.(0).1235.861561 Switchboard
+44.(0).1235.441503 Fax +44.(0).1235.441503 Fax
ian.berry@evotec.com bob.marmon@evotec.com