Screening heuristics pope-final

Screening Heuristics & Chemical Property
Bias - New directions for Lead Identification and
Optimization

Andy Pope
Platform Technology &
Science, GlaxoSmithKline,
Collegeville PA, USA

SLAS 2012, San Diego
February 4-8, 2012

Why Screening Heuristics?

1. Huge complex datasets screening wisdom? (customers)

2. Refining approaches/deliverables success rates attrition

Some available datasets inside GSK

Descriptor Descriptor Descriptor
Descriptor
metadata metadata metadata
metadata

Hit ID Compound
Structures
Profiles
Public + Properties Public
Data HTS Program Data
GSK
>300 profiling Compounds
>500 + Data

Descriptor metadata
Descriptor metadata
e.g. PubChem e.g. Literature,
>>106
Descriptor metadata

FS Connectivity
Target class Maps
>200
profiling Phys-chem
>300 DMPK
ELT
>105
>150
Safety Marketed
FBDD profiling Drugs et.
>50 >103
>20

Other GSK Data – e.g. genomic, bio-informatic, clinical

300+ HTS Campaigns – 2004-11
Target class (13 classes)

Assay technology (15 classes)
2007-11 screens – sized by count of screens

Twin approaches to screening heuristics

1. Building Collective wisdom 2. New “big” data analysis/ insights
- Capture, combine and share the - Look for data patterns in large
experiences of screeners and data aggregated datasets
from screens (and screeners)

e.g. e.g.
How well do different assay methods perform? Do chemical properties influence the results of
screens?
What is the impact of screen quality and what
should be targeted in assay development?
How are screen results related between targets
What policies do I need in place to have a high and assay methods?
quality screening process?
Which is the best method to use to discover hits?
Which assay technology works best?
How are library properties reflected in the hits?

Building Collective Wisdom – a simple example

Some Questions;
- What actually happens in
practice as z’ varies?

- What z’ should we be aiming
for?

- Is this affected by the type
of assay?

- What is the appropriate
trade off between cost,
robustness and sensitivity?

- How are we doing?

From SBS Virtual Seminar Series 2007 - HTS Module 1

Z’ Heuristics

Statistical cut-off (% effect)
- Z’ >0.8 is ideal, >0.7 acceptable
- Z’ <0.7 many aspects of performance degrade
(e.g. failures, cycle times, false +ve/-ve, hit confirmation)
- Z’ vs “sensitivity” trade-off arguments may be based on
false hunches
- Target & assay type does not make a major difference

Average Z’ of assay in HTS production

Avge. Z’
0.4-0.5
Production failure rate (% of plates)

0.5-0.55

Cycle time (weeks/campaign)
0.55-0.6

0.6-0.65

0.7-0.75

0.65-0.7

0.75-0.8

>0.8



Properties, properties, properties…..
….But, do they affect screening data?
….are we selecting hits with the best properties?

….Bottom line; High cLogP (greasiness) is BAD
...This needs to be fixed at the start ..i.e in hit ID
….and tends to creep up during Lead Op.

Do compound molecular properties impact how
they behave in screens?
Aggregate results from all 330
campaigns 2005-2010 with
e.g. Compound total polar surface area (tPSA) >500K tests
makes no difference
Compounds with tPSA 80-85 Å2

26M measured responses in this bin
- 485k marked as “hit”

Hit rate = 100*(485k/26M) = 1.86%

“hit” = % effect => 3 RSD
of sample population in
Hit Rate (%)

that specific screen

The total polar surface area (tPSA) is
defined as the surface sum over all
- Hit rate for Compounds polar atoms
in specific tPSA bin < 60 A2 predicts brain penetration
> 140 A2 predicts poor cell penetration

Polar Surface Area (tPSA, Å2)

Size Matters……

Middle 80% of Cpds
270 470

Cumulative % Cpds
% Cpds in MW Bin
4.0%
Hit Rate (%)

2.62%

1.50% MW
1.2%
 Overall Hit rate rises 1.7-fold across
the middle 80% of the screening deck
i.e. 70% rise in hit rate from MW = 270 to
Molecular Weight (MW) MW = 470

- Only bins containing 1M or more records are shown
 3.3-fold rise across full MW range

Greasiness matters most……

Middle 80% of Cpds
1 5

Cumulative % Cpds
% Cpds in ClogP Bin
4.5%

3.31%
Hit Rate (%)

ClogP

1.14%
1.1%
 Overall hit rate rises 2.9-fold across the
middle 80% of the screening deck
i.e. from ClogP = 1  5
ClogP  4.1-fold rise across full ClogP range

- Only bins containing 1M or more
records are shown

HTS Promiscuity - cLogP
Compounds Compounds hitting
hitting ~1 target >10% of targets
cLogP

Note; Compounds
required to have been
run in 50 HTS and
yielded > 50% effect in
a single screen to be
included

Frequency at bin > Frequency at bin > Frequency at bin > Frequency at bin >

Inhibition frequency Index* (%)

*Inhibition frequency index (IFI) = % of screens where cpd yielded
>50% inhibition, where total screens run => 50

“Dark” Matter is small and polar
– Compounds which have not yielded >50% effect
once in >50 screens

Molecular Weight (Da)
cLogP

Biases translate to full-curve follow-up and beyond
Property bias in primary HTS hit marking are propagated forward
to dose-response follow-up

SS testing
FC testing
FC – SS differential
% Compounds Tested

% Compounds Tested

cLogP Molecular Weight

Elevated testing of large, lipophilic Reduced testing of small, polar compounds
compounds in the full-curve phase of HTS in the full-curve phase of HTS

Note; Plots represent data from 402M single-concentration responses &
2.1M full-curve results

Property bias detection at an individual screen level
e.g. Screens with largest response to cLogP
Hit rate as % of HR at cLogP =3.5

cLogP

Assay Technology vs. property bias
e.g. By assay technology, normalized to HR for that screen at median collection cLogP value

Colored by Hit
rate (%)
Hit rate as % of HR at cLogP =3.5

e.g. No clear origins in any meta-data
- Assay Technology, Target class, Screen quality etc.
…. But effects detectable even at single screen level

cLogP

Lipophilicity trends in PubChem HTS Data
Primary data from around 100 Academic HTS campaigns obtained from
PubChem BioAssay

Lipophilicity – similar to GSK HTS Compound size – little effect

3.80%

Hit Rate (%)
Hit Rate (%)

Pretty flat
2.27%
2.14%

1.28%

ClogP (MW)

 GSK screening deck (>50 HTSs, 2.01M cpds)
ClogP = 0.00835*MW – 0.058, R2 = 0.18
 PubChem Compounds (405k)
ClogP = 0.00554*MW + 0.97, R2 = 0.09

Not just HTS… Lipophilicity trends in kinase focused set screens

Primary data from ~50 focused screen campaigns against protein kinases

Lipophilicity and size – similar to GSK HTS

Y% Y%

Hit Rate (% of cpds >50% I) at 10 uM
Hit Rate (% of cpds >50% I) at 10 uM

X%
X%

ClogP MW

Bias from other simple chemical properties?
Property R2, ± vs MW R2, ± vs
ClogP
+ve -ve
MW 1, + 0.21, +

cLogP fCsp3 ClogP 0.21, + 1.0, +
MW (HAC) flexibility HAC 0.92, + 0.19, +
fCsp3 0.15, + 0.00
RotBonds 0.36, + 0.04, +
Hit Rate (%)

tPSA 0.16, + 0.08, -
Chiral 0.02, + 0.00
HetAtmRatio 0.02, - 0.34, -
Complexity 0.31, + 0.02, +
Flexibility 0.02, + 0.00
AromRings 0.22, + 0.16, +
Fraction of carbons that are sp3 (fCsp3) HBA 0.11, + 0.10, -
HBD 0.01, + 0.02, -

Improving hit marking – Property Biasing

Mean + 3 x RSD cut-off

Hit Rate (%)
Ordinary HTS Hit Marking
Property-biased Hit Marking
More attractive
properties
% Compounds

- promote MW

Less attractive

Hit Rate (%)
properties
- demote

Ordinary HTS Hit Marking
Property-biased Hit Marking

RESPONSE (% control)
ClogP

Evolving the screening collection…
 GSK’s Compound Collection Enhancement (CCE) strategy
- moving the HTS deck towards decreased size and lipophilicity with the aim of
improving chemical starting points

Compounds tested in HTS test datasets

% Compounds Exceeding Property Limit
- 2004
(% of total compounds in HTS)

- 2010
- D 2010 <> 2004

ClogP > 5

MW > 500

New
2011

ClogP Year

CCE Acquisition, Property Bounds
2004-05: Lipinski criteria (MW<500, ClogP<5)
Most recently: MW<360, ClogP<3
Inclusion of DPU lead-op cpds: MW<500, ClogP<5

Can property biases translate into lead optimization?
Cellular
Med. Biochemical “mechanistic”
Rodent DMPK,
chem target assay efficacy model
target assay

More potent in cell
Example from current
Lead Optimization
“patient in a Program
pIC50 Cell - Biochem

plate”
-Cellular activity favors
Or……. cLogP >4
- Directional “pull” to
More potent in biochem

more lipophilic cpds?
“biochemistry -Good DMPK at cLogP <3
in a (grease- - Value of cellular assay?
selective) bag”!

Binned cLogP

Property bias in broad pharmacological profiling
Early safety cross screening panel (eXP)
GSK Lead Op. compounds 2009-11 Marketed drugs
n = ~1000

Average % of assays giving IC50 <=10 uM

GSK Terminated Leads & Candidates

n = ~2500
n = ~2500 n = ~400

GPCR’s – 17 Binned ClogP
Ion Channels – 8 Binned ClogP
Enzymes – 3
Kinases – 4
Nuclear Receptors – 2
Transporter – 3
Phenotypic – 3 (Blue Screen, Cell Heath, Phospholipidoses)

Property bias in broad pharmacological profiling
Early safety cross screening panel (eXP)
GSK Lead Op. compounds 2009-11

n = ~2500
n = ~2500

GPCR’s – 17 Binned ClogP
Ion Channels – 8
Enzymes – 3
Kinases – 4
Nuclear Receptors – 2
Transporter – 3
Phenotypic – 3 (Blue Screen, Cell Heath, Phospholipidoses)

Kinome profiling – no impact of cLogP

~400 kinase Lead Op
% inhibition values (>300 kinase assays) Compounds vs
300 protein kinases

Binned ClogP
(>300 kinase assays)
% inhibition values

Kinase structural classifier

Conclusions

 Heuristic approaches allow both refinement of best practice and new
insights

 Standard screening processes favor the selection of lipophilic compounds
- A contributing factor in current issues with drug Lead/Candidate property space
occupancy
- Improvement in screening collections and analysis methods can overcome this, BUT
- All this effort is wasted if Lead Optimization pathways pull compounds back towards
unfavorable property space!!

 The very large datasets generated from screening have considerable value
beyond the lifetime of individual campaigns
- Particularly crucial now that quality and cycle time problems are largely solved
- Many other examples exist beyond those shown here
- Please go look for these effects in your data!

Snehal Bhatt
Acknowledgements Stuart Baddeley
James Chan
Sue Crimmin
Pat Brady Tony Jurewicz Emilio Diez
Darren Green Glenn Hofmann Maite De Los Frailes
Stephen Pickett Stan Martens Bob Hertzberg
Sunny Hung Deb Jaworski
Jeff Gross Ricardo Macarron
Subhas Chakravorty Carl Machutta
Nicola Richmond Julio Martin-Plaza
Jesus Herranz Barry Morgan
Gonzalo Colmeranjo-Sanchez Juan Antonio Mostacero
Dave Morris
Dwight Morrow
Mehul Patel
…and numerous others who contributed Amy Quinn
to programs run by GSK 2004-2011….. Geoff Quinique
Mike Schaber
Zining Wu
Ana Roa
And colleagues...

Screening & Compound Profiling

Screening heuristics pope-final

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (19)

Semelhante a Screening heuristics pope-final

Semelhante a Screening heuristics pope-final (20)

Último

Último (20)

Screening heuristics pope-final