Stephen Friend ICR UK 2012-06-18

Exploring
Disease
Bionetworks
and

How
we
Perform
our
Science

Stephen
Friend

June
18,
2012

ICR

InformaFon
Commons
for
Biological
FuncFon

Oncogenes only make good targets in particular molecular
contexts : EGFR story

ERBB2
•  EGFR
Pathway
commonly
mutated/acFvated
in
Cancer

EGFRi EGFR •  30%
of
all
epithelial
cancers

BCR/ABL
•  Blocking
Abs
approved
for
treatment
of
metastaFc

colon
cancer

KRAS NRAS
•  Subsequently
found
that
RASMUT
tumors
don’t
respond

–
“NegaFve
PredicFve
Biomarker”

BRAF

•  However
sFll
EGFR+
/
RASWT
paFents
who
don’t

MEK1/2 respond?
–
need
“PosiFve
PredicFve
Biomarker”

•  And
in
Lung
Cancer
not
clear
that
RASMUT
status
is

Proliferation,
Survival useful
biomarker

PredicFng
treatment
response
to
known
oncogenes
is

complex
and
requires
detailed
understanding
of
how

diﬀerent
geneFc
backgrounds
funcFon

Causal Relationships ≠ Correlative Relationships? : CETPi story

•  Epidemiological Data provides strong
support for independent association of low
LDL and high HDL with reduced incidence
of heart disease

•  Statins reduce LDL and reduce incidence
of CVD deaths establishing causal
relationship

•  CETP inhibition raises HDL – Does this
have positive clinical benefit?

•  Torcetrapib (Pfizer) - $800M drug failed Ph3 (2006): a) Lack of efficacy; b) Increased mortality (off target?)
•  Dalcetrapib (Roche) – development halted in Ph3 (May 2012) for lack of efficacy (no increase in mortality)
•  Anacetrapib (Merck) / Evacetrapib (Lilly) – development ongoing. Hoped that they are better inhibitors and
this will lead to clinical benefit. Will cost $1Billion+ to find out

Can
we
save
billions
of
dollars
by
generaFng
and
sharing
datasets
that

let
us
be]er
understand
causal
relaFonships?

Is
there
a
common
framework
for
tesFng
clinical
hypotheses

(ARCH2POCM)?

what will it take to understand disease?

DNA

RNA
PROTEIN

MOVING
BEYOND
ALTERED
COMPONENT
LISTS

Preliminary Probabalistic Models- Rosetta

Networks facilitate direct
identification of genes that are
causal for disease
Evolutionarily tolerated weak spots

Gene symbol Gene name Variance of OFPM Mouse Source
explained by gene model
expression*
Zfp90 Zinc finger protein 90 68% tg Constructed using BAC transgenics
Gas7 Growth arrest specific 7 68% tg Constructed using BAC transgenics
Gpx3 Glutathione peroxidase 3 61% tg Provided by Prof. Oleg
Mirochnitchenko (University of
Medicine and Dentistry at New
Jersey, NJ) [12]

Lactb Lactamase beta 52% tg Constructed using BAC transgenics
Me1 Malic enzyme 1 52% ko Naturally occurring KO
Gyk Glycerol kinase 46% ko Provided by Dr. Katrina Dipple
(UCLA) [13]
Lpl Lipoprotein lipase 46% ko Provided by Dr. Ira Goldberg
(Columbia University, NY) [11]
C3ar1 Complement component 46% ko Purchased from Deltagen, CA
3a receptor 1
Tgfbr2 Transforming growth 39% ko Purchased from Deltagen, CA
Nat Genet (2005) 205:370 factor beta receptor 2

Extensive Publications now Substantiating Scientific Approach
Probabilistic Causal Bionetwork Models
• >80 Publications from Rosetta Genetics

Metabolic "Genetics of gene expression surveyed in maize, mouse and man." Nature. (2003)
Disease "Variations in DNA elucidate molecular networks that cause disease." Nature. (2008)
"Genetics of gene expression and its effect on disease." Nature. (2008)
"Validation of candidate causal genes for obesity that affect..." Nat Genet. (2009)
….. Plus 10 additional papers in Genome Research, PLoS Genetics, PLoS Comp.Biology, etc
CVD "Identification of pathways for atherosclerosis." Circ Res. (2007)
"Mapping the genetic architecture of gene expression in human liver." PLoS Biol. (2008)
…… Plus 5 additional papers in Genome Res., Genomics, Mamm.Genome

Bone "Integrating genotypic and expression data …for bone traits…" Nat Genet. (2005)
d
“..approach to identify candidate genes regulating BMD…" J Bone Miner Res. (2009)
Methods "An integrative genomics approach to infer causal associations ...”
Nat Genet. (2005)
"Increasing the power to detect causal associations… “PLoS Comput Biol. (2007)
"Integrating large-scale functional genomic data ..." Nat Genet. (2008)
…… Plus 3 additional papers in PLoS Genet., BMC Genet.

List of Influential Papers in Network Modeling

  50 network papers
  http://sagebase.org/research/resources.php

Fundamentally
Biological
Science
hasn’t
changed
because
of
the
‘Omics
RevoluFon……

…..it
is
about
the
process
of
linking
a
system
to
a
hypothesis
to
some
data
to
some
analyses

Biological Data Analysis
System

But
the
way
we
do
it
has
changed…………………………………………

Driven
by
molecular
technologies
we
have
become
more
data
intensive
leading
to
more

specializaFon:
data
generators
(centralized
cores),
data
analyzers
(bioinformaFcians),

validators
(experimentalists:
lab
&
clinical)

This
is
reﬂected
in
the
tendency
for
more
mulF
lab
consorFum
style
grants
in
which
the

data
generators,
analyzers,
validators
may
be
diﬀerent
labs.

Single Lab Model Data

•  R01 Funding
•  Hypothesis->data->analysis->paper
•  Small-scale data / analysis
•  Reproducible? Biological Analysis
System

Multiple Lab Model
Data
•  P01 Funding
•  Hypothesis->data->analysis->paper
•  Medium-scale data / analysis
•  Data Generators/Analysts/Validators maybe
different groups Biological Analysis
•  Reproducible? System

Iterative Networked Approaches
To Generating Analyzing and Supporting New Models

Data

Biological
Analysis
System

Uncouple the automatic linkage between the
data generators, analyzers, and validators

Networked Approaches

BioMedicine Information Commons
Patients/
Citizens
Data
Generators
CURATED
DATA
Data
TOOLS/ Analysts

METHODS
RAW
DATA

ANALYZES/
MODELS

Clinicians

SYNAPSE
Experimentalists

Networked Approaches 2

1

REWARDS

USABLE

RECOGNITION

DATA

BioMedical Information Commons
Patients/
Citizens
Data
Generators
CURATED
DATA
Data
TOOLS/ Analysts

METHODS

5
RAW
DATA
PRIVACY

BARRIERS

ANALYZES/
MODELS 3

GOVERNANCE

Clinicians
4

HOW
TO

SYNAPSE
Experimentalists
DISTRIBUTE

TASKS

Barriers to Engaging Networked Approaches
to a BioMedicine Information Commons

1

USABLE

DATA

4

SYNAPSE
HOW
TO

DISTRIBUTE

TASKS

COLLABORATIVE

2
CHALLENGES

REWARDS

RECOGNITION

SYNAPSE
5

PRIVACY

BARRIERS

PORTABLE
LEGAL
CONSENT

3

RULES

GOVERNANCE

THE
FEDERATION

Open and Networked Approaches:Democratization of Science

1

USABLE

DATA

SYNAPSE

2

REWARDS

RECOGNITION

SYNAPSE

Two approaches to building common
scientific and technical knowledge

Every code change versioned
Every issue tracked
Text summary of the completed project Every project the starting point for new work
Assembled after the fact All evolving and accessible in real time
Social Coding

Synapse is GitHub for Biomedical Data

Every code change versioned
Every issue tracked
Data and code versioned Every project the starting point for new work
Analysis history captured in real time All evolving and accessible in real time
Work anywhere, and share the results with anyone Social Coding
Social Science

Why not share clinical /genomic data and model building in the
ways currently used by the software industry
(power of tracking workflows and versioning

Leveraging Existing Technologies

Addama

Taverna
tranSMART

sage bionetworks synapse project
Watch What I Do, Not What I Say


Reduce, Reuse, Recycle

Most of the People You Need to Work with Don’t Work with You

My Other Computer is “The Cloud”

Data Analysis with Synapse

Run Any Tool

On Any Platform

Record in Synapse

Share with Anyone

Public or Private Projects
Find Public Data

Use Existing Tools Publish Your Work

my other computer is the cloud… let me hand it to you…

pilot advisors!
so with a click from your or ﬁgures...
clearScience links the
browser you can push
components of a ‘big
code into a virtual machine
science’ project to a cloud or entire compute
computing environment...
environments...
or data...
conveniently pre-populated
with data, code, and the
library and version
or models...
dependencies

Downloading
through
TCGA
data
portal

•  Automated
workﬂows
for
curaFon,
QC,
and
sharing
of

1%/2* 53,'6%(* !7"(%,2/"* large-‐scale
datasets.

-./#"++0%(* (3&4"#*
•  All
of
TCGA,
GEO,
and
user-‐submi]ed
data

processed
with
standard
normalizaFon
methods.

1%/2* 53,'6%(* !7"(%,2/"* •  Searchable
TCGA
data:

-./#"++0%(* (3&4"#* •  23
cancers

•  11
data
plaoorms

•  Standardized
meta-‐data
ontologies

-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2*
(3&4"#* (3&4"#*
53,'6%(* 53,'6%(*

!#"80)69"*&%8":*
;"("#'6%(*

!"#$%#&'()"*
'++"++&"(,*

1%/2* 53,'6%(* !7"(%,2/"* •  Data
accessible
at
mulFple
levels
of
aggregaFon.

-./#"++0%(* (3&4"#*
•  Links
to
upstream
and
downstream
processing
of

data.

1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#* •  Displayed
is
TCGA
Glioblastoma
data
normalized

for
each
plaoorm
across
batches.

-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2*
(3&4"#* (3&4"#*
53,'6%(* 53,'6%(*

!#"80)69"*&%8":*
;"("#'6%(*

!"#$%#&'()"*
'++"++&"(,*

1%/2* 53,'6%(* •  Data
accessible
through
programmaFc

!7"(%,2/"*
-./#"++0%(* (3&4"#*
environments
such
as
R.

•  Standardized
formats
allow
reuse
of
analysis

1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#* pipelines
on
all
processed
datasets.

•  TCGA,
GEO,
user-‐submi]ed
data.

-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2*
(3&4"#* (3&4"#*
53,'6%(* 53,'6%(*

!#"80)69"*&%8":*
;"("#'6%(*

!"#$%#&'()"*
'++"++&"(,*

1%/2* 53,'6%(* !7"(%,2/"* •  Comparison
of
many
modeling
approaches
applied

-./#"++0%(* (3&4"#*
to
the
same
data.

•  Models
transparently
shared
and
reusable
through

-./#"++0%(*
1%/2* 53,'6%(* !7"(%,2/"* Synapse.

(3&4"#*
•  Displayed
is
comparison
of
6
modeling
approaches

to
predict
sensiFvity
to
130
drugs.

•  Extending
pipeline
to
evaluate
predicFon
of

-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"* TCGA
phenotypes.

1%/2* 1%/2*
(3&4"#* (3&4"#* •  HosFng
of
collaboraFve
compeFFons
to
compare

53,'6%(* 53,'6%(* models
from
many
groups.

1--'&2-3$4567$

!#"80)69"*&%8":*
*&+%,-./0$

;"("#'6%(*

!"#$%#&'()"*
'++"++&"(,*

!"#$%&'()$

Open and Networked Approaches

THE
FEDERATION

3

RULES

GOVERNANCE

Pipeline
Strategy

A
B
C

Divide
and
Conquer
Strategy

D

A
B
C

Parallel/IteraFve
Strategy

A
B
C

sage federation:
model of biological age

Faster Aging
Predicted
Age
(liver
expression)

Slower Aging

Clinical Association
-  Gender
-  BMI
-  Disease
Age Differential Genotype Association
Gene Pathway Expression

Chronological
Age
(years)

REDEFINING HOW WE WORK TOGETHER:
Sage/DREAM Breast Cancer Prognosis Challenge

4

HOW
TO
COLLABORATIVE

DISTRIBUTE
CHALLENGES

TASKS

What
is the problem?
Our current models of disease biology are primitive and limit
doctor’s understanding and ability to treat patients

Current incentives reward those who
silo information and work in closed
systems 38

The Solution: Competitions to crowd-source research
in biology and other fields

  Why competitions?
•  Objective assessments
•  Acceleration of progress
•  Transparency
•  Reproducibility
•  Extensible, reusable models

  Competitions in biomedical research
•  CASP (protein structure)
•  Fold it / EteRNA (protein / RNA structure)
•  CAGI (genome annotation)
•  Assemblethon / alignathon (genome assembly / alignment)
•  SBV Improver (industrial methodology benchmarking)
•  DREAM (co-organizer of Sage/DREAM competition)

  Generic competition platforms
•  Kaggle, Innocentive, MLComp
39

The Sage/DREAM breast cancer prognosis
challenge
Goal: Challenge to assess the accuracy of computational models designed to
predict breast cancer survival using patient clinical and genomic data

Why this is unique:
  This Sage/DREAM Challenge is a pre-collated cohort: 2000 breast cancer samples
from the Metabric cohort
  Accessible to all: A cloud-based common compute architecture is being made
available by Google to support the computational models needed to develop and test
challenge models
  New Rigor:
•  Contestants will evaluate their models on a validation data set composed of newly generated
data (provided by Dr. Anne-Lise Borreson Dale)
•  Contestants must demonstrate their models can be reproduced by others
  New incentives: leaderboard to energize participants, Science Translational Medicine
publication for winning team
  Breast cancer patients, funders and researchers can track this Challenge on BRIDGE,
an open source online community being built by Sage and Ashoka Changemakers and
affiliated with this Challenge

40

Sage/DREAM Challenge: Details and Timing

Phase
1: Apr thru end-Sep 2012 Phase
2:
Oct 1 thru Nov 12, 2012
  Training data: 2,000 breast cancer   Evaluation of models in novel
samples from METABRIC cohort dataset.
•  Gene expression
•  Copy number
  Validation data: ~500 fresh frozen
•  Clinical covariates tumors from Norway group with:
•  10 year survival •  Clinical covariates
•  10 year survival
  Supporting data: Other Sage-
curated breast cancer datasets
  Gene expression and copy number
•  >1,000 samples from GEO data to be generated for model
•  ~800 samples from TCGA evaluation
•  ~500 additional samples from •  Sent to Cancer Research UK to
Norway group generate data at same facility as
•  Curated and available on METABRIC
Synapse, Sage’s compute •  Models built on training data
platform evaluated on newly generated
data
  Data released in phases on
Synapse from now through end-   Winners announced at November
September 12 DREAM conference

  Will evaluate accuracy of models
built on METABRIC data to predict
survival in:
•  Held out samples from
METABRIC 41

•  Other datasets

Summary

Transparency,
Valida;on
in
novel

reproducibility
-./#"++0%(*
1%/2*
(3&4"#* 53,'6%(* !7"(%,2/"*
dataset

1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#*

-./#"++0%(* -./#"++0%(*
!7"(%,2/"* !7"(%,2/"*
1%/2* 1%/2*
(3&4"#* (3&4"#*
53,'6%(* 53,'6%(*

!#"80)69"*&%8":*
;"("#'6%(*

!"#$%#&'()"*
'++"++&"(,*

Publica;on
in
Science
Dona;on
of
Google-‐
Transla;onal
Medicine
scale
compute
space.

For
the
goal
of
promo;ng
democra;za;on
of
medicine…

Registra;on
star;ng
NOW…

42

sign
up
at
synapse.sagebase.org

Presentation outline

1)
Predic;ng
drug
2)
Predic;ng
clinical
3)
Workﬂows
for
data

response
from
cancer
cancer
phenotypes
management,
versioning
and

cell
lines
method
comparison

Cancer
cell
line
Primary
tumor
datasets

encyclopedia
(TCGA,
METABRIC)

1%/2* 53,'6%(* !7"(%,2/"*
-./#"++0%(* (3&4"#*
Molecular Molecular
characterization characterization 1%/2*
-./#"++0%(* 53,'6%(* !7"(%,2/"*
•  1,000 cell lines   genomics (3&4"#*
  transcriptomics
  mRNA   epigenetics
-./#"++0%(* -./#"++0%(*
  copy number Predic;ve
Clinical data 1%/2*
!7"(%,2/"*
1%/2*
!7"(%,2/"*

model
(3&4"#* (3&4"#*
  Sequencing (e.g. survival time) 53,'6%(* 53,'6%(*
(1,600 genes)
4)
Network-‐based

predictors
and
mul;-‐
Viability screens task
learning
!#"80)69"*&%8":*
;"("#'6%(*
•  500 cell lines
•  24 compounds !"#$%#&'()"*
'++"++&"(,*

Developing predictive models of genotype-
specific sensitivity to compound treatment

Gene;c
Feature
Matrix

Expression,
copy
number,

somaFc
mutaFons,
etc.

Predic;ve
Features

(biomarkers)

Cancer
samples
with
varying

degrees
of
response
to
therapy

Sensi;ve
Refractory

(e.g.
EC50)

44

Our approach identifies mutations in genes upstream of
MEK as top predictors of sensitivity to MEK inhibition

#9
Mut
KRAS

#3
Mut
BRAF

!"#$% &"#$% #1
Mut
NRAS

PD-‐0325901

'"#(%
#312
Mut
NRAS

)*!+,-% #./0-11%
2/345-674+%

#9
Mut
BRAF

45

PD-‐0325901

For 11/12 compounds, the #1 predictive feature in an unbiased
analysis corresponds to the known stratifier of sensitivity
#2
CML
lineage

CML lineage
#1
EGFR
mut

EGFR mut

#1
EGFR
mut

EGFR mut

#1
CML
lineage

#1
EGFR
mut

CML linage
EGFR mut

#1
ERBB2
expr

ERBB2 expr

Can
the
approach
make
new
mut

#1
BRAF

discoveries?

BRAF mut

#1
HGF
expr

HGF expr
#2
NRAS
mut
NRAS mut

BRAF mut
#1
BRAF
mut

#3
KRAS
mut

KRAS mut

#2
NRAS
mut

NRAS mut
BRAF mut

#1
BRAF
mut

#3
KRAS
mut

KRAS mut

#2
NRAS
mut

NRAS mut
BRAF mut

#1
BRAF
mut

#2
TP53
mut

TP53 mut

#3
CDKN2A
copy

CDKN2A copy

#1
MDM2
expr

MDM2 expr

46

Predicted biomarkers supported by literature evidence

Predic;on
Literature
evidence
Model
/
Significance

HDAC
inhibitors
are
Supported
in
current
Typical
pharma:
>10
phase
2

effec;ve
in
clinical
trials
clinical
trials
in
solid
tumors

haematopoie;c
tumors
@
$millions
per
trial.

solid

haematopoietic ”Responses
with
single
agent
HDACi
have
been

predominantly
observed
in
advanced

LBH589 (HDACi)
hematologic
malignancies
including
T-‐cell

lymphoma,
Hodgkin
lymphoma,
and
myeloid

malignancies."

NQO1
over-‐expression
NQO1
metabolizes
17-‐AAG
to

predicts
17-‐AAG
stable
intermediary
with
32-‐fold

sensi;vity
increase
in
ac;vity.

!"#$%&'()%

)*+,,-%

MYC
amplifica;on
HSP70
inhibits
MYC-‐mediated
%&'())**+$
predicts
sensi;vity
to
apoptosis.

HSP70
inhibi;on.
!"#$
,-./*$
)*+,(-.)(

!"#$%%&&'(

Novel predictions are functionally validated

Predic;on
Valida;on

AHR
expression
predicts
sensi;vity
Func;onally
validated
by
AHR
knockdown

to
MEK
inhibitors
in
NRAS
mutant

cell
lines

Legend

AHR
shRNA

Wei
G.*,
Margolin
A.A.*,
et
al,
Cancer
Cell

Control
shRNA

BCL-‐xL
expression
predicts
Func;onally
validated
by
:

sensi;vity
to
several

chemotherapeu;cs
BCL-‐xL
knockdown
BCL-‐xL
inhibitor
drug
synergy

!"#$%&'#()* +',-&$#"#(&'* ./%0* 0&1&"23#/#4* .4#5&67/#4* 86)94)* :2"&67/#4*

!"#$%#&
=><"*
?!@*

'%()*++,-.&
/,5$,5)*

&

!"#"$%&'(')*
;<"*

+$',-".'/0*
1203)0* Mouse
models
Clinical
trials

4(-!*
5.67",'$'/".*
4)'("28(')*
9%$"28(')*
48

Open and Networked Approaches

5

PRIVACY
PORTABLE
LEGAL
CONSENT:
weconsent.us

BARRIERS
John
Wilbanks

The Current R&D Ecosystem Is In Need of a New
Approach to Drug Development

•  $200B per year in biomedical and drug discovery R&D

•  Only a handful of new medicines are approved each year

•  Productivity in steady decline since 1950

•  >90% of novel drugs entering clinical trials fail, and negative POC
information is not shared

•  Significant pharma revenues going off patent in next 5 years

•  >30,000 pharma employees laid off from downsizing in each of last four
years

•  90% of 2013 prescriptions will be for generic drugs

51

Issues With Drug Discovery

1.  The greatest attrition is at clinical proof-of-concept – once
a “target” is linked to a disease in the clinic, the risk of
failure is far lower

2.  Most novel targets are pursued by multiple companies in
parallel (and most fail at clinical POC)

3.  The complete data from failed trials are rarely, if ever,
released to the public

52

Open access research tools drive science

53

SGC: Open Access Chemical Biology
a great success

•  PPP:

-‐
GSK,
Pﬁzer,
NovarFs,
Lilly,
Abbo],
Takeda

-‐
Genome
Canada,
Ontario,
CIHR,
Wellcome
Trust

•  Based
in
UniversiFes
of
Toronto
and
Oxford

•  200
scienFsts

•  Academic
network
of
more
than
250
labs

•  Generate
freely
available
reagents
(proteins,
assays,
structures,
inhibitors,

anFbodies)
for
novel,
human,
therapeuFcally
relevant
proteins

•  Give
these
to
academic
collaborators
to
dissect
pathways
and
disease

networks,
and
thereby
discover
new
targets
for
drug
discovery

54

Some SGC Achievements

•  Structural
impact

–  SGC
contributed
~25%
of
global
output
of
human
structures
annually

–  SGC
contributes
>40%
of
global
output
of
human
parasite
structures
annually

•  High
quality
science
(some
publicaFons
from
2011)

Vedadi
et
al,
Nature
Chem
Biol,
in
press
(2011);
Evans
et
al,
Nature
Gene;cs
in

press
(2011);
Norman
et
al
Science
Transl
Med.
3(88):88mr1
(2011);
Kochan
G

et
al
PNAS
108:7745
(2011);
Clasquin
MF
et
al
Cell
145:969
(2011);
Colwill
et
al,

Nature
Methods
8:551
(2011);
Ceccarelli
et
al,
Cell
145:1075
(2011;

Strushkevich
et
al,
PNAS
108:10139
(2011);
Bian
et
al
EMBO
J
in
press
(2011)

Norman
et
al
Science
Trans.
Med.
3:76cm10
(2011);
Xu
et
al
Nature
Comm.
2:

art.
no.
227
(2011);
Edwards
et
al
Nature
470:163
(2011);
Fairman
et
al
Nature

Struct,
and
Mol.
Biol.
18:316
(2011);
Adams-‐Cioaba
et
al,
Nature
Comm.
2
(1)

(2011);
Carr
et
al
EMBO
J
30:317
(2011);
Deutsch
et
al

Cell
144:566
(2011);

Filippakopoulos
et
al
Cell,
in
press;
Nature
Chem.
Biol.
in
press,
Nature
in
press

55

Impact Of SGC’s Open Access JQ1 BET Probe

  Paper published Dec 23 has already cited >60 times
  Harvard spin off (15 M$ seed funding raised)
  > 5 pharma have launched bromodomain programs
  JQ1/SGCB01 has been distributed to >250 labs/companies
  Already used by some to link Brd4 to new areas of science

Zuber et al : BRD4 as target in acute leukaemia Nature, 2011
Delmore et al: JQ1 suppresses myc in multiple myeloma Cell, 2011
Dawson et al: BRD4 in MLL (isoxazole inhibitor) Nature, 2011
Blobel et al: Novel Targets in AML Cancer Cell, 2011
Mertz et al : Myc dependent cancer PNAS, 2011
Zhao et al: Post mitotic transcriptional re-activation Nature Cell Biol., 2011

56

Open access to the clinic?

57

Drug
Discovery
Is
a
Lomery
Because:

Knowledge
about
clinical
disease
is
limiFng

-‐
paFents
are
heterogeneous

-‐
do
not
know
how
some
drugs
work
eg
paracetamol

-‐
different
doses
effecFve
in
different
paFents

-‐
efficacy
is
short
lived

-‐
poor
biomarkers…..

Too
many
targets/preclinical
assays
do
not

prioriFze

58

Other Problems With How We Do Drug
Discovery

•  Same
targets,
in
parallel,
in
secret

•  No
one
organisaFon
has
all
capabiliFes

•  Early
IP
is
making
it
even
harder
(makes

process
slower,
harder
and
more
expensive)

59

Most Novel Targets Fail at Clinical POC

Hit/
Target HTS Probe/ LO Clinical
Tox./ Phase Phase
ID/ candidate
Lead Pharmacy I IIa/ b
Discovery ID
ID

50% 10% 30% 30% 90+%

this is killing
our industry

…we can generate “safe” molecules, but they
are not developable in chosen patient group 60

This Failure Is Repeated, Many Times

Hit/
Target HTS Probe/ LO Clinical
Toxicology/ Phase Phase
ID/ candidate
Discovery Hit/ ID
Target ID Clinical
Probe/ Toxicology/ Phase Phase
ID/ candidate
Discovery Hit/ ID 30% 30% 90+%
Target ID Clinical
ID/ Hit/ candidate
Target Lead Clinical Pharmacy I IIa/ b
Discovery Probe/ ID Toxicology/ Phase Phase
ID/ ID candidate 30% 30% 90+%
Discovery Hit/ ID
Target ID Clinical
Probe/ Toxicology/
30% Phase
30% Phase
90+%
ID/ candidate
Discovery Hit/ ID
Target ID Clinical 30% 30% 90+%
ID/ candidate
Discovery Hit/ ID 30% 30% 90+%
Target ID Clinical
ID/ candidate
Discovery ID
ID 30% 30% 90+%

50% 10% 30% 30% 90+%

…and outcomes are not shared 61

A Possible Soution:Arch2POCM
An Open Access Clinical Validation PPP
•  PPP
to
clinically
validate
(Ph
IIa)
pioneer
targets

•  Pharma,
public,
academia,
regulators
and
paFent
groups
are
acFve

parFcipants

•  CulFvate
a
common
stream
of
knowledge

–  Avoid
patents

–  Place
all
data
into
the
public
domain

–  Crowdsource
the
PPP’s
druglike
compounds

•  In
–validated
targets
are
idenFfied
before
pharma
makes
a
substanFal

proprietary
investment

–  Reduces
the
number
of
redundant
trials
on
bad
targets

–  Reduces
safety
concerns

•  Validated
targets
are
de-‐risked
for
pharma
investment

–  Pharma
can
iniFate
proprietary
effort
when
risks
are
balanced
with
returns

–  PPP
pharma
members
can
acquire
Arch2POCM
IND
for
validated
targets
and
benefit
from

shorter
development
Fmeline
and
data
exclusivity
for
sales

62

Arch2POCM: Scale and Scope
•  Proposed Vertical Goal:
–  Initiate 2 programs. One for Oncology/Epigenetics/Immunology. One for
Neuroscience/Schizophrenia/Autism.
–  Both programs will have 8 drug discovery projects (targets)
–  By Year 5, 30% of projects will have started Ph 1 and 20% will have completed
Ph Iia
–  $200-250M over five years is projected as necessary to advance up to 8 drug
discovery projects within each of the two therapeutic programs
–  By investing $1.6 M annually into one or both of Arch2POCM’s selected disease
areas, partnered pharmaceutical companies:
1.  obtain a vote on Arch2POCM target selection
2.  gain real time data access to Arch2POCM’s 16 drug discovery projects
3.  have the strategic opportunity to expand their overall portfolio
•  Proposed Horizontal Goal:
–  Initiate 1-2 projects, (1-2 novel target mechanisms), as pilots to assess
Arch2POCM principles
–  In either Oncology or Neuroscience
–  Specific target mechanisms to be determined by funders’ interest
–  Interested funders include pharma, public research foundations and venture
philanthropists
63

Epigenetics: Exciting Science and Also A New Area
For Drug Discovery

Lysine

DNA

Histone

Modification Write Read Erase

Acetyl HAT Bromo HDAC
Methyl HMT MBT DeMethyl
64

The Case For Epigenetics/Chromatin Biology

1.  There are epigenetic oncology drugs on the market (HDACs)

2.  A growing number of links to oncology, notably many genetic links (i.e.
fusion proteins, somatic mutations)

3.  A pioneer area: More than 400 targets amenable to small molecule
intervention - most of which only recently shown to be “druggable”, and
only a few of which are under active investigation

4.  Open access, early-stage science is developing quickly – significant
collaborative efforts (e.g. SGC, NIH) to generate proteins, structures,
assays and chemical starting points

65

The Current Epigenetics Universe
Domain Family Typical substrate class* Total
Targets
Histone Lysine Histone/Protein K/R(me)n/ (meCpG) 30

demethylase
Bromodomain Histone/Protein K(ac) 57

R Tudor domain Histone Kme2/3 - Rme2s 59

O
Chromodomain Histone/Protein K(me)3 34

Y
A MBT repeat Histone K(me)3 9

L
PHD finger Histone K(me)n 97

Acetyltransferase Histone/Protein K 17

Methyltransferase Histone/Protein K&R 60

PARP/ADPRT Histone/Protein R&E 17

MACRO Histone/Protein (p)-ADPribose 15

Histone deacetylases Histone/Protein KAc 11

395

Now known to be amenable to small molecule inhibition 66

BET family chemical biology

SGC Toronto SGC Oxford 67

What Are Bromodomains and How Do They
Function?
What Are Bromodomains:
• Small highly conserved protein recognition
domains (~110 residues)
• Bundle of four α-helices and two loops that form
a pocket with a conserved Asn residue
• 56 unique human bromodomains identified:
spread across 42 proteins

How Do They Function:
• Selectively bind to acetylated lysine residues
located on histones
• Histone/BRD complex leads to transcription and
gene expression
• Inhibition of BRD binding to acetylated histones
leads to gene silencing

68

Bromodomains: Genetic Links to Cancer

Genetic abnormality

Publications

69

Available Reagents for Bromodomain Family

28 crystal structures
42 purified proteins

70

Robust Assays Available
Peptide library screen using SPR Peptide array screens using dot blots

Histone peptide
Targets

  We now have a suite of assays for bromodomains
•  Filippakopoulos et al Cell. 2012 149(1):214-31.

71

A Series of Chemical Starting Points

CBP/PCAF

BET

72

Proof-of-concept
JQ1: A Selective Inhibitor for BETs

73

Panagis
Fillipakopoulos,
Jun
Qi,
Stefan
Knapp,
Jay
Bradner

  NUT midline carcinoma (NMC) is a rare,
highly lethal cancer that occurs in children
and young adults.

  NMCs uniformly present in the midline,
most commonly in the head, neck, or
mediastinum, as poorly differentiated
carcinomas

  Rearrangement of the Nuclear protein in
testis (NUT) that creates a BRD4-NUT
fusion gene

 Variant rearrangements, some involving
the BRD3 gene

It is unclear how common NUT   NMC is diagnosed by fluorescence in
rearrangements are in squamous cell situ hybridization and NUT antibodies.
carcinomas due to lack of routine
diagnostic 74

JQ1 Inhibits NMC Tumour Growth

FDG-PET

4 days 50mg/kg IP 75

Jay Bradner/Andrew Kung, Harvard

Potential Year 1 Aims of an Arch2POCM Bromodomain
Program

1.  Select two pre-clinical candidates: Leverage SGC’s existing open
access network of labs, compounds, assays and information to identify
two chemotypes for medicinal chemistry optimization

2.  Develop a biomarker strategy for clinical development: opportunities for
surrogate endpoints and patient stratification

3.  Implement crowdsourced research: manufacture and distribute
optimized pre-clinical candidates to academic and clinical researchers

76

Process For Arch2POCM Target Selection
Arch2POCM creates a disease area spreadsheet of relevant
information for pioneer targets such as:
1.  Novelty: Target selection should focus on addressing fundamental
questions on biology and disease association
•  No clinical precedent
•  Exception: advance an existing asset into a new disease area
2.  Targets should be tractable
•  In vitro assay availability
•  Cell-based assay availability
•  Characterized protein (e.g. 3D structure; antibody, cell lines, mouse model)
•  Availability of starting chemical matter

3.  Evidence of genetic linkages
•  Translocations, mutations, splicing alterations specifically linked to disease
•  “Peripheral” genetic linkages:
•  Gene expression profiles or GWAS data indicate correlation
–  Implicated in pathway with clear genetic link (SLS, Networks)

4.  Key research contacts (academic or industry)

77

Poten;al
Targets-‐
Bromodomain
Family

Evidence
that
this
target
plays
an
important
Maturity
of
the
Posi;ve
Data
showing
Mouse
knockout
model

(MGI)

role
in
tumors
(in
vitro,
in
vivo,
animal
program
evidence
of
a
failed
result

model
data)
the
of
the

compound
compound
for

playing
a
role
the
given

in
the
given
disease

disease

Expression
correlates
with
development
of
potent,
NA
NA
Homozygotes
for
a
null
allele
die
in
utero
before

SMARCA4
prostate
cancer

selecFve,
cell
implantaFon.
Embryos
heterozygous
for
this
null

BUT
SMARCA4
in
general
acts
as
tumor
acFve
allele
and
an
ENU-‐induced
allele
show
impaired

suppressor
and
is
necessary
for
genome
compound
definiFve
erythropoiesis,
anemia
and
lethality

stability;
targeted
knockdown
of
SMARCA4
idenFfied
during
organogenesis.
Heterozygotes
show

potenFates
lung
cancer
development;

cyanosis
and
cardiovascular
defects
and
are
pre-‐
disposed
to
breast
tumors

Gastric
cancer;
mutated
in
CLL;
depleFon
of
potent,
NA
NA
Mice
homozygous
for
a
targeted
mutaFon
in
this

SMARCA2A
BRM
causes
accelerated
progression
to
the
selecFve,
cell
gene
may
exhibit
inferFlity
and
a
slightly
increased

differenFaFon
phenotype
acFve
body
weight
in
some
geneFc
backgrounds.

BUT
targeted
deleFon
is
causaFve
for
the
compound

development
of
prostaFc
hyperplasia
in
mice
idenFfied

TranslocaFon
of
CBP
with
MOZ,
monocyFc
potent,
NA
NA
Homozygotes
for
null
or
altered
alleles
die
around

CBP
leukemia
zinc
finger
protein

cause

acute
selecFve,
cell
midgestaFon
with
defects
in
hemopoiesis,
blood

myeloid
leukemia
;
other
translocaFons
acFve
vessel
formaFon,
and
neural
tube
closure.

involve
MLL
(HRX);
Mutated
in
ALL
BUT
CBP
compound
Heterozygotes
may
exhibit
skeletal,
cardiac,
and

has
also
been

proposed
as
a
classical
tumor
idenFfied
hematopoieFc
defects,
retarded
growth,
and

suppressor

hematologic
tumors.

Correlated
with
survival
of
high-‐grade
Weak
hits
NA
NA
NA

ATAD2
osteosarcoma
paFents
a{er
chemo-‐therapy;

required
for
breast
cancer
cell
proliferaFon
;

differenFally
expressed
in
NSCLC

TranslocaFons
produce
BRD4-‐NUT
fusion
JQ1
JQ1
in
BRD-‐ NA
Homozygotes
for
a
gene-‐trap
null
mutaFon
die

BRD4
oncogene
causing
midline
carcinoma
NUT
fusion
soon
a{er
implantaFon.
Heterozygotes
exhibit

and
MLL
impaired
pre-‐
and
postnatal
growth,
head

malformaFons,
lack
of
subcutaneous
fat,

cataracts,
and
abnormal
liver
cells.

In
transgenic
mice,
consFtuFve
lymphoid
JQ1
JQ1
in
BRD-‐ NA
Mice
homozygous
for
a
null
mutaFon
display

BRD2
expression
of
Brd2
causes
a
malignancy
most
NUT
fusion
embryonic
lethality
during
organogenesis
with

similar
to
human
diffuse
large
B
cell
and
MLL
decreased
embryo
size,
decreased
cell

lymphoma
proliferaFon,
a
delay
in
the
cell
cycle,
and

increased
cell
death.
Heterozygous
mice
also

display
decreased
cell
proliferaFon.

Poten;al
Targets-‐
Demethylases

Evidence
that
this
target
plays
an
important
role
in
Maturity
of
Posi;ve
Data
showing
a
Mouse
model

(MGI)

tumors
(in
vitro,
in
vivo,
animal
model
data)
the
program
evidence
of
the
failed
result
of

compound
the
compound

playing
a
role
in
for
the
given

the
given
disease

disease

Upregulated
in
prostate
cancer;
expression
is
higher
potent,
NA;
inhibits

NA
Mice
homozygous
for
a
knock-‐out
allele

JMJD3
in
metastaFc
prostate
cancer
selecFve,
TNF-‐alpha
exhibit
perinatal
lethality
associated
with

BUT
JMJD3
contributes
to
the
acFvaFon
of
the
cell
acFve
producFon
in
thick
alveolar
septum
and
absences
of
air

INK4A-‐ARF
tumor
suppressor
locus
in
response
to
compound
macrophages
of
space
in
the
lungs.
Bone
marrow
chimera

oncogene
-‐
and
stress-‐induced
senescence.

idenFﬁed
RA
paFents
mice
derived
from
fetal
liver
cells
exhibit

impaired
eosinophil
recruitment
and

abnormal
response
to
helminth
infecFon.

High
levels
in
breast
cancer
cell
lines,
strong
No
progress
NA
NA
NA

JARID1B
expression
in
the
invasive
but
not
in
the
benign

components
of
primary
breast
carcinomas.
BUT

tumor
suppressor
in
melanoma
cells

Poten;al
Targets-‐
Histone
Methyltransferases

Evidence
that
this
target
plays
an
important
role
in
Maturity
of
the
Posi;ve
evidence
Data
showing
a

tumors
(in
vitro,
in
vivo,
animal
model
data)
program
of
the
compound
failed
result
of
the

playing
a
role
in
compound
for
the

the
given
disease
given
disease

Recent
data
indicates
that
SETD8
deregulates
PCNA
Weak
inhibitors
NA
NA

SETD8
expression
by
degradaFon
accelerated
by
methylaFon
at
idenFfied
(8
microM)

K248.

Expression
levels
of
SETD8
and
PCNA
upregulated
in
in
chemistry

cancer
cells.

Cancer
Research
May
2012
Takawa
et
al.
opFmizaFon.

EZH2
upregulated
in
cancer
cells.

Studies
on
mutants
potent,
selecFve,
cell
NA
NA

EZH2
indicates
an
interesFng
profile
where
both
wild-‐type
and
acFve
compound

mutant
(Y641F)
are
required
for
malignant
phenotype.

idenFfied.

Sneeringer
et
al.
PNAS
2012.

Compounds
idenFfied
in
GSK

patents
WO
2011/140324
and
140315
and
WO
2012/005805

and
075080.

MMSET,
WHSC1,
NSD2
is
overexpressed
in
cancer
cells.

No
hits—currently
NA
NA

MMSET
Hudlebusch
et
al.
Clinical
Cancer
Res
2011
screening

Daigle
et
al.
Cancer
Cell
2011
elegantly
show
that
potent
potent,
selecFve,
cell
Transgenic
mouse

DOT1L
DOT1L
inhibitors
kill
cells
containing
MLL
translocaFons
acFve
compound
model
tumors

and
do
not
kill
cell
not
containing
the
translocaFons
idenFfied.
shrunk
by
SC

dosing
of
inhibitor

Proposed Metrics For Measuring Arch2POCM Success
Use a therapeutic product profile (TPP) with stage-gates and defined milestones
to monitor project progression:
•  Small molecule screening hit rate achieved
•  SAR/In vitro testing
–  Target EC50 achieved by at least XX compounds
–  Selectivity target achieved by at least YY compounds
–  Biological activity demonstrated for at least XX compounds in human tissue models (disease tissue, stem cells)
•  Manufacturing and Quality
–  Steady and cost-effective supply of lead compound achieved
–  Stability of lead compound demonstrated (sufficient to support POCM testing)
–  Lead compound formulation identified to support pre-clinical and clinical studies
–  Lead compound demonstrates selected quality attributes (sufficient to support pre-clinical studies and distribution to the
crowd)
•  Pre-clinical testing
–  Lead compounds achieve pre-clinical safety
–  Lead compound s surpass target TI
–  Lead compounds demonstrate cross-reactivity sufficient to support pre-clinical tox testing
•  Clinical
–  Lead compounds demonstrate Ph I safety
–  Lead compounds demonstrate Ph II POCM
•  Data management
–  IT database infrastructure populated with XX epigenetics investigators/grant application/publications
–  Database QC and compliance defined and implemented (internal and external)

81

Program Activities Grid For Arch2POCM
Ac;vity

Arch2POCM
Loca;on/Inves;gator
(TBD)

Target
Structure

Compound
libraries

Assay
development
for
epigeneFc
screens
and
biomarkers

HTP
screens
for
epigeneFc
hits

Med
Chem
SAR
To
ID
Two
Suitable
Binding
Arch2POCM
Test

Compounds

Non-‐GLP
scaleup
of
Arch2POCM
Test
Compounds
and
associated

analyFcs

DistribuFon
of
Arch2POCM
Test
Compounds

PK,
PD,
ADME,
Tox
TesFng

GMP
Manufacturing
of
Arch2POCM
Test
Compounds

GMP
FormulaFon

GMP
Drug
Storage
and
DistribuFon

IND
PreparaFon
Support

Clinical
Assay
Development
and
QualiﬁcaFon

Ph
I-‐II
Clinical
Trials

Ph
I-‐II
Database
Management
and
CSR
ProducFon

82

DISCUSSION

•  OpportuniFes
to
Review
Targets

•  OpportuniFes
to
Discuss
Approach

•  OpportuniFes
to
Consider
PotenFal
Lead

Groups
for
funding
using
this
Open
Approach

83

Stephen Friend ICR UK 2012-06-18

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Stephen Friend ICR UK 2012-06-18

Semelhante a Stephen Friend ICR UK 2012-06-18 (20)

Mais de Sage Base

Mais de Sage Base (20)

Último

Último (20)

Stephen Friend ICR UK 2012-06-18