2. Outline
Background
Methods
Data
Constructing the networks
Graphlets
K-core decomposition
The Core Diseasome
Topological uniqueness
Functional annotation
Drug targets
Computing the Core Diseasome
Key cardio-vascular disease genes
G-protein coupled receptors
Imperial College London Vuk Janjić vj11@imperial.ac.uk
3. Outline
Background
Methods
Data
Constructing the networks
Graphlets
K-core decomposition
The Core Diseasome
Topological uniqueness
Functional annotation
Drug targets
Computing the Core Diseasome
Key cardio-vascular disease genes
G-protein coupled receptors
Imperial College London Vuk Janjić vj11@imperial.ac.uk
4. Background
A LOT of system-level biological data due to advances in
biotechnology
Imperial College London Vuk Janjić vj11@imperial.ac.uk 1/17
5. Background
A LOT of system-level biological data due to advances in
biotechnology
We’re looking for a “core subnetwork” of the human
protein-protein interaction (PPI) network in which genes (their
protein products) involved in a multitude of diseases reside
Imperial College London Vuk Janjić vj11@imperial.ac.uk 1/17
6. Background
A LOT of system-level biological data due to advances in
biotechnology
We’re looking for a “core subnetwork” of the human
protein-protein interaction (PPI) network in which genes (their
protein products) involved in a multitude of diseases reside
No a priori knowledge of genes’ involvement in disease and by
using k-core decomposition
Imperial College London Vuk Janjić vj11@imperial.ac.uk 1/17
7. Background
A LOT of system-level biological data due to advances in
biotechnology
We’re looking for a “core subnetwork” of the human
protein-protein interaction (PPI) network in which genes (their
protein products) involved in a multitude of diseases reside
No a priori knowledge of genes’ involvement in disease and by
using k-core decomposition
Other studies have used a similar approach, but with a
different goal in mind
Imperial College London Vuk Janjić vj11@imperial.ac.uk 1/17
8. Outline
Background
Methods
Data
Constructing the networks
Graphlets
K-core decomposition
The Core Diseasome
Topological uniqueness
Functional annotation
Drug targets
Computing the Core Diseasome
Key cardio-vascular disease genes
G-protein coupled receptors
Imperial College London Vuk Janjić vj11@imperial.ac.uk
9. Data
# of nodes # of edges Reference
Protein-protein 11,100 56,708 HPRD, BioGRID
Genetic 274 281 BioGRID
Disease-gene 561 / 4,004 4,029 Disease Ontology
(diseases/genes)
Table: Interaction data
Janjić V. & Pržulj N., Molecular BioSystems, 8, 2614-2625 (2012).
Imperial College London Vuk Janjić vj11@imperial.ac.uk 2/17
10. Constructing the networks
Table: Basic network properties for our four networks
H-ALL H-SIM REST CORE
Number of nodes 11,100 1,706 8,227 88
Number of edges 56,807 8,655 24,730 865
Clustering coefficient 0.125 0.173 0.102 0.462
Diameter 13 9 16 3
Radius 7 5 8 2
Avg. degree 10.23 10.14 4.53 19.65
Avg. path length 3.69 3.48 4.53 1.87
Imperial College London Vuk Janjić vj11@imperial.ac.uk 3/17
11. Constructing the networks
Table: Basic network properties for our four networks
H-ALL H-SIM REST CORE
Number of nodes 11,100 1,706 8,227 88
Number of edges 56,807 8,655 24,730 865
Clustering coefficient 0.125 0.173 0.102 0.462
Diameter 13 9 16 3
Radius 7 5 8 2
Avg. degree 10.23 10.14 4.53 19.65
Avg. path length 3.69 3.48 4.53 1.87
Imperial College London Vuk Janjić vj11@imperial.ac.uk 3/17
12. Constructing the networks
Table: Basic network properties for our four networks
H-ALL H-SIM REST CORE
Number of nodes 11,100 1,706 8,227 88
Number of edges 56,807 8,655 24,730 865
Clustering coefficient 0.125 0.173 0.102 0.462
Diameter 13 9 16 3
Radius 7 5 8 2
Avg. degree 10.23 10.14 4.53 19.65
Avg. path length 3.69 3.48 4.53 1.87
Imperial College London Vuk Janjić vj11@imperial.ac.uk 3/17
20. Constructing the networks
Table: Basic network properties for our four networks
H-ALL H-SIM REST CORE
Number of nodes 11,100 1,706 8,227 88
Number of edges 56,807 8,655 24,730 865
Clustering coefficient 0.125 0.173 0.102 0.462
Diameter 13 9 16 3
Radius 7 5 8 2
Avg. degree 10.23 10.14 4.53 19.65
Avg. path length 3.69 3.48 4.5 1.87
Imperial College London Vuk Janjić vj11@imperial.ac.uk 3/17
21. Constructing the networks
Table: Basic network properties for our four networks
H-ALL H-SIM REST CORE
Number of nodes 11,100 1,706 8,227 88
Number of edges 56,807 8,655 24,730 865
Clustering coefficient 0.125 0.173 0.102 0.462
Diameter 13 9 16 3
Radius 7 5 8 2
Avg. degree 10.23 10.14 4.53 19.65
Avg. path length 3.69 3.48 4.5 1.87
Imperial College London Vuk Janjić vj11@imperial.ac.uk 3/17
22. Constructing the networks
Table: Basic network properties for our four networks
H-ALL H-SIM REST CORE
Number of nodes 11,100 1,706 8,227 88
Number of edges 56,807 8,655 24,730 865
Clustering coefficient 0.125 0.173 0.102 0.462
Diameter 13 9 16 3
Radius 7 5 8 2
Avg. degree 10.23 10.14 4.53 19.65
Avg. path length 3.69 3.48 4.5 1.87
Imperial College London Vuk Janjić vj11@imperial.ac.uk 3/17
23. Outline
Background
Methods
Data
Constructing the networks
Graphlets
K-core decomposition
The Core Diseasome
Topological uniqueness
Functional annotation
Drug targets
Computing the Core Diseasome
Key cardio-vascular disease genes
G-protein coupled receptors
Imperial College London Vuk Janjić vj11@imperial.ac.uk
24. Topological uniqueness
Maximum EC = 10.52%
Algorithm executions 1-4,000
Edgecorrectness(%)
13
12
11
10
9
8
7
6
Imperial College London Vuk Janjić vj11@imperial.ac.uk 4/17
25. Functional annotation
Statistics performed using:
hypergeometric test
H-ALL as the background model
Benjamini-Hochberg False Discovery Rate correction for
multiple hypothesis testing
Imperial College London Vuk Janjić vj11@imperial.ac.uk 5/17
26. Functional annotation
Statistics performed using:
hypergeometric test
H-ALL as the background model
Benjamini-Hochberg False Discovery Rate correction for
multiple hypothesis testing
Enriched Molecular Function Gene Ontology (GO) terms
enzyme binding, transcription factor binding, transcription
regulator activity, DNA binding, promoter binding
Imperial College London Vuk Janjić vj11@imperial.ac.uk 5/17
27. Functional annotation
Statistics performed using:
hypergeometric test
H-ALL as the background model
Benjamini-Hochberg False Discovery Rate correction for
multiple hypothesis testing
Enriched Molecular Function Gene Ontology (GO) terms
enzyme binding, transcription factor binding, transcription
regulator activity, DNA binding, promoter binding
Enriched Biological Process GO terms (mostly regulatory)
positive regulation of macromolecule metabolic process,
positive regulation of cellular biosynthetic process, response to
organic substance, regulation of cell proliferation, positive
regulation of gene expression
Imperial College London Vuk Janjić vj11@imperial.ac.uk 5/17
28. Functional annotation
Table: Regulation of cell death and apoptosis enrichment.
regulation of cell death regulation of apoptosis
(GO:10941) (GO:42981)
H-ALL 8.9% 8.8%
H-SIM 19.9% (p = 8.59 × 10−60
) 19.8% (p = 1.13 × 10−59
)
REST no enrichment no enrichment
CORE 32.1% (p = 6.93 × 10−10
) 29.8% (p = 1.1 × 10−8
)
Imperial College London Vuk Janjić vj11@imperial.ac.uk 6/17
29. Functional annotation
Top 1% hubs contain only 9 (out of 185) apoptosis annotated
proteins
Imperial College London Vuk Janjić vj11@imperial.ac.uk 7/17
30. Functional annotation
Top 1% hubs contain only 9 (out of 185) apoptosis annotated
proteins
These 9 are evenly split between H-SIM and REST (5 are in
H-SIM and 4 in REST)
Imperial College London Vuk Janjić vj11@imperial.ac.uk 7/17
31. Functional annotation
Top 1% hubs contain only 9 (out of 185) apoptosis annotated
proteins
These 9 are evenly split between H-SIM and REST (5 are in
H-SIM and 4 in REST)
Cell death has no annotated proteins in the top 1% of hubs.
Imperial College London Vuk Janjić vj11@imperial.ac.uk 7/17
32. Functional annotation
Could the Core Diseasome be capturing genes causal to
diseases for which we generally have no effective cure,
including cancer, hematologic diseases, neurodegenerative
diseases, progression of viral and HIV infection?
Imperial College London Vuk Janjić vj11@imperial.ac.uk 8/17
33. Driver genes
Genetic interactions are increasingly starting to show that a
very small number of genetic changes may trigger disease
onset. These mutations are usually called driver mutations.
Ashworth A. et al., Cell, 145, 30–38, (2011).
Imperial College London Vuk Janjić vj11@imperial.ac.uk 9/17
34. Driver genes
We verify that CORE genes:
are enriched in genetic interactions (GIs)
22 of them participate in 21 GIs within CORE (p = 10−16
)
32 of them participate in 100 GIs total (including 59 genes
outside of core)
Imperial College London Vuk Janjić vj11@imperial.ac.uk 10/17
35. Driver genes
We verify that CORE genes:
are enriched in genetic interactions (GIs)
22 of them participate in 21 GIs within CORE (p = 10−16
)
32 of them participate in 100 GIs total (including 59 genes
outside of core)
capture 15 driver genes (both known and predicted).
Imperial College London Vuk Janjić vj11@imperial.ac.uk 10/17
37. Drug targets
Amongst the 22 genes participating in genetic interactions
within CORE, there are 11 drug targets linked to 116 distinct
drugs (p = 8.64 × 10−5)
MDM2MDM2
JUNJUN
RB1RB1
ARAR
SMAD2SMAD2
NCOA2NCOA2
KAT2BKAT2B CCND1CCND1
ESR1ESR1
CTNNB1CTNNB1
CREBBPCREBBP
Imperial College London Vuk Janjić vj11@imperial.ac.uk 12/17
38. Drug targets
Amongst the 22 genes participating in genetic interactions
within CORE, there are 11 drug targets linked to 116 distinct
drugs (p = 8.64 × 10−5)
MDM2MDM2
JUNJUN
RB1RB1
ARAR
SMAD2SMAD2
NCOA2NCOA2
KAT2BKAT2B CCND1CCND1
ESR1ESR1
CTNNB1CTNNB1
CREBBPCREBBP
Out of these 11 drug targets, 3 are
targeted by 23 or more drugs:
ESR1 is targeted by 61 different
drugs, AR by 40, and NCOA2 by
23. (the p-value of any target
being hit by more than 22 drugs is
0.0017)
Imperial College London Vuk Janjić vj11@imperial.ac.uk 12/17
39. Drug targets
Amongst the 22 genes participating in genetic interactions
within CORE, there are 11 drug targets linked to 116 distinct
drugs (p = 8.64 × 10−5)
MDM2MDM2
JUNJUN
RB1RB1
ARAR
SMAD2SMAD2
NCOA2NCOA2
KAT2BKAT2B CCND1CCND1
ESR1ESR1
CTNNB1CTNNB1
CREBBPCREBBP
Out of these 11 drug targets, 3 are
targeted by 23 or more drugs:
ESR1 is targeted by 61 different
drugs, AR by 40, and NCOA2 by
23. (the p-value of any target
being hit by more than 22 drugs is
0.0017)
2 known driver genes in CORE are
drug targets: RB1 and CTNNB1
Imperial College London Vuk Janjić vj11@imperial.ac.uk 12/17
47. Outline
Background
Methods
Data
Constructing the networks
Graphlets
K-core decomposition
The Core Diseasome
Topological uniqueness
Functional annotation
Drug targets
Computing the Core Diseasome
Key cardio-vascular disease genes
G-protein coupled receptors
Imperial College London Vuk Janjić vj11@imperial.ac.uk
48. G-protein coupled receptors
New unpublished interaction network of human G-protein
coupled receptors (GPCRs) from Štagljar Lab (U-of-T)
Imperial College London Vuk Janjić vj11@imperial.ac.uk 16/17
49. G-protein coupled receptors
New unpublished interaction network of human G-protein
coupled receptors (GPCRs) from Štagljar Lab (U-of-T)
The whole GPCR network is basically a signal transduction
“backbone” of the human PPI network — it’s wiring allows it
to quickly reach all parts of the interactome
Imperial College London Vuk Janjić vj11@imperial.ac.uk 16/17
50. G-protein coupled receptors
New unpublished interaction network of human G-protein
coupled receptors (GPCRs) from Štagljar Lab (U-of-T)
The whole GPCR network is basically a signal transduction
“backbone” of the human PPI network — it’s wiring allows it
to quickly reach all parts of the interactome
The “core” of this GPCR network has 68 interactions between
25 proteins
Imperial College London Vuk Janjić vj11@imperial.ac.uk 16/17
51. G-protein coupled receptors
New unpublished interaction network of human G-protein
coupled receptors (GPCRs) from Štagljar Lab (U-of-T)
The whole GPCR network is basically a signal transduction
“backbone” of the human PPI network — it’s wiring allows it
to quickly reach all parts of the interactome
The “core” of this GPCR network has 68 interactions between
25 proteins
Its “core” proteins primarily expressed in brain, and involved in
a range of personality and behavioural disorders:
attention deficit hyperactivity disorder, weight gain, bipolar
disorder, antipsychotic agent-induced weight gain, attention
deficit disorder / conduct disorder / oppositional defiant
disorder, schizophrenia, weight loss, obesity, mood disorders,
tardive dyskinesia, and personality traits.
Imperial College London Vuk Janjić vj11@imperial.ac.uk 16/17
52. We’ve seen that. . . (i.e., take-home messages)
A sub-network of the human PPI network exist, such that it’s
topology is unique within that context and it captures disease
genes, driver genes and their drug targets
Imperial College London Vuk Janjić vj11@imperial.ac.uk 17/17
53. We’ve seen that. . . (i.e., take-home messages)
A sub-network of the human PPI network exist, such that it’s
topology is unique within that context and it captures disease
genes, driver genes and their drug targets
...and it can be obtained purely computationally
Imperial College London Vuk Janjić vj11@imperial.ac.uk 17/17
54. We’ve seen that. . . (i.e., take-home messages)
A sub-network of the human PPI network exist, such that it’s
topology is unique within that context and it captures disease
genes, driver genes and their drug targets
...and it can be obtained purely computationally
Usability of the “core” approach in identifying therapeutically
relevant regions of the interactome in two case studies —
Cardiovascular disease and G-protein coupled receptors
Imperial College London Vuk Janjić vj11@imperial.ac.uk 17/17