Network Biology Lent 2010 - lecture 1

Exploratory analysis of phenotyping screens: enrichment, clustering, ranking Network Biology - lecture 2

High-dimensional phenotypes by microscopy or molecular profiling Low-dimensional phenotypes A- Time Size

A challenge for computation and statistics

Today’s lecture ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

High-throughput phenotyping Weak Strong Phenotypes of 100s to 10.000s perturbed genes Hits

Gene Ontology (GO) www.geneontology.org A GO Term with a gene set annotated to it

GO over-representation Hyper-geometric test Hits Weak Strong Phenotype GO All genes All Hits Hits in GO term Genes in GO term

Hyper-geometric distribution N genes altogether n hits k hits in GO term m genes in GO term Probability to observe k hits in GO term Number of possibilities to choose n hits out of N genes k hits fall into the GO term of size m The other n-k hits are all genes outside the GO term

Hyper-geometric test: example pvalue = phyper ( k , m , N-m , n , lower.tail = FALSE ) Hold these fixed : N = 10 000 m = 200 See what happens if we vary: k = 1,2,…, 30 n = 50, 100, 200, 300, 400 k or more! N n k m Number k of hits in GO term p-value [log10] 50 100 200 300 400

Gene Set Enrichment Analysis Not restricted to GO , could be any collection of gene sets, e.g. MSigDB at http://www.broad.mit.edu/gsea/ Subramanian et al. (2005) Weak Strong Phenotype GO Weak Strong Phenotype No significant trend Significant trend

GSEA: construction Weak Strong Phenotype Nr. non- hits before i Nr. all non- hits i Nr. hits before i Nr. all hits if p = 0

PROs and CONs Gene set 1 Gene set 2 Gene set 3 Gene set 4 … e.g. Wnt pathway DNA repair Chromosome 1 Expressed in liver p -values (hyper-geometric or GSEA) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Result:

Map phenotypes to network Where do the hits fall in the network and what are they connected to?

Sub-networks with highly correlated phenotypes

Predicting phenotypes Guilt by association Use known phenotypes in the network Use edge weights if possible Success depends on quality and coverage of linkage in the network ? ? ?

Which networks? Networks from large-scale experiments Networks from analyzing the experimental literature Networks from probabilistic data integration

Cluster first, think later Perturbed Genes Phenotypic profiles Features = Expression profiles, parameters of cell shape, protein concentration or localization Genes with similar phenotypic profiles often have similar molecular functions or act in the same pathways. Principle for function prediction: Guilt by association

From data to distances Phenotypic Profiles Perturbed genes D [i,j] = dist( M [i,] , M [j,] ) M D D [j,i] = D [i,j] D [i,i] = 0 for all i dist(. , .) What distance measure should we use? Distance or dissimilarity matrix i j i j j i

Examples of distances how is this related to correlation? a b Euclidean distance dist( a,b ) = a b dist( a,b ) = Manhattan distance  a b dist( a,b ) = Cosine distance

Linkage: distances to clusters dist(C1, C2) = max { dist( i , j ) : i in C1, j in C2 } complete linkage = min { dist( i , j ) : i in C1, j in C2 } single linkage = mean { dist( i , j ) : i in C1, j in C2 } average linkage D Phenotype 2 Phenotype 1 1 2 3 6 5 4 D[3,4] 1 2 3 6 5 4 Phenotype 2 Phenotype 1 D[ (2,3) ,4] = ??? Distances between individual genes 3

Hierarchical clustering Ingredients: data matrix , distance function, linkage function Phenotype 2 Phenotype 1 1 2 3 6 5 4 1 2 3 6 5 4 Dendrogram

Phenotypic Data 332 knock-downs in 5 conditions Correlation matrix Dendrogram Data by Klaas Mulder, CRI

Clustering: PROs and CONs ,[object Object],[object Object],[object Object],PRO NEG ,[object Object],Brown et al (2005) Ivanova et al (2006) Bakal et al (2007)

Query-based gene ranking ,[object Object],Often addressed by clustering: Which cluster does the query gene fall into? ,[object Object],[object Object],[object Object],[object Object],[object Object],Cluster 1 Cluster 2 Cluster 3 Query gene

Ranking by PhenoBLAST ,[object Object],[object Object],PhenoBLAST Gunsalus et al (2004) Perturbed Genes Ordered by similarity to query gene Phenotypes

Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Three take-home messages ,[object Object],[object Object],[object Object]

From clusters to pathways ,[object Object],Next lectures: graph-based and probabilistic models to infer (causal) pathway structure from phenotypic data.

Network Biology - lecture 2 ≥ 3 Questions !

Network Biology Lent 2010 - lecture 1

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a Network Biology Lent 2010 - lecture 1

Semelhante a Network Biology Lent 2010 - lecture 1 (20)

Network Biology Lent 2010 - lecture 1