3. STRING provides a modular protein network by integrating diverse types of evidence Genomic neighborhood Species co-occurrence Gene fusions Database imports Exp. interaction data Microarray expression data Literature co-mentioning
6. Formalizing the phylogenetic profile method Align all proteins against all Calculate best-hit profile Join similar species by PCA Calculate PC profile distances Calibrate against KEGG maps
7. Predicting functional and physical interactions from gene fusion/fission events Find in A genes that match a the same gene in B Exclude overlapping alignments Calibrate against KEGG maps Calculate all-against-all pairwise alignments
8. Inferring functional associations from evolutionarily conserved operons Identify runs of adjacent genes with the same direction Score each gene pair based on intergenic distances Calibrate against KEGG maps Infer associations in other species
9.
10. Integrating physical interaction screens Complex pull-down experiments Yeast two-hybrid data sets are inherently binary Calculate score from number of (co-)occurrences Calculate score from non-shared partners Calibrate against KEGG maps Infer associations in other species Combine evidence from experiments
11. Mining microarray expression databases Re-normalize arrays by modern method to remove biases Build expression matrix Combine similar arrays by PCA Construct predictor by Gaussian kernel density estimation Calibrate against KEGG maps Infer associations in other species
14. Predicting and defining metabolic pathways and other functional modules Image: Molecular Biology of the Cell, 3 . rd edition Metabolism overview Defined manually: cutting metabolic maps into pathways Purine biosynthesis Histidine biosynthesis Defined objectively: standard clustering of genome-scale data
15.
16.
17. Getting the parts list yeast culture Microarrays Gene expression Expression profile Cho et al. & Spellman et al. 600 periodically expressed genes (with associated peak times) that encode “dynamic proteins” The Parts list New Analysis
18.
19. Interactions are close in time Observation: Interacting dynamic proteins typically expressed close in time
20. Static proteins play a major role Observation: Static ( scaffold ) proteins comprise about a third of the network and participate in interactions throughout the entire cycle
21. Just-in-time synthesis? yes and no! Observation: The dynamic proteins are generally expressed just before they are needed to carry out their function, generally referred to as just-in-time synthesis But, the general design principle seems to be that only some key components of each module/complex are dynamic This suggests a mechanism of just-in-time assembly or partial just-in-time synthesis
22. Network as a discovery tools Observation: The network places 30+ uncharacterized proteins in a temporal interaction context. The network thus generates detailed hypothesis about their function. Observation: The network contains entire novel modules and complexes.
23. Network Hubs: “Party” versus “Date” “ Date” Hub: the hub protein interacts with different proteins at different times. “ Party” Hub: the hub protein and its interactors are expressed close in time.