NetBioSIG2014-Talk by Hyunghoon Cho

•Transferir como PPTX, PDF•

0 gostou•1,098 visualizações

Alexander Pico

NetBioSIG2014 at ISMB in Boston, MA, USA on July 11, 2014

Ciências

Identifying context-dependent
community structure across
multiple networks
Hyunghoon Cho, Gerald Quon, Bonnie Berger, Manolis Kellis
MIT CSAIL
ISMB Network Biology SIG
July 11th, 2014

Modules / communities
Cellular functions are carried out by groups of
biomolecules (e.g., proteins, RNA) acting in a
coordinated fashion.
Problem: how does this structure change under a
different condition?

Detecting changes in modules
1 2
3
1
2
3
Context
Module
1 2 Kv v v

Approaches to module detection
• Many algorithms for detecting modules in a single network
– Link clustering [Shi et al. 2013], label propagation [Gregory 2010],
Tensor decomposition [Anandkumar et al. 2013], mixed-membership
stochastic blockmodels [Airoldi et al. 2008], etc.
• Not obvious how to extend to the multiple network case:
Combine networks,
then detect modules
likely to miss
rare modules
Detect modules,
then combine results
inconsistent
module definition
Multi-MMSB
Jointly learns modules from
all networks, allow each to
be only present in a subset
of networks

Model description: SB
Note: each node belongs to a single module
Adjacency matrix

Model description: MMSB
[Airoldi et al., 2008]

Learning the model
Goal: optimize model likelihood
Expectation-Maximization algorithm to deal with latent variables
Need variational approximation
Random restarts to alleviate local optima issue

Performance metric
• Normalized mutual information (NMI)
Sequence of
structural queries
Learned
community
structure
True
community
structure
Answers
Answers
Calculate
mutual information
[Esquivel and Rosvall, 2012]

Synthetic data: results
Normalizedmutualinformation

Asthma data (GSE19301)
Microarray profiling of peripheral blood mononuclear
cells from asthma patients at 3 different stages:
• quiet: 394 samples
• exacerbation: 125 samples
• follow-up (2 weeks after exacerbation): 166 samples
[Bjornsdottir et al., 2011]

RNA decay data (GSE37451)
Microarray profiling of 70 lymphoblastoid cell lines at 5
different timepoints after transcription arrest:
• 0 hr (before transcription arrest)
• 0.5 hr
• 1 hr
• 2 hr
• 4 hr

Summary
• We developed Multi-MMSB, a flexible way of
learning community structure over multiple
networks
• Multi-MMSB outperformed naive methods on
synthetic data
• When applied to real data, Multi-MMSB identified
context-specific modules that are biologically
plausible

Future directions
• Extending the model:
– Directed networks
– Weighted edges
• Application to other types of biological networks:
– Regulatory networks
– PPI

Acknowledgements
• Gerald Quon
• Prof. Bonnie Berger
• Prof. Manolis Kellis

Mais conteúdo relacionado

Mais procurados

NetBioSIG2012 chriseveloAlexander Pico

NetBioSIG2013-Talk Vuk JanjicAlexander Pico

Overall Vision for NRNB: 2015-2020Alexander Pico

NetBioSIG2013-Talk David AmarAlexander Pico

Technology R&D Theme 3: Multi-scale Network RepresentationsAlexander Pico

NetBioSIG2013-Talk Robin Haw Alexander Pico

NRNB Annual Report 2011Alexander Pico

NRNB Annual Report 2016: OverallAlexander Pico

Technology R&D Theme 2: From Descriptive to Predictive NetworksAlexander Pico

NetBioSIG2013-Talk Thomas KelderAlexander Pico

NetBioSIG2013-KEYNOTE Natasa PrzuljAlexander Pico

NetBioSIG2013-KEYNOTE Stefan SchusterAlexander Pico

System biology and its toolsGaurav Diwakar

NRNB Annual Report 2012Alexander Pico

NRNB EAC Meeting 2012Alexander Pico

NRNB Annual Report 2018Alexander Pico

NetBioSIG2014-Talk by David AmarAlexander Pico

NetBioSIG2012 ugurdogrusoz-cbioAlexander Pico

Introduction to systems biologylemberger

Systems biology & Approaches of genomics and proteomicssonam786

Mais procurados (20)

NetBioSIG2012 chrisevelo

NetBioSIG2013-Talk Vuk Janjic

Overall Vision for NRNB: 2015-2020

NetBioSIG2013-Talk David Amar

Technology R&D Theme 3: Multi-scale Network Representations

NetBioSIG2013-Talk Robin Haw

NRNB Annual Report 2011

NRNB Annual Report 2016: Overall

Technology R&D Theme 2: From Descriptive to Predictive Networks

NetBioSIG2013-Talk Thomas Kelder

NetBioSIG2013-KEYNOTE Natasa Przulj

NetBioSIG2013-KEYNOTE Stefan Schuster

System biology and its tools

NRNB Annual Report 2012

NRNB EAC Meeting 2012

NRNB Annual Report 2018

NetBioSIG2014-Talk by David Amar

NetBioSIG2012 ugurdogrusoz-cbio

Introduction to systems biology

Systems biology & Approaches of genomics and proteomics

Semelhante a NetBioSIG2014-Talk by Hyunghoon Cho

Knowledge extraction and visualisation using rule-based machine learningjaumebp

System Biology and Pathway Network.pptxssuserecbdb6

presentationPeter Langfelder

Introduction to biocomputingNatalio Krasnogor

Intro to in silico drug discovery 2014Lee Larcombe

Modular RADAR: Immune System Inspired Strategies for Distributed SystemsSoumya Banerjee

ECCB posterAminaKhalid19

Eccb posterTejaswiniKumar3

Session ii g2 overview chemical modeling mmcUSD Bioinformatics

scRNA-Seq Workshop Presentation - Stem Cell Network 2018David Cook

ISMB2014読み会イントロ + Deep learning of the tissue-regulated splicing codeKengo Sato

Introduction to systems biology – How systems work?improvemed

Thesis PresentationDimitrios Apostolos Chalepakis Ntellis

P  Systems  Model  Optimisation  by  Means  of  Evolutionary  Based  Search  ...Natalio Krasnogor

AI approaches in healthcare - targeting precise and personalized medicine DayOne

Systems Biology Approaches to CancerRaunak Shrestha

An interactive approach to multiobjective clustering of gene expression patternsRavi Kumar

Maps of sparse memory networks reveal overlapping communities in network flowsUmeå University

Cornell Pbsb 20090126 NetsMark Gerstein

Java tutorial: Programmatic Access to Molecular InteractionsRafael C. Jimenez

Semelhante a NetBioSIG2014-Talk by Hyunghoon Cho (20)

Knowledge extraction and visualisation using rule-based machine learning

System Biology and Pathway Network.pptx

presentation

Introduction to biocomputing

Intro to in silico drug discovery 2014

Modular RADAR: Immune System Inspired Strategies for Distributed Systems

ECCB poster

Eccb poster

Session ii g2 overview chemical modeling mmc

scRNA-Seq Workshop Presentation - Stem Cell Network 2018

ISMB2014読み会イントロ + Deep learning of the tissue-regulated splicing code

Introduction to systems biology – How systems work?

Thesis Presentation

P  Systems  Model  Optimisation  by  Means  of  Evolutionary  Based  Search  ...

AI approaches in healthcare - targeting precise and personalized medicine

Systems Biology Approaches to Cancer

An interactive approach to multiobjective clustering of gene expression patterns

Maps of sparse memory networks reveal overlapping communities in network flows

Cornell Pbsb 20090126 Nets

Java tutorial: Programmatic Access to Molecular Interactions

Mais de Alexander Pico

NRNB Annual Report 2017Alexander Pico

2016 Cytoscape 3.3 TutorialAlexander Pico

2015 Cytoscape 3.2 TutorialAlexander Pico

NetBioSIG2014-FlashJournalClub by Frank KramerAlexander Pico

NetBioSIG2014-Talk by Salvatore LoguercioAlexander Pico

NetBioSIG2014-Intro by Alex PicoAlexander Pico

NetBioSIG2014-Talk by Traver HartAlexander Pico

NetBioSIG2014-Talk by Yu XiaAlexander Pico

NetBioSIG2014-Keynote by Marian WalhoutAlexander Pico

NetBioSIG2014-Talk by Ashwini PatilAlexander Pico

NetBioSIG2014-Talk by Gerald QuonAlexander Pico

Visualization and Analysis of Dynamic Networks Alexander Pico

NRNB Annual Report 2013Alexander Pico

Introduction to WikiPathwaysAlexander Pico

Network Visualization and Analysis with CytoscapeAlexander Pico

NetBioSIG2013-KEYNOTE Esti Yeger-LotemAlexander Pico

Mais de Alexander Pico (16)

NRNB Annual Report 2017

2016 Cytoscape 3.3 Tutorial

2015 Cytoscape 3.2 Tutorial

NetBioSIG2014-FlashJournalClub by Frank Kramer

NetBioSIG2014-Talk by Salvatore Loguercio

NetBioSIG2014-Intro by Alex Pico

NetBioSIG2014-Talk by Traver Hart

NetBioSIG2014-Talk by Yu Xia

NetBioSIG2014-Keynote by Marian Walhout

NetBioSIG2014-Talk by Ashwini Patil

NetBioSIG2014-Talk by Gerald Quon

Visualization and Analysis of Dynamic Networks

NRNB Annual Report 2013

Introduction to WikiPathways

Network Visualization and Analysis with Cytoscape

NetBioSIG2013-KEYNOTE Esti Yeger-Lotem

Último

Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli

CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation

Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav

Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2

Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay

Module for Grade 9 for Asynchronous/Distance learninglevieagacer

GBSN - Microbiology (Unit 3)Areesha Ahmad

development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6

FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson

Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2

Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244

Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa

Selaginella: features, morphology ,anatomy and reproduction.Silpa

300003-World Science Day For Peace And Development.pptxryanrooker

Clean In Place(CIP).pptx .Poonam Aher Patil

Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav

Exploring Criminology and Criminal Behaviour.pdfrohankumarsinghrore1

biology HL practice questions IB BIOLOGY1301aanya

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

NetBioSIG2014-Talk by Hyunghoon Cho

1. Identifying context-dependent community structure across multiple networks Hyunghoon Cho, Gerald Quon, Bonnie Berger, Manolis Kellis MIT CSAIL ISMB Network Biology SIG July 11th, 2014

2. Modules / communities Cellular functions are carried out by groups of biomolecules (e.g., proteins, RNA) acting in a coordinated fashion. Problem: how does this structure change under a different condition?

3. Detecting changes in modules 1 2 3 1 2 3 Context Module 1 2 Kv v v

4. Approaches to module detection • Many algorithms for detecting modules in a single network – Link clustering [Shi et al. 2013], label propagation [Gregory 2010], Tensor decomposition [Anandkumar et al. 2013], mixed-membership stochastic blockmodels [Airoldi et al. 2008], etc. • Not obvious how to extend to the multiple network case: Combine networks, then detect modules likely to miss rare modules Detect modules, then combine results inconsistent module definition Multi-MMSB Jointly learns modules from all networks, allow each to be only present in a subset of networks

5. Model description: SB Note: each node belongs to a single module Adjacency matrix

6. Model description: MMSB [Airoldi et al., 2008]

7. Model description: Multi-MMSB

8. Learning the model Goal: optimize model likelihood Expectation-Maximization algorithm to deal with latent variables Need variational approximation Random restarts to alleviate local optima issue

9. Performance metric • Normalized mutual information (NMI) Sequence of structural queries Learned community structure True community structure Answers Answers Calculate mutual information [Esquivel and Rosvall, 2012]

10. Synthetic data: results Normalizedmutualinformation

11. Synthetic data: results

12. Synthetic data: results

13. Synthetic data: results

14. Asthma data (GSE19301) Microarray profiling of peripheral blood mononuclear cells from asthma patients at 3 different stages: • quiet: 394 samples • exacerbation: 125 samples • follow-up (2 weeks after exacerbation): 166 samples [Bjornsdottir et al., 2011]

15. Asthma data: results

16. RNA decay data (GSE37451) Microarray profiling of 70 lymphoblastoid cell lines at 5 different timepoints after transcription arrest: • 0 hr (before transcription arrest) • 0.5 hr • 1 hr • 2 hr • 4 hr

17. RNA decay data: results

18. Summary • We developed Multi-MMSB, a flexible way of learning community structure over multiple networks • Multi-MMSB outperformed naive methods on synthetic data • When applied to real data, Multi-MMSB identified context-specific modules that are biologically plausible

19. Future directions • Extending the model: – Directed networks – Weighted edges • Application to other types of biological networks: – Regulatory networks – PPI

20. Acknowledgements • Gerald Quon • Prof. Bonnie Berger • Prof. Manolis Kellis

Notas do Editor

For instance, suppose we are observing an individual over the course of an environmental stress, such as a viral infection or a physical injury. In this case we expect to see some group of genes that are only temporarily co-regulated in a specific situation. For example, genes involved in immune response to injuries would be temporarily co-regulated. And this would lead to something like the blue module, which is only active in one of the networks. On the other hand, we also expect some housekeeping genes to be always turned on and highly co-regulated. This would lead to something that looks like a red module that is always present in the networks. By identifying and functionally characterizing such modules with different patterns of occurrences, one can start to reason about the biological processes that are affected or unaffected by the given context of interest. With this motivation in mind, the goal of this project was to develop an algorithm that takes as input multiple networks from different contexts, and outputs the overall community structure with the associated activity pattern that tells us in which subset of contexts each module appears in So how do we go about doing this?
2 mins First, it is important to know that there are a large number of module detection algorithms that work on a single network This includes link clustering, label propagation, spectral decomposition, stochastic block models, et cetera. However, extension of these methods to the multiple network case is not trivial. One naïve approach one might consider is to combine all the networks into a single representative network (for example, by taking the average of the adjacency matrices) and to run existing module detection algorithm on it Once we have a global set of modules, Then we can go back to individual networks and check whether each identified module is active or not While this approach is fairly simple and easy to implement, this suffers from the limitation that modules that are only active in a small number of networks are more difficult to identify in the combined network. This is because the merging process dilutes the signal in the data. Another naïve approach, would be to apply module detection algorithms on each network independently to learn the modules, and then try to combine the outputs by matching modules detected from different networks. While this approach has no problem identifying modules that are rarely active, when the detected boundaries of a module differ between networks it is not clear how to resolve such disagreements in a principled manner. In this talk, I present a hierarchical Bayesian model named multi-MMSB that avoids both of these issues. Our model learns a global community structure jointly from all networks, while allowing each module to be only present in a subset of networks, thereby increasing power to detect rare modules. In the following section, I will describe the details of multi-MMSB. Let’s first start with a simple Bayesian model that forms the basis of our model.
Stochastic blockmodel is a probabilistic, generative model of random graphs that originates from the social network analysis literature. The basic idea is that when we look at the adjacency matrix of a graph that shows a modular pattern it will have these “blocky” structure, where each block corresponds to a single module. So the idea is to cluster the nodes such that within each cluster we see a lot of edges and not many edges are between different clusters. Now we can formalize this model as follows. First we introduce a parameter p_m that represents the connectivity level for each module m p_0 represents the background connectivity between nodes of different modules, which can be thought of as the amount of noise in the data and lastly for each node in the network, we have a latent label z_i that represents which module the node belongs to. Given these variables, each edge is sampled independently from a Bernoulli distribution with parameter p_m if both nodes belong to module m and p_0 otherwise A key limitation of stochastic blockmodel is that each node can only be assigned to a single module. However, in many applications, modules often overlap with each other. This is the motivation behind mixed-membership stochastic blockmodels, or MMSB.
In this version of the model, we allow each node to have a fractional membership to modules rather than a hard assignment. This is represented by the vector c_i. In addition, we introduce a latent label z_ij for every pair of i and j to represent the conditional membership of node I with respect to node j. Intuitively speaking, this allows each node to be multi-faceted – they can change their module membership based on the node that they are interacting with. In this new setup, an edge is sampled with probability p_m when the conditional memberships on both sides agree with each other. While this model has been shown to be effective in a variety of settings, by design it only works on a single network.
In order to extend this to the multiple network case we first duplicate the latent variables z_ij across the networks while keeping only a single copy of c_i so that the fractional membership of each node remains identical in every network. Furthermore, we introduce another layer of latent variables denoted as d_km, which represents context-specific activity of module m in network k. Now, when we sample the edges using p_m, in addition to checking whether the conditional memberships match, we also check whether the module is active in the given network. Note that, in practice, we are only given the edges and none of the latent variables.
A standard approach to learning a Bayesian model is to optimize the likelihood function given the observed data. In this case, since there are variables that are not observed, we want to optimize what’s called a marginal likelihood. Which is the same as the complete likelihood where the latent variables are integrated out. Expectation-maximization algorithm can be used to optimize this objective. But because the posterior distribution over the latent variables is intractable in this case, we need to use variational EM, which makes a simplifying assumption that the latent variables are independent from each other. At the end of this training procedure, what we get is the optimal set of model parameters and our belief over the latent variables, from which we can extract the community structure learned by the model. Because this approach is susceptible to local optima, we typically learn the model several times for a given setting and select the one with the highest objective for further analysis
Once we learn the model, we need a way to measure the accuracy, assuming the ground truth is available, which is the case in simulated data. To quantify the similarity between two community structures, we use a metric called normalized mutual information which was first developed in the context of network covers by Esquivel and Rosvall in 2012. I won’t go into too much detail here, but the basic intuition is as follows. First, we randomly generate a sequence of structural queries. [Give example] Then we send these queries through both the learned and the true community structures to get two sets of answers. Calculating mutual information between these two answer sets gives us our similarity score. In the limiting case, if the two structures are exactly the same then the answers we get would be identical in all cases and this leads to an NMI of 1. Note that this procedure does not require us to know the mapping between the modules between the two structures because mutual information doesn’t change even if we relabel the modules on either side. Now we’ve established everything about the model. In the following section I will present some results on synthetic data.

NetBioSIG2014-Talk by Hyunghoon Cho

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a NetBioSIG2014-Talk by Hyunghoon Cho

Semelhante a NetBioSIG2014-Talk by Hyunghoon Cho (20)

Mais de Alexander Pico

Mais de Alexander Pico (16)

Último

Último (20)

NetBioSIG2014-Talk by Hyunghoon Cho

Notas do Editor