SlideShare uma empresa Scribd logo
1 de 17
Baixar para ler offline
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
A single source k-shortest paths algorithm
to infer regulatory pathways in a gene
network
Ekaterina Starostina
30.03.2013
1 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Gene interaction network
Node - gene or its corresponding protein
Edge - proteinprotein interaction (PPI) or a transcription factor
(TF)DNA binding
Regulatory pathway - chain of interacting genes within a network.
A regulatory pathway begins with a causal gene and ends at a target
gene
Goal - identify the potential regulatory pathways passing through
the given gene in the gene network
2 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Existing approach
Random walk. A random walk typically starts from the given gene,
walks through several nodes, and terminates according to some
pre-dened parameters (such as length and edge weights).
k shortest paths algorithm Executes O(n) times Dijkstra algorithm
to generate candidate paths for each of the k shortest paths. Time
complexity - O(kn(m + nlogn)), where n is the number of nodes and
m is the number of edges.
3 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Problem denition
Let G = (V, E) denote a gene network, which is a weighted directed
graph, where
V is the set of nodes (genes or proteins),
E is the set of directed edges (interactions)
n = |V|, m = |E|.
Each edge e ∈ E denoted by (ga, gb), ga, gb ∈ V, represents the
direction of interaction from ga to gb.
w(e) ∈ [0, 1] is the weight of an edge e, and the weight represents the
condence level of the interaction.
4 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Sub problems
UnknownCausal
infer possible causal genes if the given gene is a target gene
UnknownTarget
infer possible target genes if the given gene is a causal gene
CandidateCausal
infer the true causal gene in a given set of candidate causal genes, if
the given gene is a target gene
5 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Random walk model
P is n × n transition matrix constructed by normalizing the adjacent
matrix A of the graph G, where
Aij = w((gi, gj))
Pij = αAij / k
Aik , α ∈ (0, 1) is called `damping factor'.
If the current node is gi , the walk would terminate at gi with
probability 1 − α and go to another node gj with probability Pij
A random walk starts from a source node (causal gene) and once
the walk reaches a sink node(target gene), the walk immediately
terminates.
6 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Problems of random walk model
(a) Normalization of edge weights is lossy
(b) The walks repeatedly go through the bidirectional edge
7 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Single-source k-shortest paths
First we convert the weight of an edge into a distance as follows:
d(e) = − log(w(e)) + c (1)
, where d(∗) denotes the distance of an edge or a path.
Then, the importance value of a gene ga is dened as
Imp(ga) =
k
i=1
1/d(Pi ) (2)
,where P1 ,P2 ,...,Pk are k-shortest paths from the given gene to ga
With this transformation, it is obvious that genes having shorter distances
to the given gene of interest will be assigned greater importance values.
8 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Sub problems
The inference problem is now a single-source k-shortest paths problem,
dened for subroblems as it follows:
UnknownTarget - starting node is the given causal gene
UnknownCasual - starting node is the given target gene, the
direction of all edges should be reversed rst
CandidateCasual - starting node is the given target gene, the
direction of all edges should be reversed rst and we select the
candidate causal gene with the highest importance value as our
predicted causal gene
9 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Denitions
Def. 1 Given all top k-shortest paths from the starting node to each other
node, a pseudo tree stores all paths in a tree structure. If k = 1,
the pseudo tree is equivalent to the shortest path tree; as k  1, all
2nd to k-th shortest paths are iteratively merged into the pseudo
tree by sharing the longest common prex path.
Def. 2 A tree-path is a path from the root to another node in a pseudo
tree.
Def. 3 Path nodes and dummy nodes. A tree-path to a path node is a
top-k shortest path from the root to this path node, while a
tree-path to a dummy node is not.
10 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Theorems
T. 1 If a top k-shortest path from the root S to a node A contains a
dummy node B, let the sub-path from the dummy node to A be
PBA , then for each top k-shortest path from S to B, denoted by
PSB , either PSB contains a node in PBA besides B or there exists a
top k-shortest path from S to A using PSB as a prex path.
T. 2 Given a graph G, the pseudo tree for the k-shortest paths contains
at most O(n2k) nodes.
T. 3 The time complexity of Algorithm 1 is
(nklog(nk) + mk(h + k)) (3)
11 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Algorithm
12 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Diversity of a path
Diversity of a path is the number of edges in this path but not
appearing in any already found paths divided by the total number of
edges in the path.
If diverse paths P1 ,P2 ,...Pê, êk, are already found, the diversity of a
new candidate path Pnew is:
div(Pnew ) =
|{e|e ∈ Pnew , Pi : e ∈ Pi , 1 ≤ i ≤ k}|
|{e|e ∈ Pnew }|
(4)
13 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Single-source k-shortest diverse paths algorithm
1 Execute Algorithm 1
2 For each node, examine the diversities of k shortest paths in the
increasing order of their distances
3 If the diverse paths are not enough ( k) for some nodes, remove
edges from the graph according to the probability
4 Repeat the procedure until k diverse paths are found for every node
or less than m/n edges are removed in the last iteration
14 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Overview of the inference framework
15 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Testing aspects
CandidateCausial The goal is to correctly pick the true causal gene
from the ten candidate causal genes.
UnknownCasual and UnknownTarget SaddleSam was used to
evaluate the enrichment levels of the GO terms for the results of
UnknownCausal and UnknownTarget of a gene set V.
Diversity, importance value and eciency
Diversity and enriched functions The goal is to show that using
dierent λ, the importance values generated by the k diverse paths
algorithm can identify dierent potential regulatory pathways with
dierent functions.
16 / 17
A single source k-shortest paths algorithm to infer regulatory pathways in a gene network
Conclusions
Algorithm can identify pathways with higher potentiality than
current methods based on the random walk model, and requiring the
paths to be more diverse can further uncover other potential
regulated functions.
Heuristic algorithm can achieve a huge speedup than the previous
single-pair shortest paths algorithm while the found paths are
equivalent in our yeast gene network.
17 / 17

Mais conteúdo relacionado

Mais procurados

A New Analysis for Wavelength Translation in Regular WDM Networks
A New Analysis for Wavelength Translation in Regular WDM NetworksA New Analysis for Wavelength Translation in Regular WDM Networks
A New Analysis for Wavelength Translation in Regular WDM NetworksVishal Sharma, Ph.D.
 
A New Key Agreement Protocol Using BDP and CSP in Non Commutative Groups
A New Key Agreement Protocol Using BDP and CSP in Non Commutative GroupsA New Key Agreement Protocol Using BDP and CSP in Non Commutative Groups
A New Key Agreement Protocol Using BDP and CSP in Non Commutative GroupsEswar Publications
 
Sparse Random Network Coding for Reliable Multicast Services
Sparse Random Network Coding for Reliable Multicast ServicesSparse Random Network Coding for Reliable Multicast Services
Sparse Random Network Coding for Reliable Multicast ServicesAndrea Tassi
 
Hossein Taghavi : Codes on Graphs
Hossein Taghavi : Codes on GraphsHossein Taghavi : Codes on Graphs
Hossein Taghavi : Codes on Graphsknowdiff
 
Question bank of module iv packet switching networks
Question bank of module iv packet switching networksQuestion bank of module iv packet switching networks
Question bank of module iv packet switching networksEngineering
 
Security aware relaying scheme for cooperative networks with untrusted relay ...
Security aware relaying scheme for cooperative networks with untrusted relay ...Security aware relaying scheme for cooperative networks with untrusted relay ...
Security aware relaying scheme for cooperative networks with untrusted relay ...Pvrtechnologies Nellore
 
Ijarcet vol-2-issue-7-2351-2356
Ijarcet vol-2-issue-7-2351-2356Ijarcet vol-2-issue-7-2351-2356
Ijarcet vol-2-issue-7-2351-2356Editor IJARCET
 
Enforcing end to-end proportional fairness with bounded buffer overflow proba...
Enforcing end to-end proportional fairness with bounded buffer overflow proba...Enforcing end to-end proportional fairness with bounded buffer overflow proba...
Enforcing end to-end proportional fairness with bounded buffer overflow proba...ijwmn
 
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)Enrique Monzo Solves
 

Mais procurados (11)

A New Analysis for Wavelength Translation in Regular WDM Networks
A New Analysis for Wavelength Translation in Regular WDM NetworksA New Analysis for Wavelength Translation in Regular WDM Networks
A New Analysis for Wavelength Translation in Regular WDM Networks
 
Ijcnc050215
Ijcnc050215Ijcnc050215
Ijcnc050215
 
A New Key Agreement Protocol Using BDP and CSP in Non Commutative Groups
A New Key Agreement Protocol Using BDP and CSP in Non Commutative GroupsA New Key Agreement Protocol Using BDP and CSP in Non Commutative Groups
A New Key Agreement Protocol Using BDP and CSP in Non Commutative Groups
 
Sparse Random Network Coding for Reliable Multicast Services
Sparse Random Network Coding for Reliable Multicast ServicesSparse Random Network Coding for Reliable Multicast Services
Sparse Random Network Coding for Reliable Multicast Services
 
Hossein Taghavi : Codes on Graphs
Hossein Taghavi : Codes on GraphsHossein Taghavi : Codes on Graphs
Hossein Taghavi : Codes on Graphs
 
Distributed Hash Table
Distributed Hash TableDistributed Hash Table
Distributed Hash Table
 
Question bank of module iv packet switching networks
Question bank of module iv packet switching networksQuestion bank of module iv packet switching networks
Question bank of module iv packet switching networks
 
Security aware relaying scheme for cooperative networks with untrusted relay ...
Security aware relaying scheme for cooperative networks with untrusted relay ...Security aware relaying scheme for cooperative networks with untrusted relay ...
Security aware relaying scheme for cooperative networks with untrusted relay ...
 
Ijarcet vol-2-issue-7-2351-2356
Ijarcet vol-2-issue-7-2351-2356Ijarcet vol-2-issue-7-2351-2356
Ijarcet vol-2-issue-7-2351-2356
 
Enforcing end to-end proportional fairness with bounded buffer overflow proba...
Enforcing end to-end proportional fairness with bounded buffer overflow proba...Enforcing end to-end proportional fairness with bounded buffer overflow proba...
Enforcing end to-end proportional fairness with bounded buffer overflow proba...
 
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
 

Semelhante a Slides -e._starostina

Multiple optimal path identification using ant colony optimisation in wireles...
Multiple optimal path identification using ant colony optimisation in wireles...Multiple optimal path identification using ant colony optimisation in wireles...
Multiple optimal path identification using ant colony optimisation in wireles...ijwmn
 
Amtr the ant based qos aware multipath temporally ordered routing algorithm ...
Amtr  the ant based qos aware multipath temporally ordered routing algorithm ...Amtr  the ant based qos aware multipath temporally ordered routing algorithm ...
Amtr the ant based qos aware multipath temporally ordered routing algorithm ...csandit
 
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...cscpconf
 
A genetic algorithm to solve the
A genetic algorithm to solve theA genetic algorithm to solve the
A genetic algorithm to solve theIJCNCJournal
 
Turbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statisticsTurbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statisticsijwmn
 
An Ant Algorithm for Solving QoS Multicast Routing Problem
An Ant Algorithm for Solving QoS Multicast Routing ProblemAn Ant Algorithm for Solving QoS Multicast Routing Problem
An Ant Algorithm for Solving QoS Multicast Routing ProblemCSCJournals
 
OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...
OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...
OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...cscpconf
 
Optimal path identification using ant colony optimisation in wireless sensor ...
Optimal path identification using ant colony optimisation in wireless sensor ...Optimal path identification using ant colony optimisation in wireless sensor ...
Optimal path identification using ant colony optimisation in wireless sensor ...csandit
 
A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...
A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...
A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...CSCJournals
 
Enhanced Threshold Sensitive Stable Election Protocol
Enhanced Threshold Sensitive Stable Election ProtocolEnhanced Threshold Sensitive Stable Election Protocol
Enhanced Threshold Sensitive Stable Election Protocoliosrjce
 
Fuzzy Controller Based Stable Routes with Lifetime Prediction in MANETs
Fuzzy Controller Based Stable Routes with Lifetime Prediction in MANETsFuzzy Controller Based Stable Routes with Lifetime Prediction in MANETs
Fuzzy Controller Based Stable Routes with Lifetime Prediction in MANETsCSCJournals
 
α Nearness ant colony system with adaptive strategies for the traveling sales...
α Nearness ant colony system with adaptive strategies for the traveling sales...α Nearness ant colony system with adaptive strategies for the traveling sales...
α Nearness ant colony system with adaptive strategies for the traveling sales...ijfcstjournal
 
An approach to dsr routing qos by fuzzy genetic algorithms
An approach to dsr routing qos by fuzzy genetic algorithmsAn approach to dsr routing qos by fuzzy genetic algorithms
An approach to dsr routing qos by fuzzy genetic algorithmsijwmn
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS
A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS
A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS cscpconf
 

Semelhante a Slides -e._starostina (20)

Multiple optimal path identification using ant colony optimisation in wireles...
Multiple optimal path identification using ant colony optimisation in wireles...Multiple optimal path identification using ant colony optimisation in wireles...
Multiple optimal path identification using ant colony optimisation in wireles...
 
Amtr the ant based qos aware multipath temporally ordered routing algorithm ...
Amtr  the ant based qos aware multipath temporally ordered routing algorithm ...Amtr  the ant based qos aware multipath temporally ordered routing algorithm ...
Amtr the ant based qos aware multipath temporally ordered routing algorithm ...
 
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...
AMTR: THE ANT BASED QOS AWARE MULTIPATH TEMPORALLY ORDERED ROUTING ALGORITHM ...
 
Week13 lec1
Week13 lec1Week13 lec1
Week13 lec1
 
A genetic algorithm to solve the
A genetic algorithm to solve theA genetic algorithm to solve the
A genetic algorithm to solve the
 
Turbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statisticsTurbo Detection in Rayleigh flat fading channel with unknown statistics
Turbo Detection in Rayleigh flat fading channel with unknown statistics
 
An Ant Algorithm for Solving QoS Multicast Routing Problem
An Ant Algorithm for Solving QoS Multicast Routing ProblemAn Ant Algorithm for Solving QoS Multicast Routing Problem
An Ant Algorithm for Solving QoS Multicast Routing Problem
 
OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...
OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...
OPTIMAL PATH IDENTIFICATION USING ANT COLONY OPTIMISATION IN WIRELESS SENSOR ...
 
Optimal path identification using ant colony optimisation in wireless sensor ...
Optimal path identification using ant colony optimisation in wireless sensor ...Optimal path identification using ant colony optimisation in wireless sensor ...
Optimal path identification using ant colony optimisation in wireless sensor ...
 
A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...
A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...
A Genetic Algorithm for Reliability Evaluation of a Stochastic-Flow Network w...
 
EESRDA
EESRDAEESRDA
EESRDA
 
E017332733
E017332733E017332733
E017332733
 
Enhanced Threshold Sensitive Stable Election Protocol
Enhanced Threshold Sensitive Stable Election ProtocolEnhanced Threshold Sensitive Stable Election Protocol
Enhanced Threshold Sensitive Stable Election Protocol
 
Fuzzy Controller Based Stable Routes with Lifetime Prediction in MANETs
Fuzzy Controller Based Stable Routes with Lifetime Prediction in MANETsFuzzy Controller Based Stable Routes with Lifetime Prediction in MANETs
Fuzzy Controller Based Stable Routes with Lifetime Prediction in MANETs
 
α Nearness ant colony system with adaptive strategies for the traveling sales...
α Nearness ant colony system with adaptive strategies for the traveling sales...α Nearness ant colony system with adaptive strategies for the traveling sales...
α Nearness ant colony system with adaptive strategies for the traveling sales...
 
An approach to dsr routing qos by fuzzy genetic algorithms
An approach to dsr routing qos by fuzzy genetic algorithmsAn approach to dsr routing qos by fuzzy genetic algorithms
An approach to dsr routing qos by fuzzy genetic algorithms
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS
A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS
A NOVEL ANT COLONY ALGORITHM FOR MULTICAST ROUTING IN WIRELESS AD HOC NETWORKS
 
Shortest path analysis
Shortest path analysis Shortest path analysis
Shortest path analysis
 
Small world
Small worldSmall world
Small world
 

Mais de BioinformaticsInstitute

Comparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphsComparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphsBioinformaticsInstitute
 
Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
 Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес... Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...BioinformaticsInstitute
 
Вперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днкВперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днкBioinformaticsInstitute
 
"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр Предеус"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр ПредеусBioinformaticsInstitute
 
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...BioinformaticsInstitute
 
Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)BioinformaticsInstitute
 
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...BioinformaticsInstitute
 
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)BioinformaticsInstitute
 

Mais de BioinformaticsInstitute (20)

Graph genome
Graph genome Graph genome
Graph genome
 
Nanopores sequencing
Nanopores sequencingNanopores sequencing
Nanopores sequencing
 
A superglue for string comparison
A superglue for string comparisonA superglue for string comparison
A superglue for string comparison
 
Comparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphsComparative Genomics and de Bruijn graphs
Comparative Genomics and de Bruijn graphs
 
Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
 Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес... Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
Биоинформатический анализ данных полноэкзомного секвенирования: анализ качес...
 
Вперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днкВперед в прошлое. Методы генетической диагностики древней днк
Вперед в прошлое. Методы генетической диагностики древней днк
 
Knime & bioinformatics
Knime & bioinformaticsKnime & bioinformatics
Knime & bioinformatics
 
"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр Предеус"Зачем биологам суперкомпьютеры", Александр Предеус
"Зачем биологам суперкомпьютеры", Александр Предеус
 
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
Иммунотерапия раковых опухолей: взгляд со стороны системной биологии. Максим ...
 
Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)Рак 101 (Мария Шутова, ИоГЕН РАН)
Рак 101 (Мария Шутова, ИоГЕН РАН)
 
Плюрипотентность 101
Плюрипотентность 101Плюрипотентность 101
Плюрипотентность 101
 
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
Секвенирование как инструмент исследования сложных фенотипов человека: от ген...
 
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
Инвестиции в биоинформатику и биотех (Андрей Афанасьев)
 
Biodb 2011-everything
Biodb 2011-everythingBiodb 2011-everything
Biodb 2011-everything
 
Biodb 2011-05
Biodb 2011-05Biodb 2011-05
Biodb 2011-05
 
Biodb 2011-04
Biodb 2011-04Biodb 2011-04
Biodb 2011-04
 
Biodb 2011-03
Biodb 2011-03Biodb 2011-03
Biodb 2011-03
 
Biodb 2011-01
Biodb 2011-01Biodb 2011-01
Biodb 2011-01
 
Biodb 2011-02
Biodb 2011-02Biodb 2011-02
Biodb 2011-02
 
Ngs 3 1
Ngs 3 1Ngs 3 1
Ngs 3 1
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

Slides -e._starostina

  • 1. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Ekaterina Starostina 30.03.2013 1 / 17
  • 2. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Gene interaction network Node - gene or its corresponding protein Edge - proteinprotein interaction (PPI) or a transcription factor (TF)DNA binding Regulatory pathway - chain of interacting genes within a network. A regulatory pathway begins with a causal gene and ends at a target gene Goal - identify the potential regulatory pathways passing through the given gene in the gene network 2 / 17
  • 3. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Existing approach Random walk. A random walk typically starts from the given gene, walks through several nodes, and terminates according to some pre-dened parameters (such as length and edge weights). k shortest paths algorithm Executes O(n) times Dijkstra algorithm to generate candidate paths for each of the k shortest paths. Time complexity - O(kn(m + nlogn)), where n is the number of nodes and m is the number of edges. 3 / 17
  • 4. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Problem denition Let G = (V, E) denote a gene network, which is a weighted directed graph, where V is the set of nodes (genes or proteins), E is the set of directed edges (interactions) n = |V|, m = |E|. Each edge e ∈ E denoted by (ga, gb), ga, gb ∈ V, represents the direction of interaction from ga to gb. w(e) ∈ [0, 1] is the weight of an edge e, and the weight represents the condence level of the interaction. 4 / 17
  • 5. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Sub problems UnknownCausal infer possible causal genes if the given gene is a target gene UnknownTarget infer possible target genes if the given gene is a causal gene CandidateCausal infer the true causal gene in a given set of candidate causal genes, if the given gene is a target gene 5 / 17
  • 6. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Random walk model P is n × n transition matrix constructed by normalizing the adjacent matrix A of the graph G, where Aij = w((gi, gj)) Pij = αAij / k Aik , α ∈ (0, 1) is called `damping factor'. If the current node is gi , the walk would terminate at gi with probability 1 − α and go to another node gj with probability Pij A random walk starts from a source node (causal gene) and once the walk reaches a sink node(target gene), the walk immediately terminates. 6 / 17
  • 7. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Problems of random walk model (a) Normalization of edge weights is lossy (b) The walks repeatedly go through the bidirectional edge 7 / 17
  • 8. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Single-source k-shortest paths First we convert the weight of an edge into a distance as follows: d(e) = − log(w(e)) + c (1) , where d(∗) denotes the distance of an edge or a path. Then, the importance value of a gene ga is dened as Imp(ga) = k i=1 1/d(Pi ) (2) ,where P1 ,P2 ,...,Pk are k-shortest paths from the given gene to ga With this transformation, it is obvious that genes having shorter distances to the given gene of interest will be assigned greater importance values. 8 / 17
  • 9. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Sub problems The inference problem is now a single-source k-shortest paths problem, dened for subroblems as it follows: UnknownTarget - starting node is the given causal gene UnknownCasual - starting node is the given target gene, the direction of all edges should be reversed rst CandidateCasual - starting node is the given target gene, the direction of all edges should be reversed rst and we select the candidate causal gene with the highest importance value as our predicted causal gene 9 / 17
  • 10. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Denitions Def. 1 Given all top k-shortest paths from the starting node to each other node, a pseudo tree stores all paths in a tree structure. If k = 1, the pseudo tree is equivalent to the shortest path tree; as k 1, all 2nd to k-th shortest paths are iteratively merged into the pseudo tree by sharing the longest common prex path. Def. 2 A tree-path is a path from the root to another node in a pseudo tree. Def. 3 Path nodes and dummy nodes. A tree-path to a path node is a top-k shortest path from the root to this path node, while a tree-path to a dummy node is not. 10 / 17
  • 11. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Theorems T. 1 If a top k-shortest path from the root S to a node A contains a dummy node B, let the sub-path from the dummy node to A be PBA , then for each top k-shortest path from S to B, denoted by PSB , either PSB contains a node in PBA besides B or there exists a top k-shortest path from S to A using PSB as a prex path. T. 2 Given a graph G, the pseudo tree for the k-shortest paths contains at most O(n2k) nodes. T. 3 The time complexity of Algorithm 1 is (nklog(nk) + mk(h + k)) (3) 11 / 17
  • 12. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Algorithm 12 / 17
  • 13. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Diversity of a path Diversity of a path is the number of edges in this path but not appearing in any already found paths divided by the total number of edges in the path. If diverse paths P1 ,P2 ,...Pê, êk, are already found, the diversity of a new candidate path Pnew is: div(Pnew ) = |{e|e ∈ Pnew , Pi : e ∈ Pi , 1 ≤ i ≤ k}| |{e|e ∈ Pnew }| (4) 13 / 17
  • 14. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Single-source k-shortest diverse paths algorithm 1 Execute Algorithm 1 2 For each node, examine the diversities of k shortest paths in the increasing order of their distances 3 If the diverse paths are not enough ( k) for some nodes, remove edges from the graph according to the probability 4 Repeat the procedure until k diverse paths are found for every node or less than m/n edges are removed in the last iteration 14 / 17
  • 15. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Overview of the inference framework 15 / 17
  • 16. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Testing aspects CandidateCausial The goal is to correctly pick the true causal gene from the ten candidate causal genes. UnknownCasual and UnknownTarget SaddleSam was used to evaluate the enrichment levels of the GO terms for the results of UnknownCausal and UnknownTarget of a gene set V. Diversity, importance value and eciency Diversity and enriched functions The goal is to show that using dierent λ, the importance values generated by the k diverse paths algorithm can identify dierent potential regulatory pathways with dierent functions. 16 / 17
  • 17. A single source k-shortest paths algorithm to infer regulatory pathways in a gene network Conclusions Algorithm can identify pathways with higher potentiality than current methods based on the random walk model, and requiring the paths to be more diverse can further uncover other potential regulated functions. Heuristic algorithm can achieve a huge speedup than the previous single-pair shortest paths algorithm while the found paths are equivalent in our yeast gene network. 17 / 17