A multithreaded method for network alignment

A multithreaded algorithm
for network alignment
v
w
David F. Gleich
r Overlap s
Computer Science
Purdue University

wtu with
u
t

A L B Arif Khan, Alex Pothen
Purdue University, Computer Science

Work supported by DOE CSCAPES Institute grant (DE- Mahantesh Halappanavar
FC02-08ER25864), NSF CAREER grant 1149756-CCF,
and the Center for Adaptive Super Computing Software Paciﬁc Northwest National Labs
Multithreaded Architectures (CASS-MT) at PNNL. PNNL
is operated by Battelle Memorial Institute under contract

1
DE-AC06-76RL01830

Network alignment"
What is the best way of matching "
graph A to B using only edges in L?
w
v
Overlap s
r

wtu
t u

A L B

Find a 1-1 matching between vertices
with as many overlaps as possible.

2

Network alignment"
… is NP-hard"
… has no approximation algorithm
w
v
r Overlap s •  Computer Vision
•  Ontology matching
•  Database matching
wtu •  Bioinformatics
t u

A L B

objective = α matching + βoverlap

3

the Figure 2. The NetworkBLAST local network alignment algorithm. Given two input
s) or
odes
lem
Network alignment"
networks, a network alignment graph is constructed. Nodes in this graph correspond
to pairs of sequence-similar proteins, one from each species, and edges correspond to
conserved interactions. A search algorithm identiﬁes highly similar subnetworks that

follow a prespeciﬁed interaction pattern. Adapted from Sharan and Ideker.30
n the
ent;
nied
ped
lem

net-
one
one
plest
ying
eins
ome
the
be-

d as
aph
ever,
ap- From Sharan and Ideker, Modeling cellular machinery through biological
rked network comparison. Nat. Biotechnol. 24, 4 (Apr. 2006), 427–433.
, we
Figure 3. Performance comparison of computational approaches.

4
mon-

Our contribution

Multi-threaded network alignment via a new, multi-threaded
approximation algorithm for max-weight bipartite matching
procedure with linear complexity

415 sec
High performance C++ implementations "
40-times faster (on 16 cores – Xeon E5-2670)"
(C++ ~ 3, complexity ~ 2, threading ~ 8)"
www.cs.purdue.edu/~dgleich/codes/netalignmc
10 sec.

... enabling interactive computation!

5

… the best methods in a recent survey …
Bayati, Gleich, et al. TKDE forthcoming

Belief Propagation" Klau’s Matching

Relaxation!
Use a probabilistic Iterative improve an
relaxation and iteratively upper-bound on the
ﬁnd the probability that solution via a sub-
an edge is in the gradient method
matching, given the applied to the
probabilities of its Lagrangian"
neighboring edges

6

Each iteration involves
Let x[i] be the score for
Matrix-vector-ish computations each pair-wise match in L
with a sparse matrix, e.g. sparse
matrix vector products in a semi- for i=1 to ...
ring, dot-products, axpy, etc.
update x[i] to y[i]
Bipartite max-weight matching compute a
using a different weight vector at max-weight match
with y
each iteration
update y[i] to x[i]
" (using match in MR)
No “convergence” "
100-1000 iterations

7

The methods
Each iteration involves! Belief Propagation!

!
Listing 2. A belief-propagation message passing procedure for network
alignment. See the text for a description of othermax and round heuristic.
D

1 y(0) = 0, z(0) = 0, d(0) = 0, S(k) = 0 t
Matrix-vector-ish computations ! 2

3
for k = 1 to niter
T
F = bound0, [ S + S(k) ] Step 1: compute F
O
s
with a sparse matrix, e.g. sparse 4 d = ↵w + Fe Step 2: compute d a
! 5 y(k) = d othermaxcol(z(k 1) ) Step 3: othermax i
matrix vector products in a semi- 6 z(k) = d othermaxrow(y(k 1) ) i
h
S(k) = diag(y(k) + z(k) d)S F Step 4: update S
!
7
8 (y(k) , z(k) , S(k) ) k
(y(k) , z(k) , S(k) )+
O
a
9 (1 k
)(y(k 1) , z(k 1) , S(k 1) ) Step 5: damping
e

10
11 !
round heuristic (y(k) ) Step 6: matching
round heuristic (z(k) ) Step 6: matching
I
12 end
Bipartite max-weight matching return y(k) or z(k) with the largest objective value
!
13 t
p
using a different weight vector at m

!
w
each iteration
interpretation, the weight vectors are usually called messages
as they communicate the “beliefs” of each “agent.” In this A
particular problem, the neighborhood of an agent represents
all of the other edges in graph L incident on the same vertex s

9
in graph A (1st vector), all edges in L incident on the same ﬁ
vertex in graph B (2nd vector), or the edges in L that are “

The NEW methods
Each iteration involves! Belief Propagation!

el
!
Listing 2. A belief-propagation message passing procedure for network
alignment. See the text for a description of othermax and round heuristic.
D

l
Paral
(0) (0) (0) (k)
y = 0, z = 0, d = 0, S = 0
1 t
! F = bound
Matrix-vector-ish computations for k = 1 to n [ S + S ] Step 1: compute F
2

3
iter

0,
(k) T
O
s
with a sparse matrix, e.g. sparse d = ↵wd+ Fe Step 2: compute dStep 3: othermax
4 a
! y = d othermaxrow(y ))
= 5
(k)
othermaxcol(z (k 1) i
matrix vector products in a semi- z 6
(k)
(k)
(k 1) i
h
S = diag(y + z d)S F Step 4: update S
(k) (k)

! (y , z , S ) (y , z , S )+
7
8
(k) (k) (k) k (k) (k) (k) O
a
9 (1 k
)(y(k 1) , z(k 1) , S(k 1) ) Step 5: damping
e

10
11 !
round heuristic (y(k) ) Step 6: matching
round heuristic (z(k) )
Step 6" I
12 end approx matching
Approximate bipartite max- return y or z with the largest objective value
(k) (k)

!
13 t
p
weight matching is used here m

!
w
instead!
interpretation, the weight vectors are usually called messages
as they communicate the “beliefs” of each “agent.” In this A
particular problem, the neighborhood of an agent represents

10
all of the other edges in graph L incident on the same vertex s
in graph A (1st vector), all edges in L incident on the same ﬁ
vertex in graph B (2nd vector), or the edges in L that are “

MR
Approximation doesn’t hurt the
between the Library of Congress

r
0.2 ApproxMR
pedia categories (lcsh-wiki). While BP
e hierarchical tree, they also have
belief propagation algorithm
ApproxBP
r types of relationships. Thus we 0
0 5 10 15 20
l graphs. The second problem is an expected degree of noise in L (p ⋅ n)
rary of Congress subject headings
French National Library: Rameau. 1
d weights in L are computed via a
heading strings (and via translated

of correct match
au). These problems are larger than 0.8
BP a

Fraction fraction correct
indis nd App
tingu roxB
NMENT WITH APPROXIMATE 0.6
isha P
ATCHING ble
are
ss the question: how does the be- 0.4
d the BP method change when we
matching procedure from Section V MR
0.2 ApproxMR
step in each algorithm? Note that
BP
ching in the ﬁrst step of Klau’s
ApproxBP
ch) because the problems in each 0
we parallelize over perturb onealso 0 5 10 15 20
Randomly rows. Note expected degree of noise in L (p ⋅ n)
is much more integral to Klau’s B
power-law graph to get A, The amount of random-ness in L in
average expected degree
edure. Generate L by the true-we Fig. 2. Alignment with a power-law graph shows the large effect that
For the BP procedure,
ing problem to evaluate the quality approximate rounding can have on solutions from Klau’s method (MR). With

11
match + random edges
Klau’s method, the results of the that method, using exact rounding will yield the identity matching for all
problems (bottom ﬁgure), whereas using the approximation results in over a

The methods (in more detail)
Belief Propagation" Klau’s Matching

Relaxation!
for i=1 to ... for i=1 to ...
update x[i] to y[i] update x[i] to y[i]
compute a compute a
max-weight match max-weight match
with y with y
save y if it is the update y[i] to x[i]
best result so far
based on the match

The matching is incidental to the BP method,
but integral to Klau’s MR method

12

On real-world problems,
ApproxMR isn’t so different
400
375
Upper overlap upper bound
381 On a protein-
350
bound on protein align-
overlap
300 ment problem,
there is little
250
difference with
Overlap

200 exact vs.
Upper bound on approximate
150 matching
matching
max weight
100 BP
671.551

AppBP
50 AppMR
MR
0
0 100 200 300 400 500 600

13
Weight

Algorithmic analysis
v
w
Exact runtime
s
r
matrix + matching with "
matrix ≪ matching

u
O(|EL| + |S|) + O(|EL| N log N)

t

A L B

Our approx. runtime!
Algorithmic parameters

matrix + approx. matching!
|EL| number of edges in L
O(|EL| + |S|) + O(|EL|)
|S| number of potential overlaps

14

A local dominating edge
method for bipartite matching
j
i The method guarantees
r
s
•  ½ approximation
•  maximal matching
based on work by Preis
(1999), Manne and
wtu Bisseling (2008), and
t u
Halappanavar et al (2012)
A L B
A locally dominating edge is an edge
heavier than all neighboring edges.

For bipartite Work on smaller side only

15

j
Queue all vertices
i
r
s Until queue is empty!
In Parallel over vertices!
Match to heavy edge
and if there’s a conﬂict,
wtu
u
check the winner, and
t
ﬁnd an alternative for
A L B the loser
Add endpoint of non-
A locally dominating edge is an edge dominating edges to
the queue


16

j
i Customized ﬁrst iteration
r
s
(with all vertices)

Use OpenMP locks to
update choices
wtu
t u
Use sync_and_fetch_add
A L B for queue updates.

A locally dominating edge is an edge


17

Remaining multi-threading
procedures are straightforward
Standard OpenMP for matrix-computations"
use schedule=dynamic to handle skew
We can batch the matching procedures in the
BP method for additional parallelism

for i=1 to ...
update x[i] to y[i]
save y[i] in a buffer
when the buffer is full
compute max-weight match
for all in buffer and save
the best

18

TABLE II
ed F OR EACH PROBLEM IN OUR BIOINFORMATICS AND ONTOLOGY SETS , WE
to Real-world data sets
REPORT THE NUMBER OF VERTICES IN GRAPH A AND B, THE NUMBER OF
EDGES IN THE GRAPH L, AND THE NUMBER OF NONZEROS IN S.
ch
Problem |VA | |VB |
|EL | |S|
dmela-scere 9,459 5,696
34,582 6,860
=
homo-musm 3,247 9,695 15,810 12,180
e.
lcsh-wiki 297,266 205,948 4,971,629 1,785,310
ed lcsh-rameau 154,974 342,684
20,883,500 4,929,272
be
d;
st Algorithmic parameters
Our approx. runtime
order to match vertices. We experimented with an initialization

algorithm tailored for bipartite graphs by approx. matching
matrix + spawning threads
|EL| number of edges in L
only from one of the vertex sets VO(|E |V+ |S|) identify|)
A or B to + O(|E locally
|S| number of potential overlaps
L L
dominant edges. If the thread is responsible for matching a

-1 vertex in VA , then it has to check the adjacency sets of the

19
vertices in VB that are adjacent to it in order to determine if the

Performance evaluation
(2x4)-10 core Intel E7-8870, 2.4 GHz (80-cores)
16 GB memory/proc (128 GB)

Scaling study
Mem
Mem
Mem
Mem
1.  Thread binding " CPU
CPU
CPU
CPU
scattered vs. compact
CPU
CPU
CPU
CPU
2.  Memory binding "
Mem
Mem
Mem
Mem
interleaved vs. bind

20

Scaling
BP with no batching
lcsh-rameau, 400 iterations
25
scatter and interleave
20
Speedup

15

115 seconds for 40-thread
10

5
1450 seconds for 1-thread
0
0 20 40 60 80
Threads

21

Scaling
BP with no batching
25
compact and interleave
compact and membind
20 scatter and interleave
scatter and membind
Speedup

15

10

5

0
0 20 40 60 80
Threads

22

25
compact and membind

Scaling
scatter and membind

Speedup
15
25
compact and interleave 10
compact and membind
scatter and membind 5
BP with no batching
Speedup

15
0
0 20 40 60 80
10
25 Threads
5
Klau’s MR method
20
compact and membind
scatter and interleave
0 scatter and membind
0 20 40 60 80
Speedup
15
Threads

In all cases, we get a 10
speedup of around 12-15
on 40-cores with scatter 5

threads and interleaved BP with batch=20

23
0
memory
0 20 40
Threads
60 80

Summary & Conclusions
•  Tailored algorithm for approx. max-weight bipartite matching
•  Algorithmic improvement in network alignment methods
•  Multi-threaded C++ code for network alignment

415 seconds -> 10 seconds (40-times overall speedup)
For large problems, interactive network alignment is possible
Future work Memory control, improved methods

Work supported by DOE CSCAPES Institute grant (DE-
Code and data available! FC02-08ER25864), NSF CAREER grant 1149756-CCF,
www.cs.purdue.edu/~dgleich/ and the Center for Adaptive Super Computing Software
codes/netalignmc Multithreaded Architectures (CASS-MT) at PNNL. PNNL
is operated by Battelle Memorial Institute under contract

24
DE-AC06-76RL01830

A multithreaded method for network alignment

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to A multithreaded method for network alignment

Similar to A multithreaded method for network alignment (20)

More from David Gleich

More from David Gleich (15)

A multithreaded method for network alignment