1. Dept. of Computer Science
Course :Advance Data Algorithm Vikram Singh Slathia
Course Id: MAI312 2011MAI025
Central University of Rajasthan MSc CS III sem.
2. Overview of Parallel Sorting
Odd–Even Sorting
Overview
Algorithm
Example
Complexity
Bitonic Sort
Overview
Binary Split
Example
Complexity
References
Dept. of Computer Science Curaj 2
3. What is a parallel sorted sequence ?
The sorted list is partitioned with the property
that each partitioned list is sorted and each
element in processor Pi's list is less than that in
Pj's list if i < j.
Dept. of Computer Science Curaj 3
4. What is the parallel counterpart to a sequential
comparator?
If each processor has one element, the compare
exchange operation stores the smaller element at the
processor with smaller id. This can be done in ts + tw
time.
If we have more than one element per processor, we
call this operation a compare split. Assume each of
two processors have n/p elements.
Dept. of Computer Science Curaj 4
5. After the compare-split operation, the smaller n/p
elements are at processor Pi and the larger n/p
elements at Pj, where i < j.
The time for a compare-split operation is (ts+
twn/p), assuming that the two partial lists were
initially sorted.
Dept. of Computer Science Curaj 5
6. A parallel compare-exchange operation.
Processes Pi and Pj send their elements to each
other. Process Pi keeps min{ai,aj}, and Pj keeps
max{ai, aj}.
Dept. of Computer Science Curaj 6
7. A compare-split operation. Each process sends its block of size n/p to the
other process. Each process merges the received block with its own block
and retains only the appropriate half of the merged block. In this
example, process Pi retains the smaller elements and process Pi retains the
larger elements.
Dept. of Computer Science Curaj 7
8.
9. An odd–even sort or odd–even transposition
sort also known as brick sort.
Dept. of Computer Science Curaj 9
11. void OddEvenSort(T a[ ], int n)
{
for (int i = 0; i < n; ++i)
{
if (i & 1)
{
for ( int j = 2; j < n; j+=2 )
if (a [j] < a[j-1])
Swap(a[ j-1], a[ j ]);
}
else
{
for (int j = 1; j < n; j+=2)
if (a[ j ] < a[j-1])
Swap(a[ j-1], a[ j ]);
}
}
}
Dept. of Computer Science Curaj 11
12. Odd-Even Transposition Sort -
example
Step
0
1
2
3
Time
4
5
6
7
Parallel time complexity: Tpar = O(n) (for P=n)
Dept. of Computer Science Curaj 12
13. Unsorted elements
3 2 3 8 5 6 4 1
Solution
▪ Sorting n = 8 elements, using the odd-even transposition
sort algorithm.
▪ During each phase, n = 8 elements are compared.
Dept. of Computer Science Curaj 13
22. After n phases of odd-even exchanges, the
sequence is sorted.
Each phase of the algorithm (either odd or
even) requires Θ(n) comparisons.
Serial complexity is Θ(n2).
Dept. of Computer Science Curaj 22
23. Consider the one item per processor case.
There are n iterations, in each iteration, each
processor does one compare-exchange.
The parallel run time of this formulation is
Θ(n).
Dept. of Computer Science Curaj 23
25. Consider a block of n/p elements per
processor.
The first step is a local sort.
In each subsequent step, the compare
exchange operation is replaced by the compare
split operation.
Dept. of Computer Science Curaj 25
27. Time complexity:
Tpar = (Local Sort) + (p merge-splits) +(p
exchanges)
Tpar = (n/p)log(n/p) + n + n = (n/p)log(n/p) + 2n
Dept. of Computer Science Curaj 27
28.
29. A bitonic sequence is defined as a list with no more
than one LOCAL MAXIMUM and no more than one
LOCAL MINIMUM.
Dept. of Computer Science Curaj 29
30. A bitonic sequence is a list with no more than one LOCAL MAXIMUM
and no more than one LOCAL MINIMUM.
(Endpoints must be considered - wraparound )
This is ok!
1 Local MAX; 1 Local MIN
The list is bitonic!
This is NOT bitonic! Why?
1 Local MAX; 2 Local MINs
Dept. of Computer Science Curaj 30
31. 1. Divide the bitonic list into two equal halves.
2. Compare-Exchange each item on the first half
with the corresponding item in the second half.
Result:
Two bitonic sequences where the numbers in one sequence are all less
than the numbers in the other sequence.
Dept. of Computer Science Curaj 31
32. Bitonic list:
24 20 15 9 4 2 5 8 | 10 11 12 13 22 30 32 45
Result after Binary-split:
10 11 12 9 4 2 5 8 | 24 20 15 13 22 30 32 45
If you keep applying the BINARY-SPLIT to each half repeatedly, you will
get a ORTED LIST !
10 11 12 9 . 4 2 5 8 | 24 20 15 13 . 22 30 32 45
4 2 . 5 8 10 11 . 12 9 | 22 20 . 15 13 24 30 . 32 45
4 . 2 5 . 8 10 . 9 12 .11 15 . 13 22 . 20 24 . 30 32 . 45
2 4 5 8 9 10 11 12 13 15 20 22 24 30 32 45
Q: How many parallel steps does it take to sort ?
A: log n
Dept. of Computer Science Curaj 32
33. A bitonic sorting network sorts n elements in
Θ(log2n) time.
A bitonic sequence has two tones - increasing
and decreasing, or vice versa. Any cyclic rotation
of such networks is also considered bitonic.
1,2,4,7,6,0 is a bitonic sequence, because it
first increases and then decreases. 8,9,2,1,0,4 is
another bitonic sequence, because it is a cyclic
shift of 0,4,8,9,2,1 .
Dept. of Computer Science Curaj 33
34. Let s = a0,a1,…,an-1 be a bitonic sequence
such that a0 ≤ a1 ≤ ··· ≤ an/2-1 and an/2 ≥ an/2+1 ≥
··· ≥ an-1.
Consider the following subsequences of s:
s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1}
s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-1}
Note that s1 and s2 are both bitonic and each
element of s1 is less than every element in s2.
We can apply the procedure recursively on s1
and s2 to get the sorted sequence.
Dept. of Computer Science Curaj 34
35. We can easily build a sorting network to implement
this bitonic merge algorithm.
Such a network is called a bitonic merging network.
The network contains log n columns. Each column
contains n/2 comparators and performs one step of the
bitonic merge.
We denote a bitonic merging network with n inputs by
BM[n].
Replacing the comparators by Ө comparators results
in a decreasing output sequence; such a network is
denoted by ӨBM[n].
Dept. of Computer Science Curaj 35
36. How do we sort an unsorted sequence using a
bitonic merge?
We must first build a single bitonic sequence
from the given sequence.
A sequence of length 2 is a bitonic sequence.
A bitonic sequence of length 4 can be built by sorting
the first two elements using BM[2] and next
two, using ӨBM[2].
This process can be repeated to generate larger bitonic
sequences.
Dept. of Computer Science Curaj 36
37. A bitonic merging network for n = 16. The input
wires are numbered 0,1,…, n - 1, and the binary
representation of these numbers is shown. Each
column of comparators is drawn separately; the
entire figure represents a BM[16] bitonic
merging network. The network takes a bitonic
sequence and outputs it in sorted order.
Dept. of Computer Science Curaj 37
41. A schematic representation of a network that
converts an input sequence into a bitonic
sequence. In this example, BM[k] and
ӨBM[k] denote bitonic merging networks of
input size k that use and Ө
comparators, respectively. The last merging
network ( BM[16]) sorts the input.
In this example,
▪ n = 16.
Dept. of Computer Science Curaj 41
42. Six phases of Bitonic Sort on a hypercube
of dimension 3
Step No. Processor No.
000 001 010 011 100 101 110 111
1 L H H L L H H L
2 L L H H H H L L
3 L H L H H L H L
4 L L L L H H H H
5 L L H H L L H H
6 L H L H L H L H
Dept. of Computer Science Curaj 42
44. The depth of the network is Θ(log2 n).
Each stage of the network contains n/2
comparators. A serial implementation of the
network would have complexity Θ(nlog2 n).
Dept. of Computer Science Curaj 44
45. Bitonic sort (for N = P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
K G J M C A N F
Dept. of Computer Science Curaj 45
46. Bitonic sort (for N = P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
K G J M C A N F
Lo Hi Hi Lo Lo Hi High Low
G K M J A C N F
Dept. of Computer Science Curaj 46
47. Bitonic sort (for N = P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
K G J M C A N F
Lo Hi Hi Lo Lo Hi High Low
G K M J A C N F
L L H H H H L L
G J M K N F A C
Dept. of Computer Science Curaj 47
48. Bitonic sort (for N = P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
K G J M C A N F
Lo Hi Hi Lo Lo Hi High Low
G K M J A C N F
L L H H H H L L
G J M K N F A C
L H L H H L H L
G J K M N F C A
Dept. of Computer Science Curaj 48
49. Bitonic sort (for N = P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
K G J M C A N F
Lo Hi Hi Lo Lo Hi High Low
G K M J A C N F
L L H H H H L L
G J M K N F A C
L H L H H L H L
G J K M N F C A
L L L L H H H H
G F C A N J K M
Dept. of Computer Science Curaj 49
50. Bitonic sort (for N = P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
K G J M C A N F
Lo Hi Hi Lo Lo Hi High Low
G K M J A C N F
L L H H H H L L
G J M K N F A C
L H L H H L H L
G J K M N F C A
L L L L H H H H
G F C A N J K M
L L H H L L H H
C A G F K J N M
Dept. of Computer Science Curaj 50
51. Bitonic sort (for N = P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
K G J M C A N F
Lo Hi Hi Lo Lo Hi High Low
G K M J A C N F
L L H H H H L L
G J M K N F A C
L H L H H L H L
G J K M N F C A
L L L L H H H H
G F C A N J K M
L L H H L L H H
C A G F K J N M
A C F G J K M N
Dept. of Computer Science Curaj 51
52. In general, with n = 2k, there are k phases, each of
1, 2, 3, …, k steps.
Hence the total number of steps is:
i log n
bitonicbitonic
i log n log n (log n
log n (log n 1) 1) 2
T par
T par ii O (log O)
n (log
2
n)
i 1 2 2
i 1
Dept. of Computer Science Curaj 52
55. Bitonic sort (for N >> P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
2 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17
Local Sort (ascending):
2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17
L H H L L H High Low
2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5
L L H H H H L L
2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5
Dept. of Computer Science Curaj 55
56. Bitonic sort (for N >> P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
2 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17
Local Sort (ascending):
2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17
L H H L L H High Low
2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5
L L H H H H L L
2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5
L H L H H L H L
1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4
Dept. of Computer Science Curaj 56
57. Bitonic sort (for N >> P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
2 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17
Local Sort (ascending):
2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17
L H H L L H High Low
2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5
L L H H H H L L
2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5
L H L H H L H L
1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4
L L L L H H H H
1 2 4 4 5 6 5 6 6 2 3 4 14 15 17 8 10 11 7 7 9 12 13 18
Dept. of Computer Science Curaj 57
58. Bitonic sort (for N >> P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
2 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17
Local Sort (ascending):
2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17
L H H L L H High Low
2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5
L L H H H H L L
2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5
L H L H H L H L
1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4
L L L L H H H H
1 2 4 4 5 6 5 6 6 2 3 4 14 15 17 8 10 11 7 7 9 12 13 18
L L H H L L H H
1 2 4 2 3 4 5 6 6 4 5 6 7 7 9 8 10 11 14 15 17 12 13 18
Dept. of Computer Science Curaj 58
59. Bitonic sort (for N >> P)
P0 P1 P2 P3 P4 P5 P6 P7
000 001 010 011 100 101 110 111
2 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17
Local Sort (ascending):
2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17
L H H L L H High Low
2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5
L L H H H H L L
2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5
L H L H H L H L
1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4
L L L L H H H H
1 2 4 4 5 6 5 6 6 2 3 4 14 15 17 8 10 11 7 7 9 12 13 18
L L H H L L H H
1 2 4 2 3 4 5 6 6 4 5 6 7 7 9 8 10 11 14 15 17 12 13 18
L H L H L H L H
1 2 2 3 4 4 4 5 5 6 6 6 7 7 8 9 10 11 12 13 14 15 17 18
Dept. of Computer Science Curaj 59
60. Complexity (for N >> P)
bitonic
T par Local Sort Parallel Bitonic Merge
N N N
log 2 (1 2 3 ... log P )
P P P
N N log P (1 log P )
{log 2( )}
P P 2
N 2
(log N log P log P log P)
P
bitonic N 2
T par
(log N log P)
P of Computer Science
Dept. Curaj 60
61. Computational time complexity using P=n
processors
• Odd-even transposition sort -
• O(n)
• Bitonic Mergesort –
• O(log2n) (** BEST! **)
Dept. of Computer Science Curaj 61
62. Books
Parallel Programming in C with MPI and OpenMP , Michael J.
Quinn, McGraw Hill Higher Education, 2003
Introduction to Parallel Processing: Algorithms and
Architectures, Behrooz Parham, Springer
The Art of Concurrency: A Thread Monkey's Guide to Writing
Parallel Applications, Clay Breshears, O'Reilly Media
Links
http://www-
users.cs.umn.edu/~karypis/parbook/Lectures/AG/chap9_slides.pdf
A Library of Parallel Algorithms,
▪ www.cs.cmu.edu/~scandal/nesl/algorithms.html
Image Source
http://www-users.cs.umn.edu/~karypis/parbook/Lectures/AG/chap9_slides.pdf
Dept. of Computer Science Curaj 62