1. ICME MapReduce Workshop!
April 29 – May 1, 2013!
!
David F. Gleich!
Computer Science!
Purdue University
David Gleich · Purdue
1
!
Website www.stanford.edu/~paulcon/icme-mapreduce-2013
Paul G. Constantine!
Center for Turbulence Research!
Stanford University
MRWorkshop
2. Goals
Learn the basics of MapReduce & Hadoop
Be able to process large volumes of data from
science and engineering applications
… help enable you to explore on your own!
David Gleich · Purdue
2
MRWorkshop
3. Workshop overview
Monday!
Me! Sparse matrix computations in MapReduce!
Austin Benson Tall-and-skinny matrix computations in MapReduce
Tuesday!
Joe Buck Extending MapReduce for scientific computing!
Chunsheng Feng Large scale video analytics on pivotal Hadoop
Wednesday!
Joe Nichols Post-processing CFD dynamics data in MapReduce !
Lavanya Ramakrishnan Evaluating MapReduce and Hadoop for science
David Gleich · Purdue
3
MRWorkshop
4. Sparse matrix computations
in MapReduce!
David F. Gleich!
Computer Science!
Purdue University
David Gleich · Purdue
4
Slides online soon!
Code https://github.com/dgleich/mapreduce-matrix-tutorial
MRWorkshop
5. How to compute with big matrix data !
A tale of two computers
224k Cores
10 PB drive
1.7 Pflops
7 MW
Custom !
interconnect!
$104 M
80k cores!
50 PB drive
? Pflops
? MW
GB ethernet
$?? M
625 GB/core!
High disk to CPU
45 GB/core
High CPU to disk
5
ORNL 2010 Supercomputer!
Google’s 2010? !
Data computer!
David Gleich · Purdue
MRWorkshop
6. My data computers
6
Nebula Cluster @ Sandia CA!
2TB/core storage, 64 nodes,
256 cores, GB ethernet
Cost $150k
These systems are good for working with
enormous matrix data!
ICME Hadoop @ Stanford!
3TB/core storage, 11 nodes,
44 cores, GB ethernet
Cost $30k
David Gleich · Purdue
MRWorkshop
7. My data computers
7
Nebula Cluster @ Sandia CA!
2TB/core storage, 64 nodes,
256 cores, GB ethernet
Cost $150k
These systems are good for working with
enormous matrix data!
ICME Hadoop @ Stanford!
3TB/core storage, 11 nodes,
44 cores, GB ethernet
Cost $30k
^
but not great,
David Gleich · Purdue
some
^
MRWorkshop
8. By 2013(?) all Fortune 500
companies will have a data
computer
David Gleich · Purdue
8
MRWorkshop
9. How do you program them?
9
David Gleich · Purdue
MRWorkshop
11. MapReduce is designed to
solve a different set of problems
from standard parallel libraries
11
David Gleich · Purdue
MRWorkshop
12. The MapReduce
programming model
Input a list of (key, value) pairs
Map apply a function f to all pairs
Reduce apply a function g to !
all values with key k (for all k)
Output a list of (key, value) pairs
12
David Gleich · Purdue
MRWorkshop
13. Computing a histogram !
A simple MapReduce example
13
Input!
!
Key ImageId
Value Pixels
Map(ImageId, Pixels)
for each pixel
emit"
Key = (r,g,b)"
Value = 1
Reduce(Color, Values)
emit"
Key = Color
Value = sum(Values)
Output!
!
Key Color
Value !
# of pixels
David Gleich · Purdue
5
15
10
9
3
17
5
10
1
1
1
1
Map
Reduce
1
1
1
1
1
1
1
1
1
1
1
1
shuffle
MRWorkshop
14. Many matrix computations
are possible in MapReduce
Column sums are easy !
Input Key (i,j) Value Aij
Other basic methods !
can use common parallel/out-of-core algs!
Sparse matrix-vector products y = Ax
Sparse matrix-matrix products C = AB
14
Reduce(j,Values)
emit
Key = j, Value = sum(Values)
David Gleich · Purdue
Map((i,j), val)
emit"
Key = j, Value = val
A11
A12
A13
A14
A21
A22
A23
A24
A31
A32
A33
A34
A41
A42
A43
A44
(3,4) -> 5
(1,2) -> -6.0
(2,3) -> -1.2
(1,1) -> 3.14
…
“Coordinate storage”
MRWorkshop
15. Many matrix computations
are possible in MapReduce
Column sums are easy !
Input Key (i,j) Value Aij
Other basic methods !
can use common parallel/out-of-core algs!
Sparse matrix-vector products y = Ax
Sparse matrix-matrix products C = AB
15
Reduce(j,Values)
emit
Key = j, Value = sum(Values)
David Gleich · Purdue
Map((i,j), val)
emit"
Key = j, Value = val
A11
A12
A13
A14
A21
A22
A23
A24
A31
A32
A33
A34
A41
A42
A43
A44
(3,4) -> 5
(1,2) -> -6.0
(2,3) -> -1.2
(1,1) -> 3.14
…
“Coordinate storage”
Beware of un-thoughtful ideas
MRWorkshop
16. Why so many limitations?
16
David Gleich · Purdue
MRWorkshop
17. The MapReduce
programming model
Input a list of (key, value) pairs
Map apply a function f to all pairs
Reduce apply a function g to !
all values with key k (for all k)
Output a list of (key, value) pairs
Map function f must be side-effect free!
All map functions run in parallel
Reduce function g must be side-effect free!
All reduce functions run in parallel
17
David Gleich · Purdue
MRWorkshop
18. A graphical view of the MapReduce
programming model
David Gleich · Purdue
18
data
Map
data
Map
data
Map
data
Map
key
value
key
value
key
value
key
value
key
value
key
value
()
Shuffle
key
value
value
dataReduce
key
value
value
value
dataReduce
key
value dataReduce
MRWorkshop
19. Data scalability
The idea !
Bring the computations to the data
MR can schedule map functions without
moving data.
1
M
M
R
R
M
M
M
Maps
Reduce
Shuffle
2
3
4
5
1
2
M M
3
4
M M
5
M
19
David Gleich · Purdue
MRWorkshop
20. After waiting in the queue for a month and !
after 24 hours of finding eigenvalues, one node randomly hiccups.
heartbreak on node rs252
David Gleich · Purdue
20
MRWorkshop
21. Fault tolerant
Redundant input helps make maps data-local
Just one type of communication: shuffle
M
M
R
R
M
M
Input stored in triplicate
Map output!
persisted to disk!
before shuffle
Reduce input/!
output on disk
David Gleich · Purdue
21
MRWorkshop
22. Fault injection
10
100
1000
1/Prob(failure) – mean number of success per failure
Timetocompletion(sec)
200
100
No faults (200M by 200)
Faults (800M by 10)
Faults (200M by 200)
No faults !
(800M by 10)
With 1/5
tasks failing,
the job only
takes twice
as long.
David Gleich · Purdue
22
MRWorkshop
23. Data scalability
The idea !
Bring the computations to the data
MR can schedule map functions without
moving data.
1
M
M
R
R
M
M
M
Maps
Reduce
Shuffle
2
3
4
5
1
2
M M
3
4
M M
5
M
23
David Gleich · Purdue
MRWorkshop
24. Computing a histogram !
A simple MapReduce example
24
Input!
!
Key ImageId
Value Pixels
Map(ImageId, Pixels)
for each pixel
emit"
Key = (r,g,b)"
Value = 1
Reduce(Color, Values)
emit"
Key = Color
Value = sum(Values)
Output!
!
Key Color
Value !
# of pixels
David Gleich · Purdue
5
15
10
9
3
17
5
10
1
1
1
1
Map
Reduce
1
1
1
1
1
1
1
1
1
1
1
1
shuffle
The entire dataset is
“transposed” from
images to pixels.
This moves the data
to the computation!
(Using a combiner
helps to reduce the
data moved, but it
cannot always be
used)
MRWorkshop
25. Hadoop and MapReduce are
bad systems for some matrix
computations.
David Gleich · Purdue
25
MRWorkshop
26. How should you evaluate a
MapReduce algorithm?
Build a performance model!
Measure the worst mapper
Usually not too bad
Measure the data moved
Could be very bad
Measure the worst reducer
Could be very bad
David Gleich · Purdue
26
MRWorkshop
27. Tools I like
hadoop streaming
dumbo
mrjob
hadoopy
C++
David Gleich · Purdue
27
MRWorkshop
28. Tools I don’t use but other
people seem to like …
pig
java
hbase
mahout
Eclipse
Cassandra
David Gleich · Purdue
28
MRWorkshop
29. hadoop streaming
the map function is a program!
(key,value) pairs are sent via stdin!
output (key,value) pairs goes to stdout
the reduce function is a program!
(key,value) pairs are sent via stdin!
keys are grouped!
output (key,value) pairs goes to stdout
David Gleich · Purdue
29
MRWorkshop
30. mrjob from
a wrapper around hadoop streaming for
map and reduce functions in python
class MRWordFreqCount(MRJob):
def mapper(self, _, line):
for word in line.split():
yield (word.lower(), 1)
def reducer(self, word, counts):
yield (word, sum(counts))
if __name__ == '__main__':
MRWordFreqCount.run()
David Gleich · Purdue
30
MRWorkshop
31. How can Hadoop streaming
possibly be fast?
Iter 1
QR (secs.)
Iter 1
Total (secs.)
Iter 2
Total (secs.)
Overall
Total (secs.)
Dumbo 67725 960 217 1177
Hadoopy 70909 612 118 730
C++ 15809 350 37 387
Java 436 66 502
Synthetic data test 100,000,000-by-500 matrix (~500GB)
Codes implemented in MapReduce streaming
Matrix stored as TypedBytes lists of doubles
Python frameworks use Numpy+Atlas
Custom C++ TypedBytes reader/writer with Atlas
New non-streaming Java implementation too
David Gleich (Sandia)
All timing results from the Hadoop job tracker
C++ in streaming beats a native Java implementation.
16/22MapReduce 2011
David Gleich · Purdue
31
Example available from
github.com/dgleich/mrtsqr!
for verification
mrjob could be faster if it used
typedbytes for intermediate storage see
https://github.com/Yelp/mrjob/pull/447
MRWorkshop
32. Code samples and short tutorials at
github.com/dgleich/mrmatrix
github.com/dgleich/mapreduce-matrix-tutorial
David Gleich · Purdue
32
MRWorkshop
33. Matrix-vector product
David Gleich · Purdue
33
Ax = y
yi =
X
k
Aik xk
A
x
Follow along!
mapreduce-matrix-tutorial!
/codes/smatvec.py!
MRWorkshop
34. Matrix-vector product
David Gleich · Purdue
34
Ax = y
yi =
X
k
Aik xk
A
x
A is stored by row
$ head samples/smat_5_5.txt !
0 0 0.125 3 1.024 4 0.121!
1 0 0.597!
2 2 1.247!
3 4 -1.45!
4 2 0.061!
x is stored entry-wise
!
$ head samples/vec_5.txt!
0 0.241!
1 -0.98!
2 0.237!
3 -0.32!
4 0.080!
Follow along!
mapreduce-matrix-tutorial!
/codes/smatvec.py!
MRWorkshop
35. Matrix-vector product!
(in pictures)
David Gleich · Purdue
35
Ax = y
yi =
X
k
Aik xk
A
x
A
x
Input
Map 1!
Align on columns!
Reduce 1!
Output Aik xk!
keyed on row i
A
x
Reduce 2!
Output
sum(Aik xk)!
y
MRWorkshop
36. Matrix-vector product!
(in pictures)
David Gleich · Purdue
36
Ax = y
yi =
X
k
Aik xk
A
x
A
x
Input
Map 1!
Align on columns!
def joinmap(self, key, line):!
vals = line.split()!
if len(vals) == 2:!
# the vector!
yield (vals[0], # row!
(float(vals[1]),)) # xi!
else:!
# the matrix!
row = vals[0]!
for i in xrange(1,len(vals),2):!
yield (vals[i], # column!
(row, # i,Aij!
float(vals[i+1])))!
MRWorkshop
37. Matrix-vector product!
(in pictures)
David Gleich · Purdue
37
Ax = y
yi =
X
k
Aik xk
A
x
A
x
Input
Map 1!
Align on columns!
Reduce 1!
Output Aik xk!
keyed on row i
A
x
def joinred(self, key, vals):!
vecval = 0. !
matvals = []!
for val in vals:!
if len(val) == 1:!
vecval += val[0]!
else:!
matvals.append(val) !
for val in matvals:!
yield (val[0], val[1]*vecval)!
Note that you should use a
secondary sort to avoid
reading both in memory
MRWorkshop
38. Matrix-vector product!
(in pictures)
David Gleich · Purdue
38
Ax = y
yi =
X
k
Aik xk
A
x
A
x
Input
Map 1!
Align on columns!
Reduce 1!
Output Aik xk!
keyed on row i
A
x
Reduce 2!
Output
sum(Aik xk)!
y
def sumred(self, key, vals):!
yield (key, sum(vals))!
MRWorkshop
39. Move the computations to the
data? Not really!
David Gleich · Purdue
39
A
x
A
x
Input
Map 1!
Align on columns!
Reduce 1!
Output Aik xk!
keyed on row i
A
x
Reduce 2!
Output
sum(Aik xk)!
y
Copy data once,
now aligned on column
Copy data again,
align on row
MRWorkshop
40. Matrix-matrix product
David Gleich · Purdue
40
A
B
AB = C
Cij =
X
k
Aik Bkj
Follow along!
mapreduce-matrix-tutorial!
/codes/matmat.py!
MRWorkshop
41. Matrix-matrix product
David Gleich · Purdue
41
A
B
AB = C
Cij =
X
k
Aik Bkj
A is stored by row
$ head samples/smat_10_5_A.txt !
0 0 0.599 4 -1.53!
1!
2 2 0.260!
3!
4 0 0.267 1 0.839
B is stored by row
$ head samples/smat_5_5.txt !
0 0 0.125 3 1.024 4 0.121!
1 0 0.597!
2 2 1.247!
Follow along!
mapreduce-matrix-tutorial!
/codes/matmat.py!
MRWorkshop
42. Matrix-matrix product !
(in pictures)
David Gleich · Purdue
42
A
B
AB = C
Cij =
X
k
Aik Bkj
A
Map 1!
Align on columns!
B
Reduce 1!
Output Aik Bkj!
keyed on (i,j)
A
B
Reduce 2!
Output
sum(Aik Bkj)!
C
MRWorkshop
43. Matrix-matrix product !
(in code)
David Gleich · Purdue
43
A
B
AB = C
Cij =
X
k
Aik Bkj
A
Map 1!
Align on columns!
B
def joinmap(self, key, line):!
mtype = self.parsemat()!
vals = line.split()!
row = vals[0]!
rowvals = !
[(vals[i],float(vals[i+1])) !
for i in xrange(1,len(vals),2)]!
if mtype==1:!
# matrix A, output by col!
for val in rowvals:!
yield (val[0], (row, val[1]))!
else:!
yield (row, (rowvals,))!
MRWorkshop
44. Matrix-matrix product !
(in code)
David Gleich · Purdue
44
A
B
AB = C
Cij =
X
k
Aik Bkj
A
Map 1!
Align on columns!
B
Reduce 1!
Output Aik Bkj!
keyed on (i,j)
A
B
def joinred(self, key, line):!
# load the data into memory !
brow = []!
acol = []!
for val in vals:!
if len(val) == 1:!
brow.extend(val[0])!
else:!
acol.append(val)!
!
for (bcol,bval) in brow:!
for (arow,aval) in acol:!
yield ((arow,bcol),aval*bval)!
MRWorkshop
45. Matrix-matrix product !
(in pictures)
David Gleich · Purdue
45
A
B
AB = C
Cij =
X
k
Aik Bkj
A
Map 1!
Align on columns!
B
Reduce 1!
Output Aik Bkj!
keyed on (i,j)
A
B
Reduce 2!
Output
sum(Aik Bkj)!
C
def sumred(self, key, vals):!
yield (key, sum(vals))!
MRWorkshop
46. Why is MapReduce so popular?
if (root) {!
PetscInt cur_nz=0;!
unsigned char* root_nz_buf;!
unsigned int *root_nz_buf_i,*root_nz_buf_j;!
double *root_nz_buf_v;!
PetscMalloc((sizeof(unsigned
int)*2+sizeof(double))*root_nz_bufsize,root_nz_buf);!
PetscMalloc(sizeof(unsigned
int)*root_nz_bufsize,root_nz_buf_i);!
PetscMalloc(sizeof(unsigned
int)*root_nz_bufsize,root_nz_buf_j);!
PetscMalloc(sizeof(double)*root_nz_bufsize,root_nz_buf_v);!
!
unsigned long long int nzs_to_read = total_nz;!
!
while (send_rounds 0) {!
// check if we are near the end of the file!
// and just read that amount!
size_t cur_nz_read = root_nz_bufsize;!
if (cur_nz_read nzs_to_read) {!
cur_nz_read = nzs_to_read;!
}!
PetscInfo2(PETSC_NULL, reading %i non-zeros of %llin,
cur_nz_read, nzs_to_read);!
600 lines of gross
code in order to
load a sparse matrix
into memory,
streaming from one
processor.
MapReduce offers a
better alternative
David Gleich · Purdue
46
MRWorkshop
47. Thoughts on a better system
Default quadruple precision
Matrix computations without indexing
Easy setup of MPI data jobs
David Gleich · Purdue
47
Initial data load of any MPI job
Compute task
MRWorkshop
49. Error analysis of summation
s = 0; for i=1 to n: s = s + x[i]
A simple summation formula has !
error that is not always small if n is a billion
David Gleich · Purdue
49
fl(x + y) = (x + y)(1 + )
fl(
X
i
xi )
X
i
xi nµ
X
i
|xi | µ ⇡ 10 16
MRWorkshop
50. If your application matters
then watch out for this issue.
Use quad-precision arithmetic
or compensated summation
instead.
David Gleich · Purdue
50
MRWorkshop
51. Compensated Summation
“Kahan summation algorithm” on Wikipedia
s = 0.; c = 0.;
for i=1 to n:
y = x[i] – c
t = s + y
c = (t – s) – y
s = t
David Gleich · Purdue
51
Mathematically, c is always zero.
On a computer, c can be non-zero
The parentheses matter!
fl(csum(x))
X
i
xi (µ + nµ2
)
X
i
|xi |
µ ⇡ 10 16
MRWorkshop
52. Summary
MapReduce is a powerful but limited tool that has a role
in the future of computational math.
… but it should be used carefully! See Austin’s talk next!
David Gleich · Purdue
52
MRWorkshop
Code samples and short tutorials at
github.com/dgleich/mrmatrix
github.com/dgleich/mapreduce-matrix-tutorial