Breaking the Kubernetes Kill Chain: Host Path Mount
The Performance of MapReduce: An In-depth Study
1. Dawei Jiang, Beng Chin Ooi, Lei Shi, Sai Wu,
School of Computing, NUS
Presented by Tang Kai
2. Introduction
Factors affecting Performance of MR
Pruning search space
Implementation
Benchmark
3. MapReduce-based systems are increasingly
being used.
◦ Simple yet impressive interface
Map() Reduce()
◦ Flexible
Storage system independence
◦ Scalable
◦ Fine-grain fault tolerance
4. Previous study
◦ Fundamental difference
Schema support
Data access
Fault tolerance
◦ Benchmark
Parallel DB >> MR-based
5. Is it not possible to have a flexible, scalable
and efficient MapReduce-based systems?
Works
◦ Identify several performance bottlenecks
◦ manage bottlenecks and tune performance
well-known engineering and database techniques
Conclusion
◦ 2.5x-3.5x
6. Introduction
Factors affecting Performance of MR
Pruning search space
Implementation
Benchmark
7. 7 steps of a MapReduce job
1) Map
2) Parse
3) Process
4) Sort
5) Shuffle
6) Merge
7) Reduce
9. Direct I/O
◦ read data from the disk directly
◦ Local
Streaming I/O
◦ streaming data from the storage system by an
inter-process communication scheme,
such as TCP/IP or JDBC.
◦ Local and remote
Direct I/O > Streaming I/O by 10%-15%
10. Input of a MapReduce job
◦ a set of files stored in a distributed file system, i.e.
HDFS Boost selection task 2x-10x
Ranged-indexes depending on the selectivity
◦ input HDFS files are not sorted but each data chunk
in the files are indexed by keys
Block-level indexes
◦ tables stored in database servers
Database indexed tables
11. Raw data -> <k,v> pair
Immutable decoding
◦ Read-only records (set once)
Mutable decoding
Mutable decoder is 10x faster.
◦ boost selection task 2x overall
12. Map-side sorting affects performance of
aggregation
◦ Cost of key comparison is non-trivial.
Example
◦ SourceIP in UserVisits Table
◦ Sort intermediate records.
◦ sourceIP variable-length string
String compare (byte-to-byte)
Fingerprint compare (integer)
Fingerprint-based is 4x-5x faster.
◦ 20%-25% overall
13. Why
◦ 4 factors
Resulting in large search space (2*2*3*2)
◦ Budget limit on Amazon EC2
Greedy
14. Greedy Stategy 3 datasets
Direct I/O
I/O mode
Stream I/O
Different sort schemes Bench
In various architecture mark
Hadoop Writable
Google’s
Parser
ProtocolBuffer
Berkeley DB
4 queries
15. Introduction
Factors affecting Performance of MR
Pruning search space
Implementation
Benchmark
16. Hadoop 0.19.2 as code base
Direct I/O
◦ Modification of data node implementation
Text decoder
◦ Immutable same as Dewitt
◦ Mutable by ourselves
Binary decoder
◦ Hadoop
Immutable Writable decoder
Mutable using hadoop API by ourselves
◦ Google Protocol buffer
Build-in compiler->mutable
Immutable by ourselves
◦ Berkeley DB
BDB binding API (mutable)
18. Introduction
Factors affecting Performance of MR
Pruning search space
Implementation
Benchmark
19. Results for different I/O mode
◦ Single node
◦ No-op job w/ map w/o reduce
20. Results for record parsing
◦ Run in Java process instead of MapReduce job
◦ Time start after loading into memory
Mutable > Immutable
◦ Mutable text> mutable binary
21. In between hadoop-based system
◦ Cache factor
In between hadoop-based and Parallel DB
◦ Close
23. UserVisits GROUP BY SUBSTR(so
Parsing: 2x faster
Sorting: 20%-25% faster
◦ Not significant in small size aggregation task
24. On decoding scheme
Comparison of tuned MR-based & Parallel DB
25. Cons
◦ Need to be committed/forked to Hadoop source
code tree
◦ A complete framework is needed instead of
miscellaneous patches.
◦ Various API support: CLI, Web rather than Java.
Future work
◦ Provide query parser, optimizer etc to build a
complete solution
◦ Elastic power-aware data intensive Cloud
http://www.comp.nus.edu.sg/~epic/download/MapRe
duceBenchmark.tar.gz
Tenzing: A SQL Implemetation On The MapReduce Framework