The core idea behind Hadoop is to distribute both the data and user software on individual shards within the cluster. The Bigdata Replay method is drastically different in that it packs user software into batches on a single multicore machine and uses circuit emulation to maximize throughout when bringing data shards for replay. The effect from hotspots, defined as drastically higher access frequency to a small portion of (popular) data, is different in the two platforms. This paper models the difference numerically but in a relative form, which makes it possible to compare the two platforms.
2. Background on Hadoop
• Hadoop performance measurement
◦ creators on performance limits 09
◦ superlinear effect 08
◦ various benchmarks on Hadoop vs Spark 07
◦ inconsistencies in measurements 11
• Hadoop/MapReduce optimization in 14 and a ton of other papers
• the ”Do We (actually) Need Hadoop?” argument in 10 and few recent
papers
09 K.Shvachko+0 ”HDFS scalability: the limits to growth” Usenix Login (2010)
08 N.Gunther+2 ”Hadoop Superlinear Scalability” ACM Queue (2015)
07 J.Shi+6 ”Clash of the Titans: MapReduce vs. Spark for Large Scale Data Analytics” Very Large Data Bases (2015)
11 M.Xia+3 ”Performance Inconsistency in Large Scale Data Processing Clusters” 10th USENIX ICAC (2013)
14 A.Rasooli+1 ”COSHH: A Classiffication and Optimization based Scheduler for Heterogeneous Hadoop Systems” Future Gen.Comp.Sys. (2014)
10 A.Rowstron+1 ”Nobody ever got fired for using Hadoop on a cluster” 1st HotCDP (2012)
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 2/12
2/12
3. Modeling Hadoop Bottlenecks
Network
(NW)
Bulk
Storage
(BS)
Shared
Memory
(SM)
Core Output
Big Data Processing
HPC, Simulators, Modeling
Small
Data
Bulk
Storage
(BS)
On-Chip
Shared
Memory
(hSM)
Numberofparallelaccesses
Network
(NW)
Ability to isolate
Bottleneck
(pipe width)
RAM-based
Shared Memory
(sSM) Bulk
Storage
(BS)
Network
(NW)1
RAM-based
Shared Memory
(sSM)
Parallelaccesses
Ability to isolate
Core Output
Small
Data
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 3/12
3/12
4. Hadoop’s Answer: Rack Awareness
Rack
Switch
Datanode
Datanode
Datanode
…
Rack
Switch
Datanode
Datanode
……
Core
Switch
Client
Client
Logical
Client
Own Rack
Switch
Other Rack Switch
Other Rack Switch
Other Rack Switch
Datanodes
• official Hadoop feature
(not a bug) 12
• some dynamics, goes
off-rack when local
nodes have too many jobs
• sadly, manual
configuration of rack
affiliation (much potential here for
research on virtual network coordinates –
Meridian, Vivaldi...)
12 ”Hadoop: Rack Awareness” https://hadoop.apache.org (2017)
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 4/12
4/12
5. Hadoop vs Bigdata Replay Method
• basic idea similar to 10 but uses circuits 02 to transfer shards and multicore
01 to parallel-process them
Name Node
Storage Node (shard)
file A
file B
file C
…
Hadoop Space
Manager
Hadoop Job
(your code)
Hadoop Job
(your code)
Hadoop Job
(your code)
MapReduce
job (your code)
manymany
Name
Server(s)
Client Machine
Hadoop Client
Your
Code
You
Start Use
Deploy
FindRead/parse
many
Internals (DC)
Users
Storage Node
(shard)
Time-Aware
Sub-Store(s)
Manager
Client Machine
Client
Your
Sketcher
You
Start Use
Schedule
Multicore
Replay
Replay Node
many
10 A.Rowstron+1 ”Nobody ever got fired for using Hadoop on a cluster” 1st HotCDP (2012)
02 myself+0 ”Circuit Emulation for Big Data Transfers in Clouds” Networking for Big Data, CRC (2015)
01 myself+0 ”Streaming Algorithms for Big Data Processing on Multicore” Big Data: Algorithms, Analytics, and Applications, CRC (2015)
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 5/12
5/12
6. Replay Environment is Highly Flexible!
• replay is time-aligned, so jobs can pick any spot on the timeline
• similar to Spark in going beyond key-value datatype but more – the full scope
of streaming algorithms 01
• massively multicore environments 04 with 100+ cores, dynamic re-packing of
job batches, etc.
Core 1
Core 1
Core X
Replay
Manager
Now(replay)
….
Time-Aligned Big Data
Cursor
Time
Direction
One Sketch One SketchOne Sketch
Start End End End
Read/prepare
Shared Memory
Start
….
Time
Now
(buffer head)
Manager
Job
Job
Buffer
tail
pos
pos
Controller
Kill
2 Report
Manage
in realtime
One Replay Batch
One
Buffer
One
Buffer
One
BufferJobs
Jobs
Jobs
Replay at
a scale
1
01 myself+0 ”Streaming Algorithms for Big Data Processing on Multicore” Big Data: Algorithms, Analytics, and Applications, CRC (2015)
04 myself+0 ”Volume and Irregularity Effects on Massively Multicore Packet Processors” APNOMS (2016)
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 6/12
6/12
7. Performance under hotspots
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 7/12
7/12
8. The Hotspot Distribution
0 20 40 60 80 100
Decreasing order
0
0.35
0.7
1.05
1.4
1.75
2.1
2.45
2.8
log(value)
Class A Class B Class C Class D Class E
• models Flash/Hotspot/
Killerapp/Blackswan
events using extreme variance
in popularity
• generation method:
stick-breaking process,
Dirichlet distribution with
parallel beta sources 05
• final step: classify based on
the number of hot/flash items
05 myself+1 ”Popularity-Based Modeling of Flash Events in Synthetic Packet Traces” CQ研 (2012)
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 8/12
8/12
9. The Binary ”Till Contention” Metric
• not a common, but very realistic
way to model performance under load
• note: even more applicable under
hotspot-y input
Rack
Rack
Border
(switch)
Client
Data
Shards
Data
Shards
…
Volume
Contention
Contention -free
to contention -ful
threshold
• example: function of server response
time to load can be expressed as:
T =
1
2
[
(L − n) +
√
(L − n)2 + k
1 − L
]
• ...where T is response time, L is load,
and k is the knee = contention point!
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 9/12
9/12
10. Performance Models
• shard size as S and in-job traffic to shard size ratio r
◦ so, Hadoop jobs generate rS versus always strictly S under Replay
• contention threshold as C (for both contention and/or capacity)
• list of shard hotness (popularity)
{
h1, h2, h3, ..., hn
}
and sizes{
S1, S2, S3, ...., Sn
}
• then we have (job/traffic) volume for Hadoop:
Vhadoop =
∑
i=1..n
rhiSi
• ... and for Replay method:
Vreplay =
∑
i=1..n
Si (1)
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 10/12
10/12
11. Results
A 0.001 B 0.001 C 0.001 D 0.001 E 0.001
hadoopreplay
A 0.005 B 0.005 C 0.005 D 0.005 E 0.005
A 0.01 B 0.01 C 0.01 D 0.01 E 0.01
A 0.05 B 0.05 C 0.05 D 0.05 E 0.05
A 0.1 B 0.1 C 0.1 D 0.1 E 0.1
A 0.2 B 0.2 C 0.2 D 0.2 E 0.2
10
20
50
100
200
500
1000
2000
5000
10000
0.7
1.4
2.1
2.8
3.5
4.2
log(1+timetillcontention)
A 0.5 B 0.5 C 0.5 D 0.5 E 0.5
Replay period (step) is 10
A 0.001 B 0.001 C 0.001 D 0.001 E 0.001
hadoopreplay
A 0.005 B 0.005 C 0.005 D 0.005 E 0.005
A 0.01 B 0.01 C 0.01 D 0.01 E 0.01
A 0.05 B 0.05 C 0.05 D 0.05 E 0.05
A 0.1 B 0.1 C 0.1 D 0.1 E 0.1
A 0.2 B 0.2 C 0.2 D 0.2 E 0.2
10
20
50
100
200
500
1000
2000
5000
10000
0.8
1.6
2.4
3.2
4
4.8
log(1+timetillcontention)
A 0.5 B 0.5 C 0.5 D 0.5 E 0.5
Replay period (step) is 50
A 0.001 B 0.001 C 0.001 D 0.001 E 0.001
hadoopreplay
A 0.005 B 0.005 C 0.005 D 0.005 E 0.005
A 0.01 B 0.01 C 0.01 D 0.01 E 0.01
A 0.05 B 0.05 C 0.05 D 0.05 E 0.05
A 0.1 B 0.1 C 0.1 D 0.1 E 0.1
A 0.2 B 0.2 C 0.2 D 0.2 E 0.2
10
20
50
100
200
500
1000
2000
5000
10000
0.9
1.8
2.7
3.6
4.5
5.4
log(1+timetillcontention)
A 0.5 B 0.5 C 0.5 D 0.5 E 0.5
Replay period (step) is 200
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 11/12
11/12
12. That’s all, thank you ...
M.Zhanikeev – maratishe@gmail.com On Performance Under Hotspots in Hadoop versus Bigdata Replay Platforms – bit.do/170920 12/12
12/12