Micro-architectural performance is generally consistent between batch and stream processing workloads in Spark if they only differ in micro-batching. DataFrames show improved instruction retirement and reduced stalls compared to RDDs. Higher data velocities can improve CPU utilization and reduce stalls, while increasing bandwidth consumption and instruction retirement. The size of micro-batches in stream workloads determines their micro-architectural behavior.
Micro-architectural Characterization of Apache Spark on Batch and Stream Processing Workloads
1. 1
Micro-architectural Characterization of
Apache Spark on Batch and Stream
Processing Workloads
Ahsan Javed Awan
EMJD-DC (KTH-UPC)
(https://www.kth.se/profile/ajawan/)
Mats Brorsson(KTH), Eduard Ayguade(UPC and BSC),
Vladimir Vlassov(KTH)
2. 2
Motivation
Why should we care about architecture support?
*Taken from Babak's slides
Data Growing Faster Than Technology
3. 3
Motivation
Cont...
Our GoalOur Goal
Improve the node level performance
through architecture support
*Source: http://navcode.info/2012/12/24/cloud-scaling-schemes/
Phoenix ++,
Metis, Ostrich,
etc..
Hadoop, Spark,
Flink, etc..
4. 4
Our Approach
● Performance characterization of in-memory data analytics on a
modern cloud server, in 5th International IEEE Conference on Big
Data and Cloud Computing, 2015 (Best Paper Award).
● How Data Volume Affects Spark Based Data Analytics on a
Scale-up Server in 6th International Workshop on Big Data
Benchmarks, Performance Optimization and Emerging Hardware
(BpoE), held in conjunction with VLDB 2015, Hawaii, USA
– Limited to batch processing workloads only
– Does not consider the velocity aspect of big data
– Experiments are based on older version of Spark.
What are the major performance
bottlenecks??
5. 5
Our Approach
● Does micro-architectural performance remains consistent
across batch and stream processing workloads ?
● How Data-frames micro-architecturally compare to RDDs ?
● How data velocity affect the micro-architectural performance ?
What are the remaining questions??
6. 6
Progress Meeting 12-12-14
Which Scale-out Framework ?
[Picture Courtesy: Amir H. Payberah]
● Tuning of Spark internal Parameters
● Tuning of JVM Parameters (Heap size etc..)
● Micro-architecture Level Analysis using Hardware Performance
Counters.
14. 14
Cont..
Workload Spark Transformation Input
data
rate
Window
size (s)
Working Set with
2s sampling
interval
WWc FlatMap, Map,
ReduceByKeyAndWindow
10^4 30 15 x 10^4
CSpc FlatMap, Map,
CountByValueAndWindow
10^4 10 5 x 10^4
CErpz FlatMap, Map, Window,
GroupByKey
10^4 30 15 x 10^4
CAuC FlatMap, Map, Window,
GroupByKey, Count
10^4 10 5 x 10^4
Tpt FlatMap,
ReduceByKeyAndWindow,
Transform
10^1 60 30 x 10^1
Micro-batch size determines the micro-architectural behavior of stream processing
workloads with similar Spark transformations
15. 15
Do Dataframes perform better than RDDs at
micro-architectural level?
DataFrame exhibit 25% less back-end bound stalls 64% less DRAM bound stalled cycles
25% less BW consumption10% less starvation of execution resources
Dataframes have better micro-architectural performance than RDDs
16. 16
How Data Velocity affect micro-architectural
performance?
Better CPU utilization at higher data velocity
17. 17
Cont..
Higher instruction retirement at higher data velocity Higher L1-Bound stalls at higher data velocity
Less starvation at higher data velocity Higher BW consumption at higher velocity
18. 18
Our Approach
Conclusion
● Batch processing and stream processing has same micro-architectural
behavior in Spark if the difference between two implementations is of
micro-batching only.
● Spark workloads using DataFrames have improved instruction
retirement over workloads using RDDs.
● If the input data rates are small, stream processing workloads are
front-end bound. However, the front end bound stalls are reduced at
larger input data rates and instruction retirement is improved.
20. 20
Our Approach
List of Papers
● Performance characterization of in-memory data analytics on a
modern cloud server, in 5th
International IEEE Conference on Big Data
and Cloud Computing, 2015 (Best Paper Award).
● How Data Volume Affects Spark Based Data Analytics on a Scale-up
Server in 6th
International Workshop on Big Data Benchmarks,
Performance Optimization and Emerging Hardware (BpoE), held in
conjunction with VLDB 2015, Hawaii, USA .
● Micro-architectural Characterization of Apache Spark on Batch and
Stream Processing Workloads. (accepted to BDCloud 2016)
● Node Architecture Implications for In-Memory Data Analytics in Scale-
in Clusters (accepted to IEEE BDCAT 2016)
● Implications of In-Memory Data Analytics with Apache Spark on Near
Data Computing Architectures (under submission).