SlideShare uma empresa Scribd logo
1 de 37
Baixar para ler offline
Mikhail Genkin
Performance Architect and Optimization Lead, IBM Spectrum Computing
Michael Feiman (presenter)
Principal Software Architect, IBM Spectrum Computing
Peter Lankford (presenter)
Founder & Director, STAC
Deep Dive Into Spark
Multi-User Performance
Outline
• Introduction
• Factors affecting multi-user performance
• Spark-RM integration architecture
• Analyzing multi-user performance with the Spark Multi-user Benchmark
• Break
• Data and insights
• Q&A
• References
Disclaimer
• STAC and all STAC names are trademarks or registered trademarks of the Securities Technology Analysis Center, LLC.
Introduction
• Most Spark production deployments are multi-user or multi-tenant
• Understanding factors that affect performance of Spark applications under
multi-user or multi-tenant conditions is very important
• In this session we focus on Spark performance with YARN, Mesos and IBM
Spectrum Conductor with Spark
• IBM contracted STAC to perform and independent evaluation of 3 common
Spark resource manager configurations:
– Spark 2.0.2 (dev) + YARN 2.7.3
– Spark 2.0.1 + Mesos 1.0.1
– IBM Spectrum Conductor with Spark 2.1.0.1 (includes Spark 2.0.1)
• IBM views this as a first step in providing more data-driven community
engagement on Spark resource management
Factors Affecting Multi-User/Tenant
Performance
• Resource manager configuration and behavior
– Resource manager controls:
• how many CPUs and how much memory a given app gets
• how many Spark applications will run concurrently
• can have order-of-magnitude impact on a given application performance
• Spark configuration and tuning
– Some Spark tunables, such as spark.executor.memory affect how many Spark executors and applications will run concurrently
• Number of concurrently running users and applications
– More concurrent applications means less resources per application and slower performance
• Cluster infrastructure
– Capacity and presence or absence of bottlenecks
• Data scale used by the applications
– More data usually results in slower performance
• Type of processing performed by the app
– Spark SQL vs. machine learning for example
• Others
Spark-Resource Manager Integration
Spark-RM Integration Architecture
• Interplay between Spark and Resource Manager configurations
• Spark determines:
– Executor memory
– Executor CPU
• Resource manager determines (directly and indirectly):
– How many CPUs and how much memory each application gets initially
– Whether applications can release or gain resources during execution (dynamic allocation)
– Whether resources can be taken from one application to boost another (pre-emption)
– How many Spark executors will be started concurrently
– How many Spark tasks will execute in parallel
Spark Multi-User Benchmark (SMB)
• Open-source benchmark: www.github.com/IBM/SparkMultiuserBenchmark
– Initially developed at IBM
• SMB is designed to measure performance under very realistic workload
conditions
– Simulates steady-state, non-steady state job-arrival patterns
– Mix of TPC-DS-inspired queries and machine learning jobs
– Short-running synchronous interactive queries executed in parallel by multiple users
– Longer-running batch jobs and queries executed in parallel by multiple users
– Mixed workloads – interactive + batch – executed in parallel by multiple users
– Multi-tenant workloads – interactive + batch – executed in parallel by multiple users with different
weighting or QoS constraints
• How it can be used:
– Tune and optimize your Spark applications and cluster hardware infrastructure
– Study resource manager behavior and efficiency
SMB-1 and SMB-2
• Workload patterns:
1. Synchronous interactive workload
• SMB-1 and SMB-2
• Short running queries – 5 to 20 sec
duration
• TPC-DS-inspired queries
2. Asynchronous batch workload
• SMB-2
• Longer running queries – 30 sec to 5 min
duration
• TPC-DS-inspired queries
• K-means machine learning jobs
• Use cases:
1. Synchronous interactive multi-user
• SMB-1 and SMB-2
• Workload pattern 1 only
• All users have the same weight and QoS
2. Asynchronous batch multi-user
• SMB-2 only
• Workload pattern 2 only
• All users have the same weight and QoS
3. Mixed multi-user
• SMB-2 only
• Workload mix:
– 50% interactive users
– 50% batch users
• Same weight for all users
4. Mixed multi-tenant
• SMB-2 only
• Workload mix:
– 50% interactive users
– 50% batch users
• Interactive has priority
• Interactive can use up to 70% of resources
• Batch can use 100% of resources when interactive is
not running
• Batch can use not more than 30% when interactive is
running
SMB-2 Synchronous Interactive Workload Pattern
SMB-2 Asynchronous Batch Workload Pattern
Interpreting SMB Data
• Coarse metrics
– Total test duration
– Combined query throughput (qph)
– Average weighted Relative
Standard Deviation (RSD)
• RSD = Std.Dev./Avg. *100
• Measures variability
• Detailed per-query metrics
– Query RSD
– Avg. query duration
– Standard deviation for query durations
– Plot of query durations for all queries
vs. test duration (pattern shown on the
right)
Step-up Step-down
Steady-state
Break – 10 min
Part 2 – Data and Insights
So what is STAC®?
• STAC facilitates the STAC Benchmark Council:
– ~300 financial firms and ~50 tech vendors
– Establishes standard technology benchmarks & promotes dialog
– Big data, fast data, fast compute, big compute
• STAC performs independent benchmark audits
Visualization – Use Case 1
1
10
100
1000
10000
0:00 0:28 0:57 1:26 1:55 2:24 2:52
Jobduration(seconds,logscale)
Job start time
Use Case 1 (sync. interactive multi-user) - Job durations (log scale) for Spark Resource Managers
Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers
(Lower is better. Values below 1 are not displayed)
Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1
Throughput – Use Case 1
• IBM had 56% higher
throughput than YARN
and 57% higher than
Mesos
• YARN and Mesos
were nearly the same
Use Case 1:
Synchronous interactive
multi-user
Use Case 2:
Asynchronous batch
multi-user
Use Case 3:
Mixed
multi-user
Use Case 4:
Mixed
multi-tenant
Duration Throughput Duration Throughput Duration Throughput Duration Throughput
Resource
Manager Run
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
1 1.83 2623 1.53 327 1.29 908 2.05 574
2 1.84 2607 1.53 327 1.31 899 1.92 612
Worst
result
1.84 2607 1.53 327 1.31 899 2.05 574
1 2.87 1673 1.98 253 2.01 586 5.84 201
2 2.64 1816 1.58 316 2.02 582 6.63 177
Worst
result
2.87 1673 1.98 253 2.02 582 6.63 177
1 2.86 1679 2.40 209 2.29 512 2.57 458
2 2.89 1660 2.47 202 2.46 478 2.30 511
Worst
result
2.89 1660 2.47 202 2.46 478 2.57 458
Apache
Mesos v1.0.1
IBM
Spectrum
Conductor
with Spark
v2.1.0
Apache
YARN v2.7.3
Variability – Use Case 1
• IBM had 4.3x lower average RSD than YARN and 10.9x lower than Mesos
• YARN had 2.6x lower average RSD than Mesos
Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD
Q19 17 6 38% Q19 20 50 245% Q19 25 194 766%
Q42 16 6 40% Q42 19 49 263% Q42 22 150 674%
Q52 15 7 47% Q52 30 230 777% Q52 15 43 283%
Q55 15 6 40% Q55 19 49 264% Q55 40 363 916%
Q63 16 7 43% Q63 17 19 111% Q63 19 136 730%
Q68 42 13 30% Q68 62 72 116% Q68 47 164 350%
Q73 18 7 40% Q73 22 21 97% Q73 17 38 223%
Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278%
71% 304% 777%
KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33%
Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38%
Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63%
Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82%
Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62%
28% 36% 55%
Q19 22 7 31% Q19 28 6 21% Q19 25 35 141%
Q42 20 5 24% Q42 28 11 39% Q42 23 24 101%
Q52 20 5 25% Q52 26 5 20% Q52 21 16 74%
Average RSD
Average RSD Average RSD
Average RSD Average RSD
Average RSD
Apache YARN v2.7.3 Apache Mesos v1.0.1
Use Case 2:
Asynchronous
batch
multi-user
Use Case 1:
Synchronous
interactive multi-
user
IBM Spectrum Conductor with
Spark v2.1.0
Visualization – Use Case 2
1
10
100
1000
10000
0:00 0:28 0:57 1:26 1:55 2:24
Jobduration(seconds,logscale)
Job start time
Use Case 2 (async. batch multi-user) - Job durations (log scale) for Spark Resource Managers
Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers
(Lower is better. Values below 1 are not displayed)
Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1
Throughput – Use Case 2
• IBM throughput was
30% higher than YARN
and 62% higher than
Mesos
• YARN throughput was
25% higher than Mesos
Use Case 1:
Synchronous interactive
multi-user
Use Case 2:
Asynchronous batch
multi-user
Use Case 3:
Mixed
multi-user
Use Case 4:
Mixed
multi-tenant
Duration Throughput Duration Throughput Duration Throughput Duration Throughput
Resource
Manager Run
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
1 1.83 2623 1.53 327 1.29 908 2.05 574
2 1.84 2607 1.53 327 1.31 899 1.92 612
Worst
result
1.84 2607 1.53 327 1.31 899 2.05 574
1 2.87 1673 1.98 253 2.01 586 5.84 201
2 2.64 1816 1.58 316 2.02 582 6.63 177
Worst
result
2.87 1673 1.98 253 2.02 582 6.63 177
1 2.86 1679 2.40 209 2.29 512 2.57 458
2 2.89 1660 2.47 202 2.46 478 2.30 511
Worst
result
2.89 1660 2.47 202 2.46 478 2.57 458
Apache
Mesos v1.0.1
IBM
Spectrum
Conductor
with Spark
v2.1.0
Apache
YARN v2.7.3
Use Case 1:
Synchronous interactive
multi-user
Use Case 2:
Asynchronous batch
multi-user
Use Case 3:
Mixed
multi-user
Use Case 4:
Mixed
multi-tenant
Duration Throughput Duration Throughput Duration Throughput Duration Throughput
Run
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
1 1.83 2623 1.53 327 1.29 908 2.05 574
2 1.84 2607 1.53 327 1.31 899 1.92 612
Worst
result
1.84 2607 1.53 327 1.31 899 2.05 574
1 2.87 1673 1.98 253 2.01 586 5.84 201
2 2.64 1816 1.58 316 2.02 582 6.63 177
Worst
result
2.87 1673 1.98 253 2.02 582 6.63 177
1 2.86 1679 2.40 209 2.29 512 2.57 458
2 2.89 1660 2.47 202 2.46 478 2.30 511
Worst
result
2.89 1660 2.47 202 2.46 478 2.57 458
1
3
Variability – Use Case 2
• IBM had 1.3x lower average RSD than YARN and 2.0x lower than Mesos
• YARN had 1.5x lower average RSD than Mesos
Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD
Q19 17 6 38% Q19 20 50 245% Q19 25 194 766%
Q42 16 6 40% Q42 19 49 263% Q42 22 150 674%
Q52 15 7 47% Q52 30 230 777% Q52 15 43 283%
Q55 15 6 40% Q55 19 49 264% Q55 40 363 916%
Q63 16 7 43% Q63 17 19 111% Q63 19 136 730%
Q68 42 13 30% Q68 62 72 116% Q68 47 164 350%
Q73 18 7 40% Q73 22 21 97% Q73 17 38 223%
Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278%
71% 304% 777%
KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33%
Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38%
Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63%
Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82%
Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62%
28% 36% 55%
Q19 22 7 31% Q19 28 6 21% Q19 25 35 141%
Q42 20 5 24% Q42 28 11 39% Q42 23 24 101%
Q52 20 5 25% Q52 26 5 20% Q52 21 16 74%
Q55 21 5 26% Q55 27 6 22% Q55 21 15 69%
Q63 21 6 28% Q63 27 5 20% Q63 27 61 225%
Q68 58 10 18% Q68 91 19 21% Q68 77 57 74%
Q73 26 6 23% Q73 35 8 23% Q73 28 18 64%
Q98 2 3 180% Q98 1 1 55% Q98 1 1 112%
44% 28% 107%
KMS 500 176 35% KMS 1329 912 69% KMS 686 273 40%
Q3 255 43 17% Q3 2294 1468 64% Q3 694 246 35%
Q53 115 19 17% Q53 1233 937 76% Q53 422 298 71%
Average RSD
Average RSD Average RSD
Average RSD Average RSD
Average RSD
Use Case 3:
Mixed
multi-user
Use Case 2:
Asynchronous
batch
multi-user
Use Case 1:
Synchronous
interactive multi-
user
Average interactive RSD Average interactive RSD Average interactive RSD
Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD
Q19 17 6 38% Q19 20 50 245% Q19 25 194 766%
Q42 16 6 40% Q42 19 49 263% Q42 22 150 674%
Q52 15 7 47% Q52 30 230 777% Q52 15 43 283%
Q55 15 6 40% Q55 19 49 264% Q55 40 363 916%
Q63 16 7 43% Q63 17 19 111% Q63 19 136 730%
Q68 42 13 30% Q68 62 72 116% Q68 47 164 350%
Q73 18 7 40% Q73 22 21 97% Q73 17 38 223%
Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278%
71% 304% 777%
KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33%
Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38%
Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63%
Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82%
Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62%
28% 36% 55%
Q19 22 7 31% Q19 28 6 21% Q19 25 35 141%
Q42 20 5 24% Q42 28 11 39% Q42 23 24 101%
Q52 20 5 25% Q52 26 5 20% Q52 21 16 74%
Average RSD
Average RSD Average RSD
Average RSD Average RSD
Average RSD
Apache YARN v2.7.3 Apache Mesos v1.0.1
Use Case 2:
Asynchronous
batch
multi-user
Use Case 1:
Synchronous
interactive multi-
user
IBM Spectrum Conductor with
Spark v2.1.0
1
10
100
1000
10000
0:00 0:28 0:57 1:26 1:55 2:24
Jobduration(seconds,logscale)
Job start time
Use Case 3 (mixed multi-user) - Job durations (log scale) for Spark Resource Managers
Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers
(Lower is better. Values below 1 are not displayed)
Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1
Visualization – Use Case 3
Throughput – Use Case 3
• IBM throughput was
55% higher than YARN
and 88% higher than
Mesos
• YARN throughput was
22% higher than
Mesos
Use Case 1:
Synchronous interactive
multi-user
Use Case 2:
Asynchronous batch
multi-user
Use Case 3:
Mixed
multi-user
Use Case 4:
Mixed
multi-tenant
Duration Throughput Duration Throughput Duration Throughput Duration Throughput
Resource
Manager Run
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
1 1.83 2623 1.53 327 1.29 908 2.05 574
2 1.84 2607 1.53 327 1.31 899 1.92 612
Worst
result
1.84 2607 1.53 327 1.31 899 2.05 574
1 2.87 1673 1.98 253 2.01 586 5.84 201
2 2.64 1816 1.58 316 2.02 582 6.63 177
Worst
result
2.87 1673 1.98 253 2.02 582 6.63 177
1 2.86 1679 2.40 209 2.29 512 2.57 458
2 2.89 1660 2.47 202 2.46 478 2.30 511
Worst
result
2.89 1660 2.47 202 2.46 478 2.57 458
Apache
Mesos v1.0.1
IBM
Spectrum
Conductor
with Spark
v2.1.0
Apache
YARN v2.7.3
1:
eractive
er
Use Case 2:
Asynchronous batch
multi-user
Use Case 3:
Mixed
multi-user
Use Case 4:
Mixed
multi-tenant
oughput Duration Throughput Duration Throughput Duration Throughput
in
bs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
2623 1.53 327 1.29 908 2.05 574
2607 1.53 327 1.31 899 1.92 612
2607 1.53 327 1.31 899 2.05 574
1673 1.98 253 2.01 586 5.84 201
1816 1.58 316 2.02 582 6.63 177
1673 1.98 253 2.02 582 6.63 177
1679 2.40 209 2.29 512 2.57 458
1660 2.47 202 2.46 478 2.30 511
1660 2.47 202 2.46 478 2.57 458
Variability – Use Case 3
• IBM had 2.1x
lower average
RSD than
YARN and
2.6x lower
than Mesos
• YARN had
1.2x lower
average RSD
than Mesos
Q68 42 13 30% Q68 62 72 116% Q68 47 164 350%
Q73 18 7 40% Q73 22 21 97% Q73 17 38 223%
Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278%
71% 304% 777%
KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33%
Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38%
Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63%
Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82%
Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62%
28% 36% 55%
Q19 22 7 31% Q19 28 6 21% Q19 25 35 141%
Q42 20 5 24% Q42 28 11 39% Q42 23 24 101%
Q52 20 5 25% Q52 26 5 20% Q52 21 16 74%
Q55 21 5 26% Q55 27 6 22% Q55 21 15 69%
Q63 21 6 28% Q63 27 5 20% Q63 27 61 225%
Q68 58 10 18% Q68 91 19 21% Q68 77 57 74%
Q73 26 6 23% Q73 35 8 23% Q73 28 18 64%
Q98 2 3 180% Q98 1 1 55% Q98 1 1 112%
44% 28% 107%
KMS 500 176 35% KMS 1329 912 69% KMS 686 273 40%
Q3 255 43 17% Q3 2294 1468 64% Q3 694 246 35%
Q53 115 19 17% Q53 1233 937 76% Q53 422 298 71%
Q8 71 14 20% Q8 1043 881 84% Q8 310 227 73%
Q89 117 19 17% Q89 1253 1003 80% Q89 445 290 65%
21% 75% 57%
28% 60% 73%
Q19 14 5 37% Q19 14 27 187% Q19 18 77 435%
Average RSD
Average RSD Average RSD
Average RSD Average RSD
Average RSD
Average batch RSD
Weighted average RSD
(interactive & batch)
Case 3:
Mixed
lti-user
Case 2:
chronous
batch
lti-user
ctive multi-
user
Average batch RSD
Weighted average RSD
(interactive & batch)
Average interactive RSD Average interactive RSD
Average batch RSD
Weighted average RSD
(interactive & batch)
Average interactive RSD
Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD
Q19 17 6 38% Q19 20 50 245% Q19 25 194 766%
Q42 16 6 40% Q42 19 49 263% Q42 22 150 674%
Q52 15 7 47% Q52 30 230 777% Q52 15 43 283%
Q55 15 6 40% Q55 19 49 264% Q55 40 363 916%
Q63 16 7 43% Q63 17 19 111% Q63 19 136 730%
Q68 42 13 30% Q68 62 72 116% Q68 47 164 350%
Q73 18 7 40% Q73 22 21 97% Q73 17 38 223%
Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278%
71% 304% 777%
KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33%
Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38%
Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63%
Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82%
Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62%
28% 36% 55%
Q19 22 7 31% Q19 28 6 21% Q19 25 35 141%
Q42 20 5 24% Q42 28 11 39% Q42 23 24 101%
Q52 20 5 25% Q52 26 5 20% Q52 21 16 74%
Average RSD
Average RSD Average RSD
Average RSD Average RSD
Average RSD
Apache YARN v2.7.3 Apache Mesos v1.0.1
Case 2:
chronous
batch
lti-user
Case 1:
chronous
ctive multi-
user
IBM Spectrum Conductor with
Spark v2.1.0
1
10
100
1000
10000
0:00 1:12 2:24 3:36 4:48 6:00
Jobduration(seconds,logscale)
Time in test run
Use Case 4 (mixed multi-tenant) - Job durations (log scale) for Spark Resource Managers
Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers
(Lower is better. Values below 1 are not displayed)
Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1
Visualization – Use Case 4
Throughput – Use Case 4
• IBM throughput was
224% higher than
YARN and 25% higher
than Mesos
• Mesos throughput was
158% higher than
YARN
Use Case 1:
Synchronous interactive
multi-user
Use Case 2:
Asynchronous batch
multi-user
Use Case 3:
Mixed
multi-user
Use Case 4:
Mixed
multi-tenant
Duration Throughput Duration Throughput Duration Throughput Duration Throughput
Resource
Manager Run
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
1 1.83 2623 1.53 327 1.29 908 2.05 574
2 1.84 2607 1.53 327 1.31 899 1.92 612
Worst
result
1.84 2607 1.53 327 1.31 899 2.05 574
1 2.87 1673 1.98 253 2.01 586 5.84 201
2 2.64 1816 1.58 316 2.02 582 6.63 177
Worst
result
2.87 1673 1.98 253 2.02 582 6.63 177
1 2.86 1679 2.40 209 2.29 512 2.57 458
2 2.89 1660 2.47 202 2.46 478 2.30 511
Worst
result
2.89 1660 2.47 202 2.46 478 2.57 458
Apache
Mesos v1.0.1
IBM
Spectrum
Conductor
with Spark
v2.1.0
Apache
YARN v2.7.3
2:
batch
r
Use Case 3:
Mixed
multi-user
Use Case 4:
Mixed
multi-tenant
oughput Duration Throughput Duration Throughput
in
bs/hour
in
hours
in
jobs/hour
in
hours
in
jobs/hour
327 1.29 908 2.05 574
327 1.31 899 1.92 612
327 1.31 899 2.05 574
253 2.01 586 5.84 201
316 2.02 582 6.63 177
253 2.02 582 6.63 177
209 2.29 512 2.57 458
202 2.46 478 2.30 511
202 2.46 478 2.57 458
Variability – Use Case 4
• IBM had 2.5x
lower average
RSD than
YARN and
2.7x lower
than Mesos
• YARN had
1.1x lower
average RSD
than Mesos
Q98 2 3 180% Q98 1 1 55% Q98 1 1 112%
44% 28% 107%
KMS 500 176 35% KMS 1329 912 69% KMS 686 273 40%
Q3 255 43 17% Q3 2294 1468 64% Q3 694 246 35%
Q53 115 19 17% Q53 1233 937 76% Q53 422 298 71%
Q8 71 14 20% Q8 1043 881 84% Q8 310 227 73%
Q89 117 19 17% Q89 1253 1003 80% Q89 445 290 65%
21% 75% 57%
28% 60% 73%
Q19 14 5 37% Q19 14 27 187% Q19 18 77 435%
Q42 12 4 35% Q42 12 23 193% Q42 9 3 35%
Q52 12 6 47% Q52 11 19 169% Q52 25 108 440%
Q55 13 6 45% Q55 11 17 159% Q55 9 3 35%
Q63 12 8 60% Q63 13 21 166% Q63 9 3 35%
Q68 37 9 25% Q68 40 48 120% Q68 38 66 174%
Q73 15 5 36% Q73 13 19 142% Q73 12 4 31%
Q98 1 1 44% Q98 1 2 158% Q98 0 1 159%
41% 162% 168%
KMS 472 167 35% KMS 329 109 33% KMS 601 258 43%
Q3 705 117 17% Q3 289 98 34% Q3 1247 707 57%
Q53 312 93 30% Q53 184 102 55% Q53 435 244 56%
Q8 199 72 36% Q8 155 92 59% Q8 317 207 65%
Q89 319 115 36% Q89 199 113 57% Q89 479 274 57%
31% 48% 56%
34% 84% 92%
Weighted average RSD
(interactive & batch)
Weighted average RSD
(interactive & batch)
Weighted average RSD
(interactive & batch)
Average interactive RSD Average interactive RSD Average interactive RSD
Average batch RSD
Weighted average RSD
(interactive & batch)
Average batch RSD Average batch RSD Average batch RSD
Case 3:
Mixed
lti-user
Case 4:
Mixed
ti-tenant
Average batch RSD
Weighted average RSD
(interactive & batch)
Average interactive RSD Average interactive RSD
Average batch RSD
Weighted average RSD
(interactive & batch)
Average interactive RSD
Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD Job Avg
Std
Dev
RSD
Q19 17 6 38% Q19 20 50 245% Q19 25 194 766%
Q42 16 6 40% Q42 19 49 263% Q42 22 150 674%
Q52 15 7 47% Q52 30 230 777% Q52 15 43 283%
Q55 15 6 40% Q55 19 49 264% Q55 40 363 916%
Q63 16 7 43% Q63 17 19 111% Q63 19 136 730%
Q68 42 13 30% Q68 62 72 116% Q68 47 164 350%
Q73 18 7 40% Q73 22 21 97% Q73 17 38 223%
Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278%
71% 304% 777%
KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33%
Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38%
Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63%
Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82%
Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62%
28% 36% 55%
Q19 22 7 31% Q19 28 6 21% Q19 25 35 141%
Q42 20 5 24% Q42 28 11 39% Q42 23 24 101%
Q52 20 5 25% Q52 26 5 20% Q52 21 16 74%
Average RSD
Average RSD Average RSD
Average RSD Average RSD
Average RSD
Apache YARN v2.7.3 Apache Mesos v1.0.1
Case 2:
chronous
batch
lti-user
Case 1:
chronous
ctive multi-
user
IBM Spectrum Conductor with
Spark v2.1.0
Results Summary
Relative Advantage of IBM Spectrum Conductor for Spark
versus YARN and Mesos
Why?
• Dynamic allocation/de-allocation
– The resource manager should allow applications to both release and gain resources during
execution, based on scheduling demand from the Spark driver
– IBM Spectrum Conductor with Spark supports both dynamic allocation and dynamic de-
allocation
– YARN supports dynamic allocation but appeared to have issues with de-allocation (or re-
allocation)
– Mesos supports both dynamic allocation and dynamic de-allocation but appeared to have issues
with the offer mechanism and long-running jobs
• Pre-emption
– Pre-emption allows for more even resource distribution among applications, higher throughput
and lower variation in performance
– IBM Spectrum Conductor with Spark includes a highly configurable and intelligent pre-emption
implementation
– YARN supports pre-emption but it hurt rather than helped when we tried it
– The version of Mesos we used did not support pre-emption
Next steps
• Please provide feedback/contribute to the SMB
benchmarks
• Please use the SMB benchmarks and share
what you find
References
Spark Multi-User Benchmark (SMB):
www.github.com/IBM/SparkMultiuserBenchmark
STAC Report:
www.STACresearch.com/news/2017/05/19/IBM170405
SMB-2 Blog:
https://developer.ibm.com/code/2017/05/24/introducing-spark-multiuser-benchmark-version-2/
Results Blog:
https://www.ibm.com/developerworks/community/blogs/281605c9-7369-46dc-ad03-
70d9ad377480/entry/Understanding_Multiuser_Spark_Application_Performance_Differences?lang=en
Spark-related materials at STAC:
www.STACresearch.com/spark
Q&A
Thank You.
Mikhail Genkin: genkin@ca.ibm.com
Peter Lankford: peter.lankford@STACresearch.com
Back-Up
What was STAC’s role?
• Provide feedback on the benchmark specs
– Note: This was not a STAC Benchmark™
• Confirm conformance of tests to the specs
• Inspect the system configurations
• Oversee the test execution
• Analyze the results
• Produce the STAC Report™ and detailed STAC
Configuration Disclosure
Limitations (as per STAC Report)
• Sensitivity to tuning
• Sensitivity to workload
• Difficulty of configuring Use Case 4
• Mesos/YARN/Spark configuration puzzles
• Others

Mais conteúdo relacionado

Mais procurados

Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksDatabricks
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Spark Summit
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Databricks
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceMongoDB
 
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...Databricks
 
Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...
Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...
Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...Databricks
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Databricks
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Databricks
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Spark Summit
 
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark Summit
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsDatabricks
 
The Hidden Life of Spark Jobs
The Hidden Life of Spark JobsThe Hidden Life of Spark Jobs
The Hidden Life of Spark JobsDataWorks Summit
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark Summit
 
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Spark Summit
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakSpark Summit
 
Continuous Processing in Structured Streaming with Jose Torres
 Continuous Processing in Structured Streaming with Jose Torres Continuous Processing in Structured Streaming with Jose Torres
Continuous Processing in Structured Streaming with Jose TorresDatabricks
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelinprajods
 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedA Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedDatabricks
 
Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
Time Series Analytics with Spark: Spark Summit East talk by Simon OuelletteTime Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
Time Series Analytics with Spark: Spark Summit East talk by Simon OuelletteSpark Summit
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Josef A. Habdank
 

Mais procurados (20)

Scaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on DatabricksScaling Data Analytics Workloads on Databricks
Scaling Data Analytics Workloads on Databricks
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
 
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
Cassandra and SparkSQL: You Don't Need Functional Programming for Fun with Ru...
 
How To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own DatasourceHow To Connect Spark To Your Own Datasource
How To Connect Spark To Your Own Datasource
 
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
Extending the R API for Spark with sparklyr and Microsoft R Server with Ali Z...
 
Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...
Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...
Real-time Machine Learning Analytics Using Structured Streaming and Kinesis F...
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 
Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks Jump Start on Apache® Spark™ 2.x with Databricks
Jump Start on Apache® Spark™ 2.x with Databricks
 
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
Tagging and Processing Data in Real Time-(Hari Shreedharan and Siddhartha Jai...
 
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
Spark-on-Yarn: The Road Ahead-(Marcelo Vanzin, Cloudera)
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
The Hidden Life of Spark Jobs
The Hidden Life of Spark JobsThe Hidden Life of Spark Jobs
The Hidden Life of Spark Jobs
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
 
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
 
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub WozniakNext CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
 
Continuous Processing in Structured Streaming with Jose Torres
 Continuous Processing in Structured Streaming with Jose Torres Continuous Processing in Structured Streaming with Jose Torres
Continuous Processing in Structured Streaming with Jose Torres
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelin
 
A Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons LearnedA Journey into Databricks' Pipelines: Journey and Lessons Learned
A Journey into Databricks' Pipelines: Journey and Lessons Learned
 
Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
Time Series Analytics with Spark: Spark Summit East talk by Simon OuelletteTime Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
 
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
Extreme Apache Spark: how in 3 months we created a pipeline that can process ...
 

Semelhante a Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Genkin and Peter Lankford

Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Productionconfluent
 
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...Spark Summit
 
Anton Boyko, "The evolution of microservices platform or marketing gibberish"
Anton Boyko, "The evolution of microservices platform or marketing gibberish"Anton Boyko, "The evolution of microservices platform or marketing gibberish"
Anton Boyko, "The evolution of microservices platform or marketing gibberish"Sigma Software
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analyticsamesar0
 
Anton Boyko "The future of serverless computing"
Anton Boyko "The future of serverless computing"Anton Boyko "The future of serverless computing"
Anton Boyko "The future of serverless computing"Fwdays
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLArnab Biswas
 
Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502kaziul Islam Bulbul
 
Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleDataWorks Summit
 
collab2011-tuning-ebusiness-421966.pdf
collab2011-tuning-ebusiness-421966.pdfcollab2011-tuning-ebusiness-421966.pdf
collab2011-tuning-ebusiness-421966.pdfElboulmaniMohamed
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Indrajit Poddar
 
Scaling habits of ASP.NET
Scaling habits of ASP.NETScaling habits of ASP.NET
Scaling habits of ASP.NETDavid Giard
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionSplunk
 
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layerC. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layerUni Systems S.M.S.A.
 
How Busy Is Too Busy? Automating Your System for Maximum Throughput
How Busy Is Too Busy? Automating Your System for Maximum Throughput How Busy Is Too Busy? Automating Your System for Maximum Throughput
How Busy Is Too Busy? Automating Your System for Maximum Throughput Compuware
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!Ulrich Krause
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexApache Apex
 

Semelhante a Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Genkin and Peter Lankford (20)

Streaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in ProductionStreaming in Practice - Putting Apache Kafka in Production
Streaming in Practice - Putting Apache Kafka in Production
 
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
Migrating Complex Data Aggregation from Hadoop to Spark-(Ashish Singh andPune...
 
Anton Boyko, "The evolution of microservices platform or marketing gibberish"
Anton Boyko, "The evolution of microservices platform or marketing gibberish"Anton Boyko, "The evolution of microservices platform or marketing gibberish"
Anton Boyko, "The evolution of microservices platform or marketing gibberish"
 
Cloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark AnalyticsCloud Security Monitoring and Spark Analytics
Cloud Security Monitoring and Spark Analytics
 
Anton Boyko "The future of serverless computing"
Anton Boyko "The future of serverless computing"Anton Boyko "The future of serverless computing"
Anton Boyko "The future of serverless computing"
 
Machine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkMLMachine Learning With H2O vs SparkML
Machine Learning With H2O vs SparkML
 
Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502Collaborate 2011-tuning-ebusiness-416502
Collaborate 2011-tuning-ebusiness-416502
 
Metrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scaleMetrics-driven tuning of Apache Spark at scale
Metrics-driven tuning of Apache Spark at scale
 
collab2011-tuning-ebusiness-421966.pdf
collab2011-tuning-ebusiness-421966.pdfcollab2011-tuning-ebusiness-421966.pdf
collab2011-tuning-ebusiness-421966.pdf
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
 
Scaling habits of ASP.NET
Scaling habits of ASP.NETScaling habits of ASP.NET
Scaling habits of ASP.NET
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
01 oracle architecture
01 oracle architecture01 oracle architecture
01 oracle architecture
 
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
 
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layerC. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
C. Sotiriou, Vodafone Greece: Adopting Quarkus for the digital experience layer
 
How Busy Is Too Busy? Automating Your System for Maximum Throughput
How Busy Is Too Busy? Automating Your System for Maximum Throughput How Busy Is Too Busy? Automating Your System for Maximum Throughput
How Busy Is Too Busy? Automating Your System for Maximum Throughput
 
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!La vita nella corsia di sorpasso; A tutta velocità, XPages!
La vita nella corsia di sorpasso; A tutta velocità, XPages!
 
Life in the Fast Lane: Full Speed XPages!, #dd13
Life in the Fast Lane: Full Speed XPages!, #dd13Life in the Fast Lane: Full Speed XPages!, #dd13
Life in the Fast Lane: Full Speed XPages!, #dd13
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
 

Mais de Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Mais de Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Último

➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 

Último (20)

➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 

Deep Dive Into Apache Spark Multi-User Performance Michael Feiman, Mikhail Genkin and Peter Lankford

  • 1. Mikhail Genkin Performance Architect and Optimization Lead, IBM Spectrum Computing Michael Feiman (presenter) Principal Software Architect, IBM Spectrum Computing Peter Lankford (presenter) Founder & Director, STAC Deep Dive Into Spark Multi-User Performance
  • 2. Outline • Introduction • Factors affecting multi-user performance • Spark-RM integration architecture • Analyzing multi-user performance with the Spark Multi-user Benchmark • Break • Data and insights • Q&A • References
  • 3. Disclaimer • STAC and all STAC names are trademarks or registered trademarks of the Securities Technology Analysis Center, LLC.
  • 4. Introduction • Most Spark production deployments are multi-user or multi-tenant • Understanding factors that affect performance of Spark applications under multi-user or multi-tenant conditions is very important • In this session we focus on Spark performance with YARN, Mesos and IBM Spectrum Conductor with Spark • IBM contracted STAC to perform and independent evaluation of 3 common Spark resource manager configurations: – Spark 2.0.2 (dev) + YARN 2.7.3 – Spark 2.0.1 + Mesos 1.0.1 – IBM Spectrum Conductor with Spark 2.1.0.1 (includes Spark 2.0.1) • IBM views this as a first step in providing more data-driven community engagement on Spark resource management
  • 5. Factors Affecting Multi-User/Tenant Performance • Resource manager configuration and behavior – Resource manager controls: • how many CPUs and how much memory a given app gets • how many Spark applications will run concurrently • can have order-of-magnitude impact on a given application performance • Spark configuration and tuning – Some Spark tunables, such as spark.executor.memory affect how many Spark executors and applications will run concurrently • Number of concurrently running users and applications – More concurrent applications means less resources per application and slower performance • Cluster infrastructure – Capacity and presence or absence of bottlenecks • Data scale used by the applications – More data usually results in slower performance • Type of processing performed by the app – Spark SQL vs. machine learning for example • Others
  • 7. Spark-RM Integration Architecture • Interplay between Spark and Resource Manager configurations • Spark determines: – Executor memory – Executor CPU • Resource manager determines (directly and indirectly): – How many CPUs and how much memory each application gets initially – Whether applications can release or gain resources during execution (dynamic allocation) – Whether resources can be taken from one application to boost another (pre-emption) – How many Spark executors will be started concurrently – How many Spark tasks will execute in parallel
  • 8. Spark Multi-User Benchmark (SMB) • Open-source benchmark: www.github.com/IBM/SparkMultiuserBenchmark – Initially developed at IBM • SMB is designed to measure performance under very realistic workload conditions – Simulates steady-state, non-steady state job-arrival patterns – Mix of TPC-DS-inspired queries and machine learning jobs – Short-running synchronous interactive queries executed in parallel by multiple users – Longer-running batch jobs and queries executed in parallel by multiple users – Mixed workloads – interactive + batch – executed in parallel by multiple users – Multi-tenant workloads – interactive + batch – executed in parallel by multiple users with different weighting or QoS constraints • How it can be used: – Tune and optimize your Spark applications and cluster hardware infrastructure – Study resource manager behavior and efficiency
  • 9. SMB-1 and SMB-2 • Workload patterns: 1. Synchronous interactive workload • SMB-1 and SMB-2 • Short running queries – 5 to 20 sec duration • TPC-DS-inspired queries 2. Asynchronous batch workload • SMB-2 • Longer running queries – 30 sec to 5 min duration • TPC-DS-inspired queries • K-means machine learning jobs • Use cases: 1. Synchronous interactive multi-user • SMB-1 and SMB-2 • Workload pattern 1 only • All users have the same weight and QoS 2. Asynchronous batch multi-user • SMB-2 only • Workload pattern 2 only • All users have the same weight and QoS 3. Mixed multi-user • SMB-2 only • Workload mix: – 50% interactive users – 50% batch users • Same weight for all users 4. Mixed multi-tenant • SMB-2 only • Workload mix: – 50% interactive users – 50% batch users • Interactive has priority • Interactive can use up to 70% of resources • Batch can use 100% of resources when interactive is not running • Batch can use not more than 30% when interactive is running
  • 10. SMB-2 Synchronous Interactive Workload Pattern
  • 11. SMB-2 Asynchronous Batch Workload Pattern
  • 12. Interpreting SMB Data • Coarse metrics – Total test duration – Combined query throughput (qph) – Average weighted Relative Standard Deviation (RSD) • RSD = Std.Dev./Avg. *100 • Measures variability • Detailed per-query metrics – Query RSD – Avg. query duration – Standard deviation for query durations – Plot of query durations for all queries vs. test duration (pattern shown on the right) Step-up Step-down Steady-state
  • 13. Break – 10 min Part 2 – Data and Insights
  • 14. So what is STAC®? • STAC facilitates the STAC Benchmark Council: – ~300 financial firms and ~50 tech vendors – Establishes standard technology benchmarks & promotes dialog – Big data, fast data, fast compute, big compute • STAC performs independent benchmark audits
  • 15. Visualization – Use Case 1 1 10 100 1000 10000 0:00 0:28 0:57 1:26 1:55 2:24 2:52 Jobduration(seconds,logscale) Job start time Use Case 1 (sync. interactive multi-user) - Job durations (log scale) for Spark Resource Managers Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers (Lower is better. Values below 1 are not displayed) Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1
  • 16. Throughput – Use Case 1 • IBM had 56% higher throughput than YARN and 57% higher than Mesos • YARN and Mesos were nearly the same Use Case 1: Synchronous interactive multi-user Use Case 2: Asynchronous batch multi-user Use Case 3: Mixed multi-user Use Case 4: Mixed multi-tenant Duration Throughput Duration Throughput Duration Throughput Duration Throughput Resource Manager Run in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour 1 1.83 2623 1.53 327 1.29 908 2.05 574 2 1.84 2607 1.53 327 1.31 899 1.92 612 Worst result 1.84 2607 1.53 327 1.31 899 2.05 574 1 2.87 1673 1.98 253 2.01 586 5.84 201 2 2.64 1816 1.58 316 2.02 582 6.63 177 Worst result 2.87 1673 1.98 253 2.02 582 6.63 177 1 2.86 1679 2.40 209 2.29 512 2.57 458 2 2.89 1660 2.47 202 2.46 478 2.30 511 Worst result 2.89 1660 2.47 202 2.46 478 2.57 458 Apache Mesos v1.0.1 IBM Spectrum Conductor with Spark v2.1.0 Apache YARN v2.7.3
  • 17. Variability – Use Case 1 • IBM had 4.3x lower average RSD than YARN and 10.9x lower than Mesos • YARN had 2.6x lower average RSD than Mesos Job Avg Std Dev RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Q19 17 6 38% Q19 20 50 245% Q19 25 194 766% Q42 16 6 40% Q42 19 49 263% Q42 22 150 674% Q52 15 7 47% Q52 30 230 777% Q52 15 43 283% Q55 15 6 40% Q55 19 49 264% Q55 40 363 916% Q63 16 7 43% Q63 17 19 111% Q63 19 136 730% Q68 42 13 30% Q68 62 72 116% Q68 47 164 350% Q73 18 7 40% Q73 22 21 97% Q73 17 38 223% Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278% 71% 304% 777% KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33% Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38% Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63% Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82% Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62% 28% 36% 55% Q19 22 7 31% Q19 28 6 21% Q19 25 35 141% Q42 20 5 24% Q42 28 11 39% Q42 23 24 101% Q52 20 5 25% Q52 26 5 20% Q52 21 16 74% Average RSD Average RSD Average RSD Average RSD Average RSD Average RSD Apache YARN v2.7.3 Apache Mesos v1.0.1 Use Case 2: Asynchronous batch multi-user Use Case 1: Synchronous interactive multi- user IBM Spectrum Conductor with Spark v2.1.0
  • 18. Visualization – Use Case 2 1 10 100 1000 10000 0:00 0:28 0:57 1:26 1:55 2:24 Jobduration(seconds,logscale) Job start time Use Case 2 (async. batch multi-user) - Job durations (log scale) for Spark Resource Managers Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers (Lower is better. Values below 1 are not displayed) Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1
  • 19. Throughput – Use Case 2 • IBM throughput was 30% higher than YARN and 62% higher than Mesos • YARN throughput was 25% higher than Mesos Use Case 1: Synchronous interactive multi-user Use Case 2: Asynchronous batch multi-user Use Case 3: Mixed multi-user Use Case 4: Mixed multi-tenant Duration Throughput Duration Throughput Duration Throughput Duration Throughput Resource Manager Run in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour 1 1.83 2623 1.53 327 1.29 908 2.05 574 2 1.84 2607 1.53 327 1.31 899 1.92 612 Worst result 1.84 2607 1.53 327 1.31 899 2.05 574 1 2.87 1673 1.98 253 2.01 586 5.84 201 2 2.64 1816 1.58 316 2.02 582 6.63 177 Worst result 2.87 1673 1.98 253 2.02 582 6.63 177 1 2.86 1679 2.40 209 2.29 512 2.57 458 2 2.89 1660 2.47 202 2.46 478 2.30 511 Worst result 2.89 1660 2.47 202 2.46 478 2.57 458 Apache Mesos v1.0.1 IBM Spectrum Conductor with Spark v2.1.0 Apache YARN v2.7.3 Use Case 1: Synchronous interactive multi-user Use Case 2: Asynchronous batch multi-user Use Case 3: Mixed multi-user Use Case 4: Mixed multi-tenant Duration Throughput Duration Throughput Duration Throughput Duration Throughput Run in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour 1 1.83 2623 1.53 327 1.29 908 2.05 574 2 1.84 2607 1.53 327 1.31 899 1.92 612 Worst result 1.84 2607 1.53 327 1.31 899 2.05 574 1 2.87 1673 1.98 253 2.01 586 5.84 201 2 2.64 1816 1.58 316 2.02 582 6.63 177 Worst result 2.87 1673 1.98 253 2.02 582 6.63 177 1 2.86 1679 2.40 209 2.29 512 2.57 458 2 2.89 1660 2.47 202 2.46 478 2.30 511 Worst result 2.89 1660 2.47 202 2.46 478 2.57 458 1 3
  • 20. Variability – Use Case 2 • IBM had 1.3x lower average RSD than YARN and 2.0x lower than Mesos • YARN had 1.5x lower average RSD than Mesos Job Avg Std Dev RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Q19 17 6 38% Q19 20 50 245% Q19 25 194 766% Q42 16 6 40% Q42 19 49 263% Q42 22 150 674% Q52 15 7 47% Q52 30 230 777% Q52 15 43 283% Q55 15 6 40% Q55 19 49 264% Q55 40 363 916% Q63 16 7 43% Q63 17 19 111% Q63 19 136 730% Q68 42 13 30% Q68 62 72 116% Q68 47 164 350% Q73 18 7 40% Q73 22 21 97% Q73 17 38 223% Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278% 71% 304% 777% KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33% Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38% Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63% Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82% Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62% 28% 36% 55% Q19 22 7 31% Q19 28 6 21% Q19 25 35 141% Q42 20 5 24% Q42 28 11 39% Q42 23 24 101% Q52 20 5 25% Q52 26 5 20% Q52 21 16 74% Q55 21 5 26% Q55 27 6 22% Q55 21 15 69% Q63 21 6 28% Q63 27 5 20% Q63 27 61 225% Q68 58 10 18% Q68 91 19 21% Q68 77 57 74% Q73 26 6 23% Q73 35 8 23% Q73 28 18 64% Q98 2 3 180% Q98 1 1 55% Q98 1 1 112% 44% 28% 107% KMS 500 176 35% KMS 1329 912 69% KMS 686 273 40% Q3 255 43 17% Q3 2294 1468 64% Q3 694 246 35% Q53 115 19 17% Q53 1233 937 76% Q53 422 298 71% Average RSD Average RSD Average RSD Average RSD Average RSD Average RSD Use Case 3: Mixed multi-user Use Case 2: Asynchronous batch multi-user Use Case 1: Synchronous interactive multi- user Average interactive RSD Average interactive RSD Average interactive RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Q19 17 6 38% Q19 20 50 245% Q19 25 194 766% Q42 16 6 40% Q42 19 49 263% Q42 22 150 674% Q52 15 7 47% Q52 30 230 777% Q52 15 43 283% Q55 15 6 40% Q55 19 49 264% Q55 40 363 916% Q63 16 7 43% Q63 17 19 111% Q63 19 136 730% Q68 42 13 30% Q68 62 72 116% Q68 47 164 350% Q73 18 7 40% Q73 22 21 97% Q73 17 38 223% Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278% 71% 304% 777% KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33% Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38% Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63% Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82% Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62% 28% 36% 55% Q19 22 7 31% Q19 28 6 21% Q19 25 35 141% Q42 20 5 24% Q42 28 11 39% Q42 23 24 101% Q52 20 5 25% Q52 26 5 20% Q52 21 16 74% Average RSD Average RSD Average RSD Average RSD Average RSD Average RSD Apache YARN v2.7.3 Apache Mesos v1.0.1 Use Case 2: Asynchronous batch multi-user Use Case 1: Synchronous interactive multi- user IBM Spectrum Conductor with Spark v2.1.0
  • 21. 1 10 100 1000 10000 0:00 0:28 0:57 1:26 1:55 2:24 Jobduration(seconds,logscale) Job start time Use Case 3 (mixed multi-user) - Job durations (log scale) for Spark Resource Managers Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers (Lower is better. Values below 1 are not displayed) Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1 Visualization – Use Case 3
  • 22. Throughput – Use Case 3 • IBM throughput was 55% higher than YARN and 88% higher than Mesos • YARN throughput was 22% higher than Mesos Use Case 1: Synchronous interactive multi-user Use Case 2: Asynchronous batch multi-user Use Case 3: Mixed multi-user Use Case 4: Mixed multi-tenant Duration Throughput Duration Throughput Duration Throughput Duration Throughput Resource Manager Run in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour 1 1.83 2623 1.53 327 1.29 908 2.05 574 2 1.84 2607 1.53 327 1.31 899 1.92 612 Worst result 1.84 2607 1.53 327 1.31 899 2.05 574 1 2.87 1673 1.98 253 2.01 586 5.84 201 2 2.64 1816 1.58 316 2.02 582 6.63 177 Worst result 2.87 1673 1.98 253 2.02 582 6.63 177 1 2.86 1679 2.40 209 2.29 512 2.57 458 2 2.89 1660 2.47 202 2.46 478 2.30 511 Worst result 2.89 1660 2.47 202 2.46 478 2.57 458 Apache Mesos v1.0.1 IBM Spectrum Conductor with Spark v2.1.0 Apache YARN v2.7.3 1: eractive er Use Case 2: Asynchronous batch multi-user Use Case 3: Mixed multi-user Use Case 4: Mixed multi-tenant oughput Duration Throughput Duration Throughput Duration Throughput in bs/hour in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour 2623 1.53 327 1.29 908 2.05 574 2607 1.53 327 1.31 899 1.92 612 2607 1.53 327 1.31 899 2.05 574 1673 1.98 253 2.01 586 5.84 201 1816 1.58 316 2.02 582 6.63 177 1673 1.98 253 2.02 582 6.63 177 1679 2.40 209 2.29 512 2.57 458 1660 2.47 202 2.46 478 2.30 511 1660 2.47 202 2.46 478 2.57 458
  • 23. Variability – Use Case 3 • IBM had 2.1x lower average RSD than YARN and 2.6x lower than Mesos • YARN had 1.2x lower average RSD than Mesos Q68 42 13 30% Q68 62 72 116% Q68 47 164 350% Q73 18 7 40% Q73 22 21 97% Q73 17 38 223% Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278% 71% 304% 777% KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33% Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38% Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63% Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82% Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62% 28% 36% 55% Q19 22 7 31% Q19 28 6 21% Q19 25 35 141% Q42 20 5 24% Q42 28 11 39% Q42 23 24 101% Q52 20 5 25% Q52 26 5 20% Q52 21 16 74% Q55 21 5 26% Q55 27 6 22% Q55 21 15 69% Q63 21 6 28% Q63 27 5 20% Q63 27 61 225% Q68 58 10 18% Q68 91 19 21% Q68 77 57 74% Q73 26 6 23% Q73 35 8 23% Q73 28 18 64% Q98 2 3 180% Q98 1 1 55% Q98 1 1 112% 44% 28% 107% KMS 500 176 35% KMS 1329 912 69% KMS 686 273 40% Q3 255 43 17% Q3 2294 1468 64% Q3 694 246 35% Q53 115 19 17% Q53 1233 937 76% Q53 422 298 71% Q8 71 14 20% Q8 1043 881 84% Q8 310 227 73% Q89 117 19 17% Q89 1253 1003 80% Q89 445 290 65% 21% 75% 57% 28% 60% 73% Q19 14 5 37% Q19 14 27 187% Q19 18 77 435% Average RSD Average RSD Average RSD Average RSD Average RSD Average RSD Average batch RSD Weighted average RSD (interactive & batch) Case 3: Mixed lti-user Case 2: chronous batch lti-user ctive multi- user Average batch RSD Weighted average RSD (interactive & batch) Average interactive RSD Average interactive RSD Average batch RSD Weighted average RSD (interactive & batch) Average interactive RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Q19 17 6 38% Q19 20 50 245% Q19 25 194 766% Q42 16 6 40% Q42 19 49 263% Q42 22 150 674% Q52 15 7 47% Q52 30 230 777% Q52 15 43 283% Q55 15 6 40% Q55 19 49 264% Q55 40 363 916% Q63 16 7 43% Q63 17 19 111% Q63 19 136 730% Q68 42 13 30% Q68 62 72 116% Q68 47 164 350% Q73 18 7 40% Q73 22 21 97% Q73 17 38 223% Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278% 71% 304% 777% KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33% Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38% Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63% Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82% Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62% 28% 36% 55% Q19 22 7 31% Q19 28 6 21% Q19 25 35 141% Q42 20 5 24% Q42 28 11 39% Q42 23 24 101% Q52 20 5 25% Q52 26 5 20% Q52 21 16 74% Average RSD Average RSD Average RSD Average RSD Average RSD Average RSD Apache YARN v2.7.3 Apache Mesos v1.0.1 Case 2: chronous batch lti-user Case 1: chronous ctive multi- user IBM Spectrum Conductor with Spark v2.1.0
  • 24. 1 10 100 1000 10000 0:00 1:12 2:24 3:36 4:48 6:00 Jobduration(seconds,logscale) Time in test run Use Case 4 (mixed multi-tenant) - Job durations (log scale) for Spark Resource Managers Spark 2.0.1 (Spark v2.0.2 for Yarn) / HDFS 2.7.3 / RHEL 7.1 / 11 x Lenovo x3630 M4 Servers (Lower is better. Values below 1 are not displayed) Apache YARN v2.7.3 IBM Spectrum Conductor with Spark v2.1.0 Apache Mesos v1.0.1 Visualization – Use Case 4
  • 25. Throughput – Use Case 4 • IBM throughput was 224% higher than YARN and 25% higher than Mesos • Mesos throughput was 158% higher than YARN Use Case 1: Synchronous interactive multi-user Use Case 2: Asynchronous batch multi-user Use Case 3: Mixed multi-user Use Case 4: Mixed multi-tenant Duration Throughput Duration Throughput Duration Throughput Duration Throughput Resource Manager Run in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour in hours in jobs/hour 1 1.83 2623 1.53 327 1.29 908 2.05 574 2 1.84 2607 1.53 327 1.31 899 1.92 612 Worst result 1.84 2607 1.53 327 1.31 899 2.05 574 1 2.87 1673 1.98 253 2.01 586 5.84 201 2 2.64 1816 1.58 316 2.02 582 6.63 177 Worst result 2.87 1673 1.98 253 2.02 582 6.63 177 1 2.86 1679 2.40 209 2.29 512 2.57 458 2 2.89 1660 2.47 202 2.46 478 2.30 511 Worst result 2.89 1660 2.47 202 2.46 478 2.57 458 Apache Mesos v1.0.1 IBM Spectrum Conductor with Spark v2.1.0 Apache YARN v2.7.3 2: batch r Use Case 3: Mixed multi-user Use Case 4: Mixed multi-tenant oughput Duration Throughput Duration Throughput in bs/hour in hours in jobs/hour in hours in jobs/hour 327 1.29 908 2.05 574 327 1.31 899 1.92 612 327 1.31 899 2.05 574 253 2.01 586 5.84 201 316 2.02 582 6.63 177 253 2.02 582 6.63 177 209 2.29 512 2.57 458 202 2.46 478 2.30 511 202 2.46 478 2.57 458
  • 26. Variability – Use Case 4 • IBM had 2.5x lower average RSD than YARN and 2.7x lower than Mesos • YARN had 1.1x lower average RSD than Mesos Q98 2 3 180% Q98 1 1 55% Q98 1 1 112% 44% 28% 107% KMS 500 176 35% KMS 1329 912 69% KMS 686 273 40% Q3 255 43 17% Q3 2294 1468 64% Q3 694 246 35% Q53 115 19 17% Q53 1233 937 76% Q53 422 298 71% Q8 71 14 20% Q8 1043 881 84% Q8 310 227 73% Q89 117 19 17% Q89 1253 1003 80% Q89 445 290 65% 21% 75% 57% 28% 60% 73% Q19 14 5 37% Q19 14 27 187% Q19 18 77 435% Q42 12 4 35% Q42 12 23 193% Q42 9 3 35% Q52 12 6 47% Q52 11 19 169% Q52 25 108 440% Q55 13 6 45% Q55 11 17 159% Q55 9 3 35% Q63 12 8 60% Q63 13 21 166% Q63 9 3 35% Q68 37 9 25% Q68 40 48 120% Q68 38 66 174% Q73 15 5 36% Q73 13 19 142% Q73 12 4 31% Q98 1 1 44% Q98 1 2 158% Q98 0 1 159% 41% 162% 168% KMS 472 167 35% KMS 329 109 33% KMS 601 258 43% Q3 705 117 17% Q3 289 98 34% Q3 1247 707 57% Q53 312 93 30% Q53 184 102 55% Q53 435 244 56% Q8 199 72 36% Q8 155 92 59% Q8 317 207 65% Q89 319 115 36% Q89 199 113 57% Q89 479 274 57% 31% 48% 56% 34% 84% 92% Weighted average RSD (interactive & batch) Weighted average RSD (interactive & batch) Weighted average RSD (interactive & batch) Average interactive RSD Average interactive RSD Average interactive RSD Average batch RSD Weighted average RSD (interactive & batch) Average batch RSD Average batch RSD Average batch RSD Case 3: Mixed lti-user Case 4: Mixed ti-tenant Average batch RSD Weighted average RSD (interactive & batch) Average interactive RSD Average interactive RSD Average batch RSD Weighted average RSD (interactive & batch) Average interactive RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Job Avg Std Dev RSD Q19 17 6 38% Q19 20 50 245% Q19 25 194 766% Q42 16 6 40% Q42 19 49 263% Q42 22 150 674% Q52 15 7 47% Q52 30 230 777% Q52 15 43 283% Q55 15 6 40% Q55 19 49 264% Q55 40 363 916% Q63 16 7 43% Q63 17 19 111% Q63 19 136 730% Q68 42 13 30% Q68 62 72 116% Q68 47 164 350% Q73 18 7 40% Q73 22 21 97% Q73 17 38 223% Q98 1 4 293% Q98 2 9 560% Q98 8 175 2278% 71% 304% 777% KMS 549 230 42% KMS 400 167 42% KMS 557 181.7 33% Q3 168 36 21% Q3 490 177 36% Q3 596 223.6 38% Q53 84 19 23% Q53 201 74 37% Q53 339 212.6 63% Q8 54 16 30% Q8 123 37 30% Q8 228 186.7 82% Q89 89 21 24% Q89 219 82 38% Q89 358 221.6 62% 28% 36% 55% Q19 22 7 31% Q19 28 6 21% Q19 25 35 141% Q42 20 5 24% Q42 28 11 39% Q42 23 24 101% Q52 20 5 25% Q52 26 5 20% Q52 21 16 74% Average RSD Average RSD Average RSD Average RSD Average RSD Average RSD Apache YARN v2.7.3 Apache Mesos v1.0.1 Case 2: chronous batch lti-user Case 1: chronous ctive multi- user IBM Spectrum Conductor with Spark v2.1.0
  • 27. Results Summary Relative Advantage of IBM Spectrum Conductor for Spark versus YARN and Mesos
  • 28. Why? • Dynamic allocation/de-allocation – The resource manager should allow applications to both release and gain resources during execution, based on scheduling demand from the Spark driver – IBM Spectrum Conductor with Spark supports both dynamic allocation and dynamic de- allocation – YARN supports dynamic allocation but appeared to have issues with de-allocation (or re- allocation) – Mesos supports both dynamic allocation and dynamic de-allocation but appeared to have issues with the offer mechanism and long-running jobs • Pre-emption – Pre-emption allows for more even resource distribution among applications, higher throughput and lower variation in performance – IBM Spectrum Conductor with Spark includes a highly configurable and intelligent pre-emption implementation – YARN supports pre-emption but it hurt rather than helped when we tried it – The version of Mesos we used did not support pre-emption
  • 29. Next steps • Please provide feedback/contribute to the SMB benchmarks • Please use the SMB benchmarks and share what you find
  • 30. References Spark Multi-User Benchmark (SMB): www.github.com/IBM/SparkMultiuserBenchmark STAC Report: www.STACresearch.com/news/2017/05/19/IBM170405 SMB-2 Blog: https://developer.ibm.com/code/2017/05/24/introducing-spark-multiuser-benchmark-version-2/ Results Blog: https://www.ibm.com/developerworks/community/blogs/281605c9-7369-46dc-ad03- 70d9ad377480/entry/Understanding_Multiuser_Spark_Application_Performance_Differences?lang=en Spark-related materials at STAC: www.STACresearch.com/spark
  • 31. Q&A
  • 32. Thank You. Mikhail Genkin: genkin@ca.ibm.com Peter Lankford: peter.lankford@STACresearch.com
  • 34.
  • 35.
  • 36. What was STAC’s role? • Provide feedback on the benchmark specs – Note: This was not a STAC Benchmark™ • Confirm conformance of tests to the specs • Inspect the system configurations • Oversee the test execution • Analyze the results • Produce the STAC Report™ and detailed STAC Configuration Disclosure
  • 37. Limitations (as per STAC Report) • Sensitivity to tuning • Sensitivity to workload • Difficulty of configuring Use Case 4 • Mesos/YARN/Spark configuration puzzles • Others