SlideShare a Scribd company logo
1 of 16
Download to read offline
An Experiment-Driven Performance Model of Stream
Processing Operators in Fog Computing Environments
Hamidreza Arkian1
, Guillaume Pierre1
,
Johan Tordsson2
, Erik Elmroth2
1
University of Rennes1/IRISA, France
2
Elastisys AB, Sweden
SAC’20 - March 30-April 3, 2020 - Brno, Czech Republic
2/16
IoT-to-Cloud basic architecture
3/16
Cloud-based stream processing
Apache Flink
4/16
Challenges
Apache Flink
Low Throughput!!
Low Bandwidth!!
Cost!!
Continuously generating
stream of data
with high rate
Latency-sensitive
applications
5/16
Fog-based stream processing
6/16
Operator
2
Stream processing in Fog environment
Source Operator
4
Operator
1
Sink
Operator
2
Operator
3
Source
Operator
4
Operator
1
Sink
Operator
2
Operator
3
Source
Operator
2
Logical graph of DSP
Workflow execution model
7/16
Operator
2
Stream processing in geo-distributed environments
Source Operator
4
Operator
1
Sink
Operator
2
Operator
3
Source
Operator
4
Operator
1
Sink
Operator
2
Operator
3
Source
Operator
2
Logical graph of DSP
Workflow execution model
Deployment in Fog geo-distributed environment
Sink
Operator
1
Op2
Replica2
Operator
3
Operator
4
Source
Op2
Replica1
Op2
Replica3
Source
8/16
Challenges
➢ Understanding the performance of a geo-distributed stream processing application is
difficult.
➢ Any configuration decision can have a significant impact on performance.
9/16
Experimental setup
➢ Emulation of a real fog platform
o 32-core server ≈ 16 fog nodes (2 cores/node)
o Emulated network latencies
o Apache Flink
➢ Test Application
o Input stream of 100,000 Tuple2 records
o The operator calls the Fibonacci function
Fib(24) upon every processed record
➢ Performance metric:
o Processing Time (PT)
10/16
Modeling operator replication
➢ n operator replicas should in principle process data n times faster than a single replica
➢ α represents the computation capacity of a single node.
➢ We can determine the value of α based on one measurement
Experiment Model
11/16
Considering heterogeneous network delays
➢ Network delays between data sources and operator replicas slow down the whole system.
➢ When the network delays are heterogeneous, the dominating one is the greatest one (NDmax
).
➢ γ represents the impact of network delays on overall performance.
➢ We can determine both α and γ based on two measurements
Experiment Model
12/16
Improving the model’s accuracy
➢ Operator replication incurs some amount of parallelization inefficiency
➢ The speedup with n nodes is usually a little less than n
➢ 𝛽 represents Flink’s parallelization inefficiency
➢ We can determine α, 𝛽 and γ based on three or more measurements
Experiment Model
13/16
Prediction accuracy
Accuracy metric: 𝑀𝐴𝑃𝐸
4 measurements,
2.0% accuracy
14/16
What about modeling an entire (simple) workflow?
➢ The throughput of an entire workflow is determined by the slowest operator
𝛱Workflow
= max(𝛱Map+KeyBy
, 𝛱Reduce
)
Experiment ModelWorkflow
15/16
Can we reuse the parameters instead of multiple measurements?
➢ 𝛼 cannot be reused because it is specific to the
computation complexity of one operator.
➢ β and γ capture properties that are independent from
the nature of the computation carried out by the
operator.
➢ β and γ values of one operator’s model might be reused
for other operators’ models.
Calibrated model for Operator 1 Uncalibrated model for Operator 2
𝛼1
β1
γ1
𝛼2
β1
γ1
16/16
Conclusions
➢ Heterogeneous network characteristics make it difficult to understand the
performance of stream processing engines in geo-distributed environments.
➢ A predictive performance model for Apache Flink operators that is backed by
experimental measurements and evaluations was proposed.
➢ The model predictions are accurate within ±2% of the actual values.
Hamidreza Arkian
hamidreza.arkian@irisa.fr
Acknowledgment
This work is part of a project that has received funding from the European Union’s
Horizon 2020 research and innovation programme under the Marie
Skłodowska-Curie grant agreement No 765452. The information and views set out
in this publication are those of the author(s) and do not necessarily reflect the
official opinion of the European Union. Neither the European Union institutions
and bodies nor any person acting on their behalf may be held responsible for the
use which may be made of the information contained therein.
Training the next generation of European
Fog computing experts
http://www.fogguru.eu/

More Related Content

What's hot

Architecture and Performance of Runtime Environments for Data Intensive Scala...
Architecture and Performance of Runtime Environments for Data Intensive Scala...Architecture and Performance of Runtime Environments for Data Intensive Scala...
Architecture and Performance of Runtime Environments for Data Intensive Scala...jaliyae
 
A Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR SnapshotsA Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR SnapshotsIan Downard
 
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial NetworksSuperframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial NetworksOka Danil
 
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11aDeadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11aOka Danil
 
Scalable Parallel Computing on Clouds
Scalable Parallel Computing on CloudsScalable Parallel Computing on Clouds
Scalable Parallel Computing on CloudsThilina Gunarathne
 
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...Otávio Carvalho
 
Optimization of graph storage using GoFFish
Optimization of graph storage using GoFFishOptimization of graph storage using GoFFish
Optimization of graph storage using GoFFishAnushree Prasanna Kumar
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING JournalPandey_G
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolHenry Muccini
 
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUsC-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUsPandey_G
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...EUDAT
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...deawoo Kim
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networksbalmanme
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?Ian Lumb
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platforma3labdsp
 
Python Master Thesis Projects in UK.
Python Master Thesis Projects in UK.Python Master Thesis Projects in UK.
Python Master Thesis Projects in UK.Phdtopiccom
 
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...NECST Lab @ Politecnico di Milano
 
High performance computing
High performance computingHigh performance computing
High performance computingGuy Tel-Zur
 

What's hot (20)

Architecture and Performance of Runtime Environments for Data Intensive Scala...
Architecture and Performance of Runtime Environments for Data Intensive Scala...Architecture and Performance of Runtime Environments for Data Intensive Scala...
Architecture and Performance of Runtime Environments for Data Intensive Scala...
 
A Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR SnapshotsA Guide to Data Versioning with MapR Snapshots
A Guide to Data Versioning with MapR Snapshots
 
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial NetworksSuperframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
 
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11aDeadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
 
Scalable Parallel Computing on Clouds
Scalable Parallel Computing on CloudsScalable Parallel Computing on Clouds
Scalable Parallel Computing on Clouds
 
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
 
Optimization of graph storage using GoFFish
Optimization of graph storage using GoFFishOptimization of graph storage using GoFFish
Optimization of graph storage using GoFFish
 
Super COMPUTING Journal
Super COMPUTING JournalSuper COMPUTING Journal
Super COMPUTING Journal
 
SERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_schoolSERENE 2014 School: Daniel varro serene2014_school
SERENE 2014 School: Daniel varro serene2014_school
 
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUsC-SAW: A Framework for Graph Sampling and Random Walk on GPUs
C-SAW: A Framework for Graph Sampling and Random Walk on GPUs
 
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
High Performance & High Throughput Computing - EUDAT Summer School (Giuseppe ...
 
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
 
Experiences with High-bandwidth Networks
Experiences with High-bandwidth NetworksExperiences with High-bandwidth Networks
Experiences with High-bandwidth Networks
 
Clone cloud
Clone cloudClone cloud
Clone cloud
 
High Performance Computing in the Cloud?
High Performance Computing in the Cloud?High Performance Computing in the Cloud?
High Performance Computing in the Cloud?
 
2019 swan-cs3
2019 swan-cs32019 swan-cs3
2019 swan-cs3
 
Low Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard PlatformLow Power High-Performance Computing on the BeagleBoard Platform
Low Power High-Performance Computing on the BeagleBoard Platform
 
Python Master Thesis Projects in UK.
Python Master Thesis Projects in UK.Python Master Thesis Projects in UK.
Python Master Thesis Projects in UK.
 
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
 
High performance computing
High performance computingHigh performance computing
High performance computing
 

Similar to Experiment-Driven Performance Model of Stream Processing Operators in Fog

Optical Switching in the Datacenter
Optical Switching in the DatacenterOptical Switching in the Datacenter
Optical Switching in the DatacenterKostas Katrinis
 
An evaluation of manet protocols in terms of tcp variants based on thier perf...
An evaluation of manet protocols in terms of tcp variants based on thier perf...An evaluation of manet protocols in terms of tcp variants based on thier perf...
An evaluation of manet protocols in terms of tcp variants based on thier perf...eSAT Publishing House
 
Raminder kaur presentation_two
Raminder kaur presentation_twoRaminder kaur presentation_two
Raminder kaur presentation_tworamikaurraminder
 
Hybrid networking and distribution
Hybrid networking and distribution Hybrid networking and distribution
Hybrid networking and distribution vivek pratap singh
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Vincenzo Gulisano
 
IPLC Analytic Dashboard - Mohd Rizal bin Mohd Ramly
IPLC Analytic Dashboard - Mohd Rizal bin Mohd RamlyIPLC Analytic Dashboard - Mohd Rizal bin Mohd Ramly
IPLC Analytic Dashboard - Mohd Rizal bin Mohd RamlyMyNOG
 
Parallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMPParallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMPAnil Bohare
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingRuymán Reyes
 
Testing tool for an automated ticketing system
Testing tool for an automated ticketing systemTesting tool for an automated ticketing system
Testing tool for an automated ticketing systemVladimirZitoli
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...
IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...
IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...IRJET Journal
 
Life in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear RoadLife in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear RoadAJAY KHARAT
 
IRJET- The RTL Model of a Reconfigurable Pipelined MCM
IRJET- The RTL Model of a Reconfigurable Pipelined MCMIRJET- The RTL Model of a Reconfigurable Pipelined MCM
IRJET- The RTL Model of a Reconfigurable Pipelined MCMIRJET Journal
 
IRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
IRJET- Performance Improvement of Wireless Network using Modern Simulation ToolsIRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
IRJET- Performance Improvement of Wireless Network using Modern Simulation ToolsIRJET Journal
 
Comparing Cpp And Erlang For Motorola Telecoms Software
Comparing Cpp And Erlang For Motorola Telecoms SoftwareComparing Cpp And Erlang For Motorola Telecoms Software
Comparing Cpp And Erlang For Motorola Telecoms Softwarel xf
 
Uber mobility - High Performance Networking
Uber mobility - High Performance NetworkingUber mobility - High Performance Networking
Uber mobility - High Performance NetworkingDhaval Patel
 
Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsLightbend
 

Similar to Experiment-Driven Performance Model of Stream Processing Operators in Fog (20)

Optical Switching in the Datacenter
Optical Switching in the DatacenterOptical Switching in the Datacenter
Optical Switching in the Datacenter
 
D031201021027
D031201021027D031201021027
D031201021027
 
An evaluation of manet protocols in terms of tcp variants based on thier perf...
An evaluation of manet protocols in terms of tcp variants based on thier perf...An evaluation of manet protocols in terms of tcp variants based on thier perf...
An evaluation of manet protocols in terms of tcp variants based on thier perf...
 
Raminder kaur presentation_two
Raminder kaur presentation_twoRaminder kaur presentation_two
Raminder kaur presentation_two
 
Hybrid networking and distribution
Hybrid networking and distribution Hybrid networking and distribution
Hybrid networking and distribution
 
Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)Crash course on data streaming (with examples using Apache Flink)
Crash course on data streaming (with examples using Apache Flink)
 
IPLC Analytic Dashboard - Mohd Rizal bin Mohd Ramly
IPLC Analytic Dashboard - Mohd Rizal bin Mohd RamlyIPLC Analytic Dashboard - Mohd Rizal bin Mohd Ramly
IPLC Analytic Dashboard - Mohd Rizal bin Mohd Ramly
 
Parallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMPParallelization of Coupled Cluster Code with OpenMP
Parallelization of Coupled Cluster Code with OpenMP
 
FrackingPaper
FrackingPaperFrackingPaper
FrackingPaper
 
Directive-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous ComputingDirective-based approach to Heterogeneous Computing
Directive-based approach to Heterogeneous Computing
 
Testing tool for an automated ticketing system
Testing tool for an automated ticketing systemTesting tool for an automated ticketing system
Testing tool for an automated ticketing system
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...
IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...
IRJET- Design and Implementation of Performance Evaluation of Routing Protoco...
 
Life in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear RoadLife in the Fast Lane: A Line-Rate Linear Road
Life in the Fast Lane: A Line-Rate Linear Road
 
IRJET- The RTL Model of a Reconfigurable Pipelined MCM
IRJET- The RTL Model of a Reconfigurable Pipelined MCMIRJET- The RTL Model of a Reconfigurable Pipelined MCM
IRJET- The RTL Model of a Reconfigurable Pipelined MCM
 
IRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
IRJET- Performance Improvement of Wireless Network using Modern Simulation ToolsIRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
IRJET- Performance Improvement of Wireless Network using Modern Simulation Tools
 
aaa.pptx
aaa.pptxaaa.pptx
aaa.pptx
 
Comparing Cpp And Erlang For Motorola Telecoms Software
Comparing Cpp And Erlang For Motorola Telecoms SoftwareComparing Cpp And Erlang For Motorola Telecoms Software
Comparing Cpp And Erlang For Motorola Telecoms Software
 
Uber mobility - High Performance Networking
Uber mobility - High Performance NetworkingUber mobility - High Performance Networking
Uber mobility - High Performance Networking
 
Operationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML ModelsOperationalizing Machine Learning: Serving ML Models
Operationalizing Machine Learning: Serving ML Models
 

More from FogGuru MSCA Project

The magical recipe for speaking in public
The magical recipe for speaking in publicThe magical recipe for speaking in public
The magical recipe for speaking in publicFogGuru MSCA Project
 
Introduction to the economics of innovation
Introduction to the economics of innovationIntroduction to the economics of innovation
Introduction to the economics of innovationFogGuru MSCA Project
 
Introduction to entrepreneurial finances
Introduction to entrepreneurial financesIntroduction to entrepreneurial finances
Introduction to entrepreneurial financesFogGuru MSCA Project
 
Financing Innovation and Intellectual property
Financing Innovation and Intellectual property Financing Innovation and Intellectual property
Financing Innovation and Intellectual property FogGuru MSCA Project
 
Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities FogGuru MSCA Project
 
Business growth: material for exercises
Business growth: material for exercisesBusiness growth: material for exercises
Business growth: material for exercisesFogGuru MSCA Project
 
Business growth: material for discussions
Business growth: material for discussions  Business growth: material for discussions
Business growth: material for discussions FogGuru MSCA Project
 
Management, organization and leadership
Management, organization and leadershipManagement, organization and leadership
Management, organization and leadershipFogGuru MSCA Project
 
Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks FogGuru MSCA Project
 
How to carry out bibliographic research
How to carry out bibliographic research How to carry out bibliographic research
How to carry out bibliographic research FogGuru MSCA Project
 
Guidelines for empirical evaluations
Guidelines for empirical evaluationsGuidelines for empirical evaluations
Guidelines for empirical evaluationsFogGuru MSCA Project
 
Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole FogGuru MSCA Project
 

More from FogGuru MSCA Project (20)

Assignments
AssignmentsAssignments
Assignments
 
The magical recipe for speaking in public
The magical recipe for speaking in publicThe magical recipe for speaking in public
The magical recipe for speaking in public
 
Introduction to the economics of innovation
Introduction to the economics of innovationIntroduction to the economics of innovation
Introduction to the economics of innovation
 
Introduction to entrepreneurial finances
Introduction to entrepreneurial financesIntroduction to entrepreneurial finances
Introduction to entrepreneurial finances
 
Financing Innovation and Intellectual property
Financing Innovation and Intellectual property Financing Innovation and Intellectual property
Financing Innovation and Intellectual property
 
Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities Creating Competitive Advantage: Resource and Capabilities
Creating Competitive Advantage: Resource and Capabilities
 
Business growth: material for exercises
Business growth: material for exercisesBusiness growth: material for exercises
Business growth: material for exercises
 
Business growth: material for discussions
Business growth: material for discussions  Business growth: material for discussions
Business growth: material for discussions
 
Scale-ups and large companies
Scale-ups and large companiesScale-ups and large companies
Scale-ups and large companies
 
Management, organization and leadership
Management, organization and leadershipManagement, organization and leadership
Management, organization and leadership
 
Key strategies for growth
Key strategies for growthKey strategies for growth
Key strategies for growth
 
Financing growth
Financing growthFinancing growth
Financing growth
 
Machine Learning: exercises
Machine Learning: exercises Machine Learning: exercises
Machine Learning: exercises
 
Introduction to Machine Learning
Introduction to Machine Learning Introduction to Machine Learning
Introduction to Machine Learning
 
Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks Writing code well: tools, tips and tricks
Writing code well: tools, tips and tricks
 
How to make a presentation
How to make a presentationHow to make a presentation
How to make a presentation
 
How to carry out bibliographic research
How to carry out bibliographic research How to carry out bibliographic research
How to carry out bibliographic research
 
Guidelines for empirical evaluations
Guidelines for empirical evaluationsGuidelines for empirical evaluations
Guidelines for empirical evaluations
 
Ethics and Personal Data
Ethics and Personal DataEthics and Personal Data
Ethics and Personal Data
 
Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole Business case 1: Soft mobility in Rennes Metropole
Business case 1: Soft mobility in Rennes Metropole
 

Recently uploaded

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Experiment-Driven Performance Model of Stream Processing Operators in Fog

  • 1. An Experiment-Driven Performance Model of Stream Processing Operators in Fog Computing Environments Hamidreza Arkian1 , Guillaume Pierre1 , Johan Tordsson2 , Erik Elmroth2 1 University of Rennes1/IRISA, France 2 Elastisys AB, Sweden SAC’20 - March 30-April 3, 2020 - Brno, Czech Republic
  • 4. 4/16 Challenges Apache Flink Low Throughput!! Low Bandwidth!! Cost!! Continuously generating stream of data with high rate Latency-sensitive applications
  • 6. 6/16 Operator 2 Stream processing in Fog environment Source Operator 4 Operator 1 Sink Operator 2 Operator 3 Source Operator 4 Operator 1 Sink Operator 2 Operator 3 Source Operator 2 Logical graph of DSP Workflow execution model
  • 7. 7/16 Operator 2 Stream processing in geo-distributed environments Source Operator 4 Operator 1 Sink Operator 2 Operator 3 Source Operator 4 Operator 1 Sink Operator 2 Operator 3 Source Operator 2 Logical graph of DSP Workflow execution model Deployment in Fog geo-distributed environment Sink Operator 1 Op2 Replica2 Operator 3 Operator 4 Source Op2 Replica1 Op2 Replica3 Source
  • 8. 8/16 Challenges ➢ Understanding the performance of a geo-distributed stream processing application is difficult. ➢ Any configuration decision can have a significant impact on performance.
  • 9. 9/16 Experimental setup ➢ Emulation of a real fog platform o 32-core server ≈ 16 fog nodes (2 cores/node) o Emulated network latencies o Apache Flink ➢ Test Application o Input stream of 100,000 Tuple2 records o The operator calls the Fibonacci function Fib(24) upon every processed record ➢ Performance metric: o Processing Time (PT)
  • 10. 10/16 Modeling operator replication ➢ n operator replicas should in principle process data n times faster than a single replica ➢ α represents the computation capacity of a single node. ➢ We can determine the value of α based on one measurement Experiment Model
  • 11. 11/16 Considering heterogeneous network delays ➢ Network delays between data sources and operator replicas slow down the whole system. ➢ When the network delays are heterogeneous, the dominating one is the greatest one (NDmax ). ➢ γ represents the impact of network delays on overall performance. ➢ We can determine both α and γ based on two measurements Experiment Model
  • 12. 12/16 Improving the model’s accuracy ➢ Operator replication incurs some amount of parallelization inefficiency ➢ The speedup with n nodes is usually a little less than n ➢ 𝛽 represents Flink’s parallelization inefficiency ➢ We can determine α, 𝛽 and γ based on three or more measurements Experiment Model
  • 13. 13/16 Prediction accuracy Accuracy metric: 𝑀𝐴𝑃𝐸 4 measurements, 2.0% accuracy
  • 14. 14/16 What about modeling an entire (simple) workflow? ➢ The throughput of an entire workflow is determined by the slowest operator 𝛱Workflow = max(𝛱Map+KeyBy , 𝛱Reduce ) Experiment ModelWorkflow
  • 15. 15/16 Can we reuse the parameters instead of multiple measurements? ➢ 𝛼 cannot be reused because it is specific to the computation complexity of one operator. ➢ β and γ capture properties that are independent from the nature of the computation carried out by the operator. ➢ β and γ values of one operator’s model might be reused for other operators’ models. Calibrated model for Operator 1 Uncalibrated model for Operator 2 𝛼1 β1 γ1 𝛼2 β1 γ1
  • 16. 16/16 Conclusions ➢ Heterogeneous network characteristics make it difficult to understand the performance of stream processing engines in geo-distributed environments. ➢ A predictive performance model for Apache Flink operators that is backed by experimental measurements and evaluations was proposed. ➢ The model predictions are accurate within ±2% of the actual values. Hamidreza Arkian hamidreza.arkian@irisa.fr Acknowledgment This work is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 765452. The information and views set out in this publication are those of the author(s) and do not necessarily reflect the official opinion of the European Union. Neither the European Union institutions and bodies nor any person acting on their behalf may be held responsible for the use which may be made of the information contained therein. Training the next generation of European Fog computing experts http://www.fogguru.eu/