SlideShare uma empresa Scribd logo
1 de 18
Baixar para ler offline
Workflow Overhead Analysis
    and Optimizations
     Weiwei Chen, Ewa Deelman
        Information Sciences Institute
       University of Southern California
          {wchen,deelman}@isi.edu
      WORKS11, Nov 14 2011, Seattle WA
Outline

•    Introduction
•    Overhead modeling
•    Cumulative overhead
•    Experiments and evaluations
•    Conclusions and future work
Introduction
• Workflow Optimization
   • Scheduling
   • Reducing Runtime
   • Reducing and Overlapping
   Overheads
• Overheads
• Benefits
   • Workflow modeling and
   simulation                   Fig	
  1	
  System	
  Overview	
  
   • Performance evaluation
   • New optimization methods
Outline


•    Introduction
•    Overhead modeling
•    Cumulative overhead
•    Experiments and evaluations
•    Conclusions and future work
Modeling Overheads




Job Events:                                                 Workflow Events:
•    Job Release
                            Workflow engine delay •  Workflow Engine Start
•    Job Submit                                                                                                             Makespan
                            Queue delay           •  Workflow Engine Finished
•    Job Execute
                            Runtime
•    Job Terminate
•    Postscript Start
                            Postscript delay
•    Postscript Terminate


                                           	
  1	
  h1p://pegasus.isi.edu/wms/docs/3.1/monitoring_	
  debugging_stats.php#ploAng_staBsBcs	
  
Outline

•    Introduction
•    Overhead modeling
•    Cumulative overhead
•    Experiments and evaluations
•    Conclusions and future work
Cumulative Overhead (O1)
•  O1 simply adds up a similar type of
overheads of all jobs.




     O1(workflow engine delay)=10+10+10=30
     O1(queue delay)=10+20+10=40
     O1(data transfer delay)=10
     O1(postscript delay)=10+20+10=40
Cumulative Overhead (O2)
•  O2 subtracts from O1 the overlaps of the same type of
overhead.




  O2(workflow engine delay)=20	
     O2(data transfer delay)=10.
  O2(queue delay)=30.                O2(postscript delay)=40
Cumulative Overhead (O3)
    •  O3 subtracts the overlap of dissimilar overheads from O2




O3(workflow engine delay)=20	
     O3(data transfer delay)=10
O3(queue delay)=20	
               O3(postscript delay)=30	
  
Outline

•    Introduction
•    Overhead modeling
•    Cumulative overhead
•    Experiments and evaluations
•    Conclusions and future work
Experiments
•  Environments:
          •  Amazon EC2                                                        •  HPCC
          •  FutureGrid                                                        •  Other clusters
•  Applications:
          •          Biology: Epigenomics, Proteomics, SIPHT
          •          Earthquake science: Broadband, CyberShake
          •          Astronomy: Montage
          •          Physics: LIGO
•  Optimizations:
          •          Job Clustering                                            •  Data Pre-Staging
          •          Resource Provisioning                                     •  Throttling

Data	
  are	
  available	
  at	
  h1p://pegasus.isi.edu/workflow_gallery/	
  
Experiments
Distribution of Overheads
Job Clustering
•  Merging small jobs into a clustered job	
  




        without with            without with            without    with
        clustering clustering   clustering clustering   clustering clustering
   Percentage(%)=cumulative overhead(seconds) / makspan(seconds)

   With job clustering, the cumulative overheads decrease
   greatly due to the decreased number of jobs.
Resource Provisioning
         •  Deploy pilot jobs as placeholders	
  




  with         without        with         without        with         without
  provisioning provisioning   provisioning provisioning   provisioning provisioning



O3 and O2 have shown more obviously that the
portion of runtime has been increased than O1.
Outline

•    Introduction
•    Overhead modeling
•    Cumulative overhead
•    Experiments and evaluations
•    Conclusions and future work
Conclusions and Future Work

Conclusions
  •    Overhead Analysis
  •    A complete view of these three metrics

Future Work
  •  More optimization methods.
  •  Dynamic provisioning
Q&A

•  Pegasus Group: http://pegasus.isi.edu/
•  FutureGrid: https://portal.futuregrid.org/
•  Scripts are available at
      http://isi.edu/~wchen/techniques.html
•  Data are available at
      http://pegasus.isi.edu/workflow_gallery/

Mais conteúdo relacionado

Mais procurados

Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit
 
Your Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's PerspectiveYour Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's PerspectiveIlya Ganelin
 
Briefing - Dynamic Workers for Scheduling
Briefing - Dynamic Workers for SchedulingBriefing - Dynamic Workers for Scheduling
Briefing - Dynamic Workers for SchedulingBernie Chiu
 
Spark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena LazovikSpark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena LazovikSpark Summit
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit
 
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Spark Summit
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with TransformersDatabricks
 
Ray: Enterprise-Grade, Distributed Python
Ray: Enterprise-Grade, Distributed PythonRay: Enterprise-Grade, Distributed Python
Ray: Enterprise-Grade, Distributed PythonDatabricks
 
Optimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansOptimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansDatabricks
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinFlink Forward
 
“TinyML Isn’t Thinking Big Enough,” a Presentation from Perceive
“TinyML Isn’t Thinking Big Enough,” a Presentation from Perceive“TinyML Isn’t Thinking Big Enough,” a Presentation from Perceive
“TinyML Isn’t Thinking Big Enough,” a Presentation from PerceiveEdge AI and Vision Alliance
 
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph BradleyDeploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph BradleyDatabricks
 
Spark Summit EU talk by Ahsan Javed Awan
Spark Summit EU talk by Ahsan Javed AwanSpark Summit EU talk by Ahsan Javed Awan
Spark Summit EU talk by Ahsan Javed AwanSpark Summit
 
Building Robust Pipelines with Airflow
Building Robust Pipelines with AirflowBuilding Robust Pipelines with Airflow
Building Robust Pipelines with AirflowErin Shellman
 
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...Databricks
 
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Summit
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsDatabricks
 

Mais procurados (20)

Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van Hovell
 
Your Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's PerspectiveYour Guide to Streaming - The Engineer's Perspective
Your Guide to Streaming - The Engineer's Perspective
 
Briefing - Dynamic Workers for Scheduling
Briefing - Dynamic Workers for SchedulingBriefing - Dynamic Workers for Scheduling
Briefing - Dynamic Workers for Scheduling
 
Spark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena LazovikSpark Summit EU talk by Elena Lazovik
Spark Summit EU talk by Elena Lazovik
 
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim LauSpark Summit EU talk by Kent Buenaventura and Willaim Lau
Spark Summit EU talk by Kent Buenaventura and Willaim Lau
 
Spark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar CastanedaSpark Summit EU talk by Oscar Castaneda
Spark Summit EU talk by Oscar Castaneda
 
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
Apache Spark the Hard Way: Challenges with Building an On-Prem Spark Analytic...
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
 
Ray: Enterprise-Grade, Distributed Python
Ray: Enterprise-Grade, Distributed PythonRay: Enterprise-Grade, Distributed Python
Ray: Enterprise-Grade, Distributed Python
 
Optimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex PlansOptimizing the Catalyst Optimizer for Complex Plans
Optimizing the Catalyst Optimizer for Complex Plans
 
Univa Presentation at DAC 2020
Univa Presentation at DAC 2020 Univa Presentation at DAC 2020
Univa Presentation at DAC 2020
 
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache ZeppelinMoon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
Moon soo Lee – Data Science Lifecycle with Apache Flink and Apache Zeppelin
 
“TinyML Isn’t Thinking Big Enough,” a Presentation from Perceive
“TinyML Isn’t Thinking Big Enough,” a Presentation from Perceive“TinyML Isn’t Thinking Big Enough,” a Presentation from Perceive
“TinyML Isn’t Thinking Big Enough,” a Presentation from Perceive
 
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph BradleyDeploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
 
Spark Summit EU talk by Ahsan Javed Awan
Spark Summit EU talk by Ahsan Javed AwanSpark Summit EU talk by Ahsan Javed Awan
Spark Summit EU talk by Ahsan Javed Awan
 
Building Robust Pipelines with Airflow
Building Robust Pipelines with AirflowBuilding Robust Pipelines with Airflow
Building Robust Pipelines with Airflow
 
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
 
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin WilkinsSpark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
Spark Summit EU talk by Mikhail Semeniuk Hollin Wilkins
 
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence SpracklenSpark Autotuning: Spark Summit East talk by Lawrence Spracklen
Spark Autotuning: Spark Summit East talk by Lawrence Spracklen
 
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUsChoose Your Weapon: Comparing Spark on FPGAs vs GPUs
Choose Your Weapon: Comparing Spark on FPGAs vs GPUs
 

Semelhante a Overhead Supercomputing 2011

Predicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data AnalyticsPredicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data AnalyticsDatabricks
 
PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko Neotys
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapJosh Evans
 
Distributed Model Validation with Epsilon
Distributed Model Validation with EpsilonDistributed Model Validation with Epsilon
Distributed Model Validation with EpsilonSina Madani
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure applicationCodecamp Romania
 
Adventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesAdventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesDerek Graham
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementMongoDB
 
Adam shiwa summerschool 2012
Adam shiwa summerschool 2012Adam shiwa summerschool 2012
Adam shiwa summerschool 2012aszbel
 
Automated functional size measurement for three tier object relational mappin...
Automated functional size measurement for three tier object relational mappin...Automated functional size measurement for three tier object relational mappin...
Automated functional size measurement for three tier object relational mappin...IWSM Mensura
 
Performance Tuning with Execution Plans
Performance Tuning with Execution PlansPerformance Tuning with Execution Plans
Performance Tuning with Execution PlansGrant Fritchey
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOpsRui Quintino
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at ParseTravis Redman
 
Advanced Benchmarking at Parse
Advanced Benchmarking at ParseAdvanced Benchmarking at Parse
Advanced Benchmarking at ParseMongoDB
 
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for HadoopJoe Crobak
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Ilya Ganelin
 
Testing for Logic App Solutions | Integration Monday
Testing for Logic App Solutions | Integration MondayTesting for Logic App Solutions | Integration Monday
Testing for Logic App Solutions | Integration MondayBizTalk360
 
Introduction to Django-Celery and Supervisor
Introduction to Django-Celery and SupervisorIntroduction to Django-Celery and Supervisor
Introduction to Django-Celery and SupervisorSuresh Kumar
 
Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...
Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...
Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...InSync2011
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Spark Summit
 
Deep Learning Automated Helpdesk
Deep Learning Automated HelpdeskDeep Learning Automated Helpdesk
Deep Learning Automated HelpdeskPranav Sharma
 

Semelhante a Overhead Supercomputing 2011 (20)

Predicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data AnalyticsPredicting Optimal Parallelism for Data Analytics
Predicting Optimal Parallelism for Data Analytics
 
PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko PAC 2019 virtual Alexander Podelko
PAC 2019 virtual Alexander Podelko
 
Beyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the GapBeyond DevOps - How Netflix Bridges the Gap
Beyond DevOps - How Netflix Bridges the Gap
 
Distributed Model Validation with Epsilon
Distributed Model Validation with EpsilonDistributed Model Validation with Epsilon
Distributed Model Validation with Epsilon
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
 
Adventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE BytesAdventures in Azure Machine Learning from NE Bytes
Adventures in Azure Machine Learning from NE Bytes
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
 
Adam shiwa summerschool 2012
Adam shiwa summerschool 2012Adam shiwa summerschool 2012
Adam shiwa summerschool 2012
 
Automated functional size measurement for three tier object relational mappin...
Automated functional size measurement for three tier object relational mappin...Automated functional size measurement for three tier object relational mappin...
Automated functional size measurement for three tier object relational mappin...
 
Performance Tuning with Execution Plans
Performance Tuning with Execution PlansPerformance Tuning with Execution Plans
Performance Tuning with Execution Plans
 
“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps“Houston, we have a model...” Introduction to MLOps
“Houston, we have a model...” Introduction to MLOps
 
Benchmarking at Parse
Benchmarking at ParseBenchmarking at Parse
Benchmarking at Parse
 
Advanced Benchmarking at Parse
Advanced Benchmarking at ParseAdvanced Benchmarking at Parse
Advanced Benchmarking at Parse
 
Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for Hadoop
 
Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)Stream Computing (The Engineer's Perspective)
Stream Computing (The Engineer's Perspective)
 
Testing for Logic App Solutions | Integration Monday
Testing for Logic App Solutions | Integration MondayTesting for Logic App Solutions | Integration Monday
Testing for Logic App Solutions | Integration Monday
 
Introduction to Django-Celery and Supervisor
Introduction to Django-Celery and SupervisorIntroduction to Django-Celery and Supervisor
Introduction to Django-Celery and Supervisor
 
Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...
Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...
Database & Technology 1 _ Craig Shallahamer _ Unit of work time based perform...
 
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
Towards True Elasticity of Spark-(Michael Le and Min Li, IBM)
 
Deep Learning Automated Helpdesk
Deep Learning Automated HelpdeskDeep Learning Automated Helpdesk
Deep Learning Automated Helpdesk
 

Último

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 

Último (20)

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 

Overhead Supercomputing 2011

  • 1. Workflow Overhead Analysis and Optimizations Weiwei Chen, Ewa Deelman Information Sciences Institute University of Southern California {wchen,deelman}@isi.edu WORKS11, Nov 14 2011, Seattle WA
  • 2. Outline •  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work
  • 3. Introduction • Workflow Optimization • Scheduling • Reducing Runtime • Reducing and Overlapping Overheads • Overheads • Benefits • Workflow modeling and simulation Fig  1  System  Overview   • Performance evaluation • New optimization methods
  • 4. Outline •  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work
  • 5. Modeling Overheads Job Events: Workflow Events: •  Job Release Workflow engine delay •  Workflow Engine Start •  Job Submit Makespan Queue delay •  Workflow Engine Finished •  Job Execute Runtime •  Job Terminate •  Postscript Start Postscript delay •  Postscript Terminate  1  h1p://pegasus.isi.edu/wms/docs/3.1/monitoring_  debugging_stats.php#ploAng_staBsBcs  
  • 6. Outline •  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work
  • 7. Cumulative Overhead (O1) •  O1 simply adds up a similar type of overheads of all jobs. O1(workflow engine delay)=10+10+10=30 O1(queue delay)=10+20+10=40 O1(data transfer delay)=10 O1(postscript delay)=10+20+10=40
  • 8. Cumulative Overhead (O2) •  O2 subtracts from O1 the overlaps of the same type of overhead. O2(workflow engine delay)=20   O2(data transfer delay)=10. O2(queue delay)=30. O2(postscript delay)=40
  • 9. Cumulative Overhead (O3) •  O3 subtracts the overlap of dissimilar overheads from O2 O3(workflow engine delay)=20   O3(data transfer delay)=10 O3(queue delay)=20   O3(postscript delay)=30  
  • 10. Outline •  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work
  • 11. Experiments •  Environments: •  Amazon EC2 •  HPCC •  FutureGrid •  Other clusters •  Applications: •  Biology: Epigenomics, Proteomics, SIPHT •  Earthquake science: Broadband, CyberShake •  Astronomy: Montage •  Physics: LIGO •  Optimizations: •  Job Clustering •  Data Pre-Staging •  Resource Provisioning •  Throttling Data  are  available  at  h1p://pegasus.isi.edu/workflow_gallery/  
  • 14. Job Clustering •  Merging small jobs into a clustered job   without with without with without with clustering clustering clustering clustering clustering clustering Percentage(%)=cumulative overhead(seconds) / makspan(seconds) With job clustering, the cumulative overheads decrease greatly due to the decreased number of jobs.
  • 15. Resource Provisioning •  Deploy pilot jobs as placeholders   with without with without with without provisioning provisioning provisioning provisioning provisioning provisioning O3 and O2 have shown more obviously that the portion of runtime has been increased than O1.
  • 16. Outline •  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work
  • 17. Conclusions and Future Work Conclusions •  Overhead Analysis •  A complete view of these three metrics Future Work •  More optimization methods. •  Dynamic provisioning
  • 18. Q&A •  Pegasus Group: http://pegasus.isi.edu/ •  FutureGrid: https://portal.futuregrid.org/ •  Scripts are available at http://isi.edu/~wchen/techniques.html •  Data are available at http://pegasus.isi.edu/workflow_gallery/