Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Overhead Supercomputing 2011
1. Workflow Overhead Analysis
and Optimizations
Weiwei Chen, Ewa Deelman
Information Sciences Institute
University of Southern California
{wchen,deelman}@isi.edu
WORKS11, Nov 14 2011, Seattle WA
2. Outline
• Introduction
• Overhead modeling
• Cumulative overhead
• Experiments and evaluations
• Conclusions and future work
3. Introduction
• Workflow Optimization
• Scheduling
• Reducing Runtime
• Reducing and Overlapping
Overheads
• Overheads
• Benefits
• Workflow modeling and
simulation Fig
1
System
Overview
• Performance evaluation
• New optimization methods
4. Outline
• Introduction
• Overhead modeling
• Cumulative overhead
• Experiments and evaluations
• Conclusions and future work
6. Outline
• Introduction
• Overhead modeling
• Cumulative overhead
• Experiments and evaluations
• Conclusions and future work
7. Cumulative Overhead (O1)
• O1 simply adds up a similar type of
overheads of all jobs.
O1(workflow engine delay)=10+10+10=30
O1(queue delay)=10+20+10=40
O1(data transfer delay)=10
O1(postscript delay)=10+20+10=40
8. Cumulative Overhead (O2)
• O2 subtracts from O1 the overlaps of the same type of
overhead.
O2(workflow engine delay)=20
O2(data transfer delay)=10.
O2(queue delay)=30. O2(postscript delay)=40
9. Cumulative Overhead (O3)
• O3 subtracts the overlap of dissimilar overheads from O2
O3(workflow engine delay)=20
O3(data transfer delay)=10
O3(queue delay)=20
O3(postscript delay)=30
10. Outline
• Introduction
• Overhead modeling
• Cumulative overhead
• Experiments and evaluations
• Conclusions and future work
11. Experiments
• Environments:
• Amazon EC2 • HPCC
• FutureGrid • Other clusters
• Applications:
• Biology: Epigenomics, Proteomics, SIPHT
• Earthquake science: Broadband, CyberShake
• Astronomy: Montage
• Physics: LIGO
• Optimizations:
• Job Clustering • Data Pre-Staging
• Resource Provisioning • Throttling
Data
are
available
at
h1p://pegasus.isi.edu/workflow_gallery/
14. Job Clustering
• Merging small jobs into a clustered job
without with without with without with
clustering clustering clustering clustering clustering clustering
Percentage(%)=cumulative overhead(seconds) / makspan(seconds)
With job clustering, the cumulative overheads decrease
greatly due to the decreased number of jobs.
15. Resource Provisioning
• Deploy pilot jobs as placeholders
with without with without with without
provisioning provisioning provisioning provisioning provisioning provisioning
O3 and O2 have shown more obviously that the
portion of runtime has been increased than O1.
16. Outline
• Introduction
• Overhead modeling
• Cumulative overhead
• Experiments and evaluations
• Conclusions and future work
17. Conclusions and Future Work
Conclusions
• Overhead Analysis
• A complete view of these three metrics
Future Work
• More optimization methods.
• Dynamic provisioning
18. Q&A
• Pegasus Group: http://pegasus.isi.edu/
• FutureGrid: https://portal.futuregrid.org/
• Scripts are available at
http://isi.edu/~wchen/techniques.html
• Data are available at
http://pegasus.isi.edu/workflow_gallery/