Skip to main navigation Skip to search Skip to main content

A data-aware workflow scheduling algorithm for heterogeneous distributed systems

  • Louisiana State University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

The workflow scheduling problem in heterogeneous distributed systems is hard to solve due to both intermediate data transfer time and the computation time for each task being considered. The heterogeneity of the computing power of distributed computational sites and the bandwidth between them makes the scheduling problem challenging. In this study, we improve a heuristic-based data-aware algorithm to find the optimal scheduling so that the turnaround time of the workflow is minimized. Our improved algorithm outperforms the existing algorithms in both performance and time efficiency in most cases. We also extend our algorithm to solve the co-scheduling problem. In this problem, each task of the workflow can request data from a remote data site before its execution; and also store important intermediate data to a remote data site after the execution. The results show that the turnaround time of the workflow can be shortened significantly using our data-aware algorithm compared to the existing optimal algorithms.

Original languageEnglish
Title of host publicationProceedings of the 2011 International Conference on High Performance Computing and Simulation, HPCS 2011
Pages114-120
Number of pages7
DOIs
StatePublished - 2011
Event2011 International Conference on High Performance Computing and Simulation, HPCS 2011 - Istanbul, Turkey
Duration: Jul 4 2011Jul 8 2011

Publication series

NameProceedings of the 2011 International Conference on High Performance Computing and Simulation, HPCS 2011

Conference

Conference2011 International Conference on High Performance Computing and Simulation, HPCS 2011
Country/TerritoryTurkey
CityIstanbul
Period07/4/1107/8/11

Keywords

  • Data intensive supercomputing
  • Grid and cluster computing
  • Large scale scientific computing
  • Large scale systems
  • Workflow scheduling

Fingerprint

Dive into the research topics of 'A data-aware workflow scheduling algorithm for heterogeneous distributed systems'. Together they form a unique fingerprint.

Cite this