Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

A dynamic data-throttling approach to minimize workflow imbalance

Rodríguez, Ricardo J., Tolosana-Calasanz, Rafael and Rana, Omer F. ORCID: https://orcid.org/0000-0003-3597-2646 2019. A dynamic data-throttling approach to minimize workflow imbalance. ACM Transactions on Internet Technology 19 (3) , 32. 10.1145/3278720

Full text not available from this repository.

Abstract

Scientific workflows enable scientists to undertake analysis on large datasets and perform complex scientific simulations. These workflows are often mapped onto distributed and parallel computational infrastructures to speed up their executions. Prior to its execution, a workflow structure may suffer transformations to accommodate the computing infrastructures, normally involving task clustering and partitioning. However, these transformations may cause workflow imbalance because of the difference between execution task times (runtime imbalance) or because of unconsidered data dependencies that lead to data locality issues (data imbalance). In this article, to mitigate these imbalances, we enhance the workflow lifecycle process in use by introducing a workflow imbalance phase that quantifies workflow imbalance after the transformations. Our technique is based on structural analysis of Petri nets, obtained by model transformation of a data-intensive workflow, and Linear Programming techniques. Our analysis can be used to assist workflow practitioners in finding more efficient ways of transforming and scheduling their workflows. Moreover, based on our analysis, we also propose a technique to mitigate workflow imbalance by data throttling. Our approach is based on autonomic computing principles that determine how data transmission must be throttled throughout workflow jobs. Our autonomic data-throttling approach mainly monitors the execution of the workflow and recompute data-throttling values when certain watchpoints are reached and time derivation is observed. We validate our approach by a formal proof and by simulations along with the Montage workflow. Our findings show that a dynamic data-throttling approach is feasible, does not introduce a significant overhead, and minimizes the usage of input buffers and network bandwidth.

Item Type: Article
Date Type: Publication
Status: Published
Schools: Computer Science & Informatics
Publisher: Association for Computing Machinery (ACM)
ISSN: 1533-5399
Date of First Compliant Deposit: 11 May 2019
Date of Acceptance: 3 May 2019
Last Modified: 04 Nov 2022 12:14
URI: https://orca.cardiff.ac.uk/id/eprint/122358

Actions (repository staff only)

Edit Item Edit Item