Scientific workflow mining in clouds

Wei Song, Fangfei Chen, Hans Arno Jacobsen, Xiaoxu Xia, Chunyang Ye, Xiaoxing Ma

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

Computing clouds have become the platform of choice for the deployment and execution of scientific workflows. Due to the uncertainty and unpredictability of scientific exploration, the execution plan for a scientific workflow may vary from the definition. It is therefore of great significance to be able to discover actual workflows from execution histories (event logs) to reproduce experimental results and to establish provenance. However, most existing process mining techniques focus on discovering control flow-oriented business processes in a centralized environment, and thus, they are mostly inapplicable to the discovery of data flow-oriented, unstructured scientific workflows in distributed cloud environments. In this paper, we present Scientific Workflow Mining as a Service ({\sf SWMaaS to support both intra-cloud and inter-cloud scientific workflow mining. The approach is implemented as a {ProM}} plug-in and is evaluated on event logs derived from real-world scientific workflows. Through experimental results, we demonstrate the effectiveness and efficiency of our approach.

Original languageEnglish
Article number7907335
Pages (from-to)2979-2992
Number of pages14
JournalIEEE Transactions on Parallel and Distributed Systems
Volume28
Issue number10
DOIs
StatePublished - 1 Oct 2017
Externally publishedYes

Keywords

  • Scientific workflow
  • direct precedence
  • event log
  • inter-cloud
  • workflow mining

Fingerprint

Dive into the research topics of 'Scientific workflow mining in clouds'. Together they form a unique fingerprint.

Cite this