Abstract
Computing clouds have become the platform of choice for the deployment and execution of scientific workflows. Due to the uncertainty and unpredictability of scientific exploration, the execution plan for a scientific workflow may vary from the definition. It is therefore of great significance to be able to discover actual workflows from execution histories (event logs) to reproduce experimental results and to establish provenance. However, most existing process mining techniques focus on discovering control flow-oriented business processes in a centralized environment, and thus, they are mostly inapplicable to the discovery of data flow-oriented, unstructured scientific workflows in distributed cloud environments. In this paper, we present Scientific Workflow Mining as a Service ({\sf SWMaaS to support both intra-cloud and inter-cloud scientific workflow mining. The approach is implemented as a {ProM}} plug-in and is evaluated on event logs derived from real-world scientific workflows. Through experimental results, we demonstrate the effectiveness and efficiency of our approach.
Original language | English |
---|---|
Article number | 7907335 |
Pages (from-to) | 2979-2992 |
Number of pages | 14 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 28 |
Issue number | 10 |
DOIs | |
State | Published - 1 Oct 2017 |
Externally published | Yes |
Keywords
- Scientific workflow
- direct precedence
- event log
- inter-cloud
- workflow mining