TY - GEN
T1 - Towards reliable predictive process monitoring
AU - Klinkmüller, Christopher
AU - van Beest, Nick R.T.P.
AU - Weber, Ingo
N1 - Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.
PY - 2018
Y1 - 2018
N2 - Predictive process monitoring is concerned with anticipating the future behavior of running process instances. Prior work primarily focused on the performance of monitoring approaches and spent little effort on understanding other aspects such as reliability. This limits the potential to reuse the approaches across scenarios. From this starting point, we discuss how synthetic data can facilitate a better understanding of approaches and then use synthetic data in two experiments. We focus on prediction as classification of process instances during execution, solely considering the discrete event behavior. First, we compare different feature representations and reveal that sub-trace occurrence can cover a broader variety of relationships in the data than other representations. Second, we present evidence that the popular strategy of cutting traces to certain prefix lengths to learn prediction models for ongoing instances is prone to yield unreliable models and that the underlying problem can be avoided by using approaches that learn from complete traces. Our experiments provide a basis for future research and highlight that an evaluation solely targeting performance incurs the risk of incorrectly assessing benefits and limitations.
AB - Predictive process monitoring is concerned with anticipating the future behavior of running process instances. Prior work primarily focused on the performance of monitoring approaches and spent little effort on understanding other aspects such as reliability. This limits the potential to reuse the approaches across scenarios. From this starting point, we discuss how synthetic data can facilitate a better understanding of approaches and then use synthetic data in two experiments. We focus on prediction as classification of process instances during execution, solely considering the discrete event behavior. First, we compare different feature representations and reveal that sub-trace occurrence can cover a broader variety of relationships in the data than other representations. Second, we present evidence that the popular strategy of cutting traces to certain prefix lengths to learn prediction models for ongoing instances is prone to yield unreliable models and that the underlying problem can be avoided by using approaches that learn from complete traces. Our experiments provide a basis for future research and highlight that an evaluation solely targeting performance incurs the risk of incorrectly assessing benefits and limitations.
KW - Behavioral classification
KW - Machine learning
KW - Predictive process monitoring
KW - Process mining
UR - http://www.scopus.com/inward/record.url?scp=85048614318&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-92901-9_15
DO - 10.1007/978-3-319-92901-9_15
M3 - Conference contribution
AN - SCOPUS:85048614318
SN - 9783319929002
T3 - Lecture Notes in Business Information Processing
SP - 163
EP - 181
BT - Information Systems in the Big Data Era - CAiSE Forum 2018, Proceedings
A2 - Mendling, Jan
A2 - Mouratidis, Haralambos
PB - Springer Verlag
T2 - CAiSE Forum 2018 held as part of the 30th International Conference on Advanced Information Systems Engineering, CAiSE 2018
Y2 - 11 June 2018 through 15 June 2018
ER -