TY - GEN
T1 - Generating Reliable Process Event Streams and Time Series Data Based on Neural Networks
AU - Herbert, Tobias
AU - Mangler, Juergen
AU - Rinderle-Ma, Stefanie
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Domains such as manufacturing and medicine crave for continuous monitoring and analysis of their processes, especially in combination with time series as produced by sensors. Time series data can be exploited to, for example, explain and predict concept drifts during runtime. Generally, a certain data volume is required in order to produce meaningful analysis results. However, reliable data sets are often missing, for example, if event streams and times series data are collected separately, in case of a new process, or if it is too expensive to obtain a sufficient data volume. Additional challenges arise with preparing time series data from multiple event sources, variations in data collection frequency, and concept drift. This paper proposes the GENLOG approach to generate reliable event and time series data that follows the distribution of the underlying input data set. GENLOG employs data resampling and enables the user to select different parts of the log data to orchestrate the training of a recurrent neural network for stream generation. The generated data is sampled back to its original sample rate and is embedded into the originating log data file. Overall, GENLOG can boost small data sets and consequently the application of online process mining.
AB - Domains such as manufacturing and medicine crave for continuous monitoring and analysis of their processes, especially in combination with time series as produced by sensors. Time series data can be exploited to, for example, explain and predict concept drifts during runtime. Generally, a certain data volume is required in order to produce meaningful analysis results. However, reliable data sets are often missing, for example, if event streams and times series data are collected separately, in case of a new process, or if it is too expensive to obtain a sufficient data volume. Additional challenges arise with preparing time series data from multiple event sources, variations in data collection frequency, and concept drift. This paper proposes the GENLOG approach to generate reliable event and time series data that follows the distribution of the underlying input data set. GENLOG employs data resampling and enables the user to select different parts of the log data to orchestrate the training of a recurrent neural network for stream generation. The generated data is sampled back to its original sample rate and is embedded into the originating log data file. Overall, GENLOG can boost small data sets and consequently the application of online process mining.
KW - Deep learning
KW - Recurrent neural network
KW - Reliable dataset boosting
KW - Synthetic log data
KW - Time series generation
UR - http://www.scopus.com/inward/record.url?scp=85111854700&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-79186-5_6
DO - 10.1007/978-3-030-79186-5_6
M3 - Conference contribution
AN - SCOPUS:85111854700
SN - 9783030791858
T3 - Lecture Notes in Business Information Processing
SP - 81
EP - 95
BT - Enterprise, Business-Process and Information Systems Modeling - 22nd International Conference, BPMDS 2021, and 26th International Conference, EMMSAD 2021, Held at CAiSE 2021, Proceedings
A2 - Augusto, Adriano
A2 - Gill, Asif
A2 - Nurcan, Selmin
A2 - Reinhartz-Berger, Iris
A2 - Schmidt, Rainer
A2 - Zdravkovic, Jelena
PB - Springer Science and Business Media Deutschland GmbH
T2 - 22nd International Conference on Business Process Modeling, Development and Support, BPMDS 2021 and 26th International Conference on Exploring Modeling Methods for Systems Analysis and Development, EMMSAD 2021 Held at CAiSE 2021
Y2 - 28 June 2021 through 29 June 2021
ER -