TY - JOUR
T1 - Machine Learning Driven Design Of Experiments For Predictive Models In Production Systems
AU - Maier, Sebastian
AU - Zimmermann, Patrick
AU - Daub, Rüdiger
N1 - Publisher Copyright:
© 2023, Publish-Ing in cooperation with TIB - Leibniz Information Centre for Science and Technology University Library. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Machine learning (ML) describes the ability of algorithms to structure and interpret data independently or to learn correlations. The use of ML is steadily increasing in companies of all sizes. However, insufficient market readiness of many ML solutions inhibits their application, especially in production systems. Predictive models apply ML to understand the complex behavior of a system through regression from operational data. This enables determining the relationship between factors and target variables. Accurate predictions of these models for production systems are essential for their application, as even minor variations can significantly affect the process. This accuracy depends on the available data to train the ML model. Production data usually shows a high epistemic uncertainty, leading to inaccurate predictions unfit for real-world applications. This paper presents ML-driven, data-centric Design of Experiments (DoE) to create a process-specific dataset with low epistemic uncertainty. This leads to improved accuracy of the predictive models, ultimately making them feasible for production systems. Our approach focuses on determining epistemic uncertainty in historical data of a production system to find data points of high value to the ML model in the factor space. To identify an efficient set of experiments, we cluster these data points weighted by feature importance. We evaluate the model by running these experiments and using the collected data for further training of a prediction model. Our approach achieves a significantly higher increase in accuracy compared to continuing the training of the prediction model with the same amount of regular operating data.
AB - Machine learning (ML) describes the ability of algorithms to structure and interpret data independently or to learn correlations. The use of ML is steadily increasing in companies of all sizes. However, insufficient market readiness of many ML solutions inhibits their application, especially in production systems. Predictive models apply ML to understand the complex behavior of a system through regression from operational data. This enables determining the relationship between factors and target variables. Accurate predictions of these models for production systems are essential for their application, as even minor variations can significantly affect the process. This accuracy depends on the available data to train the ML model. Production data usually shows a high epistemic uncertainty, leading to inaccurate predictions unfit for real-world applications. This paper presents ML-driven, data-centric Design of Experiments (DoE) to create a process-specific dataset with low epistemic uncertainty. This leads to improved accuracy of the predictive models, ultimately making them feasible for production systems. Our approach focuses on determining epistemic uncertainty in historical data of a production system to find data points of high value to the ML model in the factor space. To identify an efficient set of experiments, we cluster these data points weighted by feature importance. We evaluate the model by running these experiments and using the collected data for further training of a prediction model. Our approach achieves a significantly higher increase in accuracy compared to continuing the training of the prediction model with the same amount of regular operating data.
KW - Design of Experiments
KW - Epistemic Uncertainty
KW - Machine Learning
KW - Predictive Models
KW - Production Systems
UR - http://www.scopus.com/inward/record.url?scp=85187986160&partnerID=8YFLogxK
U2 - 10.15488/15289
DO - 10.15488/15289
M3 - Conference article
AN - SCOPUS:85187986160
SN - 2701-6277
SP - 110
EP - 118
JO - Proceedings of the Conference on Production Systems and Logistics
JF - Proceedings of the Conference on Production Systems and Logistics
T2 - 5th Conference on Production Systems and Logistics, CPSL 2023
Y2 - 14 November 2023 through 17 November 2023
ER -