Estimation of Missing Values in Incomplete Industrial Process Data Sets Using ECM Algorithm

Mina Fahimi Pirehgalin, Birgit Vogel-Heuser

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Estimation of missing values is an essential step in data pre-processing to increase the data quality for further data mining approaches. The significance of estimation of missing values in industrial data sets is that different operational situations cannot be describe properly while data sets includes missing values. In this paper, Expectation Conditional Maximization is used to find an approximated model over the data based on Gaussian distribution. Then, in the Expectation step, Sweep operation is used to obtain the regression model of missing values on observable values and estimate the missing values based on observable data. In order to evaluate the results a process data set for a real industrial production system is considered. The missing values are simulated by randomly removing the data from variables. Finally, the accuracy of the proposed method in estimation of missing values is discussed as well as the effect of imputation of missing values on further data analysis.

Original languageEnglish
Title of host publicationProceedings - IEEE 16th International Conference on Industrial Informatics, INDIN 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages251-257
Number of pages7
ISBN (Electronic)9781538648292
DOIs
StatePublished - 24 Sep 2018
Event16th IEEE International Conference on Industrial Informatics, INDIN 2018 - Porto, Portugal
Duration: 18 Jul 201820 Jul 2018

Publication series

NameProceedings - IEEE 16th International Conference on Industrial Informatics, INDIN 2018

Conference

Conference16th IEEE International Conference on Industrial Informatics, INDIN 2018
Country/TerritoryPortugal
CityPorto
Period18/07/1820/07/18

Keywords

  • Expectation Conditional Maximization
  • Likelihood Inference
  • Missing Data
  • Multivariate Gaussian Distribution
  • Sweep Matrix

Fingerprint

Dive into the research topics of 'Estimation of Missing Values in Incomplete Industrial Process Data Sets Using ECM Algorithm'. Together they form a unique fingerprint.

Cite this