Making kernel density estimation robust towards missing values in highly incomplete multivariate data without imputation

Richard Leibrandt, Stephan Günnemann

Research output: Contribution to conferencePaperpeer-review

6 Scopus citations

Abstract

Density estimation is one of the most frequently used data analytics techniques. A major challenge of real-world datasets is missing values, originating e.g. from sampling errors or data loss. The recovery of these is often impossible or too expensive. Missing values are not necessarily limited to a few features or samples, rendering methods based on complete auxiliary variables unsuitable. In this paper we introduce three models able to deal with such datasets. They are based on the new concept of virtual objects. Additionally, we present a computationally efficient approximation. Generalizing KDE, our methods are called Warp-KDE. Experiments with incomplete datasets show that Warp-KDE methods are superior to established imputation methods.

Original languageEnglish
Pages747-755
Number of pages9
DOIs
StatePublished - 2018
Event2018 SIAM International Conference on Data Mining, SDM 2018 - San Diego, United States
Duration: 3 May 20185 May 2018

Conference

Conference2018 SIAM International Conference on Data Mining, SDM 2018
Country/TerritoryUnited States
CitySan Diego
Period3/05/185/05/18

Fingerprint

Dive into the research topics of 'Making kernel density estimation robust towards missing values in highly incomplete multivariate data without imputation'. Together they form a unique fingerprint.

Cite this