Abstract
Density estimation is one of the most frequently used data analytics techniques. A major challenge of real-world datasets is missing values, originating e.g. from sampling errors or data loss. The recovery of these is often impossible or too expensive. Missing values are not necessarily limited to a few features or samples, rendering methods based on complete auxiliary variables unsuitable. In this paper we introduce three models able to deal with such datasets. They are based on the new concept of virtual objects. Additionally, we present a computationally efficient approximation. Generalizing KDE, our methods are called Warp-KDE. Experiments with incomplete datasets show that Warp-KDE methods are superior to established imputation methods.
Original language | English |
---|---|
Pages | 747-755 |
Number of pages | 9 |
DOIs | |
State | Published - 2018 |
Event | 2018 SIAM International Conference on Data Mining, SDM 2018 - San Diego, United States Duration: 3 May 2018 → 5 May 2018 |
Conference
Conference | 2018 SIAM International Conference on Data Mining, SDM 2018 |
---|---|
Country/Territory | United States |
City | San Diego |
Period | 3/05/18 → 5/05/18 |