Subspace clustering for uncertain data

Stephan Günnemann, Hardy Kremer, Thomas Seidl

Research output: Contribution to conferencePaperpeer-review

28 Scopus citations

Abstract

Analyzing uncertain databases is a challenge in data mining research. Usually, data mining methods rely on precise values. In scenarios where uncertain values occur, e.g. due to noisy sensor readings, these algorithms cannot deliver high-quality patterns. Beside uncertainty, data mining methods face another problem: high dimensional data. For finding object groupings with locally relevant dimensions in this data, subspace clustering was introduced. For high dimensional uncertain data, however, deciding whether dimensions are relevant for a subspace cluster is even more challenging; thus, approaches for effective subspace clustering on uncertain databases are needed. In this paper, we develop a method for subspace clustering for uncertain data that delivers high-quality patterns; the information provided by the individual distributions of objects is used in an effective manner. Because in uncertain scenarios a strict assignment of objects to single clusters is not appropriate, we enrich our model with the concept of membership degree. Subspace clustering for uncertain data is computationally expensive; thus, we propose an efficient algorithm. In thorough experiments we show the effectiveness and efficiency of our new subspace clustering method.

Original languageEnglish
Pages385-396
Number of pages12
DOIs
StatePublished - 2010
Externally publishedYes
Event10th SIAM International Conference on Data Mining, SDM 2010 - Columbus, OH, United States
Duration: 29 Apr 20101 May 2010

Conference

Conference10th SIAM International Conference on Data Mining, SDM 2010
Country/TerritoryUnited States
CityColumbus, OH
Period29/04/101/05/10

Fingerprint

Dive into the research topics of 'Subspace clustering for uncertain data'. Together they form a unique fingerprint.

Cite this