Abstract
Analyzing uncertain databases is a challenge in data mining research. Usually, data mining methods rely on precise values. In scenarios where uncertain values occur, e.g. due to noisy sensor readings, these algorithms cannot deliver high-quality patterns. Beside uncertainty, data mining methods face another problem: high dimensional data. For finding object groupings with locally relevant dimensions in this data, subspace clustering was introduced. For high dimensional uncertain data, however, deciding whether dimensions are relevant for a subspace cluster is even more challenging; thus, approaches for effective subspace clustering on uncertain databases are needed. In this paper, we develop a method for subspace clustering for uncertain data that delivers high-quality patterns; the information provided by the individual distributions of objects is used in an effective manner. Because in uncertain scenarios a strict assignment of objects to single clusters is not appropriate, we enrich our model with the concept of membership degree. Subspace clustering for uncertain data is computationally expensive; thus, we propose an efficient algorithm. In thorough experiments we show the effectiveness and efficiency of our new subspace clustering method.
Original language | English |
---|---|
Pages | 385-396 |
Number of pages | 12 |
DOIs | |
State | Published - 2010 |
Externally published | Yes |
Event | 10th SIAM International Conference on Data Mining, SDM 2010 - Columbus, OH, United States Duration: 29 Apr 2010 → 1 May 2010 |
Conference
Conference | 10th SIAM International Conference on Data Mining, SDM 2010 |
---|---|
Country/Territory | United States |
City | Columbus, OH |
Period | 29/04/10 → 1/05/10 |