TY - GEN
T1 - Subspace correlation clustering
T2 - 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012
AU - Günnemann, Stephan
AU - Färber, Ines
AU - Virochsiri, Kittipat
AU - Seidl, Thomas
PY - 2012
Y1 - 2012
N2 - The necessity to analyze subspace projections of complex data is a well-known fact in the clustering community. While the full space may be obfuscated by overlapping patterns and irrelevant dimensions, only certain subspaces are able to reveal the clustering structure. Subspace clustering discards irrelevant dimensions and allows objects to belong to multiple, overlapping clusters due to individual subspace projections for each set of objects. As we will demonstrate, the observations, which originate the need to consider subspace projections for traditional clustering, also apply for the task of correlation analysis. In this work, we introduce the novel paradigm of subspace correlation clustering: we analyze subspace projections to find subsets of objects showing linear correlations among this subset of dimensions. In contrast to existing techniques, which determine correlations based on the full-space, our method is able to exclude locally irrelevant dimensions, enabling more precise detection of the correlated features. Since we analyze subspace projections, each object can contribute to several correlations. Our model allows multiple overlapping clusters in general but simultaneously avoids redundant clusters deducible from already known correlations. We introduce the algorithm SSCC that exploits different pruning techniques to efficiently generate a subspace correlation clustering. In thorough experiments we demonstrate the strength of our novel paradigm in comparison to existing methods.
AB - The necessity to analyze subspace projections of complex data is a well-known fact in the clustering community. While the full space may be obfuscated by overlapping patterns and irrelevant dimensions, only certain subspaces are able to reveal the clustering structure. Subspace clustering discards irrelevant dimensions and allows objects to belong to multiple, overlapping clusters due to individual subspace projections for each set of objects. As we will demonstrate, the observations, which originate the need to consider subspace projections for traditional clustering, also apply for the task of correlation analysis. In this work, we introduce the novel paradigm of subspace correlation clustering: we analyze subspace projections to find subsets of objects showing linear correlations among this subset of dimensions. In contrast to existing techniques, which determine correlations based on the full-space, our method is able to exclude locally irrelevant dimensions, enabling more precise detection of the correlated features. Since we analyze subspace projections, each object can contribute to several correlations. Our model allows multiple overlapping clusters in general but simultaneously avoids redundant clusters deducible from already known correlations. We introduce the algorithm SSCC that exploits different pruning techniques to efficiently generate a subspace correlation clustering. In thorough experiments we demonstrate the strength of our novel paradigm in comparison to existing methods.
KW - linear correlations
KW - overlapping clusters
UR - https://www.scopus.com/pages/publications/84866030149
U2 - 10.1145/2339530.2339588
DO - 10.1145/2339530.2339588
M3 - Conference contribution
AN - SCOPUS:84866030149
SN - 9781450314626
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 352
EP - 360
BT - KDD'12 - 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Y2 - 12 August 2012 through 16 August 2012
ER -