Scalable density-based subspace clustering

Emmanuel Müller, Ira Assent, Stephan Günnemann, Thomas Seidl

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Scopus citations

Abstract

For knowledge discovery in high dimensional databases, subspace clustering detects clusters in arbitrary subspace projections. Scalability is a crucial issue, as the number of possible projections is exponential in the number of dimensions. We propose a scalable density-based subspace clustering method that steers mining to few selected subspace clusters. Our novel steering technique reduces subspace processing by identifying and clustering promising subspaces and their combinations directly. Thereby, it narrows down the search space while maintaining accuracy. Thorough experiments on real and synthetic databases show that steering is efficient and scalable, with high quality results. For future work, our steering paradigm for density-based subspace clustering opens research potential for speeding up other subspace clustering approaches as well.

Original languageEnglish
Title of host publicationCIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
Pages1077-1086
Number of pages10
DOIs
StatePublished - 2011
Externally publishedYes
Event20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, United Kingdom
Duration: 24 Oct 201128 Oct 2011

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Conference

Conference20th ACM Conference on Information and Knowledge Management, CIKM'11
Country/TerritoryUnited Kingdom
CityGlasgow
Period24/10/1128/10/11

Keywords

  • data mining
  • density-based clustering
  • high dimensional data
  • scalability
  • subspace clustering

Fingerprint

Dive into the research topics of 'Scalable density-based subspace clustering'. Together they form a unique fingerprint.

Cite this