Substructure clustering: A novel mining paradigm for arbitrary data types

Stephan Günnemann, Brigitte Boden, Thomas Seidl

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations


Subspace clustering is an established mining task for grouping objects that are represented by vector data. By considering subspace projections of the data, the problem of full-space clustering is avoided: objects show no similarity w.r.t. all of their attributes but only w.r.t. subsets of their characteristics. This effect is not limited to vector data but can be observed in several other scientific domains including graphs, where we just find similar subgraphs, or time series, where only shorter subsequences show the same behavior. In each scenario, using the whole representation of the objects for clustering is futile. We need to find clusters of similar substructures. However, none of the existing substructure mining paradigms as subspace clustering, frequent subgraph mining, or motif discovery is able to solve this task entirely since they tackle only a few challenges and are restricted to a specific type of data. In this work, we unify and generalize existing substructure mining tasks to the novel paradigm of substructure clustering that is applicable to data of an arbitrary type. As a proof of concept showing the feasibility of our novel paradigm, we present a specific instantiation for the task of subgraph clustering. By integrating the ideas of different research areas into a novel paradigm, the aim of our paper is to inspire future research directions in the individual areas.

Original languageEnglish
Title of host publicationScientific and Statistical Database Management - 24th International Conference, SSDBM 2012, Proceedings
Number of pages18
StatePublished - 2012
Externally publishedYes
Event24th International Conference on Scientific and Statistical DatabaseManagement, SSDBM 2012 - Chania, Crete, Greece
Duration: 25 Jun 201227 Jun 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7338 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference24th International Conference on Scientific and Statistical DatabaseManagement, SSDBM 2012
CityChania, Crete


Dive into the research topics of 'Substructure clustering: A novel mining paradigm for arbitrary data types'. Together they form a unique fingerprint.

Cite this