TY - GEN
T1 - Partition-based clustering in object bases
T2 - 4th International Conference on Foundations of Data Organization and Algorithms, FODO 1993
AU - Gerlhof, Carsten
AU - Kemper, Alfons
AU - Kilger, Christoph
AU - Moerkotte, Guido
N1 - Publisher Copyright:
© 1993, Springer Verlag. All rights reserved.
PY - 1993
Y1 - 1993
N2 - We classify clustering algorithms into sequence-based techniques-which transform the object net into a linear sequence—and partition-based clustering algorithms. Tsangaris and Naughton [TN91, TN92] have shown that the partition-based techniques are superior. However, their work is based on a single partitioning algorithm, the Kernighart and Lin heuristics, which is not applicable to realistically large object bases because of its high running-time complexity. The contribution of this paper is two-fold: (1) we devise a new class of greedy object graph partitioning algorithms (GGP) whose running-time complexity is moderate while still yielding good quality results. (2) Our extensive quantitative analysis of all well-known partitioning algorithms indicates that no one algorithm performs superior for all object net characteristics. Therefore, we propose an adaptable clustering strategy according to a multi-dimensional grid: the dimensions correspond to particular characteristics of the object base-given by, e.g., number and size of objects, degree of object sharing-and the grid entries indicate the most suitable clustering algorithm for the particular configuration.
AB - We classify clustering algorithms into sequence-based techniques-which transform the object net into a linear sequence—and partition-based clustering algorithms. Tsangaris and Naughton [TN91, TN92] have shown that the partition-based techniques are superior. However, their work is based on a single partitioning algorithm, the Kernighart and Lin heuristics, which is not applicable to realistically large object bases because of its high running-time complexity. The contribution of this paper is two-fold: (1) we devise a new class of greedy object graph partitioning algorithms (GGP) whose running-time complexity is moderate while still yielding good quality results. (2) Our extensive quantitative analysis of all well-known partitioning algorithms indicates that no one algorithm performs superior for all object net characteristics. Therefore, we propose an adaptable clustering strategy according to a multi-dimensional grid: the dimensions correspond to particular characteristics of the object base-given by, e.g., number and size of objects, degree of object sharing-and the grid entries indicate the most suitable clustering algorithm for the particular configuration.
UR - http://www.scopus.com/inward/record.url?scp=85028756510&partnerID=8YFLogxK
U2 - 10.1007/3-540-57301-1_20
DO - 10.1007/3-540-57301-1_20
M3 - Conference contribution
AN - SCOPUS:85028756510
SN - 9783540573012
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 301
EP - 316
BT - Foundations of Data Organization and Algorithms - 4th International Conference, FODO 1993, Proceedings
A2 - Lomet, David B.
PB - Springer Verlag
Y2 - 13 October 1993 through 15 October 1993
ER -