TY - GEN
T1 - Multi-view clustering using mixture models in subspace projections
AU - Günnemann, Stephan
AU - Färber, Ines
AU - Seidl, Thomas
PY - 2012
Y1 - 2012
N2 - Detecting multiple clustering solutions is an emerging research field. While data is often multi-faceted in its very nature, traditional clustering methods are restricted to find just a single grouping. To overcome this limitation, methods aiming at the detection of alternative and multiple clustering solutions have been proposed. In this work, we present a Bayesian framework to tackle the problem of multi-view clustering. We provide multiple generalizations of the data by using multiple mixture models. Each mixture describes a specific view on the data by using a mixture of Beta distributions in subspace projections. Since a mixture summarizes the clusters located in similar subspace projections, each view highlights specific aspects of the data. In addition, our model handles overlapping views, where the mixture components compete against each other in the data generation process. For efficiently learning the distributions, we propose the algorithm MVGen that exploits the ICM principle and uses Bayesian model selection to trade-off the cluster model's complexity against its goodness of fit. With experiments on various real-world data sets, we demonstrate the high potential of MVGen to detect multiple, overlapping clustering views in subspace projections of the data.
AB - Detecting multiple clustering solutions is an emerging research field. While data is often multi-faceted in its very nature, traditional clustering methods are restricted to find just a single grouping. To overcome this limitation, methods aiming at the detection of alternative and multiple clustering solutions have been proposed. In this work, we present a Bayesian framework to tackle the problem of multi-view clustering. We provide multiple generalizations of the data by using multiple mixture models. Each mixture describes a specific view on the data by using a mixture of Beta distributions in subspace projections. Since a mixture summarizes the clusters located in similar subspace projections, each view highlights specific aspects of the data. In addition, our model handles overlapping views, where the mixture components compete against each other in the data generation process. For efficiently learning the distributions, we propose the algorithm MVGen that exploits the ICM principle and uses Bayesian model selection to trade-off the cluster model's complexity against its goodness of fit. With experiments on various real-world data sets, we demonstrate the high potential of MVGen to detect multiple, overlapping clustering views in subspace projections of the data.
KW - generative model
KW - graphical model
KW - model-based clustering
UR - http://www.scopus.com/inward/record.url?scp=84866022951&partnerID=8YFLogxK
U2 - 10.1145/2339530.2339553
DO - 10.1145/2339530.2339553
M3 - Conference contribution
AN - SCOPUS:84866022951
SN - 9781450314626
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 132
EP - 140
BT - KDD'12 - 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
T2 - 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012
Y2 - 12 August 2012 through 16 August 2012
ER -