TY - JOUR
T1 - An alternative to EM for Gaussian mixture models
T2 - batch and stochastic Riemannian optimization
AU - Hosseini, Reshad
AU - Sra, Suvrit
N1 - Publisher Copyright:
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.
AB - We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.
KW - Gaussian mixture models
KW - Non-asymptotic rate of convergence
KW - Positive definite matrices
KW - Retraction
KW - Riemannian optimization
KW - Stochastic optimization
UR - http://www.scopus.com/inward/record.url?scp=85063224864&partnerID=8YFLogxK
U2 - 10.1007/s10107-019-01381-4
DO - 10.1007/s10107-019-01381-4
M3 - Article
AN - SCOPUS:85063224864
SN - 0025-5610
VL - 181
SP - 187
EP - 223
JO - Mathematical Programming
JF - Mathematical Programming
IS - 1
ER -