On variance reduction in stochastic gradient descent and its asynchronous variants

Sashank J. Reddi, Ahmed Hefny, Barnabás Póczos, Sra Suvrit, Alex Smola

Research output: Contribution to journalConference articlepeer-review

101 Scopus citations

Abstract

We study optimization algorithms based on variance reduction for stochastic gradient descent (SGD). Remarkable recent progress has been made in this direction through development of algorithms like SAG, SVRG, SAGA. These algorithms have been shown to outperform SGD, both theoretically and empirically. However, asynchronous versions of these algorithms-A crucial requirement for modern large-scale applications-have not been studied. We bridge this gap by presenting a unifying framework for many variance reduction techniques. Subsequently, we propose an asynchronous algorithm grounded in our framework, and prove its fast convergence. An important consequence of our general approach is that it yields asynchronous versions of variance reduction algorithms such as SVRG and SAGA as a byproduct. Our method achieves near linear speedup in sparse settings common to machine learning. We demonstrate the empirical performance of our method through a concrete realization of asynchronous SVRG.

Original languageEnglish
Pages (from-to)2647-2655
Number of pages9
JournalAdvances in Neural Information Processing Systems
Volume2015-January
StatePublished - 2015
Externally publishedYes
Event29th Annual Conference on Neural Information Processing Systems, NIPS 2015 - Montreal, Canada
Duration: 7 Dec 201512 Dec 2015

Fingerprint

Dive into the research topics of 'On variance reduction in stochastic gradient descent and its asynchronous variants'. Together they form a unique fingerprint.

Cite this