Fast approximation of the gauss-newton hessian matrix for the multilayer perceptron

Chao Chen, Severin Reiz, Chenhan D. Yu, Hans Joachim Bungartz, George Biros

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

We introduce a fast algorithm for entrywise evaluation of the Gauss-Newton Hessian (GNH) matrix for the fully connected feed-forward neural network. The algorithm has a precomputation step and a sampling step. While it generally requires O (Nn) work to compute an entry (and the entire column) in the GNH matrix for a neural network with N parameters and n data points, our fast sampling algorithm reduces the cost to O (n+d/ϵ2) work, where d is the output dimension of the network and ϵ is a prescribed accuracy (independent of N). One application of our algorithm is constructing the hierarchical-matrix (H -matrix) approximation of the GNH matrix for solving linear systems and eigenvalue problems. It generally requires O (N2) memory and O (N3) work to store and factorize the GNH matrix, respectively. The H -matrix approximation requires only O (Nro) memory footprint and O (Nr2o) work to be factorized, where ro « N is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the H -matrix approximation on classification and autoencoder neural networks.

Original languageEnglish
Pages (from-to)165-184
Number of pages20
JournalSIAM Journal on Matrix Analysis and Applications
Volume42
Issue number1
DOIs
StatePublished - Feb 2021

Keywords

  • Fast Monte Carlo sampling
  • Gauss-Newton Hessian
  • Hierarchical matrix
  • Multilayer perceptron
  • Secondorder optimization

Fingerprint

Dive into the research topics of 'Fast approximation of the gauss-newton hessian matrix for the multilayer perceptron'. Together they form a unique fingerprint.

Cite this