Abstract
We introduce a fast algorithm for entrywise evaluation of the Gauss-Newton Hessian (GNH) matrix for the fully connected feed-forward neural network. The algorithm has a precomputation step and a sampling step. While it generally requires O (Nn) work to compute an entry (and the entire column) in the GNH matrix for a neural network with N parameters and n data points, our fast sampling algorithm reduces the cost to O (n+d/ϵ2) work, where d is the output dimension of the network and ϵ is a prescribed accuracy (independent of N). One application of our algorithm is constructing the hierarchical-matrix (H -matrix) approximation of the GNH matrix for solving linear systems and eigenvalue problems. It generally requires O (N2) memory and O (N3) work to store and factorize the GNH matrix, respectively. The H -matrix approximation requires only O (Nro) memory footprint and O (Nr2o) work to be factorized, where ro « N is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the H -matrix approximation on classification and autoencoder neural networks.
Original language | English |
---|---|
Pages (from-to) | 165-184 |
Number of pages | 20 |
Journal | SIAM Journal on Matrix Analysis and Applications |
Volume | 42 |
Issue number | 1 |
DOIs | |
State | Published - Feb 2021 |
Keywords
- Fast Monte Carlo sampling
- Gauss-Newton Hessian
- Hierarchical matrix
- Multilayer perceptron
- Secondorder optimization