Abstract
In this paper we consider a measure-theoretical formulation of the training of NeurODEs in the form of a mean-field optimal control with L2-regularization of the control. We derive first order optimality conditions for the NeurODE training problem in the form of a mean-field maximum principle, and show that it admits a unique control solution, which is Lipschitz continuous in time. As a consequence of this uniqueness property, the mean-field maximum principle also provides a strong quantitative generalization error for finite sample approximations, yielding a rigorous justification of a phenomenon that we call coupled descent, indicating the simultaneous decrease of generalization and training errors. We consider two approaches to the derivation of the mean-field maximum principle, including one that is based on a generalized Lagrange multiplier theorem on convex sets of spaces of measures, which is arguably much simpler than those currently available in the literature for mean-field optimal control problems. The latter is also new, and can be considered as a result of independent interest.
| Original language | English |
|---|---|
| Article number | 113161 |
| Journal | Nonlinear Analysis, Theory, Methods and Applications |
| Volume | 227 |
| DOIs | |
| State | Published - Feb 2023 |
Keywords
- Lagrange Multiplier Theorem
- Mean-field maximum principle
- Mean-field optimal control
- NeurODEs
Fingerprint
Dive into the research topics of 'A measure theoretical approach to the mean-field maximum principle for training NeurODEs'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver