TY - JOUR
T1 - A measure theoretical approach to the mean-field maximum principle for training NeurODEs
AU - Bonnet, Benoît
AU - Cipriani, Cristina
AU - Fornasier, Massimo
AU - Huang, Hui
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2023/2
Y1 - 2023/2
N2 - In this paper we consider a measure-theoretical formulation of the training of NeurODEs in the form of a mean-field optimal control with L2-regularization of the control. We derive first order optimality conditions for the NeurODE training problem in the form of a mean-field maximum principle, and show that it admits a unique control solution, which is Lipschitz continuous in time. As a consequence of this uniqueness property, the mean-field maximum principle also provides a strong quantitative generalization error for finite sample approximations, yielding a rigorous justification of a phenomenon that we call coupled descent, indicating the simultaneous decrease of generalization and training errors. We consider two approaches to the derivation of the mean-field maximum principle, including one that is based on a generalized Lagrange multiplier theorem on convex sets of spaces of measures, which is arguably much simpler than those currently available in the literature for mean-field optimal control problems. The latter is also new, and can be considered as a result of independent interest.
AB - In this paper we consider a measure-theoretical formulation of the training of NeurODEs in the form of a mean-field optimal control with L2-regularization of the control. We derive first order optimality conditions for the NeurODE training problem in the form of a mean-field maximum principle, and show that it admits a unique control solution, which is Lipschitz continuous in time. As a consequence of this uniqueness property, the mean-field maximum principle also provides a strong quantitative generalization error for finite sample approximations, yielding a rigorous justification of a phenomenon that we call coupled descent, indicating the simultaneous decrease of generalization and training errors. We consider two approaches to the derivation of the mean-field maximum principle, including one that is based on a generalized Lagrange multiplier theorem on convex sets of spaces of measures, which is arguably much simpler than those currently available in the literature for mean-field optimal control problems. The latter is also new, and can be considered as a result of independent interest.
KW - Lagrange Multiplier Theorem
KW - Mean-field maximum principle
KW - Mean-field optimal control
KW - NeurODEs
UR - http://www.scopus.com/inward/record.url?scp=85142163326&partnerID=8YFLogxK
U2 - 10.1016/j.na.2022.113161
DO - 10.1016/j.na.2022.113161
M3 - Article
AN - SCOPUS:85142163326
SN - 0362-546X
VL - 227
JO - Nonlinear Analysis, Theory, Methods and Applications
JF - Nonlinear Analysis, Theory, Methods and Applications
M1 - 113161
ER -