Stochastic Mirror Descent on Overparameterized Nonlinear Models.

Journal

IEEE transactions on neural networks and learning systems

ISSN: 2162-2388

Titre abrégé: IEEE Trans Neural Netw Learn Syst

Pays: United States

ID NLM: 101616214

Informations de publication

Date de publication:
Dec 2022

Historique:

pubmed: 17 7 2021

medline: 17 7 2021

entrez: 16 7 2021

Statut: ppublish

Résumé

Most modern learning problems are highly overparameterized, i.e., have many more model parameters than the number of training data points. As a result, the training loss may have infinitely many global minima (parameter vectors that perfectly "interpolate" the training data). It is therefore imperative to understand which interpolating solutions we converge to, how they depend on the initialization and learning algorithm, and whether they yield different test errors. In this article, we study these questions for the family of stochastic mirror descent (SMD) algorithms, of which stochastic gradient descent (SGD) is a special case. Recently, it has been shown that for overparameterized linear models, SMD converges to the closest global minimum to the initialization point, where closeness is in terms of the Bregman divergence corresponding to the potential function of the mirror descent. With appropriate initialization, this yields convergence to the minimum-potential interpolating solution, a phenomenon referred to as implicit regularization. On the theory side, we show that for sufficiently-overparameterized nonlinear models, SMD with a (small enough) fixed step size converges to a global minimum that is "very close" (in Bregman divergence) to the minimum-potential interpolating solution, thus attaining approximate implicit regularization. On the empirical side, our experiments on the MNIST and CIFAR-10 datasets consistently confirm that the above phenomenon occurs in practical scenarios. They further indicate a clear difference in the generalization performances of different SMD algorithms: experiments on the CIFAR-10 dataset with different regularizers, l

Identifiants

DOI: 10.1109/TNNLS.2021.3087480 PMID: 34270431

pubmed: 34270431

doi: 10.1109/TNNLS.2021.3087480

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

7717-7727

Stochastic Mirror Descent on Overparameterized Nonlinear Models.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Auteurs

Navid Azizan (N)

Sahin Lale (S)

Babak Hassibi (B)

Classifications MeSH