Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation

Loucas Pillaud-Vivien; Julien Reygner; Nicolas Flammarion

Communication Dans Un Congrès Année : 2022

Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation

(1) , (2) , (1)

1
2

Loucas Pillaud-Vivien

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Julien Reygner

Fonction : Auteur

Centre d'Enseignement et de Recherche en Mathématiques et Calcul Scientifique

Nicolas Flammarion

Fonction : Auteur

Ecole Polytechnique Fédérale de Lausanne

Résumé

Understanding the implicit bias of training algorithms is of crucial importance in order to explain the success of overparametrised neural networks. In this paper, we study the role of the label noise in the training dynamics of a quadratically parametrised model through its continuous time version. We explicitly characterise the solution chosen by the stochastic flow and prove that it implicitly solves a Lasso program. To fully complete our analysis, we provide nonasymptotic convergence guarantees for the dynamics as well as conditions for support recovery. We also give experimental results which support our theoretical claims. Our findings highlight the fact that structured noise can induce better generalisation and help explain the greater performances of stochastic dynamics as observed in practice.

Domaines

Machine Learning [stat.ML]

Julien Reygner : Connectez-vous pour contacter le contributeur

https://enpc.hal.science/hal-03701409

Soumis le : mercredi 22 juin 2022-10:03:50

Dernière modification le : mercredi 15 mars 2023-15:45:13

Dates et versions

hal-03701409 , version 1 (22-06-2022)

Identifiants

HAL Id : hal-03701409 , version 1
ARXIV : 2206.09841

Citer

Loucas Pillaud-Vivien, Julien Reygner, Nicolas Flammarion. Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation. Thirty Fifth Conference on Learning Theory, Jul 2022, Londres, United Kingdom. pp.2127-2159. ⟨hal-03701409⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENPC CERMICS PARISTECH ANR

16 Consultations

0 Téléchargements

Label noise (stochastic) gradient descent implicitly solves the Lasso for quadratic parametrisation

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager