Bidirectionally self-normalizing neural networks.

Neural networks Optimization Training Vanishing/exploding gradient problem

Journal

Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Titre abrégé: Neural Netw
Pays: United States
ID NLM: 8805018

Informations de publication

Date de publication:
Oct 2023
Historique:
received: 11 10 2022
revised: 09 08 2023
accepted: 11 08 2023
medline: 23 10 2023
pubmed: 5 9 2023
entrez: 4 9 2023
Statut: ppublish

Résumé

The problem of vanishing and exploding gradients has been a long-standing obstacle that hinders the effective training of neural networks. Despite various tricks and techniques that have been employed to alleviate the problem in practice, there still lacks satisfactory theories or provable solutions. In this paper, we address the problem from the perspective of high-dimensional probability theory. We provide a rigorous result that shows, under mild conditions, how the vanishing/exploding gradients problem disappears with high probability if the neural networks have sufficient width. Our main idea is to constrain both forward and backward signal propagation in a nonlinear neural network through a new class of activation functions, namely Gaussian-Poincaré normalized functions, and orthogonal weight matrices. Experiments on both synthetic and real-world data validate our theory and confirm its effectiveness on very deep neural networks when applied in practice.

Identifiants

pubmed: 37666186
pii: S0893-6080(23)00436-7
doi: 10.1016/j.neunet.2023.08.017
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

283-291

Informations de copyright

Copyright © 2023 Elsevier Ltd. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Yao Lu (Y)

Australian National University, Australia; Peking University, China. Electronic address: yaolubrain@gmail.com.

Stephen Gould (S)

Australian National University, Australia. Electronic address: stephen.gould@anu.edu.au.

Thalaiyasingam Ajanthan (T)

Australian National University, Australia; Amazon. Electronic address: thalaiyasingam.ajanthan@anu.edu.au.

Articles similaires

Unsupervised learning for real-time and continuous gait phase detection.

Dollaporn Anopas, Yodchanan Wongsawat, Jetsada Arnin
1.00
Humans Gait Neural Networks, Computer Unsupervised Machine Learning Walking
Humans Shoulder Fractures Tomography, X-Ray Computed Neural Networks, Computer Female
Humans Artificial Intelligence Neoplasms Prognosis Image Processing, Computer-Assisted
Humans Deep Learning Mouth Neoplasms Drug Resistance, Neoplasm Cell Line, Tumor

Classifications MeSH