The Symplectic Adjoint Method: Memory-Efficient Backpropagation of Neural-Network-Based Differential Equations.

Journal

IEEE transactions on neural networks and learning systems

ISSN: 2162-2388

Titre abrégé: IEEE Trans Neural Netw Learn Syst

Pays: United States

ID NLM: 101616214

Informations de publication

Date de publication:
16 Feb 2023

Historique:

entrez: 7 4 2023

pubmed: 8 4 2023

medline: 8 4 2023

Statut: aheadofprint

Résumé

The combination of neural networks and numerical integration can provide highly accurate models of continuous-time dynamical systems and probabilistic distributions. However, if a neural network is used [Formula: see text] times during numerical integration, the whole computation graph can be considered as a network [Formula: see text] times deeper than the original. The backpropagation algorithm consumes memory in proportion to the number of uses times of the network size, causing practical difficulties. This is true even if a checkpointing scheme divides the computation graph into subgraphs. Alternatively, the adjoint method obtains a gradient by a numerical integration backward in time; although this method consumes memory only for single-network use, the computational cost of suppressing numerical errors is high. The symplectic adjoint method proposed in this study, an adjoint method solved by a symplectic integrator, obtains the exact gradient (up to rounding error) with memory proportional to the number of uses plus the network size. The theoretical analysis shows that it consumes much less memory than the naive backpropagation algorithm and checkpointing schemes. The experiments verify the theory, and they also demonstrate that the symplectic adjoint method is faster than the adjoint method and is more robust to rounding errors.

Identifiants

DOI: 10.1109/TNNLS.2023.3242345 PMID: 37027779

pubmed: 37027779

doi: 10.1109/TNNLS.2023.3242345

doi:

Types de publication

Journal Article

Langues

eng

The Symplectic Adjoint Method: Memory-Efficient Backpropagation of Neural-Network-Based Differential Equations.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Auteurs

Takashi Matsubara (T)

Yuto Miyatake (Y)

Takaharu Yaguchi (T)

Classifications MeSH