Natural Reweighted Wake-Sleep.

Helmholtz machine Information geometry Natural gradient Wake–Sleep

Journal

Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Titre abrégé: Neural Netw
Pays: United States
ID NLM: 8805018

Informations de publication

Date de publication:
Nov 2022
Historique:
received: 01 04 2022
revised: 16 07 2022
accepted: 07 09 2022
pubmed: 9 10 2022
medline: 26 10 2022
entrez: 8 10 2022
Statut: ppublish

Résumé

Helmholtz Machines (HMs) are a class of generative models composed of two Sigmoid Belief Networks (SBNs), acting respectively as an encoder and a decoder. These models are commonly trained using a two-step optimization algorithm called Wake-Sleep (WS) and more recently by improved versions, such as Reweighted Wake-Sleep (RWS) and Bidirectional Helmholtz Machines (BiHM). The locality of the connections in an SBN induces sparsity in the Fisher Information Matrices associated to the probabilistic models, in the form of a finely-grained block-diagonal structure. In this paper we exploit this property to efficiently train SBNs and HMs using the natural gradient. We present a novel algorithm, called Natural Reweighted Wake-Sleep (NRWS), that corresponds to the geometric adaptation of its standard version. In a similar manner, we also introduce Natural Bidirectional Helmholtz Machine (NBiHM). Differently from previous work, we will show how for HMs the natural gradient can be efficiently computed without the need of introducing any approximation in the structure of the Fisher information matrix. The experiments performed on standard datasets from the literature show a consistent improvement of NRWS and NBiHM not only with respect to their non-geometric baselines but also with respect to state-of-the-art training algorithms for HMs. The improvement is quantified both in terms of speed of convergence as well as value of the log-likelihood reached after training.

Identifiants

pubmed: 36208615
pii: S0893-6080(22)00340-9
doi: 10.1016/j.neunet.2022.09.006
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

574-591

Informations de copyright

Copyright © 2022 Elsevier Ltd. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Romanian Institute of Science and Technology.

Auteurs

Csongor Várady (C)

Institute for Data Science Foundations, Hamburg University of Technology, Hamburg, Germany. Electronic address: csongor.varady@tuhh.de.

Riccardo Volpi (R)

Transylvanian Institute of Neuroscience, Cluj-Napoca, Romania; Quaesta AI, Cluj-Napoca, Romania.

Luigi Malagò (L)

Transylvanian Institute of Neuroscience, Cluj-Napoca, Romania; Quaesta AI, Cluj-Napoca, Romania.

Nihat Ay (N)

Institute for Data Science Foundations, Hamburg University of Technology, Hamburg, Germany; Santa Fe Institute, Santa Fe, USA; Leipzig University, Leipzig, Germany.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Humans Male Female Aged Middle Aged
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Humans Meta-Analysis as Topic Sample Size Models, Statistical Computer Simulation

Classifications MeSH