Discriminative multimodal learning via conditional priors in generative models.

Generative models Multimodal learning Representation learning Variational autoencoder

Journal

Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Titre abrégé: Neural Netw
Pays: United States
ID NLM: 8805018

Informations de publication

Date de publication:
02 Nov 2023
Historique:
received: 16 08 2022
revised: 15 09 2023
accepted: 30 10 2023
medline: 7 11 2023
pubmed: 7 11 2023
entrez: 6 11 2023
Statut: aheadofprint

Résumé

Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data, which depict an object from different viewpoints. These two learning mechanisms can, however, conflict with each other and representations can fail to embed information on the data modalities. This research studies the realistic scenario in which all modalities and class labels are available for model training, e.g. images or handwriting, but where some modalities and labels required for downstream tasks are missing, e.g. text or annotations. We show, in this scenario, that the variational lower bound limits mutual information between joint representations and missing modalities. We, to counteract these problems, introduce a novel conditional multi-modal discriminative model that uses an informative prior distribution and optimizes a likelihood-free objective function that maximizes mutual information between joint representations and missing modalities. Extensive experimentation demonstrates the benefits of our proposed model, empirical results show that our model achieves state-of-the-art results in representative problems such as downstream classification, acoustic inversion, and image and annotation generation.

Identifiants

pubmed: 37931473
pii: S0893-6080(23)00610-X
doi: 10.1016/j.neunet.2023.10.048
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

417-430

Informations de copyright

Copyright © 2023 The Author(s). Published by Elsevier Ltd.. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Rogelio A Mancisidor (RA)

Department of Data Science and Analytics, BI Norwegian Business School, Nydalsveien 37, 0484 Oslo, Norway. Electronic address: rogelio.a.mancisidor@bi.no.

Michael Kampffmeyer (M)

Department of Physics and Technology, Faculty of Science and Technology, UiT The Arctic University of Norway, Hansine Hansens veg 18, 9037 Tromsø, Norway; Norwegian Computing Center, P.O. Box 114 Blindern Oslo, Norway. Electronic address: michael.c.kampffmeyer@uit.no.

Kjersti Aas (K)

Norwegian Computing Center, P.O. Box 114 Blindern Oslo, Norway. Electronic address: kjersti@nr.no.

Robert Jenssen (R)

Department of Physics and Technology, Faculty of Science and Technology, UiT The Arctic University of Norway, Hansine Hansens veg 18, 9037 Tromsø, Norway; Norwegian Computing Center, P.O. Box 114 Blindern Oslo, Norway. Electronic address: robert.jenssen@uit.no.

Classifications MeSH