Changing the Geometry of Representations:

attention mechanism information geometry word embeddings α-embeddings

Journal

Entropy (Basel, Switzerland)

ISSN: 1099-4300

Titre abrégé: Entropy (Basel)

Pays: Switzerland

ID NLM: 101243874

Informations de publication

Date de publication:
26 Feb 2021

Historique:

received: 06 11 2020

accepted: 23 11 2020

entrez: 3 3 2021

pubmed: 4 3 2021

medline: 4 3 2021

Statut: epublish

Résumé

Word embeddings based on a conditional model are commonly used in Natural Language Processing (NLP) tasks to embed the words of a dictionary in a low dimensional linear space. Their computation is based on the maximization of the likelihood of a conditional probability distribution for each word of the dictionary. These distributions form a Riemannian statistical manifold, where word embeddings can be interpreted as vectors in the tangent space of a specific reference measure on the manifold. A novel family of word embeddings, called α-embeddings have been recently introduced as deriving from the geometrical deformation of the simplex of probabilities through a parameter α, using notions from Information Geometry. After introducing the α-embeddings, we show how the deformation of the simplex, controlled by α, provides an extra handle to increase the performances of several intrinsic and extrinsic tasks in NLP. We test the α-embeddings on different tasks with models of increasing complexity, showing that the advantages associated with the use of α-embeddings are present also for models with a large number of parameters. Finally, we show that tuning α allows for higher performances compared to the use of larger models in which additionally a transformation of the embeddings is learned during training, as experimentally verified in attention models.

Identifiants

DOI: 10.3390/e23030287 PMID: 33652911 PMC: PMC7996742

pubmed: 33652911

pii: e23030287

doi: 10.3390/e23030287

pmc: PMC7996742

pii:

doi:

Types de publication

Journal Article

Langues

eng

Subventions

Organisme : European Regional Development Fund

ID : project ID P_37_71

Références

Behav Res Methods. 2007 Aug;39(3):510-26

pubmed: 17958162

Behav Res Methods. 2012 Sep;44(3):890-907

pubmed: 22258891

Changing the Geometry of Representations:

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Subventions

Références

Auteurs

Riccardo Volpi (R)

Uddhipan Thakur (U)

Luigi Malagò (L)

Classifications MeSH