Learning Numerosity Representations with Transformers: Number Generation Tasks and Out-of-Distribution Generalization.

attention mechanisms cognitive modeling deep neural networks density estimation numerosity perception

Journal

Entropy (Basel, Switzerland)
ISSN: 1099-4300
Titre abrégé: Entropy (Basel)
Pays: Switzerland
ID NLM: 101243874

Informations de publication

Date de publication:
03 Jul 2021
Historique:
received: 17 05 2021
revised: 23 06 2021
accepted: 29 06 2021
entrez: 6 8 2021
pubmed: 7 8 2021
medline: 7 8 2021
Statut: epublish

Résumé

One of the most rapidly advancing areas of deep learning research aims at creating models that learn to disentangle the latent factors of variation from a data distribution. However, modeling joint probability mass functions is usually prohibitive, which motivates the use of conditional models assuming that some information is given as input. In the domain of numerical cognition, deep learning architectures have successfully demonstrated that approximate numerosity representations can emerge in multi-layer networks that build latent representations of a set of images with a varying number of items. However, existing models have focused on tasks requiring to conditionally estimate numerosity information from a

Identifiants

pubmed: 34356398
pii: e23070857
doi: 10.3390/e23070857
pmc: PMC8303966
pii:
doi:

Types de publication

Journal Article

Langues

eng

Subventions

Organisme : Fondazione Cassa di Risparmio di Padova e Rovigo
ID : Progetti di eccellenza 2017 Numsense

Références

Philos Trans R Soc Lond B Biol Sci. 2017 Feb 19;373(1740):
pubmed: 29292348
Neural Comput. 2006 Jul;18(7):1527-54
pubmed: 16764513
J Exp Psychol Gen. 2012 Nov;141(4):642-8
pubmed: 22082115
Nat Rev Neurosci. 2010 Feb;11(2):127-38
pubmed: 20068583
Front Psychol. 2013 Aug 20;4:515
pubmed: 23970869
Front Comput Neurosci. 2016 Jul 13;10:73
pubmed: 27468262
Sci Rep. 2020 Jun 22;10(1):10045
pubmed: 32572067
Psychol Sci. 2008 Jun;19(6):607-14
pubmed: 18578852
Nat Hum Behav. 2017 Sep;1(9):657-664
pubmed: 31024135
Trends Cogn Sci. 2007 Oct;11(10):428-34
pubmed: 17921042
Science. 2013 Sep 6;341(6150):1123-6
pubmed: 24009396
Behav Brain Sci. 2013 Jun;36(3):181-204
pubmed: 23663408
Dev Sci. 2020 Sep;23(5):e12940
pubmed: 31977137
Psychon Bull Rev. 2021 Feb;28(1):158-168
pubmed: 32949010
Trends Cogn Sci. 2003 Apr;7(4):145-147
pubmed: 12691758
Nature. 2015 May 28;521(7553):436-44
pubmed: 26017442
Front Hum Neurosci. 2020 Mar 20;14:100
pubmed: 32265678
Dev Sci. 2016 Mar;19(2):329-37
pubmed: 25754974
Proc Natl Acad Sci U S A. 2021 Jan 19;118(3):
pubmed: 33431673
Nat Neurosci. 2012 Jan 08;15(2):194-6
pubmed: 22231428
Cogn Process. 2017 Aug;18(3):273-284
pubmed: 28238168
Trends Cogn Sci. 2008 Jun;12(6):213-8
pubmed: 18468942
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828
pubmed: 23787338

Auteurs

Tommaso Boccato (T)

Department of General Psychology, University of Padova, Via Venezia 8, 35131 Padova, Italy.

Alberto Testolin (A)

Department of General Psychology, University of Padova, Via Venezia 8, 35131 Padova, Italy.
Department of Information Engineering, University of Padova, Via Gradenigo 6, 35131 Padova, Italy.

Marco Zorzi (M)

Department of General Psychology, University of Padova, Via Venezia 8, 35131 Padova, Italy.
IRCCS San Camillo Hospital, Via Alberoni 70, 30126 Venice-Lido, Italy.

Classifications MeSH