Artificial Neural Network Language Models Predict Human Brain Responses to Language Even After a Developmentally Realistic Amount of Training.

ANN-neural data alignment artificial neural network development human behavior language network

Journal

Neurobiology of language (Cambridge, Mass.)

ISSN: 2641-4368

Titre abrégé: Neurobiol Lang (Camb)

Pays: United States

ID NLM: 101763589

Informations de publication

Date de publication:
2024

Historique:

received: 29 03 2023

accepted: 09 01 2024

medline: 22 4 2024

pubmed: 22 4 2024

entrez: 22 4 2024

Statut: epublish

Résumé

Artificial neural networks have emerged as computationally plausible models of human language processing. A major criticism of these models is that the amount of training data they receive far exceeds that of humans during language learning. Here, we use two complementary approaches to ask how the models' ability to capture human fMRI responses to sentences is affected by the amount of training data. First, we evaluate GPT-2 models trained on 1 million, 10 million, 100 million, or 1 billion words against an fMRI benchmark. We consider the 100-million-word model to be developmentally plausible in terms of the amount of training data given that this amount is similar to what children are estimated to be exposed to during the first 10 years of life. Second, we test the performance of a GPT-2 model trained on a 9-billion-token dataset to reach state-of-the-art next-word prediction performance on the human benchmark at different stages during training. Across both approaches, we find that (i) the models trained on a developmentally plausible amount of data already achieve near-maximal performance in capturing fMRI responses to sentences. Further, (ii) lower perplexity-a measure of next-word prediction performance-is associated with stronger alignment with human data, suggesting that models that have received enough training to achieve sufficiently high next-word prediction performance also acquire representations of sentences that are predictive of human fMRI responses. In tandem, these findings establish that although

Identifiants

DOI: 10.1162/nol_a_00137 PMID: 38645622 PMC: PMC11025646

pubmed: 38645622

doi: 10.1162/nol_a_00137

pii: nol_a_00137

pmc: PMC11025646

doi:

Types de publication

Journal Article

Langues

eng

Pagination

43-63

Informations de copyright

Déclaration de conflit d'intérêts

Competing Interests: The authors have declared that no competing interests exist.

Artificial Neural Network Language Models Predict Human Brain Responses to Language Even After a Developmentally Realistic Amount of Training.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Eghbal A Hosseini (EA)

Martin Schrimpf (M)

Yian Zhang (Y)

Samuel Bowman (S)

Noga Zaslavsky (N)

Evelina Fedorenko (E)

Classifications MeSH