Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns.

Journal

Nature communications

ISSN: 2041-1723

Titre abrégé: Nat Commun

Pays: England

ID NLM: 101528555

Informations de publication

Date de publication:
30 Mar 2024

Historique:

received: 24 07 2022

accepted: 04 03 2024

medline: 30 3 2024

pubmed: 30 3 2024

entrez: 30 3 2024

Statut: epublish

Résumé

Contextual embeddings, derived from deep language models (DLMs), provide a continuous vectorial representation of language. This embedding space differs fundamentally from the symbolic representations posited by traditional psycholinguistics. We hypothesize that language areas in the human brain, similar to DLMs, rely on a continuous embedding space to represent language. To test this hypothesis, we densely record the neural activity patterns in the inferior frontal gyrus (IFG) of three participants using dense intracranial arrays while they listened to a 30-minute podcast. From these fine-grained spatiotemporal neural recordings, we derive a continuous vectorial representation for each word (i.e., a brain embedding) in each patient. Using stringent zero-shot mapping we demonstrate that brain embeddings in the IFG and the DLM contextual embedding space have common geometric patterns. The common geometric patterns allow us to predict the brain embedding in IFG of a given left-out word based solely on its geometrical relationship to other non-overlapping words in the podcast. Furthermore, we show that contextual embeddings capture the geometry of IFG embeddings better than static word embeddings. The continuous brain embedding space exposes a vector-based neural code for natural language processing in the human brain.

Identifiants

DOI: 10.1038/s41467-024-46631-y PMID: 38553456

pubmed: 38553456

doi: 10.1038/s41467-024-46631-y

pii: 10.1038/s41467-024-46631-y

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

2768

Subventions

Organisme : Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)

ID : R01MH112566

Organisme : Foundation for the National Institutes of Health (Foundation for the National Institutes of Health, Inc.)

ID : R01NS109367-01

Informations de copyright

Références

Lees, R. B. & Chomsky, N. Syntactic structures. Language 33, 375 (1957).

doi: 10.2307/411160

Fodor, J. A. The Language of Thought (Harvard Univ. Press, 1975).

Landauer, T. K. & Dumais, S. T. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol. Rev. 104, 211–240 (1997).

doi: 10.1037/0033-295X.104.2.211

Pennington, J., Socher, R. & Manning, C. Glove: global vectors for word representation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1532–1543 (Association for Computational Linguistics, 2014).

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. & Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems (eds. Burges, C. J. C., Bottou, L., Welling, M., Ghahramani, Z. & Weinberger, K. Q.) (Curran Associates Inc., 2013).

Radford, A. et al. Language models are unsupervised multitask learners. OpenAI blog 1, 9 (2019).

Vaswani, A. et al. Attention is all you need. In Proc. 31st International Conference on Neural Information Processing Systems. 6000–6010 (Curran Associates Inc., 2017).

Manning, C. D., Clark, K., Hewitt, J., Khandelwal, U. & Levy, O. Emergent linguistic structure in artificial neural networks trained by self-supervision. Proc. Natl Acad. Sci. USA 117, 30046–30054 (2020).

pubmed: 32493748 pmcid: 7720155 doi: 10.1073/pnas.1907367117

Linzen, T. & Baroni, M. Syntactic structure from deep learning. Annu. Rev. Linguist. https://doi.org/10.1146/annurev-linguistics-032020-051035 (2021).

Pavlick, E. Semantic structure in deep learning. Annu. Rev. Linguist. 8, 447–471 (2022).

doi: 10.1146/annurev-linguistics-031120-122924

Georgopoulos, A. P., Schwartz, A. B. & Kettner, R. E. Neuronal population coding of movement direction. Science 233, 1416–1419 (1986).

pubmed: 3749885 doi: 10.1126/science.3749885

Rolls, E. T. & Tovee, M. J. Sparseness of the neuronal representation of stimuli in the primate temporal visual cortex. J. Neurophysiol. 73, 713–726 (1995).

pubmed: 7760130 doi: 10.1152/jn.1995.73.2.713

Pouget, A., Dayan, P. & Zemel, R. Information processing with population codes. Nat. Rev. Neurosci. 1, 125–132 (2000).

pubmed: 11252775 doi: 10.1038/35039062

Chung, S. & Abbott, L. F. Neural population geometry: an approach for understanding biological and artificial neural networks. Curr. Opin. Neurobiol. 70, 137–144 (2021).

pubmed: 34801787 pmcid: 10695674 doi: 10.1016/j.conb.2021.10.010

Haxby, J. V. et al. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430 (2001).

pubmed: 11577229 doi: 10.1126/science.1063736

Norman, K. A., Polyn, S. M., Detre, G. J. & Haxby, J. V. Beyond mind-reading: multi-voxel pattern analysis of fMRI data. Trends Cogn. Sci. 10, 424–430 (2006).

pubmed: 16899397 doi: 10.1016/j.tics.2006.07.005

Haxby, J. V., Connolly, A. C. & Guntupalli, J. S. Decoding neural representational spaces using multivariate pattern analysis. Annu. Rev. Neurosci. 37, 435–456 (2014).

pubmed: 25002277 doi: 10.1146/annurev-neuro-062012-170325

Kriegeskorte, N. et al. Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron 60, 1126–1141 (2008).

pubmed: 19109916 pmcid: 3143574 doi: 10.1016/j.neuron.2008.10.043

Kriegeskorte, N. & Kievit, R. A. Representational geometry: integrating cognition, computation, and the brain. Trends Cogn. Sci. 17, 401–412 (2013).

pubmed: 23876494 pmcid: 3730178 doi: 10.1016/j.tics.2013.06.007

Sorscher, B., Ganguli, S. & Sompolinsky, H. Neural representational geometry underlies few-shot concept learning. Proc. Natl Acad. Sci. USA 119, e2200800119 (2022).

pubmed: 36251997 pmcid: 9618072 doi: 10.1073/pnas.2200800119

Hinton, G. E. Learning distributed representations of concepts. In Proc. Eighth Annual Conference of the Cognitive Science. (ed. Morris, R. G.M.) 46–61 (Erlbaum Associates, 1986).

Mitchell, T. M. et al. Predicting human brain activity associated with the meanings of nouns. Science 320, 1191–1195 (2008).

pubmed: 18511683 doi: 10.1126/science.1152876

Huth, A. G., de Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016).

pubmed: 27121839 pmcid: 4852309 doi: 10.1038/nature17637

Pereira, F. et al. Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9, 963 (2018).

pubmed: 29511192 pmcid: 5840373 doi: 10.1038/s41467-018-03068-4

Hamilton, L. S. & Huth, A. G. The revolution will not be controlled: natural stimuli in speech neuroscience. Lang. Cogn. Neurosci. 35, 573–582 (2020).

pubmed: 32656294 doi: 10.1080/23273798.2018.1499946

Nastase, S. A., Goldstein, A. & Hasson, U. Keep it real: rethinking the primacy of experimental control in cognitive neuroscience. Neuroimage 222, 117254 (2020).

pubmed: 32800992 doi: 10.1016/j.neuroimage.2020.117254

Yamins, D. L. K. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl Acad. Sci. USA 111, 8619–8624 (2014).

pubmed: 24812127 pmcid: 4060707 doi: 10.1073/pnas.1403112111

Güçlü, U. & van Gerven, M. A. J. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35, 10005–10014 (2015).

pubmed: 26157000 pmcid: 6605414 doi: 10.1523/JNEUROSCI.5023-14.2015

Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing. Annu. Rev. Vis. Sci. 1, 417–446 (2015).

pubmed: 28532370 doi: 10.1146/annurev-vision-082114-035447

Hassabis, D., Kumaran, D., Summerfield, C. & Botvinick, M. Neuroscience-inspired artificial intelligence. Neuron 95, 245–258 (2017).

pubmed: 28728020 doi: 10.1016/j.neuron.2017.06.011

Richards, B. A. et al. A deep learning framework for neuroscience. Nat. Neurosci. 22, 1761–1770 (2019).

pubmed: 31659335 pmcid: 7115933 doi: 10.1038/s41593-019-0520-2

Hasson, U., Nastase, S. A. & Goldstein, A. Direct fit to nature: an evolutionary perspective on biological and artificial neural networks. Neuron 105, 416–434 (2020).

pubmed: 32027833 pmcid: 7096172 doi: 10.1016/j.neuron.2019.12.002

Francl, A. & McDermott, J. H. Deep neural network models of sound localization reveal how perception is adapted to real-world environments. Nat. Hum. Behav. 6, 111–133 (2022).

pubmed: 35087192 pmcid: 8830739 doi: 10.1038/s41562-021-01244-z

Jain, S. & Huth, A. Incorporating context into language encoding models for fMRI. In Advances in Neural Information Processing Systems 31 (eds. Bengio, S. et al.) 6628–6637 (Curran Associates, Inc., 2018).

Toneva, M. & Wehbe, L. Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain). In 33rd Conference on Neural Information Processing Systems (NeurIPS 2019). (2019).

Antonello, R., Turek, J. S., Vo, V. & Huth, A. Low-dimensional structure in the space of language representations is reflected in brain responses. In Advances in Neural Information Processing Systems (eds. Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P. S. & Vaughan, J. W.) 8332–8344 (Curran Associates, Inc., 2021).

Caucheteux, C., Gramfort, A. & King, J.-R. Deep language algorithms predict semantic comprehension from brain activity. Sci. Rep. 12, 16327 (2022).

Schrimpf, M. et al. The neural architecture of language: Integrative modeling converges on predictive processing. Proc. Natl Acad. Sci. USA 118, e2105646118 (2021).

pubmed: 34737231 pmcid: 8694052 doi: 10.1073/pnas.2105646118

Caucheteux, C. & King, J.-R. Brains and algorithms partially converge in natural language processing. Commun. Biol. 5, 134 (2022).

pubmed: 35173264 pmcid: 8850612 doi: 10.1038/s42003-022-03036-1

Goldstein, A. et al. Shared computational principles for language processing in humans and deep language models. Nat. Neurosci. 25, 369–380 (2022).

pubmed: 35260860 pmcid: 8904253 doi: 10.1038/s41593-022-01026-4

Kumar, S. et al. Reconstructing the cascade of language processing in the brain using the internal computations of a transformer-based language model. Preprint at bioRxiv https://doi.org/10.1101/2022.06.08.495348 (2022).

Heilbron, M., Armeni, K., Schoffelen, J.-M., Hagoort, P. & de Lange, F. P. A hierarchy of linguistic predictions during natural language comprehension. Proc. Natl. Acad. Sci. USA. 119, e2201968119 (2022).

Willems, R. M., Frank, S. L., Nijhof, A. D., Hagoort, P. & van den Bosch, A. Prediction during natural language comprehension. Cereb. Cortex 26, 2506–2516 (2016).

pubmed: 25903464 doi: 10.1093/cercor/bhv075

De Risi, V. Mathematizing Space: The Objects of Geometry from Antiquity to the Early Modern Age (Birkhäuser, 2016).

Edelman, S. Representation is representation of similarities. Behav. Brain Sci. 21, 449–467 (1998). discussion 467–98.

pubmed: 10097019 doi: 10.1017/S0140525X98001253

Gardenfors, P. Conceptual spaces as a framework for knowledge representation. Mind Matter 2, 9–27 (2004).

Shepard, R. N. The analysis of proximities: multidimensional scaling with an unknown distance function. II. Psychometrika 27, 219–246 (1962).

doi: 10.1007/BF02289621

Hagoort, P. & Indefrey, P. The neurobiology of language beyond single words. Annu. Rev. Neurosci. 37, 347–362 (2014).

pubmed: 24905595 doi: 10.1146/annurev-neuro-071013-013847

Hagoort, P. On Broca, brain, and binding: a new framework. Trends Cogn. Sci. 9, 416–423 (2005).

pubmed: 16054419 doi: 10.1016/j.tics.2005.07.004

Yang, X. et al. Uncovering cortical activations of discourse comprehension and their overlaps with common large-scale neural networks. NeuroImage 203, 116200 (2019).

pubmed: 31536803 doi: 10.1016/j.neuroimage.2019.116200

Ishkhanyan, B. et al. Anterior and posterior left inferior frontal gyrus contribute to the implementation of grammatical determiners during language production. Front. Psychol. 11, 685 (2020).

pubmed: 32395113 pmcid: 7197372 doi: 10.3389/fpsyg.2020.00685

LaPointe, L. L. Paul Broca and the Origins of Language in the Brain (Plural Publishing, 2012).

Saur, D. et al. Ventral and dorsal pathways for language. Proc. Natl Acad. Sci. USA 105, 18035–18040 (2008).

pubmed: 19004769 pmcid: 2584675 doi: 10.1073/pnas.0805234105

Toutanvoa, K. & Manning, C. D. Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora 63–70 (Association for Computational Linguistics, 2000).

Ethayarajh, K. How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings. Preprint at arXiv [cs.CL] (2019).

Gupta, P. & Jaggi, M. Obtaining better static word embeddings using contextual embedding models. Preprint at arXiv [cs.CL] (2021).

Caucheteux, C., Gramfort, A. & King, J.-R. Disentangling syntax and semantics in the brain with deep networks. In Proceedings of the 38th International Conference on Machine Learning (eds. Meila, M. & Zhang, T.) 1336–1348 (PMLR, 2021).

Tenney, I., Das, D. & Pavlick, E. BERT rediscovers the classical NLP pipeline. Preprint at arXiv [cs.CL] (2019).

Goldstein, A. et al. Deep speech-to-text models capture the neural basis of spontaneous speech in everyday conversations. Preprint at bioRxiv https://doi.org/10.1101/2023.06.26.546557 (2023).

Su-Yi Leong, C. & Linzen, T. Language models can learn exceptions to syntactic rules. Preprint at arXiv:2306.05969 (2023).

Antonello, R., Turek, J., Vo, V. A. & Huth, A. G. Low-dimensional structure in the space of language representations is reflected in brain responses. Adv. Neural Inf. Process. Syst. 8332–8344 (2021).

Heeger, D. J. & Zemlianova, K. O. A recurrent circuit implements normalization, simulating the dynamics of V1 activity. Proc. Natl Acad. Sci. USA 117, 22494–22505 (2020).

pubmed: 32843341 pmcid: 7486719 doi: 10.1073/pnas.2005417117

Hewitt, J. & Manning, C. D. A structural probe for finding syntax in word representations. In Proc. 2019 Conference of the North American Chapter of the association for Computational Linguistics: Human Language Technologies Volume 1 4129–4138 (Association for Computational Linguistics, 2019).

Yuan, J. & Liberman, M. Speaker identification on the SCOTUS corpus. J. Acoust. Soc. Am. 123, 3878 (2008).

doi: 10.1121/1.2935783

Tunstall, L., von Werra, L. & Wolf, T. Natural Language Processing with Transformers: Building Language Applications with Hugging Face (O’Reilly Media, 2022).

Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B (Methodol.) 57, 289–300 (1995).

Goldstein, A. Source data for figures in Nature Communication paper -"Alignment of Brain Embeddings and Artificial Contextual Embeddings in Natural Language Points to Common Geometric Patterns a https://zenodo.org/records/10658831 .

Goldstein, A. et al. https://github.com/hassonlab/247-plotting/blob/main/scripts/tfspaper_zeroshot.ipynb .

Desikan, R. S. et al. An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31, 968–980 (2006).

pubmed: 16530430 doi: 10.1016/j.neuroimage.2006.01.021

Alignment of brain embeddings and artificial contextual embeddings in natural language points to common geometric patterns.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Informations de copyright

Références

Auteurs

Ariel Goldstein (A)

Avigail Grinstein-Dabush (A)

Mariano Schain (M)

Haocheng Wang (H)

Zhuoqiao Hong (Z)

Bobbi Aubrey (B)

Mariano Schain (M)

Samuel A Nastase (SA)

Zaid Zada (Z)

Eric Ham (E)

Amir Feder (A)

Harshvardhan Gazula (H)

Eliav Buchnik (E)

Werner Doyle (W)

Sasha Devore (S)

Patricia Dugan (P)

Roi Reichart (R)

Daniel Friedman (D)

Michael Brenner (M)

Avinatan Hassidim (A)

Orrin Devinsky (O)

Adeen Flinker (A)

Uri Hasson (U)

Classifications MeSH