Computational reconstruction of mental representations using human behavior.

Humans Male Female Adult Neural Networks, Computer Semantics Young Adult Visual Perception / physiology Behavior Cognition / physiology Photic Stimulation / methods

Journal

Nature communications

ISSN: 2041-1723

Titre abrégé: Nat Commun

Pays: England

ID NLM: 101528555

Informations de publication

Date de publication:
17 May 2024

Historique:

received: 23 07 2023

accepted: 19 04 2024

medline: 18 5 2024

pubmed: 18 5 2024

entrez: 17 5 2024

Statut: epublish

Résumé

Revealing how the mind represents information is a longstanding goal of cognitive science. However, there is currently no framework for reconstructing the broad range of mental representations that humans possess. Here, we ask participants to indicate what they perceive in images made of random visual features in a deep neural network. We then infer associations between the semantic features of their responses and the visual features of the images. This allows us to reconstruct the mental representations of multiple visual concepts, both those supplied by participants and other concepts extrapolated from the same semantic space. We validate these reconstructions in separate participants and further generalize our approach to predict behavior for new stimuli and in a new task. Finally, we reconstruct the mental representations of individual observers and of a neural network. This framework enables a large-scale investigation of conceptual representations.

Identifiants

DOI: 10.1038/s41467-024-48114-6 PMID: 38760341

pubmed: 38760341

doi: 10.1038/s41467-024-48114-6

pii: 10.1038/s41467-024-48114-6

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

4183

Subventions

Organisme : National Science Foundation (NSF)

ID : CCF 1839308

Organisme : National Science Foundation (NSF)

ID : CCF 1839308

Informations de copyright

Références

Marr, D. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. (Henry Holt and Co., 1982).

Pylyshyn, Z. W. Computation and cognition: Issues in the foundations of cognitive science. Behav. Brain Sci. 3, 111–132 (1980).

doi: 10.1017/S0140525X00002053

Schyns, P. G., Gosselin, F. & Smith, M. L. Information processing algorithms in the brain. Trends Cogn. Sci. 13, 20–26 (2009).

pubmed: 19070533 doi: 10.1016/j.tics.2008.09.008

Wiener, N. Nonlinear Problems in Random Theory. (Wiley, 1958).

Ahumada Jr, A. J. Perceptual classification images from Vernier acuity masked by noise. Perception 25, (ECVP Abstract Supplement, 1996).

Ahumada, A. Jr & Lovell, J. Stimulus features in signal detection. J. Acoust. Soc. Am. 49, 1751–1756 (1971).

doi: 10.1121/1.1912577

Murray, R. F. Classification images: A review. J. Vis. 11, 2 (2011).

pubmed: 21536726 doi: 10.1167/11.5.2

Gosselin, F. & Schyns, P. G. Superstitious perceptions reveal properties of internal representations. Psychol. Sci. 14, 505–509 (2003).

pubmed: 12930484 doi: 10.1111/1467-9280.03452

Gosselin, F., Bacon, B. A. & Mamassian, P. Internal surface representations approximated by reverse correlation. Vis. Res. 44, 2515–2520 (2004).

pubmed: 15358086 doi: 10.1016/j.visres.2004.05.016

Morin-Duchesne, X., Gosselin, F., Fiset, D. & Dupuis-Roy, N. Paper features: A neglected source of information for letter recognition. J. Vis. 14, 11 (2014).

pubmed: 25398973 doi: 10.1167/14.13.11

Jack, R. E., Caldara, R. & Schyns, P. G. Internal representations reveal cultural diversity in expectations of facial expressions of emotion. J. Exp. Psychol.: Gen. 141, 19–25 (2012).

pubmed: 21517206 doi: 10.1037/a0023463

Dotsch, R. & Todorov, A. Reverse correlating social face perception. Soc. Psychol. Personal. Sci. 3, 562–571 (2012).

doi: 10.1177/1948550611430272

Éthier-Majcher, C., Joubert, S. & Gosselin, F. Reverse correlating trustworthy faces in young and older adults. Front. Psychol. 4, 592 (2013).

pubmed: 24046755 pmcid: 3763214 doi: 10.3389/fpsyg.2013.00592

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

pubmed: 26017442 doi: 10.1038/nature14539

Olah, C., Mordvintsev, A. & Schubert, L. Feature visualization. Distill 2, e7 (2017).

doi: 10.23915/distill.00007

Zeiler, M. D., & Fergus, R. Visualizing and understanding convolutional networks. European Conference on Computer Vision, 818–833 (2014).

Cichy, R. M., Khosla, A., Pantazis, D., Torralba, A. & Oliva, A. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Sci. Rep. 6, 27755 (2016).

pubmed: 27282108 pmcid: 4901271 doi: 10.1038/srep27755

Güçlu, U. & van Gerven, M. A. J. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J. Neurosci. 35, 10005–10014 (2015).

pubmed: 26157000 pmcid: 6605414 doi: 10.1523/JNEUROSCI.5023-14.2015

Yamins, D. L. K. et al. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Natl. Acad. Sci. 111, 8619–8624 (2014).

pubmed: 24812127 pmcid: 4060707 doi: 10.1073/pnas.1403112111

Beliy, R. et al. From voxels to pixels and back: Self-supervision in natural-image reconstruction from fMRI. Advances in Neural Information Processing Systems. 32 (2019).

Gaziv, G. et al. Self-supervised natural image reconstruction and rich semantic classification from brain activity. NeuroImage 254, 119121 (2022).

Ren, Z. et al. Reconstructing seen image from brain activity by visually-guided cognitive representation and adversarial learning. NeuroImage 228, 117602 (2021).

pubmed: 33395572 doi: 10.1016/j.neuroimage.2020.117602

Shen, G., Dwivedi, K., Majima, K., Horikawa, T. & Kamitani, Y. End-to-end deep image reconstruction from human brain activity. Front. Comput. Neurosci. 13, 21 (2019a).

pubmed: 31031613 pmcid: 6474395 doi: 10.3389/fncom.2019.00021

Shen, G., Horikawa, T., Majima, K. & Kamitani, Y. Deep image reconstruction from human brain activity. PLoS Comput. Biol. 15, e1006633–23 (2019b).

pubmed: 30640910 pmcid: 6347330 doi: 10.1371/journal.pcbi.1006633

Bashivan, P., Kar, K., & DiCarlo, J. J. Neural population control via deep image synthesis. Science 364, eaav9436 (2019).

Zijin, G. et al. NeuroGen: Activation optimized image synthesis for discovery neuroscience. NeuroImage 247, 118812 (2022).

doi: 10.1016/j.neuroimage.2021.118812

Senden, M., Emmerling, T. C., van Hoof, R., Frost, M. A. & Goebel, R. Reconstructing imagined letters from early visual cortex reveals tight topographic correspondence between visual mental imagery and perception. Brain Struct. Funct. 224, 1167–1183 (2019).

pubmed: 30637491 pmcid: 6499877 doi: 10.1007/s00429-019-01828-6

Bowers, J. S. et al. Deep problems with neural network models of human vision. Behav. Brain Sci. 46, e385 (2023).

doi: 10.1017/S0140525X22002813

Nguyen, A., Yosinski, J. & Clune, J. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 427–436 (2015).

Geirhos, R., et al. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. International Conference on Learning Representations (2019).

Schyns, P. G., Snoek, L. & Daube, C. Degrees of algorithmic equivalence between the brain and its DNN models. Trends Cogn. Sci. 26, 1090–1102 (2022).

pubmed: 36216674 doi: 10.1016/j.tics.2022.09.003

Daube, C. et al. Grounding deep neural network predictions of human categorization behavior in understandable functional features: The case of face identity. Patterns 2, 100348 (2021).

pubmed: 34693374 pmcid: 8515012 doi: 10.1016/j.patter.2021.100348

Jozwik, K. M. et al. Face dissimilarity judgments are predicted by representational distance in morphable and image-computable models. Proc. Natl. Acad. Sci. 199, e2115047119 (2022).

doi: 10.1073/pnas.2115047119

Yildirim, I., Belledonne, M., Freiwald, W., & Tenenbaum, J. Efficient inverse graphics in biological face processing. Sci. Adv. 6, eaax5979 (2020).

Ilyas, A. et al. Adversarial examples are not bugs, they are features. Advances in Neural Information Processing Systems. 32 (2019).

Golan, T., Raju, P. C. & Kriegeskorte, N. Controversial stimuli: Pitting neural networks against each other as models of human cognition. Proc. Natl. Acad. Sci. 117, 29330–29337 (2020).

pubmed: 33229549 pmcid: 7703564 doi: 10.1073/pnas.1912334117

Dharmaretnam, D., Foster, C. & Fyshe, A. Words as a window: Using word embeddings to explore the learned representations of Convolutional Neural Networks. Neural Netw. 137, 63–74 (2021).

pubmed: 33556802 doi: 10.1016/j.neunet.2020.12.009

Frome, A., et al. DeViSE: A Deep Visual-Semantic Embedding Model. Advances in Neural Information Processing Systems 26 (2013).

Bengio, Y., Ducharme, R., & Vincent, P. A neural probabilistic language model. Advances in Neural Information Processing Systems 13 (2000).

Huth, A. G., Nishimoto, S., Vu, A. T. & Gallant, J. L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210–1224 (2012).

pubmed: 23259955 pmcid: 3556488 doi: 10.1016/j.neuron.2012.10.014

Huth, A. G., de Heer, W. A., Griffiths, T. L., Theunissen, F. E. & Gallant, J. L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature 532, 453–458 (2016).

pubmed: 27121839 pmcid: 4852309 doi: 10.1038/nature17637

Bao, P., She, L., McGill, M. & Tsao, D. Y. A map of object space in primate inferotemporal cortex. Nature 583, 103–108 (2020).

pubmed: 32494012 pmcid: 8088388 doi: 10.1038/s41586-020-2350-5

Hebart, M. N., Zheng, C. Y., Pereira, F. & Baker, C. I. Revealing the multidimensional mental representations of natural objects underlying human similarity judgements. Nat. Hum. Behav. 4, 1173–1185 (2020).

pubmed: 33046861 pmcid: 7666026 doi: 10.1038/s41562-020-00951-3

Jha, A., Peterson, J. & Griffiths, T. L. Extracting low-dimensional psychological representations from convolutional neural networks. Cogn. Sci. 47, e13226 (2023).

Lehky, S. R., Kiani, R., Esteky, H. & Tanaka, K. Dimensionality of object representations in monkey inferotemporal cortex. Neural Comput. 26, 2135–2162 (2014).

pubmed: 25058707 pmcid: 4191674 doi: 10.1162/NECO_a_00648

Loper, E., & Bird, S. NLTK: The natural language toolkit. arXiv:cs/0205028 (2002).

Olah, C. et al. The building blocks of interpretability. Distill 3, e10 (2018).

doi: 10.23915/distill.00010

Krishna, R. et al. Visual genome: Connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis. 123, 32–73 (2017).

Kriegeskorte, N., Simmons, W. K., Bellgowan, P. S. F. & Baker, C. I. Circular analysis in systems neuroscience: The dangers of double dipping. Nat. Neurosci. 12, 535–540 (2009).

pubmed: 19396166 pmcid: 2841687 doi: 10.1038/nn.2303

Nishida, S., Blanc, A., Maeda, N., Kado, M. & Nishimoto, S. Behavioral correlates of cortical semantic representations modeled by word vectors. PLOS Comput. Biol. 17, e1009138 (2021).

pubmed: 34161315 pmcid: 8260002 doi: 10.1371/journal.pcbi.1009138

Xu, Y. & Vaziri-Pashkam, M. Limits to visual representational correspondence between convolutional neural networks and the human brain. Nat. Commun. 12, 2065 (2021).

pubmed: 33824315 pmcid: 8024324 doi: 10.1038/s41467-021-22244-7

Caplette, L., Wicker, B. & Gosselin, F. Atypical time course of object recognition in autism spectrum disorder. Sci. Rep. 6, 35494 (2016).

pubmed: 27752088 pmcid: 5067503 doi: 10.1038/srep35494

Tardif, J. et al. Use of face information varies systematically from developmental prosopagnosics to super-recognizers. Psychol. Sci. 30, 300–308 (2019).

pubmed: 30452304 doi: 10.1177/0956797618811338

DiCarlo, J. J. & Cox, D. D. Untangling invariant object recognition. Trends Cogn. Sci. 11, 333–341 (2007).

pubmed: 17631409 doi: 10.1016/j.tics.2007.06.010

Yamins, D. L. K. & DiCarlo, J. J. Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci. 19, 356–365 (2016).

pubmed: 26906502 doi: 10.1038/nn.4244

Zhan, J., Garrod, O. G. B., van Rijsbergen, N. & Schyns, P. G. Modelling face memory reveals task-generalizable representations. Nat. Hum. Behav. 3, 817–826 (2019).

pubmed: 31209368 doi: 10.1038/s41562-019-0625-3

Kheradpisheh, S. R., Ghodrati, M., Ganjtabesh, M. & Masquelier, T. Deep networks can resemble human feed-forward vision in invariant object recognition. Sci. Rep. 6, 32672 (2016).

pubmed: 27601096 pmcid: 5013454 doi: 10.1038/srep32672

Russakovsky, O. et al. ImageNet large scale visual recognition challenge. Int. J. Computer Vis. 115, 211–252 (2015).

doi: 10.1007/s11263-015-0816-y

Ho-Phuoc, T. CIFAR10 to compare visual recognition performance between deep neural networks and humans. arXiv:1811.07270 (2018).

Storrs, K. R., Kietzmann, T. C., Walther, A., Mehrer, J. & Kriegeskorte, N. Diverse deep neural networks all predict human inferior cortex well, after training and fitting. J. Cogn. Neurosci. 33, 2044–2064 (2020).

Touvron, H., Vedaldi, A., Douze, M., & Jégou, H. Fixing the train-test resolution discrepancy. Advances in Neural Information Processing Systems. 32 (2019).

Zhai, X., Kolesnikov, A., Houlsby, N., & Beyer, L. Scaling vision transformers. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12104–12113 (2022).

Dosovitskiy, A. et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (2021).

Goodfellow, I. et al. Generative adversarial networks. Commun. ACM 63, 139–144 (2020).

doi: 10.1145/3422622

Mehrer, J., Spoerer, C. J., Jones, E. C., Kriegeskorte, N. & Kietzmann, T. C. An ecologically motivated image dataset for deep learning yields better models of human vision. Proc. Natl Acad. Sci. 118, e2011417118 (2021).

pubmed: 33593900 pmcid: 7923360 doi: 10.1073/pnas.2011417118

Mikolov, T., Chen, K., Corrado, G. & Dean, J. Efficient estimation of word representations in vector space. In International Conference on Learning Representations (2013).

Pennington, J., Socher, R., & Manning, C. D. Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 1532–1543 (2014).

Nishida, S. & Nishimoto, S. Decoding naturalistic experiences from human brain activity via distributed representations of words. NeuroImage 180, 232–242 (2018).

pubmed: 28801255 doi: 10.1016/j.neuroimage.2017.08.017

Pereira, F. et al. Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9, 963 (2018).

pubmed: 29511192 pmcid: 5840373 doi: 10.1038/s41467-018-03068-4

Wang, S., Zhang, J., Wang, H., Lin, N. & Zong, C. Fine-grained neural decoding with distributed word representations. Inf. Sci. 507, 256–272 (2020).

doi: 10.1016/j.ins.2019.08.043

Gupta, T., Schwing, A., & Hoiem, D. Vico: Word embeddings from visual co-occurrences. Proceedings of the IEEE/CVF International Conference on Computer Vision, 7425–7434 (2019).

Hasegawa, M., Kobayashi, T., & Hayashi, Y. Incorporating visual features into word embeddings: A bimodal autoencoder-based approach. International Conference on Computational Semantics (2017).

Roads, B. D., & Love, B. C. Enriching ImageNet with human similarity judgments and psychological embeddings. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3547–3557 (2021).

Devlin, J., Chang, M., Lee, K. & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of NAACL-HLT. 4171–4186 (2019).

Reimers, N. & Gurevych, I. Sentence-bert: Sentence embeddings using Siamese bert-networks. In Conference on Empirical Methods in Natural Language Processing. 3982–3992 (2019).

Kriegeskorte, N. & Douglas, P. K. Cognitive computational neuroscience. Nat. Neurosci. 21, 1148–1160 (2018).

pubmed: 30127428 pmcid: 6706072 doi: 10.1038/s41593-018-0210-5

Kriegeskorte, N., Mur, M. & Bandettini, P. Representational similarity analysis – connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2, 4 (2008).

pubmed: 19104670 pmcid: 2605405

Naselaris, T., Kay, K. N., Nishimoto, S. & Gallant, J. L. Encoding and decoding in fMRI. NeuroImage 56, 400–410 (2011).

pubmed: 20691790 doi: 10.1016/j.neuroimage.2010.07.073

Thirion, B. et al. Inverse retinotopy: Inferring the visual content of images from brain activation patterns. NeuroImage 33, 1104–1116 (2006).

pubmed: 17029988 doi: 10.1016/j.neuroimage.2006.06.062

Long, B. et al. Mid-level perceptual features distinguish objects of different real-world sizes. J. Exp. Psychol.: Gen. 145, 95 (2016).

pubmed: 26709591 doi: 10.1037/xge0000130

Long, B., Yu, C. P. & Konkle, T. Mid-level visual features underlie the high-level categorical organization of the ventral stream. Proc. Natl Acad. Sci. 115, E9015–E9024 (2018).

pubmed: 30171168 pmcid: 6156638 doi: 10.1073/pnas.1719616115

Jagadeesh, A. V. & Gardner, J. L. Texture-like representation of objects in human visual cortex. Proc. Natl. Acad. Sci. 119, e2115302119 (2022).

pubmed: 35439063 pmcid: 9169962 doi: 10.1073/pnas.2115302119

Wammes, J. D., Norman, K. A. & Turk-Browne, N. B. Increasing stimulus similarity drives nonmonotonic representational change in hippocampus. eLife 11, e68344 (2022).

pubmed: 34989336 pmcid: 8735866 doi: 10.7554/eLife.68344

Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

pubmed: 32939066 pmcid: 7759461 doi: 10.1038/s41586-020-2649-2

Palan, S. & Schitter, C. Prolific.ac—A subject pool for online experiments. J. Behav. Exp. Financ. 17, 2227 (2018).

doi: 10.1016/j.jbef.2017.12.004

Brysbaert, M., Warriner, A. B. & Kuperman, V. Concreteness ratings for 40 thousand generally known English word lemmas. Behav. Res. Methods 46, 904–911 (2014).

pubmed: 24142837 doi: 10.3758/s13428-013-0403-5

He, K., Zhang, X., Ren, S., & Sun, J. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778 (2016).

Engstrom, L. et al. Adversarial robustness as a prior for learned representations. In International Conference on Learning Representations (2020).

Madry, A., Makelov, A., Schmidt, L., Tsipras, D. & Vladu, A. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations (2018).

Ledoit, O. & Wolf, M. Honey, I shrunk the sample covariance matrix. J. Portf. Manag. 30, 110–119 (2004).

doi: 10.3905/jpm.2004.110

Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. International Conference on Learning Representations (2014).

Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. International Conference on Learning Representations (2017).

Peirce, J. W. PsychoPy—psychophysics software in Python. J. Neurosci. Methods 162, 8–13 (2007).

pubmed: 17254636 pmcid: 2018741 doi: 10.1016/j.jneumeth.2006.11.017

Caplette, L., Gosselin, F. & West, G. L. Object expectations alter information use during visual recognition. Cognition 214, 104803 (2021).

pubmed: 34118587 doi: 10.1016/j.cognition.2021.104803

Holmes, A. P., Blair, R. C., Watson, J. D. G. & Ford, I. Nonparametric analysis of statistic images from functional mapping experiments. J. Cereb. Blood Flow. Metab. 16, 7–22 (1996).

pubmed: 8530558 doi: 10.1097/00004647-199601000-00002

Hilton, J., Cammarata, N., Carter, S., Goh, G. & Olah, C. Understanding RL Vision. Distill 5, e29 (2020).

doi: 10.23915/distill.00029

Kriegeskorte, N. & Mur, M. Inverse MDS: Inferring dissimilarity structure from multiple item arrangements. Front. Psychol. 3, 245 (2012).

pubmed: 22848204 pmcid: 3404552 doi: 10.3389/fpsyg.2012.00245

Diedrichsen, J., Berlot, E., Mur, M., Schütt, H. H., & Kriegeskorte, N. Comparing representational geometries using the unbiased distance correlation. arXiv:2007.02789 (2020).

Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

Charest, I., Kievit, R. A., Schmitz, T. W., Deca, D. & Kriegeskorte, N. Unique semantic space in the brain of each beholder predicts perceived similarity. Proc. Natl Acad. Sci. 111, 14565–14570 (2014).

pubmed: 25246586 pmcid: 4209976 doi: 10.1073/pnas.1402594111

Kim, G., Lewis-Peacock, J. A., Norman, K. A. & Turk-Browne, N. B. Pruning of memories by context-based prediction error. Proc. Natl. Acad. Sci. 111, 8997–9002 (2014).

pubmed: 24889631 pmcid: 4066528 doi: 10.1073/pnas.1319438111

Caplette, L. & Turk-Browne, N. B. Representation reconstruction from behavior. https://doi.org/10.17605/OSF.IO/MP3S6 (2024).

Caplette, L. & Turk-Browne, N. B. Representation-reconstruction. https://doi.org/10.5281/zenodo.10927712 (2024).

Computational reconstruction of mental representations using human behavior.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Informations de copyright

Références

Auteurs

Laurent Caplette (L)

Nicholas B Turk-Browne (NB)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH