Gromov-Wasserstein unsupervised alignment reveals structural correspondences between the color similarity structures of humans and large language models.

Color similarity structures Gromov–Wasserstein optimal transport Large language models Unsupervised alignment

Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
10 Jul 2024
Historique:
received: 21 01 2024
accepted: 21 06 2024
medline: 11 7 2024
pubmed: 11 7 2024
entrez: 10 7 2024
Statut: epublish

Résumé

Large Language Models (LLMs), such as the General Pre-trained Transformer (GPT), have shown remarkable performance in various cognitive tasks. However, it remains unclear whether these models have the ability to accurately infer human perceptual representations. Previous research has addressed this question by quantifying correlations between similarity response patterns of humans and LLMs. Correlation provides a measure of similarity, but it relies pre-defined item labels and does not distinguish category- and item- level similarity, falling short of characterizing detailed structural correspondence between humans and LLMs. To assess their structural equivalence in more detail, we propose the use of an unsupervised alignment method based on Gromov-Wasserstein optimal transport (GWOT). GWOT allows for the comparison of similarity structures without relying on pre-defined label correspondences and can reveal fine-grained structural similarities and differences that may not be detected by simple correlation analysis. Using a large dataset of similarity judgments of 93 colors, we compared the color similarity structures of humans (color-neurotypical and color-atypical participants) and two GPT models (GPT-3.5 and GPT-4). Our results show that the similarity structure of color-neurotypical participants can be remarkably well aligned with that of GPT-4 and, to a lesser extent, to that of GPT-3.5. These results contribute to the methodological advancements of comparing LLMs with human perception, and highlight the potential of unsupervised alignment methods to reveal detailed structural correspondences.

Identifiants

pubmed: 38987348
doi: 10.1038/s41598-024-65604-1
pii: 10.1038/s41598-024-65604-1
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

15917

Informations de copyright

© 2024. The Author(s).

Références

Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 4171–4186 (2018).
Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y.T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M.T., & Zhang, Y. Sparks of artificial general intelligence: Early experiments with GPT-4. arXiv preprint arXiv:2303.12712 (2023).
Binz, M. & Schulz, E. Using cognitive psychology to understand GPT-3. Proc. Natl. Acad. Sci 120(6), e2218523120 (2023).
doi: 10.1073/pnas.2218523120 pubmed: 36730192 pmcid: 9963545
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., & Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing System, 5998–6008 (2017).
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., & Amodei, D. Language models are Few-Shot learners. In Advances in Neural Information Processing Systems, 1877–1901 (2020).
OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
Kosinski, M. Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083 (2023).
Marjieh, R., Sucholutsky, I., van Rijn, P., Jacoby, N., & Griffiths, T.L. Large language models predict human sensory judgments across six modalities. arXiv preprint arXiv:2302.01308 (2023).
Kriegeskorte, N. & Kievit, R. A. Representational geometry: Integrating cognition, computation, and the brain. Trends Cogn. Sci 17, 401–412 (2013).
doi: 10.1016/j.tics.2013.06.007 pubmed: 23876494 pmcid: 3730178
Roads, B. D. & Love, B. C. Modeling similarity and psychological space. Annu. Rev. Psychol. 75, 215–40 (2024).
doi: 10.1146/annurev-psych-040323-115131 pubmed: 37562499
Williams, A., Kunz, E., Kornblith, S. & Linderman, S. Generalized shape metrics on neural representations. Adv. Neural Inf. Process. Syst. 34, 4738–4750 (2021).
pubmed: 38170102 pmcid: 10760997
Marjieh, R., van Rijn, P., Sucholutsky, I., Sumers, T. R., Lee, H., Griffiths, T. L., & Jacoby, N. Words are all you need? Capturing human sensory similarity with textual descriptors. arXiv preprint arXiv:2206.04105 (2022).
Marjieh, R., Sucholutsky, I., Sumers, T. R., Jacoby, N., & Griffiths, T. L. Predicting human similarity judgments using large language models. arXiv preprint arXiv:2202.04728 (2022).
Sasaki, M., Takeda, K., Abe, K., Oizumi M. Toolbox for Gromov–Wasserstein optimal transport: Application to unsupervised alignment in neuroscience. bioRxiv (2023).
Mémoli, F. Gromov–Wasserstein distances and the metric approach to object matching. Found Comput. Math. 11, 417–487 (2011).
doi: 10.1007/s10208-011-9093-5
Peyré, G., & Cuturi, M. Computational optimal transport. arXiv preprint arXiv:1803.00567 (2020).
Alvarez-Melis, D., & Jaakkola, T. S. Gromov–Wasserstein alignment of word embedding spaces. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 1881–1890 (2018).
Demetci, P., Santorella, R., Sandstede, B., Noble, W.S., & Singh, R. Gromov–Wasserstein optimal transport to align single-cell multi-omics data. bioRxiv (2020).
Kawakita, G., Zeleznikow-Johnston, A., Takeda, K., Tsuchiya, N. & Oizumi, M. Is my “red” your “red”?: Unsupervised alignment of qualia structures via optimal transport. PsyArXiv preprint https://doi.org/10.31234/osf.io/h3pqm (2023).
Epping, G. P., Fisher, E. L., Zeleznikow-Johnston, A., Pothos, E. & Tsuchiya, N. A quantum geometric model of color similarity judgements. Cogn. Sci. 47, e13231 (2023).
doi: 10.1111/cogs.13231 pubmed: 36655940
Zeleznikow-Johnston, A., Aizawa, Y., Yamada, M. & Tsuchiya, N. Are color experiences the same across the visual field?. J. Cogn. Neurosci. 35(4), 509–542 (2023).
doi: 10.1162/jocn_a_01962 pubmed: 36638234
Birch, J. Efficiency of the Ishihara test for identifying red–green colour deficiency. Ophthalmic Physiol. Opt. 17(5), 403–408 (1997).
doi: 10.1111/j.1475-1313.1997.tb00072.x pubmed: 9390366
Pouw, A., Karanjia, R. & Sadun, A. A method for identifying color vision deficiency malingering. Graefes Arch. Clin. Exp. Ophthalmol. 255(3), 613–618 (2017).
doi: 10.1007/s00417-016-3570-0 pubmed: 28004196
Saji, N., Imai, M. & Asano, M. Acquisition of the meaning of the word orange requires understanding of the meanings of red, pink, and purple: Constructing a lexicon as a connected system. Cogn. Sci. 44(1), e12813 (2020).
doi: 10.1111/cogs.12813 pubmed: 31960500
Winawer, J. et al. Russian blues reveal effects of language on color discrimination. Proc. Natl. Acad. Sci. USA 104(19), 7780–85 (2007).
doi: 10.1073/pnas.0701644104 pubmed: 17470790 pmcid: 1876524
Hebart, M. N., Zheng, C. Y., Pereira, F. & Baker, C. I. Revealing the multi-dimensional mental representations of natural objects underlying human similarity judgements. Nat. Hum. Behav. 4(11), 1173–1185 (2020).
doi: 10.1038/s41562-020-00951-3 pubmed: 33046861 pmcid: 7666026
Hebart, M. N. et al. THINGS-data: A multimodal collection of large-scale datasets for investigating object representations in brain and behavior. eLife 12, e82580 (2023).
doi: 10.7554/eLife.82580 pubmed: 36847339 pmcid: 10038662
Sharma, G., Wu, W. & Dalal, E. N. The CIEDE2000 color-difference formula: Implementation notes, supplementary test data, and mathematical observations. Color Res. Appl. 30(1), 21–30 (2005).
doi: 10.1002/col.20070
Peyré, G., Cuturi, M., & Solomon, J. Gromov–Wasserstein averaging of kernel and distance matrices. In International Conference on Machine Learning, 2664–2672 (2016).
Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. Optuna: A next-generation hyperparameter optimization framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631 (2019).
Flamary, R. et al. Pot: Python optimal transport. J. Mach. Learn. Res. 22, 1–8 (2021).

Auteurs

Genji Kawakita (G)

Department of Bioengineering, Imperial College London, London, UK. g.kawakita22@imperial.ac.uk.

Ariel Zeleznikow-Johnston (A)

School of Psychological Sciences, Monash University, Melbourne, Australia.
Turner Institute for Brain and Mental Health, Monash University, Melbourne, Australia.

Naotsugu Tsuchiya (N)

School of Psychological Sciences, Monash University, Melbourne, Australia.
Turner Institute for Brain and Mental Health, Monash University, Melbourne, Australia.
Center for Information and Neural Networks (CiNet), National Institute of Information and Communications Technology (NICT), Osaka, Japan.
Department of Qualia Structure, ATR Computational Neuroscience Laboratories, Kyoto, Japan.

Masafumi Oizumi (M)

Graduate School of Arts and Science, The University of Tokyo, Tokyo, Japan. c-oizumi@g.ecc.u-tokyo.ac.jp.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH