Asymmetric reinforcement learning facilitates human inference of transitive relations.
Journal
Nature human behaviour
ISSN: 2397-3374
Titre abrégé: Nat Hum Behav
Pays: England
ID NLM: 101697750
Informations de publication
Date de publication:
04 2022
04 2022
Historique:
received:
01
04
2021
accepted:
25
11
2021
pubmed:
2
2
2022
medline:
28
4
2022
entrez:
1
2
2022
Statut:
ppublish
Résumé
Humans and other animals are capable of inferring never-experienced relations (for example, A > C) from other relational observations (for example, A > B and B > C). The processes behind such transitive inference are subject to intense research. Here we demonstrate a new aspect of relational learning, building on previous evidence that transitive inference can be accomplished through simple reinforcement learning mechanisms. We show in simulations that inference of novel relations benefits from an asymmetric learning policy, where observers update only their belief about the winner (or loser) in a pair. Across four experiments (n = 145), we find substantial empirical support for such asymmetries in inferential learning. The learning policy favoured by our simulations and experiments gives rise to a compression of values that is routinely observed in psychophysics and behavioural economics. In other words, a seemingly biased learning strategy that yields well-known cognitive distortions can be beneficial for transitive inferential judgements.
Identifiants
pubmed: 35102348
doi: 10.1038/s41562-021-01263-w
pii: 10.1038/s41562-021-01263-w
pmc: PMC9038534
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
555-564Informations de copyright
© 2022. The Author(s).
Références
Bryant, P. E. & Trabasso, T. Transitive inferences and memory in young children. Nature 232, 456–458 (1971).
doi: 10.1038/232456a0
pubmed: 4937205
Burt, C. Experimental tests of general intelligence. Br. J. Psychol. 3, 94–177 (1909).
Jensen, G., Muñoz, F., Alkan, Y., Ferrera, V. P. & Terrace, H. S. Implicit value updating explains transitive inference performance: the betasort model. PLoS Comput. Biol. 11, e1004523 (2015).
doi: 10.1371/journal.pcbi.1004523
pubmed: 26407227
pmcid: 4583549
Piaget, J. Judgment and Reasoning in the Child (Harcourt, Brace, 1928); https://doi.org/10.4324/9780203207260
Vasconcelos, M. Transitive inference in non-human animals: an empirical and theoretical analysis. Behav. Process. 78, 313–334 (2008).
doi: 10.1016/j.beproc.2008.02.017
Boysen, S. T., Berntson, G. G., Shreyer, T. A. & Quigley, K. S. Processing of ordinality and transitivity by chimpanzees (Pan troglodytes). J. Comp. Psychol. 107, 208–215 (1993).
doi: 10.1037/0735-7036.107.2.208
pubmed: 8370275
Gillan, D. J. Reasoning in the chimpanzee: II. Transitive inference. J. Exp. Psychol. Anim. Behav. Process. 7, 150–164 (1981).
doi: 10.1037/0097-7403.7.2.150
McGonigle, B. O. & Chalmers, M. Are monkeys logical? Nature 267, 694–696 (1977).
doi: 10.1038/267694a0
pubmed: 406574
Davis, H. Transitive inference in rats (Rattus norvegicus). J. Comp. Psychol. 106, 342–349 (1992).
doi: 10.1037/0735-7036.106.4.342
pubmed: 1451416
Bond, A. B., Kamil, A. C. & Balda, R. P. Social complexity and transitive inference in corvids. Anim. Behav. 65, 479–487 (2003).
doi: 10.1006/anbe.2003.2101
Lazareva, O. F. & Wasserman, E. A. Transitive inference in pigeons: measuring the associative values of Stimuli B and D. Behav. Process. 89, 244–255 (2012).
doi: 10.1016/j.beproc.2011.12.001
Wynne, C. D. L. Pigeon transitive inference: tests of simple accounts of a complex performance. Behav. Process. 39, 95–112 (1997).
doi: 10.1016/S0376-6357(96)00048-4
Delius, J. D. & Siemann, M. Transitive responding in animals and humans: exaptation rather than adaptation? Behav. Process. 42, 107–137 (1998).
doi: 10.1016/S0376-6357(97)00072-7
Wynne, C. D. L. Reinforcement accounts for transitive inference performance. Anim. Learn. Behav. 23, 207–217 (1995).
doi: 10.3758/BF03199936
Dusek, J. A. & Eichenbaum, H. The hippocampus and memory for orderly stimulus relations. Proc. Natl Acad. Sci. USA 94, 7109–7114 (1997).
doi: 10.1073/pnas.94.13.7109
pubmed: 9192700
pmcid: 21293
Garvert, M. M., Dolan, R. J. & Behrens, T. E. A map of abstract relational knowledge in the human hippocampal–entorhinal cortex. eLife 6, e17086 (2017).
doi: 10.7554/eLife.17086
pubmed: 28448253
pmcid: 5407855
Kumaran, D. & McClelland, J. L. Generalization through the recurrent interaction of episodic memories: a model of the hippocampal system. Psychol. Rev. 119, 573–616 (2012).
doi: 10.1037/a0028681
pubmed: 22775499
pmcid: 3444305
Smith, C. & Squire, L. R. Declarative memory, awareness, and transitive inference. J. Neurosci. 25, 10138–10146 (2005).
doi: 10.1523/JNEUROSCI.2731-05.2005
pubmed: 16267221
pmcid: 1457087
Frank, M. J., Rudy, J. W., Levy, W. B. & O’Reilly, R. C. When logic fails: implicit transitive inference in humans. Mem. Cogn. 33, 742–750 (2005).
doi: 10.3758/BF03195340
Hamilton, J. M. E. & Sanford, A. J. The symbolic distance effect for alphabetic order judgements: a subjective report and reaction time analysis. Q. J. Exp. Psychol. 30, 33–41 (1978).
doi: 10.1080/14640747808400652
von Fersen, L., Wynne, C. D., Delius, J. D. & Staddon, J. E. Transitive inference formation in pigeons. J. Exp. Psychol. Anim. Behav. Process. 17, 334–341 (1991).
doi: 10.1037/0097-7403.17.3.334
Kumaran, D., Banino, A., Blundell, C., Hassabis, D. & Dayan, P. Computations underlying social hierarchy learning: distinct neural mechanisms for updating and representing self-relevant information. Neuron 92, 1135–1147 (2016).
doi: 10.1016/j.neuron.2016.10.052
pubmed: 27930904
pmcid: 5158095
Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. USA 104, 16311–16316 (2007).
doi: 10.1073/pnas.0706111104
pubmed: 17913879
pmcid: 2042203
Lefebvre, G., Lebreton, M., Meyniel, F., Bourgeois-Gironde, S. & Palminteri, S. Behavioural and neural characterization of optimistic reinforcement learning. Nat. Hum. Behav. 1, 0067 (2017).
doi: 10.1038/s41562-017-0067
Palminteri, S., Khamassi, M., Joffily, M. & Coricelli, G. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).
doi: 10.1038/ncomms9096
pubmed: 26302782
van den Bos, W., Cohen, M. X., Kahnt, T. & Crone, E. A. Striatum–medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. Cereb. Cortex 22, 1247–1255 (2012).
doi: 10.1093/cercor/bhr198
pubmed: 21817091
pmcid: 6283353
Lefebvre, G., Summerfield, C. & Bogacz, R. A normative account of confirmatory biases during reinforcement learning. Neural Comput. https://doi.org/10.1162/neco_a_01455 (2021).
Palminteri, S., Lefebvre, G., Kilford, E. J. & Blakemore, S.-J. Confirmation bias in human reinforcement learning: evidence from counterfactual feedback processing. PLoS Comput. Biol. 13, e1005684 (2017).
doi: 10.1371/journal.pcbi.1005684
pubmed: 28800597
pmcid: 5568446
Weber, E. H. De Pulsu, Resorptione, Auditu et Tactu: Annotationes Anatomicae et Physiologicae… (C.F. Koehler, 1834).
Cheyette, S. J. & Piantadosi, S. T. A unified account of numerosity perception. Nat. Hum. Behav. 4, 1265–1272 (2020).
doi: 10.1038/s41562-020-00946-0
pubmed: 32929205
Nieder, A. & Miller, E. K. Coding of cognitive magnitude: compressed scaling of numerical information in the primate prefrontal cortex. Neuron 37, 149–157 (2003).
doi: 10.1016/S0896-6273(02)01144-3
pubmed: 12526780
Kahneman, D. & Tversky, A. Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291 (1979).
doi: 10.2307/1914185
Eichenbaum, H. Hippocampus: cognitive processes and neural representations that underlie declarative memory. Neuron 44, 109–120 (2004).
doi: 10.1016/j.neuron.2004.08.028
pubmed: 15450164
O’Reilly, R. C. & Rudy, J. W. Conjunctive representations in learning and memory: principles of cortical and hippocampal function. Psychol. Rev. 108, 311–345 (2001).
doi: 10.1037/0033-295X.108.2.311
pubmed: 11381832
Whittington, J. C. R. & Bogacz, R. Theories of error back-propagation in the brain. Trends Cogn. Sci. 23, 235–250 (2019).
doi: 10.1016/j.tics.2018.12.005
pubmed: 30704969
pmcid: 6382460
Anderson, J. R. The Architecture of Cognition (Harvard Univ. Press, 1983).
Jensen, G., Terrace, H. S. & Ferrera, V. P. Discovering implied serial order through model-free and model-based learning. Front. Neurosci. 13, 878 (2019).
doi: 10.3389/fnins.2019.00878
pubmed: 31481871
pmcid: 6710392
Dehaene, S. The neural basis of the Weber–Fechner law: a logarithmic mental number line. Trends Cogn. Sci. 7, 145–147 (2003).
doi: 10.1016/S1364-6613(03)00055-X
pubmed: 12691758
Pardo-Vazquez, J. L. et al. The mechanistic foundation of Weber’s law. Nat. Neurosci. 22, 1493–1502 (2019).
doi: 10.1038/s41593-019-0439-7
pubmed: 31406366
Bhui, R. & Gershman, S. J. Decision by sampling implements efficient coding of psychoeconomic functions. Psychol. Rev. 125, 985–1001 (2018).
doi: 10.1037/rev0000123
pubmed: 30431303
Stewart, N., Chater, N. & Brown, G. D. A. Decision by sampling. Cogn. Psychol. 53, 1–26 (2006).
doi: 10.1016/j.cogpsych.2005.10.003
pubmed: 16438947
Summerfield, C. & Li, V. Perceptual suboptimality: bug or feature? Behav. Brain Sci. 41, e245 (2018).
doi: 10.1017/S0140525X18001437
pubmed: 30767825
Gigerenzer, G. & Brighton, H. Homo heuristicus: why biased minds make better inferences. Top. Cogn. Sci. 1, 107–143 (2009).
doi: 10.1111/j.1756-8765.2008.01006.x
pubmed: 25164802
Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D. & Meder, B. Generalization guides human exploration in vast decision spaces. Nat. Hum. Behav. 2, 915–924 (2018).
doi: 10.1038/s41562-018-0467-4
pubmed: 30988442
Juechems, K., Balaguer, J., Spitzer, B. & Summerfield, C. Optimal utility and probability functions for agents with finite computational precision. Proc. Natl Acad. Sci. USA 118, e2002232118 (2021).
doi: 10.1073/pnas.2002232118
pubmed: 33380453
Li, V., Herce Castañón, S., Solomon, J. A., Vandormael, H. & Summerfield, C. Robust averaging protects decisions from noise in neural computations. PLoS Comput. Biol. 13, e1005723 (2017).
doi: 10.1371/journal.pcbi.1005723
pubmed: 28841644
pmcid: 5589265
Luyckx, F., Spitzer, B., Blangero, A., Tsetsos, K. & Summerfield, C. Selective integration during sequential sampling in posterior neural signals. Cereb. Cortex 30, 4454–4464 (2020).
doi: 10.1093/cercor/bhaa039
pubmed: 32147695
Spitzer, B., Waschke, L. & Summerfield, C. Selective overweighting of larger magnitudes during noisy numerical comparison. Nat. Hum. Behav. 1, 0145 (2017).
doi: 10.1038/s41562-017-0145
Tsetsos, K. et al. Economic irrationality is optimal during noisy decision making. Proc. Natl Acad. Sci. USA 113, 3102–3107 (2016).
doi: 10.1073/pnas.1519157113
pubmed: 26929353
pmcid: 4801289
Eichenbaum, H. A cortical–hippocampal system for declarative memory. Nat. Rev. Neurosci. 1, 41–50 (2000).
doi: 10.1038/35036213
pubmed: 11252767
De Soto, C. B., London, M. & Handel, S. Social reasoning and spatial paralogic. J. Personal. Soc. Psychol. 2, 513–521 (1965).
doi: 10.1037/h0022492
Whittington, J. C. R. et al. The Tolman–Eichenbaum machine: unifying space and relational memory through generalization in the hippocampal formation. Cell 183, 1249–1263.e23 (2020).
doi: 10.1016/j.cell.2020.10.024
pubmed: 33181068
pmcid: 7707106
Frank, M. J., Rudy, J. W. & O’Reilly, R. C. Transitivity, flexibility, conjunctive representations, and the hippocampus. II. A computational analysis. Hippocampus 13, 341–354 (2003).
doi: 10.1002/hipo.10084
pubmed: 12722975
Van Elzakker, M., O’Reilly, R. C. & Rudy, J. W. Transitivity, flexibility, conjunctive representations, and the hippocampus. I. An empirical analysis. Hippocampus 13, 334–340 (2003).
doi: 10.1002/hipo.10083
pubmed: 12722974
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).
doi: 10.1016/j.neuron.2011.02.027
pubmed: 21435563
pmcid: 3077926
Hayden, B. Y. & Niv, Y. The case against economic values in the orbitofrontal cortex (or anywhere else in the brain). Behav. Neurosci. 135, 192–201 (2021).
doi: 10.1037/bne0000448
pubmed: 34060875
Brodeur, M. B., Guérard, K. & Bouras, M. Bank of Standardized Stimuli (BOSS) Phase II: 930 new normative photos. PLoS ONE 9, e106953 (2014).
doi: 10.1371/journal.pone.0106953
pubmed: 25211489
pmcid: 4161371
Brainard, D. H. The Psychophysics Toolbox. Spat. Vis. 10, 433–436 (1997).
doi: 10.1163/156856897X00357
pubmed: 9176952
Peirce, J. et al. PsychoPy2: experiments in behavior made easy. Behav. Res. 51, 195–203 (2019).
doi: 10.3758/s13428-018-01193-y
Rescorla, R. A. & Wagner, A. R. in Classical Conditioning II: Current Theory and Research 64–99 (Appleton-Century-Crofts, 1971).
R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2020); https://www.r-project.org/
Mullen, K. M., Ardia, D., Gil, D. L., Windover, D. & Cline, J. DEoptim: an R package for global optimization by differential evolution. J. Stat. Softw. 40, 1–26 (2011).
doi: 10.18637/jss.v040.i06
Rigoux, L., Stephan, K. E., Friston, K. J. & Daunizeau, J. Bayesian model selection for group studies—revisited. NeuroImage 84, 971–985 (2014).
doi: 10.1016/j.neuroimage.2013.08.065
pubmed: 24018303
McFadden, D. Conditional Logit Analysis of Qualitative Choice Behavior (Institute of Urban and Regional Development, Univ. of California, 1973).
Wilson, R. C. & Collins, A. G. Ten simple rules for the computational modeling of behavioral data. eLife 8, e49547 (2019).
doi: 10.7554/eLife.49547
pubmed: 31769410
pmcid: 6879303