Asymmetric reinforcement learning facilitates human inference of transitive relations.

Animals Humans Judgment Learning Reinforcement, Psychology

Journal

Nature human behaviour

ISSN: 2397-3374

Titre abrégé: Nat Hum Behav

Pays: England

ID NLM: 101697750

Informations de publication

Date de publication:
04 2022

Historique:

received: 01 04 2021

accepted: 25 11 2021

pubmed: 2 2 2022

medline: 28 4 2022

entrez: 1 2 2022

Statut: ppublish

Résumé

Humans and other animals are capable of inferring never-experienced relations (for example, A > C) from other relational observations (for example, A > B and B > C). The processes behind such transitive inference are subject to intense research. Here we demonstrate a new aspect of relational learning, building on previous evidence that transitive inference can be accomplished through simple reinforcement learning mechanisms. We show in simulations that inference of novel relations benefits from an asymmetric learning policy, where observers update only their belief about the winner (or loser) in a pair. Across four experiments (n = 145), we find substantial empirical support for such asymmetries in inferential learning. The learning policy favoured by our simulations and experiments gives rise to a compression of values that is routinely observed in psychophysics and behavioural economics. In other words, a seemingly biased learning strategy that yields well-known cognitive distortions can be beneficial for transitive inferential judgements.

Identifiants

DOI: 10.1038/s41562-021-01263-w PMID: 35102348 PMC: PMC9038534

pubmed: 35102348

doi: 10.1038/s41562-021-01263-w

pii: 10.1038/s41562-021-01263-w

pmc: PMC9038534

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

555-564

Informations de copyright

Références

Bryant, P. E. & Trabasso, T. Transitive inferences and memory in young children. Nature 232, 456–458 (1971).

doi: 10.1038/232456a0 pubmed: 4937205

Burt, C. Experimental tests of general intelligence. Br. J. Psychol. 3, 94–177 (1909).

Jensen, G., Muñoz, F., Alkan, Y., Ferrera, V. P. & Terrace, H. S. Implicit value updating explains transitive inference performance: the betasort model. PLoS Comput. Biol. 11, e1004523 (2015).

doi: 10.1371/journal.pcbi.1004523 pubmed: 26407227 pmcid: 4583549

Piaget, J. Judgment and Reasoning in the Child (Harcourt, Brace, 1928); https://doi.org/10.4324/9780203207260

Vasconcelos, M. Transitive inference in non-human animals: an empirical and theoretical analysis. Behav. Process. 78, 313–334 (2008).

doi: 10.1016/j.beproc.2008.02.017

Boysen, S. T., Berntson, G. G., Shreyer, T. A. & Quigley, K. S. Processing of ordinality and transitivity by chimpanzees (Pan troglodytes). J. Comp. Psychol. 107, 208–215 (1993).

doi: 10.1037/0735-7036.107.2.208 pubmed: 8370275

Gillan, D. J. Reasoning in the chimpanzee: II. Transitive inference. J. Exp. Psychol. Anim. Behav. Process. 7, 150–164 (1981).

doi: 10.1037/0097-7403.7.2.150

McGonigle, B. O. & Chalmers, M. Are monkeys logical? Nature 267, 694–696 (1977).

doi: 10.1038/267694a0 pubmed: 406574

Davis, H. Transitive inference in rats (Rattus norvegicus). J. Comp. Psychol. 106, 342–349 (1992).

doi: 10.1037/0735-7036.106.4.342 pubmed: 1451416

Bond, A. B., Kamil, A. C. & Balda, R. P. Social complexity and transitive inference in corvids. Anim. Behav. 65, 479–487 (2003).

doi: 10.1006/anbe.2003.2101

Lazareva, O. F. & Wasserman, E. A. Transitive inference in pigeons: measuring the associative values of Stimuli B and D. Behav. Process. 89, 244–255 (2012).

doi: 10.1016/j.beproc.2011.12.001

Wynne, C. D. L. Pigeon transitive inference: tests of simple accounts of a complex performance. Behav. Process. 39, 95–112 (1997).

doi: 10.1016/S0376-6357(96)00048-4

Delius, J. D. & Siemann, M. Transitive responding in animals and humans: exaptation rather than adaptation? Behav. Process. 42, 107–137 (1998).

doi: 10.1016/S0376-6357(97)00072-7

Wynne, C. D. L. Reinforcement accounts for transitive inference performance. Anim. Learn. Behav. 23, 207–217 (1995).

doi: 10.3758/BF03199936

Dusek, J. A. & Eichenbaum, H. The hippocampus and memory for orderly stimulus relations. Proc. Natl Acad. Sci. USA 94, 7109–7114 (1997).

doi: 10.1073/pnas.94.13.7109 pubmed: 9192700 pmcid: 21293

Garvert, M. M., Dolan, R. J. & Behrens, T. E. A map of abstract relational knowledge in the human hippocampal–entorhinal cortex. eLife 6, e17086 (2017).

doi: 10.7554/eLife.17086 pubmed: 28448253 pmcid: 5407855

Kumaran, D. & McClelland, J. L. Generalization through the recurrent interaction of episodic memories: a model of the hippocampal system. Psychol. Rev. 119, 573–616 (2012).

doi: 10.1037/a0028681 pubmed: 22775499 pmcid: 3444305

Smith, C. & Squire, L. R. Declarative memory, awareness, and transitive inference. J. Neurosci. 25, 10138–10146 (2005).

doi: 10.1523/JNEUROSCI.2731-05.2005 pubmed: 16267221 pmcid: 1457087

Frank, M. J., Rudy, J. W., Levy, W. B. & O’Reilly, R. C. When logic fails: implicit transitive inference in humans. Mem. Cogn. 33, 742–750 (2005).

doi: 10.3758/BF03195340

Hamilton, J. M. E. & Sanford, A. J. The symbolic distance effect for alphabetic order judgements: a subjective report and reaction time analysis. Q. J. Exp. Psychol. 30, 33–41 (1978).

doi: 10.1080/14640747808400652

von Fersen, L., Wynne, C. D., Delius, J. D. & Staddon, J. E. Transitive inference formation in pigeons. J. Exp. Psychol. Anim. Behav. Process. 17, 334–341 (1991).

doi: 10.1037/0097-7403.17.3.334

Kumaran, D., Banino, A., Blundell, C., Hassabis, D. & Dayan, P. Computations underlying social hierarchy learning: distinct neural mechanisms for updating and representing self-relevant information. Neuron 92, 1135–1147 (2016).

doi: 10.1016/j.neuron.2016.10.052 pubmed: 27930904 pmcid: 5158095

Frank, M. J., Moustafa, A. A., Haughey, H. M., Curran, T. & Hutchison, K. E. Genetic triple dissociation reveals multiple roles for dopamine in reinforcement learning. Proc. Natl Acad. Sci. USA 104, 16311–16316 (2007).

doi: 10.1073/pnas.0706111104 pubmed: 17913879 pmcid: 2042203

Lefebvre, G., Lebreton, M., Meyniel, F., Bourgeois-Gironde, S. & Palminteri, S. Behavioural and neural characterization of optimistic reinforcement learning. Nat. Hum. Behav. 1, 0067 (2017).

doi: 10.1038/s41562-017-0067

Palminteri, S., Khamassi, M., Joffily, M. & Coricelli, G. Contextual modulation of value signals in reward and punishment learning. Nat. Commun. 6, 8096 (2015).

doi: 10.1038/ncomms9096 pubmed: 26302782

van den Bos, W., Cohen, M. X., Kahnt, T. & Crone, E. A. Striatum–medial prefrontal cortex connectivity predicts developmental changes in reinforcement learning. Cereb. Cortex 22, 1247–1255 (2012).

doi: 10.1093/cercor/bhr198 pubmed: 21817091 pmcid: 6283353

Lefebvre, G., Summerfield, C. & Bogacz, R. A normative account of confirmatory biases during reinforcement learning. Neural Comput. https://doi.org/10.1162/neco_a_01455 (2021).

Palminteri, S., Lefebvre, G., Kilford, E. J. & Blakemore, S.-J. Confirmation bias in human reinforcement learning: evidence from counterfactual feedback processing. PLoS Comput. Biol. 13, e1005684 (2017).

doi: 10.1371/journal.pcbi.1005684 pubmed: 28800597 pmcid: 5568446

Weber, E. H. De Pulsu, Resorptione, Auditu et Tactu: Annotationes Anatomicae et Physiologicae… (C.F. Koehler, 1834).

Cheyette, S. J. & Piantadosi, S. T. A unified account of numerosity perception. Nat. Hum. Behav. 4, 1265–1272 (2020).

doi: 10.1038/s41562-020-00946-0 pubmed: 32929205

Nieder, A. & Miller, E. K. Coding of cognitive magnitude: compressed scaling of numerical information in the primate prefrontal cortex. Neuron 37, 149–157 (2003).

doi: 10.1016/S0896-6273(02)01144-3 pubmed: 12526780

Kahneman, D. & Tversky, A. Prospect theory: an analysis of decision under risk. Econometrica 47, 263–291 (1979).

doi: 10.2307/1914185

Eichenbaum, H. Hippocampus: cognitive processes and neural representations that underlie declarative memory. Neuron 44, 109–120 (2004).

doi: 10.1016/j.neuron.2004.08.028 pubmed: 15450164

O’Reilly, R. C. & Rudy, J. W. Conjunctive representations in learning and memory: principles of cortical and hippocampal function. Psychol. Rev. 108, 311–345 (2001).

doi: 10.1037/0033-295X.108.2.311 pubmed: 11381832

Whittington, J. C. R. & Bogacz, R. Theories of error back-propagation in the brain. Trends Cogn. Sci. 23, 235–250 (2019).

doi: 10.1016/j.tics.2018.12.005 pubmed: 30704969 pmcid: 6382460

Anderson, J. R. The Architecture of Cognition (Harvard Univ. Press, 1983).

Jensen, G., Terrace, H. S. & Ferrera, V. P. Discovering implied serial order through model-free and model-based learning. Front. Neurosci. 13, 878 (2019).

doi: 10.3389/fnins.2019.00878 pubmed: 31481871 pmcid: 6710392

Dehaene, S. The neural basis of the Weber–Fechner law: a logarithmic mental number line. Trends Cogn. Sci. 7, 145–147 (2003).

doi: 10.1016/S1364-6613(03)00055-X pubmed: 12691758

Pardo-Vazquez, J. L. et al. The mechanistic foundation of Weber’s law. Nat. Neurosci. 22, 1493–1502 (2019).

doi: 10.1038/s41593-019-0439-7 pubmed: 31406366

Bhui, R. & Gershman, S. J. Decision by sampling implements efficient coding of psychoeconomic functions. Psychol. Rev. 125, 985–1001 (2018).

doi: 10.1037/rev0000123 pubmed: 30431303

Stewart, N., Chater, N. & Brown, G. D. A. Decision by sampling. Cogn. Psychol. 53, 1–26 (2006).

doi: 10.1016/j.cogpsych.2005.10.003 pubmed: 16438947

Summerfield, C. & Li, V. Perceptual suboptimality: bug or feature? Behav. Brain Sci. 41, e245 (2018).

doi: 10.1017/S0140525X18001437 pubmed: 30767825

Gigerenzer, G. & Brighton, H. Homo heuristicus: why biased minds make better inferences. Top. Cogn. Sci. 1, 107–143 (2009).

doi: 10.1111/j.1756-8765.2008.01006.x pubmed: 25164802

Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D. & Meder, B. Generalization guides human exploration in vast decision spaces. Nat. Hum. Behav. 2, 915–924 (2018).

doi: 10.1038/s41562-018-0467-4 pubmed: 30988442

Juechems, K., Balaguer, J., Spitzer, B. & Summerfield, C. Optimal utility and probability functions for agents with finite computational precision. Proc. Natl Acad. Sci. USA 118, e2002232118 (2021).

doi: 10.1073/pnas.2002232118 pubmed: 33380453

Li, V., Herce Castañón, S., Solomon, J. A., Vandormael, H. & Summerfield, C. Robust averaging protects decisions from noise in neural computations. PLoS Comput. Biol. 13, e1005723 (2017).

doi: 10.1371/journal.pcbi.1005723 pubmed: 28841644 pmcid: 5589265

Luyckx, F., Spitzer, B., Blangero, A., Tsetsos, K. & Summerfield, C. Selective integration during sequential sampling in posterior neural signals. Cereb. Cortex 30, 4454–4464 (2020).

doi: 10.1093/cercor/bhaa039 pubmed: 32147695

Spitzer, B., Waschke, L. & Summerfield, C. Selective overweighting of larger magnitudes during noisy numerical comparison. Nat. Hum. Behav. 1, 0145 (2017).

doi: 10.1038/s41562-017-0145

Tsetsos, K. et al. Economic irrationality is optimal during noisy decision making. Proc. Natl Acad. Sci. USA 113, 3102–3107 (2016).

doi: 10.1073/pnas.1519157113 pubmed: 26929353 pmcid: 4801289

Eichenbaum, H. A cortical–hippocampal system for declarative memory. Nat. Rev. Neurosci. 1, 41–50 (2000).

doi: 10.1038/35036213 pubmed: 11252767

De Soto, C. B., London, M. & Handel, S. Social reasoning and spatial paralogic. J. Personal. Soc. Psychol. 2, 513–521 (1965).

doi: 10.1037/h0022492

Whittington, J. C. R. et al. The Tolman–Eichenbaum machine: unifying space and relational memory through generalization in the hippocampal formation. Cell 183, 1249–1263.e23 (2020).

doi: 10.1016/j.cell.2020.10.024 pubmed: 33181068 pmcid: 7707106

Frank, M. J., Rudy, J. W. & O’Reilly, R. C. Transitivity, flexibility, conjunctive representations, and the hippocampus. II. A computational analysis. Hippocampus 13, 341–354 (2003).

doi: 10.1002/hipo.10084 pubmed: 12722975

Van Elzakker, M., O’Reilly, R. C. & Rudy, J. W. Transitivity, flexibility, conjunctive representations, and the hippocampus. I. An empirical analysis. Hippocampus 13, 334–340 (2003).

doi: 10.1002/hipo.10083 pubmed: 12722974

Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).

doi: 10.1016/j.neuron.2011.02.027 pubmed: 21435563 pmcid: 3077926

Hayden, B. Y. & Niv, Y. The case against economic values in the orbitofrontal cortex (or anywhere else in the brain). Behav. Neurosci. 135, 192–201 (2021).

doi: 10.1037/bne0000448 pubmed: 34060875

Brodeur, M. B., Guérard, K. & Bouras, M. Bank of Standardized Stimuli (BOSS) Phase II: 930 new normative photos. PLoS ONE 9, e106953 (2014).

doi: 10.1371/journal.pone.0106953 pubmed: 25211489 pmcid: 4161371

Brainard, D. H. The Psychophysics Toolbox. Spat. Vis. 10, 433–436 (1997).

doi: 10.1163/156856897X00357 pubmed: 9176952

Peirce, J. et al. PsychoPy2: experiments in behavior made easy. Behav. Res. 51, 195–203 (2019).

doi: 10.3758/s13428-018-01193-y

Rescorla, R. A. & Wagner, A. R. in Classical Conditioning II: Current Theory and Research 64–99 (Appleton-Century-Crofts, 1971).

R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2020); https://www.r-project.org/

Mullen, K. M., Ardia, D., Gil, D. L., Windover, D. & Cline, J. DEoptim: an R package for global optimization by differential evolution. J. Stat. Softw. 40, 1–26 (2011).

doi: 10.18637/jss.v040.i06

Rigoux, L., Stephan, K. E., Friston, K. J. & Daunizeau, J. Bayesian model selection for group studies—revisited. NeuroImage 84, 971–985 (2014).

doi: 10.1016/j.neuroimage.2013.08.065 pubmed: 24018303

McFadden, D. Conditional Logit Analysis of Qualitative Choice Behavior (Institute of Urban and Regional Development, Univ. of California, 1973).

Wilson, R. C. & Collins, A. G. Ten simple rules for the computational modeling of behavioral data. eLife 8, e49547 (2019).

doi: 10.7554/eLife.49547 pubmed: 31769410 pmcid: 6879303

Asymmetric reinforcement learning facilitates human inference of transitive relations.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Simon Ciranka (S)

Juan Linde-Domingo (J)

Ivan Padezhki (I)

Clara Wicharz (C)

Charley M Wu (CM)

Bernhard Spitzer (B)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH