Empowerment contributes to exploration behaviour in a creative video game.


Journal

Nature human behaviour
ISSN: 2397-3374
Titre abrégé: Nat Hum Behav
Pays: England
ID NLM: 101697750

Informations de publication

Date de publication:
09 2023
Historique:
received: 17 02 2022
accepted: 15 06 2023
medline: 25 9 2023
pubmed: 25 7 2023
entrez: 24 7 2023
Statut: ppublish

Résumé

Studies of human exploration frequently cast people as serendipitously stumbling upon good options. Yet these studies may not capture the richness of exploration strategies that people exhibit in more complex environments. Here we study behaviour in a large dataset of 29,493 players of the richly structured online game 'Little Alchemy 2'. In this game, players start with four elements, which they can combine to create up to 720 complex objects. We find that players are driven not only by external reward signals, such as an attempt to produce successful outcomes, but also by an intrinsic motivation to create objects that empower them to create even more objects. We find that this drive for empowerment is eliminated when playing a game variant that lacks recognizable semantics, indicating that people use their knowledge about the world and its possibilities to guide their exploration. Our results suggest that the drive for empowerment may be a potent source of intrinsic motivation in richly structured domains, particularly those that lack explicit reward signals.

Identifiants

pubmed: 37488401
doi: 10.1038/s41562-023-01661-2
pii: 10.1038/s41562-023-01661-2
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1481-1489

Informations de copyright

© 2023. The Author(s), under exclusive licence to Springer Nature Limited.

Références

Schulz, E. & Gershman, S. J. The algorithmic architecture of exploration in the human brain. Curr. Opin. Neurobiol. 55, 7–14 (2019).
doi: 10.1016/j.conb.2018.11.003 pubmed: 30529148
Wilson, R. C., Bonawitz, E., Costa, V. D. & Ebitz, R. B. Balancing exploration and exploitation with information and randomization. Curr. Opin. Behav. Sci. 38, 49–56 (2021).
doi: 10.1016/j.cobeha.2020.10.001 pubmed: 33184605
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B. & Dolan, R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A. & Cohen, J. D. Humans use directed and random exploration to solve the explore–exploit dilemma. J. Exp. Psychol. Gen. 143, 155–164 (2014).
doi: 10.1037/a0038199
Speekenbrink, M. & Konstantinidis, E. Uncertainty and exploration in a restless bandit problem. Top. Cogn. Sci. 7, 351–367 (2015).
doi: 10.1111/tops.12145 pubmed: 25899069
Gershman, S. J. Deconstructing the human algorithms for exploration. Cognition 173, 34–42 (2018).
doi: 10.1016/j.cognition.2017.12.014 pubmed: 29289795
Gershman, S. Uncertainty and exploration. Decision 6, 277–286 (2019).
doi: 10.1037/dec0000101 pubmed: 33768122
Brändle, F., Binz, M. & Schulz, E. in The Drive for Knowledge (eds Cogliati Dezza, I. et al) Ch. 7 (Cambridge Univ. Press, 2022).
Chu, J. & Schulz, L. Not playing by the rules: exploratory play, rational action, and efficient search. Open Mind 7, 294–317 (2023).
Gottlieb, J., Oudeyer, P.-Y., Lopes, M. & Baranes, A. Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends Cogn. Sci. 17, 585–593 (2013).
doi: 10.1016/j.tics.2013.09.001 pubmed: 24126129 pmcid: 4193662
Payzan-LeNestour, E. & Bossaerts, P. Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings. PLoS Comput. Biol. 7, e1001048 (2011).
doi: 10.1371/journal.pcbi.1001048 pubmed: 21283774 pmcid: 3024253
Knox, W. B., Otto, A. R., Stone, P. & Love, B. C. The nature of belief-directed exploratory choice in human decision-making. Front. Psychol. https://doi.org/10.3389/fpsyg.2011.00398 (2012).
Schulz, E., Wu, C. M., Ruggeri, A. & Meder, B. Searching for rewards like a child means less generalization and more directed exploration. Psychol. Sci. 30, 1561–1572 (2019).
doi: 10.1177/0956797619863663 pubmed: 31652093
Little Alchemy 2. Google Play https://play.google.com/store/apps/details?id=com.recloak.littlealchemy2 (2021).
Jiang, M. et al. Wordcraft: an environment for benchmarking commonsense agents. Preprint at arXiv https://doi.org/10.48550/arXiv.2007.09185 (2020).
Schulz, E., Franklin, N. T. & Gershman, S. J. Finding structure in multi-armed bandits. Cogn. Psychol. 119, 101261 (2020).
doi: 10.1016/j.cogpsych.2019.101261 pubmed: 32059133
Schulz, E. et al. Structured, uncertainty-driven exploration in real-world consumer choice. Proc. Natl Acad. Sci. USA 116, 13903–13908 (2019).
doi: 10.1073/pnas.1821028116 pubmed: 31235598 pmcid: 6628813
Klyubin, A. S., Polani, D. & Nehaniv, C. L. All else being equal be empowered. In Proc. 8th European Conference on Advances in Artificial Life (eds Capcarrère, M.S. et al) 744–753 (Springer-Verlag, Berlin, 2005).
Colantonio, J. & Bonawitz, E. Awesome play: awe increases preschooler’s exploration and discovery. In Proc. 40th Annual Conference of the Cognitive Science Society (eds Kalish, C. et al.) 1536–1541 (Cognitive Science Society, Seattle, 2018).
Salge, C., Glackin, C. & Polani, D. in Guided Self-Organization: Inception (ed. Prokopenko, M.) 67–114 (Springer, 2014).
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).
doi: 10.1016/j.neuron.2011.02.027 pubmed: 21435563 pmcid: 3077926
Joulin, A. et al. Fasttext. zip: compressing text classification models. Preprint at arXiv https://doi.org/10.48550/arXiv.1612.03651 (2016).
Bhatia, S. Associative judgment and vector space semantics. Psychol. Rev. 124, 1–20 (2017).
Fründ, I., Wichmann, F. A. & Macke, J. H. Quantifying the effect of intertrial dependence on perceptual decisions. J. Vis. https://doi.org/10.1167/14.7.9 (2014).
Schmidhuber, J. Powerplay: training an increasingly general problem solver by continually searching for the simplest still unsolvable problem. Front. Psychol. 4, 313 (2013).
doi: 10.3389/fpsyg.2013.00313 pubmed: 23761771 pmcid: 3675324
Nasiriany, S., Pong, V. H., Lin, S. & Levine, S. Planning with goal-conditioned policies. In Advances in Neural Information Processing Systems 32 (NeurIPS 2019) (eds Wallach, H. et al.) 14843–14854 (Neural Information Processing Systems Foundation, San Diego, 2019).
Campero, A. et al. Learning with AMIGo: adversarially motivated intrinsic goals. Preprint at arXiv https://doi.org/10.48550/arXiv.2006.12122 (2020).
Chitnis, R., Silver, T., Tenenbaum, J., Kaelbling, L. P. & Lozano-Perez, T. GLIB: efficient exploration for relational model-based reinforcement learning via goal-literal babbling. Preprint at arXiv https://doi.org/10.48550/arXiv.2001.08299 (2020).
Pathak, D., Gandhi, D. & Gupta, A. Self-supervised exploration via disagreement. In Proc. 36th International Conference on Machine Learning (eds Chaudhuri, K. & Salakhutdinov, R.) 5062–5071 (PMLR, Cambridge, MA, 2019).
Gottlieb, J. & Oudeyer, P.-Y. Towards a neuroscience of active sampling and curiosity. Nat. Rev. Neurosci. 19, 758–770 (2018).
doi: 10.1038/s41583-018-0078-0 pubmed: 30397322
Chu, J. & Schulz, L. E. Play, curiosity, and cognition. Annu. Rev. Dev. Psychol. 2, 317–343 (2020).
doi: 10.1146/annurev-devpsych-070120-014806
Brändle, F., Stocks, L. J. & Schulz, E. franziskabraendle/alchemy_empowerment. Zenodo https://doi.org/10.5281/zenodo.8010316 (2023).

Auteurs

Franziska Brändle (F)

Max Planck Institute for Biological Cybernetics, Tübingen, Germany. franziska.braendle@tuebingen.mpg.de.

Lena J Stocks (LJ)

Max Planck Institute for Biological Cybernetics, Tübingen, Germany.

Joshua B Tenenbaum (JB)

Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.

Samuel J Gershman (SJ)

Center for Brains, Minds, and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.
Department of Psychology and Center for Brain Science, Harvard University, Cambridge, MA, USA.

Eric Schulz (E)

Max Planck Institute for Biological Cybernetics, Tübingen, Germany.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH