Empowerment contributes to exploration behaviour in a creative video game.
Journal
Nature human behaviour
ISSN: 2397-3374
Titre abrégé: Nat Hum Behav
Pays: England
ID NLM: 101697750
Informations de publication
Date de publication:
09 2023
09 2023
Historique:
received:
17
02
2022
accepted:
15
06
2023
medline:
25
9
2023
pubmed:
25
7
2023
entrez:
24
7
2023
Statut:
ppublish
Résumé
Studies of human exploration frequently cast people as serendipitously stumbling upon good options. Yet these studies may not capture the richness of exploration strategies that people exhibit in more complex environments. Here we study behaviour in a large dataset of 29,493 players of the richly structured online game 'Little Alchemy 2'. In this game, players start with four elements, which they can combine to create up to 720 complex objects. We find that players are driven not only by external reward signals, such as an attempt to produce successful outcomes, but also by an intrinsic motivation to create objects that empower them to create even more objects. We find that this drive for empowerment is eliminated when playing a game variant that lacks recognizable semantics, indicating that people use their knowledge about the world and its possibilities to guide their exploration. Our results suggest that the drive for empowerment may be a potent source of intrinsic motivation in richly structured domains, particularly those that lack explicit reward signals.
Identifiants
pubmed: 37488401
doi: 10.1038/s41562-023-01661-2
pii: 10.1038/s41562-023-01661-2
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1481-1489Informations de copyright
© 2023. The Author(s), under exclusive licence to Springer Nature Limited.
Références
Schulz, E. & Gershman, S. J. The algorithmic architecture of exploration in the human brain. Curr. Opin. Neurobiol. 55, 7–14 (2019).
doi: 10.1016/j.conb.2018.11.003
pubmed: 30529148
Wilson, R. C., Bonawitz, E., Costa, V. D. & Ebitz, R. B. Balancing exploration and exploitation with information and randomization. Curr. Opin. Behav. Sci. 38, 49–56 (2021).
doi: 10.1016/j.cobeha.2020.10.001
pubmed: 33184605
Daw, N. D., O’Doherty, J. P., Dayan, P., Seymour, B. & Dolan, R. J. Cortical substrates for exploratory decisions in humans. Nature 441, 876–879 (2006).
Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A. & Cohen, J. D. Humans use directed and random exploration to solve the explore–exploit dilemma. J. Exp. Psychol. Gen. 143, 155–164 (2014).
doi: 10.1037/a0038199
Speekenbrink, M. & Konstantinidis, E. Uncertainty and exploration in a restless bandit problem. Top. Cogn. Sci. 7, 351–367 (2015).
doi: 10.1111/tops.12145
pubmed: 25899069
Gershman, S. J. Deconstructing the human algorithms for exploration. Cognition 173, 34–42 (2018).
doi: 10.1016/j.cognition.2017.12.014
pubmed: 29289795
Gershman, S. Uncertainty and exploration. Decision 6, 277–286 (2019).
doi: 10.1037/dec0000101
pubmed: 33768122
Brändle, F., Binz, M. & Schulz, E. in The Drive for Knowledge (eds Cogliati Dezza, I. et al) Ch. 7 (Cambridge Univ. Press, 2022).
Chu, J. & Schulz, L. Not playing by the rules: exploratory play, rational action, and efficient search. Open Mind 7, 294–317 (2023).
Gottlieb, J., Oudeyer, P.-Y., Lopes, M. & Baranes, A. Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends Cogn. Sci. 17, 585–593 (2013).
doi: 10.1016/j.tics.2013.09.001
pubmed: 24126129
pmcid: 4193662
Payzan-LeNestour, E. & Bossaerts, P. Risk, unexpected uncertainty, and estimation uncertainty: Bayesian learning in unstable settings. PLoS Comput. Biol. 7, e1001048 (2011).
doi: 10.1371/journal.pcbi.1001048
pubmed: 21283774
pmcid: 3024253
Knox, W. B., Otto, A. R., Stone, P. & Love, B. C. The nature of belief-directed exploratory choice in human decision-making. Front. Psychol. https://doi.org/10.3389/fpsyg.2011.00398 (2012).
Schulz, E., Wu, C. M., Ruggeri, A. & Meder, B. Searching for rewards like a child means less generalization and more directed exploration. Psychol. Sci. 30, 1561–1572 (2019).
doi: 10.1177/0956797619863663
pubmed: 31652093
Little Alchemy 2. Google Play https://play.google.com/store/apps/details?id=com.recloak.littlealchemy2 (2021).
Jiang, M. et al. Wordcraft: an environment for benchmarking commonsense agents. Preprint at arXiv https://doi.org/10.48550/arXiv.2007.09185 (2020).
Schulz, E., Franklin, N. T. & Gershman, S. J. Finding structure in multi-armed bandits. Cogn. Psychol. 119, 101261 (2020).
doi: 10.1016/j.cogpsych.2019.101261
pubmed: 32059133
Schulz, E. et al. Structured, uncertainty-driven exploration in real-world consumer choice. Proc. Natl Acad. Sci. USA 116, 13903–13908 (2019).
doi: 10.1073/pnas.1821028116
pubmed: 31235598
pmcid: 6628813
Klyubin, A. S., Polani, D. & Nehaniv, C. L. All else being equal be empowered. In Proc. 8th European Conference on Advances in Artificial Life (eds Capcarrère, M.S. et al) 744–753 (Springer-Verlag, Berlin, 2005).
Colantonio, J. & Bonawitz, E. Awesome play: awe increases preschooler’s exploration and discovery. In Proc. 40th Annual Conference of the Cognitive Science Society (eds Kalish, C. et al.) 1536–1541 (Cognitive Science Society, Seattle, 2018).
Salge, C., Glackin, C. & Polani, D. in Guided Self-Organization: Inception (ed. Prokopenko, M.) 67–114 (Springer, 2014).
Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P. & Dolan, R. J. Model-based influences on humans’ choices and striatal prediction errors. Neuron 69, 1204–1215 (2011).
doi: 10.1016/j.neuron.2011.02.027
pubmed: 21435563
pmcid: 3077926
Joulin, A. et al. Fasttext. zip: compressing text classification models. Preprint at arXiv https://doi.org/10.48550/arXiv.1612.03651 (2016).
Bhatia, S. Associative judgment and vector space semantics. Psychol. Rev. 124, 1–20 (2017).
Fründ, I., Wichmann, F. A. & Macke, J. H. Quantifying the effect of intertrial dependence on perceptual decisions. J. Vis. https://doi.org/10.1167/14.7.9 (2014).
Schmidhuber, J. Powerplay: training an increasingly general problem solver by continually searching for the simplest still unsolvable problem. Front. Psychol. 4, 313 (2013).
doi: 10.3389/fpsyg.2013.00313
pubmed: 23761771
pmcid: 3675324
Nasiriany, S., Pong, V. H., Lin, S. & Levine, S. Planning with goal-conditioned policies. In Advances in Neural Information Processing Systems 32 (NeurIPS 2019) (eds Wallach, H. et al.) 14843–14854 (Neural Information Processing Systems Foundation, San Diego, 2019).
Campero, A. et al. Learning with AMIGo: adversarially motivated intrinsic goals. Preprint at arXiv https://doi.org/10.48550/arXiv.2006.12122 (2020).
Chitnis, R., Silver, T., Tenenbaum, J., Kaelbling, L. P. & Lozano-Perez, T. GLIB: efficient exploration for relational model-based reinforcement learning via goal-literal babbling. Preprint at arXiv https://doi.org/10.48550/arXiv.2001.08299 (2020).
Pathak, D., Gandhi, D. & Gupta, A. Self-supervised exploration via disagreement. In Proc. 36th International Conference on Machine Learning (eds Chaudhuri, K. & Salakhutdinov, R.) 5062–5071 (PMLR, Cambridge, MA, 2019).
Gottlieb, J. & Oudeyer, P.-Y. Towards a neuroscience of active sampling and curiosity. Nat. Rev. Neurosci. 19, 758–770 (2018).
doi: 10.1038/s41583-018-0078-0
pubmed: 30397322
Chu, J. & Schulz, L. E. Play, curiosity, and cognition. Annu. Rev. Dev. Psychol. 2, 317–343 (2020).
doi: 10.1146/annurev-devpsych-070120-014806
Brändle, F., Stocks, L. J. & Schulz, E. franziskabraendle/alchemy_empowerment. Zenodo https://doi.org/10.5281/zenodo.8010316 (2023).