Optimizing agent behavior over long time scales by transporting value.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
19 11 2019
19 11 2019
Historique:
received:
28
12
2018
accepted:
10
10
2019
entrez:
21
11
2019
pubmed:
21
11
2019
medline:
10
3
2020
Statut:
epublish
Résumé
Humans prolifically engage in mental time travel. We dwell on past actions and experience satisfaction or regret. More than storytelling, these recollections change how we act in the future and endow us with a computationally important ability to link actions and consequences across spans of time, which helps address the problem of long-term credit assignment: the question of how to evaluate the utility of actions within a long-duration behavioral sequence. Existing approaches to credit assignment in AI cannot solve tasks with long delays between actions and consequences. Here, we introduce a paradigm where agents use recall of specific memories to credit past actions, allowing them to solve problems that are intractable for existing algorithms. This paradigm broadens the scope of problems that can be investigated in AI and offers a mechanistic account of behaviors that may inspire models in neuroscience, psychology, and behavioral economics.
Identifiants
pubmed: 31745075
doi: 10.1038/s41467-019-13073-w
pii: 10.1038/s41467-019-13073-w
pmc: PMC6864102
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
5223Références
Nat Rev Neurosci. 2007 Sep;8(9):657-61
pubmed: 17700624
Psychol Rev. 1948 Jul;55(4):189-208
pubmed: 18870876
Neuron. 2017 Jul 19;95(2):245-258
pubmed: 28728020
Am Econ Rev. ;96(5):1449-76
pubmed: 29135208
Nature. 2016 Oct 27;538(7626):471-476
pubmed: 27732574
Trends Cogn Sci. 2019 May;23(5):408-422
pubmed: 31003893
Psychol Res. 2008 May;72(3):321-30
pubmed: 17447083
Annu Rev Psychol. 2017 Jan 3;68:101-128
pubmed: 27618944
Nature. 2015 Feb 26;518(7540):529-33
pubmed: 25719670
J Neurosci. 2007 Dec 26;27(52):14365-74
pubmed: 18160644
Neuron. 2010 Apr 15;66(1):138-48
pubmed: 20399735