Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events.

dynamic state space episode-dependent learning history-in-episode architecture reinforcement learning target search task

Journal

Frontiers in computational neuroscience

ISSN: 1662-5188

Titre abrégé: Front Comput Neurosci

Pays: Switzerland

ID NLM: 101477956

Informations de publication

Date de publication:
2022

Historique:

received: 28 09 2021

accepted: 26 04 2022

entrez: 20 6 2022

pubmed: 21 6 2022

medline: 21 6 2022

Statut: epublish

Résumé

Learning is a crucial basis for biological systems to adapt to environments. Environments include various states or episodes, and episode-dependent learning is essential in adaptation to such complex situations. Here, we developed a model for learning a two-target search task used in primate physiological experiments. In the task, the agent is required to gaze one of the four presented light spots. Two neighboring spots are served as the correct target alternately, and the correct target pair is switched after a certain number of consecutive successes. In order for the agent to obtain rewards with a high probability, it is necessary to make decisions based on the actions and results of the previous two trials. Our previous work achieved this by using a dynamic state space. However, to learn a task that includes events such as fixation to the initial central spot, the model framework should be extended. For this purpose, here we propose a "history-in-episode architecture." Specifically, we divide states into episodes and histories, and actions are selected based on the histories within each episode. When we compared the proposed model including the dynamic state space with the conventional SARSA method in the two-target search task, the former performed close to the theoretical optimum, while the latter never achieved target-pair switch because it had to re-learn each correct target each time. The reinforcement learning model including the proposed history-in-episode architecture and dynamic state scape enables episode-dependent learning and provides a basis for highly adaptable learning systems to complex environments.

Identifiants

DOI: 10.3389/fncom.2022.784604 PMID: 35720772 PMC: PMC9201426

pubmed: 35720772

doi: 10.3389/fncom.2022.784604

pmc: PMC9201426

doi:

Types de publication

Journal Article

Langues

eng

Pagination

784604

Informations de copyright

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Neurosci Res. 2020 Jul;156:41-49

pubmed: 31923449

Brain Res Cogn Brain Res. 2001 Mar;11(1):165-9

pubmed: 11240119

Nature. 2017 Oct 18;550(7676):354-359

pubmed: 29052630

Science. 1992 May 1;256(5057):675-7

pubmed: 1585183

Nature. 2016 Jan 28;529(7587):484-9

pubmed: 26819042

Psychol Rev. 1967 May;74(3):151-82

pubmed: 5342881

Neurosci Biobehav Rev. 2016 Dec;71:829-848

pubmed: 27693227

J Neurosci. 2020 Jan 2;40(1):203-219

pubmed: 31719167

IEEE Trans Pattern Anal Mach Intell. 1982 May;4(5):485-92

pubmed: 21869067

J Neurophysiol. 2015 Feb 1;113(3):1001-14

pubmed: 25411455

Cereb Cortex. 2005 Oct;15(10):1535-46

pubmed: 15703260

Neural Netw. 2015 Feb;62:67-72

pubmed: 25027732

Nat Commun. 2019 Dec 20;10(1):5826

pubmed: 31862876

Nat Rev Neurosci. 2019 Jun;20(6):364-375

pubmed: 30872808

Front Comput Neurosci. 2022 Feb 04;15:784592

pubmed: 35185502

Behav Neurosci. 1992 Apr;106(2):274-85

pubmed: 1590953

Neuroscience. 1991;42(2):335-50

pubmed: 1832750

Nat Rev Neurosci. 2013 Jun;14(6):417-28

pubmed: 23635870

Cereb Cortex. 2008 Sep;18(9):2036-45

pubmed: 18252744

IEEE Trans Pattern Anal Mach Intell. 2015 Feb;37(2):394-407

pubmed: 26353250

PLoS One. 2013 Dec 04;8(12):e80906

pubmed: 24349020

Neuron. 2006 May 18;50(4):631-41

pubmed: 16701212

Reinforcement Learning Model With Dynamic State Space Tested on Target Search Tasks for Monkeys: Extension to Learning Task Events.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Références

Auteurs

Kazuhiro Sakamoto (K)

Hinata Yamada (H)

Norihiko Kawaguchi (N)

Yoshito Furusawa (Y)

Naohiro Saito (N)

Hajime Mushiake (H)

Classifications MeSH