Implicit incremental natural actor critic algorithm.

Algorithms Humans Models, Theoretical

Implicit update Incremental learning Natural actor critic Natural policy gradient Reinforcement learning

Journal

Neural networks : the official journal of the International Neural Network Society

ISSN: 1879-2782

Titre abrégé: Neural Netw

Pays: United States

ID NLM: 8805018

Informations de publication

Date de publication:
Jan 2019

Historique:

received: 20 12 2017

revised: 22 08 2018

accepted: 09 10 2018

pubmed: 9 11 2018

medline: 10 1 2019

entrez: 9 11 2018

Statut: ppublish

Résumé

Natural policy gradient (NPG) methods are promising approaches to finding locally optimal policy parameters. The NPG approach works well in optimizing complex policies with high-dimensional parameters, and the effectiveness of NPG methods has been demonstrated in many fields. However, the incremental estimation of the NPG is computationally unstable owing to its high sensitivity to the step-sizes values, especially to the one used to update the estimate of NPG. In this study, we propose a new incremental and stable algorithm for the NPG estimation. We call the proposed algorithm the implicit incremental natural actor critic (I2NAC), and it is based on the idea of the implicit update. The convergence analysis for I2NAC is provided. Theoretical analysis results indicate the stability of I2NAC and the instability of conventional incremental NPG methods. Numerical experiments were performed, and the results show that I2NAC is less sensitive to the values of the meta-parameters, including the step-size for the NPG update, compared to the existing incremental NPG method.

Identifiants

DOI: 10.1016/j.neunet.2018.10.007 PMID: 30408692

pubmed: 30408692

pii: S0893-6080(18)30292-2

doi: 10.1016/j.neunet.2018.10.007

pii:

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

103-112

Implicit incremental natural actor critic algorithm.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Auteurs

Ryo Iwaki (R)

Minoru Asada (M)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH