Implicit incremental natural actor critic algorithm.

Implicit update Incremental learning Natural actor critic Natural policy gradient Reinforcement learning

Journal

Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Titre abrégé: Neural Netw
Pays: United States
ID NLM: 8805018

Informations de publication

Date de publication:
Jan 2019
Historique:
received: 20 12 2017
revised: 22 08 2018
accepted: 09 10 2018
pubmed: 9 11 2018
medline: 10 1 2019
entrez: 9 11 2018
Statut: ppublish

Résumé

Natural policy gradient (NPG) methods are promising approaches to finding locally optimal policy parameters. The NPG approach works well in optimizing complex policies with high-dimensional parameters, and the effectiveness of NPG methods has been demonstrated in many fields. However, the incremental estimation of the NPG is computationally unstable owing to its high sensitivity to the step-sizes values, especially to the one used to update the estimate of NPG. In this study, we propose a new incremental and stable algorithm for the NPG estimation. We call the proposed algorithm the implicit incremental natural actor critic (I2NAC), and it is based on the idea of the implicit update. The convergence analysis for I2NAC is provided. Theoretical analysis results indicate the stability of I2NAC and the instability of conventional incremental NPG methods. Numerical experiments were performed, and the results show that I2NAC is less sensitive to the values of the meta-parameters, including the step-size for the NPG update, compared to the existing incremental NPG method.

Identifiants

pubmed: 30408692
pii: S0893-6080(18)30292-2
doi: 10.1016/j.neunet.2018.10.007
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

103-112

Informations de copyright

Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

Auteurs

Ryo Iwaki (R)

Osaka University, 2-1, Yamadaoka, Suita city, Osaka, Japan. Electronic address: ryo.iwaki@ams.eng.osaka-u.ac.jp.

Minoru Asada (M)

Osaka University, 2-1, Yamadaoka, Suita city, Osaka, Japan. Electronic address: asada@ams.eng.osaka-u.ac.jp.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH