Structure-Preserving Imitation Learning With Delayed Reward: An Evaluation Within the RoboCup Soccer 2D Simulation Environment.

deep learning deep reinforcement learning end-to-end learning imitation learning learning with delayed reward learning with structure preservation

Journal

Frontiers in robotics and AI
ISSN: 2296-9144
Titre abrégé: Front Robot AI
Pays: Switzerland
ID NLM: 101749350

Informations de publication

Date de publication:
2020
Historique:
received: 10 05 2020
accepted: 04 08 2020
entrez: 27 1 2021
pubmed: 28 1 2021
medline: 28 1 2021
Statut: epublish

Résumé

We describe and evaluate a neural network-based architecture aimed to imitate and improve the performance of a fully autonomous soccer team in RoboCup Soccer 2D Simulation environment. The approach utilizes deep Q-network architecture for action determination and a deep neural network for parameter learning. The proposed solution is shown to be feasible for replacing a selected behavioral module in a well-established RoboCup base team,

Identifiants

pubmed: 33501289
doi: 10.3389/frobt.2020.00123
pmc: PMC7805756
doi:

Types de publication

Journal Article

Langues

eng

Pagination

123

Informations de copyright

Copyright © 2020 Nguyen and Prokopenko.

Références

Nature. 2015 Feb 26;518(7540):529-33
pubmed: 25719670
Adv Neural Inf Process Syst. 2015;28:1954-1962
pubmed: 28066133
Artif Life. 2017 Winter;23(1):34-57
pubmed: 28140630

Auteurs

Quang Dang Nguyen (QD)

Centre for Complex Systems, Faculty of Engineering, University of Sydney, Sydney, NSW, Australia.

Mikhail Prokopenko (M)

Centre for Complex Systems, Faculty of Engineering, University of Sydney, Sydney, NSW, Australia.

Classifications MeSH