Robust Inverse Q-Learning for Continuous-Time Linear Systems in Adversarial Environments.

Neural Networks, Computer Algorithms Reward

Journal

IEEE transactions on cybernetics

ISSN: 2168-2275

Titre abrégé: IEEE Trans Cybern

Pays: United States

ID NLM: 101609393

Informations de publication

Date de publication:
Dec 2022

Historique:

pubmed: 18 8 2021

medline: 23 11 2022

entrez: 17 8 2021

Statut: ppublish

Résumé

This article proposes robust inverse Q -learning algorithms for a learner to mimic an expert's states and control inputs in the imitation learning problem. These two agents have different adversarial disturbances. To do the imitation, the learner must reconstruct the unknown expert cost function. The learner only observes the expert's control inputs and uses inverse Q -learning algorithms to reconstruct the unknown expert cost function. The inverse Q -learning algorithms are robust in that they are independent of the system model and allow for the different cost function parameters and disturbances between two agents. We first propose an offline inverse Q -learning algorithm which consists of two iterative learning loops: 1) an inner Q -learning iteration loop and 2) an outer iteration loop based on inverse optimal control. Then, based on this offline algorithm, we further develop an online inverse Q -learning algorithm such that the learner mimics the expert behaviors online with the real-time observation of the expert control inputs. This online computational method has four functional approximators: a critic approximator, two actor approximators, and a state-reward neural network (NN). It simultaneously approximates the parameters of Q -function and the learner state reward online. Convergence and stability proofs are rigorously studied to guarantee the algorithm performance.

Identifiants

DOI: 10.1109/TCYB.2021.3100749 PMID: 34403352

pubmed: 34403352

doi: 10.1109/TCYB.2021.3100749

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

13083-13095

Robust Inverse Q-Learning for Continuous-Time Linear Systems in Adversarial Environments.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Auteurs

Bosen Lian (B)

Wenqian Xue (W)

Frank L Lewis (FL)

Tianyou Chai (T)

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Multilabel SegSRGAN-A framework for parcellation and morphometry of preterm brain in MRI.

An arithmetic operation P system based on symmetric ternary system.

Unsupervised learning for real-time and continuous gait phase detection.

Classifications MeSH