Lipschitzness is all you need to tame off-policy generative adversarial imitation learning.
Deep learning
Generative adversarial networks
Imitation learning
Lipschitz-continuity
Reinforcement learning
Journal
Machine learning
ISSN: 0885-6125
Titre abrégé: Mach Learn
Pays: United States
ID NLM: 9881780
Informations de publication
Date de publication:
2022
2022
Historique:
received:
01
08
2020
revised:
18
01
2022
accepted:
27
01
2022
entrez:
23
5
2022
pubmed:
24
5
2022
medline:
24
5
2022
Statut:
ppublish
Résumé
Despite the recent success of reinforcement learning in various domains, these approaches remain, for the most part, deterringly sensitive to hyper-parameters and are often riddled with essential engineering feats allowing their success. We consider the case of off-policy generative adversarial imitation learning, and perform an in-depth review, qualitative and quantitative, of the method. We show that forcing the learned reward function to be local Lipschitz-continuous is a
Identifiants
pubmed: 35602587
doi: 10.1007/s10994-022-06144-5
pii: 6144
pmc: PMC9114147
doi:
Types de publication
Journal Article
Langues
eng
Pagination
1431-1521Informations de copyright
© The Author(s) 2022.
Déclaration de conflit d'intérêts
Conflict of interestThe authors declare that they have no competing interests.
Références
Nature. 2016 Jan 28;529(7587):484-9
pubmed: 26819042
Ann Oper Res. 2013 Sep 1;208(1):383-416
pubmed: 24049244
Nature. 2015 Feb 26;518(7540):529-33
pubmed: 25719670
IEEE Trans Neural Netw. 1994;5(3):363-71
pubmed: 18267804
Neural Comput. 1997 Jan 1;9(1):1-42
pubmed: 9117894
Nature. 2019 Nov;575(7782):350-354
pubmed: 31666705