Lipschitzness is all you need to tame off-policy generative adversarial imitation learning.

Deep learning Generative adversarial networks Imitation learning Lipschitz-continuity Reinforcement learning

Journal

Machine learning

ISSN: 0885-6125

Titre abrégé: Mach Learn

Pays: United States

ID NLM: 9881780

Informations de publication

Date de publication:
2022

Historique:

received: 01 08 2020

revised: 18 01 2022

accepted: 27 01 2022

entrez: 23 5 2022

pubmed: 24 5 2022

medline: 24 5 2022

Statut: ppublish

Résumé

Despite the recent success of reinforcement learning in various domains, these approaches remain, for the most part, deterringly sensitive to hyper-parameters and are often riddled with essential engineering feats allowing their success. We consider the case of off-policy generative adversarial imitation learning, and perform an in-depth review, qualitative and quantitative, of the method. We show that forcing the learned reward function to be local Lipschitz-continuous is a

Identifiants

DOI: 10.1007/s10994-022-06144-5 PMID: 35602587 PMC: PMC9114147

pubmed: 35602587

doi: 10.1007/s10994-022-06144-5

pii: 6144

pmc: PMC9114147

doi:

Types de publication

Journal Article

Langues

eng

Pagination

1431-1521

Informations de copyright

Déclaration de conflit d'intérêts

Conflict of interestThe authors declare that they have no competing interests.

Références

Nature. 2016 Jan 28;529(7587):484-9

pubmed: 26819042

Ann Oper Res. 2013 Sep 1;208(1):383-416

pubmed: 24049244

Nature. 2015 Feb 26;518(7540):529-33

pubmed: 25719670

IEEE Trans Neural Netw. 1994;5(3):363-71

pubmed: 18267804

Neural Comput. 1997 Jan 1;9(1):1-42

pubmed: 9117894

Nature. 2019 Nov;575(7782):350-354

pubmed: 31666705

Lipschitzness is all you need to tame off-policy generative adversarial imitation learning.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Références

Auteurs

Lionel Blondé (L)

Pablo Strasser (P)

Alexandros Kalousis (A)

Classifications MeSH