Mastering the game of Stratego with model-free multiagent reinforcement learning.

Humans Artificial Intelligence Reinforcement, Psychology Video Games

Journal

Science (New York, N.Y.)

ISSN: 1095-9203

Titre abrégé: Science

Pays: United States

ID NLM: 0404511

Informations de publication

Date de publication:
02 12 2022

Historique:

entrez: 1 12 2022

pubmed: 2 12 2022

medline: 6 12 2022

Statut: ppublish

Résumé

We introduce DeepNash, an autonomous agent that plays the imperfect information game Stratego at a human expert level. Stratego is one of the few iconic board games that artificial intelligence (AI) has not yet mastered. It is a game characterized by a twin challenge: It requires long-term strategic thinking as in chess, but it also requires dealing with imperfect information as in poker. The technique underpinning DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego through self-play from scratch. DeepNash beat existing state-of-the-art AI methods in Stratego and achieved a year-to-date (2022) and all-time top-three ranking on the Gravon games platform, competing with human expert players.

Identifiants

DOI: 10.1126/science.add4679 PMID: 36454847

pubmed: 36454847

doi: 10.1126/science.add4679

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

990-996

Auteurs

Julien Perolat (J)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-8176-1666

Bart De Vylder (B)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-7833-4831

Daniel Hennes (D)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-3646-5286

Eugene Tarassov (E)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-7330-860X

Florian Strub (F)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0001-7271-5345

Vincent de Boer (V)

DeepMind Technologies Ltd., London, UK.

Paul Muller (P)

DeepMind Technologies Ltd., London, UK.

Jerome T Connor (JT)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-7141-6260

Neil Burch (N)

DeepMind Technologies Ltd., London, UK.

Thomas Anthony (T)

DeepMind Technologies Ltd., London, UK.

Stephen McAleer (S)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0003-0118-6874

Romuald Elie (R)

DeepMind Technologies Ltd., London, UK.

Sarah H Cen (SH)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0003-3723-8883

Zhe Wang (Z)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-0748-5376

Audrunas Gruslys (A)

DeepMind Technologies Ltd., London, UK.

Aleksandra Malysheva (A)

DeepMind Technologies Ltd., London, UK.

Mina Khan (M)

DeepMind Technologies Ltd., London, UK.

Sherjil Ozair (S)

DeepMind Technologies Ltd., London, UK.

Finbarr Timbers (F)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0001-9047-9542

Toby Pohlen (T)

DeepMind Technologies Ltd., London, UK.

Tom Eccles (T)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0001-6706-017X

Mark Rowland (M)

DeepMind Technologies Ltd., London, UK.

Marc Lanctot (M)

DeepMind Technologies Ltd., London, UK.

Jean-Baptiste Lespiau (JB)

DeepMind Technologies Ltd., London, UK.

Bilal Piot (B)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0003-3906-950X

Shayegan Omidshafiei (S)

DeepMind Technologies Ltd., London, UK.

Edward Lockhart (E)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0001-8753-0765

Laurent Sifre (L)

DeepMind Technologies Ltd., London, UK.

Nathalie Beauguerlange (N)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-6246-4279

Remi Munos (R)

DeepMind Technologies Ltd., London, UK.

David Silver (D)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0002-5197-2892

Satinder Singh (S)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0001-9360-7060

Demis Hassabis (D)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0003-2812-9917

Karl Tuyls (K)

DeepMind Technologies Ltd., London, UK.

ORCID: 0000-0001-7929-1944

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen

1.00

Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.

1.00

Humans Male Smoking Cessation Cardiovascular Diseases Female

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Ciara Duggan, Adam L Beckman, Ishani Ganguli et al.

1.00

Humans United States Aged Cross-Sectional Studies Medicare Part C

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Hallie Tankha, Devyn Gaskins, Amanda Shallcross et al.

1.00

Humans Yoga Low Back Pain Female Male

Classifications MeSH

questionsmedicales.fr › Eucaryotes › Animaux › Chordés › Vertébrés › Mammifères › Eutheria › Primates › Haplorhini › Catarrhini › Hominidae › Humains › Mastering the game of Stratego with model-free...

questionsmedicales.fr › Sciences de l'information › Méthodologies informatiques › Algorithmes › Intelligence artificielle › Mastering the game of Stratego with model-free...

questionsmedicales.fr › Phénomènes psychologiques › Processus mentaux › Renforcement (psychologie) › Mastering the game of Stratego with model-free...

questionsmedicales.fr › Sciences de l'information › Méthodologies informatiques › Logiciel › Jeux vidéo › Mastering the game of Stratego with model-free...