Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning.

Neural networks Particle filter Reinforcement learning Selective attention Visual features

Journal

Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Titre abrégé: Neural Netw
Pays: United States
ID NLM: 8805018

Informations de publication

Date de publication:
Jun 2022
Historique:
received: 16 06 2021
revised: 02 02 2022
accepted: 10 03 2022
pubmed: 1 4 2022
medline: 14 4 2022
entrez: 31 3 2022
Statut: ppublish

Résumé

Deep Reinforcement Learning (RL) is often criticised for being data inefficient and inflexible to changes in task structure. Part of the reason for these issues is that Deep RL typically learns end-to-end using backpropagation, which results in task-specific representations. One approach for circumventing these problems is to apply Deep RL to existing representations that have been learned in a more task-agnostic fashion. However, this only partially solves the problem as the Deep RL algorithm learns a function of all pre-existing representations and is therefore still susceptible to data inefficiency and a lack of flexibility. Biological agents appear to solve this problem by forming internal representations over many tasks and only selecting a subset of these features for decision-making based on the task at hand; a process commonly referred to as selective attention. We take inspiration from selective attention in biological agents and propose a novel algorithm called Selective Particle Attention (SPA), which selects subsets of existing representations for Deep RL. Crucially, these subsets are not learned through backpropagation, which is slow and prone to overfitting, but instead via a particle filter that rapidly and flexibly identifies key subsets of features using only reward feedback. We evaluate SPA on two tasks that involve raw pixel input and dynamic changes to the task structure, and show that it greatly increases the efficiency and flexibility of downstream Deep RL algorithms.

Identifiants

pubmed: 35358888
pii: S0893-6080(22)00093-4
doi: 10.1016/j.neunet.2022.03.015
pmc: PMC9037388
pii:
doi:

Substances chimiques

Biological Factors 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

408-421

Informations de copyright

Copyright © 2022 The Authors. Published by Elsevier Ltd.. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Références

Neural Netw. 2022 Jan;145:10-21
pubmed: 34710787
Nat Commun. 2017 Nov 24;8(1):1768
pubmed: 29170381
J Neurosci. 2015 May 27;35(21):8145-57
pubmed: 26019331
Annu Rev Neurosci. 1995;18:193-222
pubmed: 7605061
Science. 2014 Jun 27;344(6191):1481-6
pubmed: 24876345
Neurosci Biobehav Rev. 2014 Oct;46 Pt 1:30-43
pubmed: 24929218
Annu Rev Neurosci. 2001;24:167-202
pubmed: 11283309
Child Dev. 1987 Jun;58(3):798-806
pubmed: 3608650
Interface Focus. 2018 Aug 6;8(4):20180013
pubmed: 29951193
Neuron. 2009 Jan 29;61(2):168-85
pubmed: 19186161
Philos Trans R Soc Lond B Biol Sci. 2014 Nov 5;369(1655):
pubmed: 25267817
Nature. 1999 Jun 10;399(6736):575-9
pubmed: 10376597
Front Neurosci. 2017 Sep 29;11:545
pubmed: 29033784
Adv Child Dev Behav. 2004;32:163-212
pubmed: 15641463
Elife. 2018 Oct 01;7:
pubmed: 30272560
Behav Brain Sci. 2017 Jan;40:e253
pubmed: 27881212
J Exp Psychol Hum Percept Perform. 2014 Aug;40(4):1580-602
pubmed: 24842065
Front Hum Neurosci. 2012 Jan 24;5:189
pubmed: 22291631
Neural Comput. 1997 Nov 15;9(8):1735-80
pubmed: 9377276
Curr Opin Neurobiol. 2003 Aug;13(4):428-32
pubmed: 12965289
Sci Rep. 2017 Dec 15;7(1):17676
pubmed: 29247192
Behav Brain Sci. 1998 Feb;21(1):1-17; discussion 17-54
pubmed: 10097010
Nature. 2015 Feb 26;518(7540):529-33
pubmed: 25719670
Neuron. 2015 Nov 18;88(4):832-44
pubmed: 26526392
Trends Cogn Sci. 2019 Apr;23(4):278-292
pubmed: 30824227
Nature. 2019 Nov;575(7782):350-354
pubmed: 31666705
Nat Neurosci. 2002 Jul;5(7):631-2
pubmed: 12068304
IEEE Trans Pattern Anal Mach Intell. 2006 Apr;28(4):594-611
pubmed: 16566508

Auteurs

Sam Blakeman (S)

Sony AI, Wiesenstrasse 5, 8952, Schlieren, Switzerland; Centre for Brain and Cognitive Development, Department of Psychological Sciences, Birkbeck, University of London, Malet Street, WC1E 7HX, United Kingdom. Electronic address: samrobertallan.blakeman@sony.com.

Denis Mareschal (D)

Centre for Brain and Cognitive Development, Department of Psychological Sciences, Birkbeck, University of London, Malet Street, WC1E 7HX, United Kingdom. Electronic address: d.mareschal@bbk.ac.uk.

Articles similaires

Animals Choice Behavior Rats Male Maze Learning

Transmission of societal stereotypes to individual-level prejudice through instrumental learning.

David T Schultner, Benjamin S Stillerman, Björn R Lindström et al.
1.00
Humans Stereotyping Prejudice Male Female
Humans Aged Middle Aged Adult Attention

Autistic traits foster effective curiosity-driven exploration.

Francesco Poli, Maran Koolen, Carlos A Velázquez-Vargas et al.
1.00
Humans Exploratory Behavior Male Female Young Adult

Classifications MeSH