Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning.

Attention Biological Factors Learning Reinforcement, Psychology Reward

Neural networks Particle filter Reinforcement learning Selective attention Visual features

Journal

Neural networks : the official journal of the International Neural Network Society

ISSN: 1879-2782

Titre abrégé: Neural Netw

Pays: United States

ID NLM: 8805018

Informations de publication

Date de publication:
Jun 2022

Historique:

received: 16 06 2021

revised: 02 02 2022

accepted: 10 03 2022

pubmed: 1 4 2022

medline: 14 4 2022

entrez: 31 3 2022

Statut: ppublish

Résumé

Deep Reinforcement Learning (RL) is often criticised for being data inefficient and inflexible to changes in task structure. Part of the reason for these issues is that Deep RL typically learns end-to-end using backpropagation, which results in task-specific representations. One approach for circumventing these problems is to apply Deep RL to existing representations that have been learned in a more task-agnostic fashion. However, this only partially solves the problem as the Deep RL algorithm learns a function of all pre-existing representations and is therefore still susceptible to data inefficiency and a lack of flexibility. Biological agents appear to solve this problem by forming internal representations over many tasks and only selecting a subset of these features for decision-making based on the task at hand; a process commonly referred to as selective attention. We take inspiration from selective attention in biological agents and propose a novel algorithm called Selective Particle Attention (SPA), which selects subsets of existing representations for Deep RL. Crucially, these subsets are not learned through backpropagation, which is slow and prone to overfitting, but instead via a particle filter that rapidly and flexibly identifies key subsets of features using only reward feedback. We evaluate SPA on two tasks that involve raw pixel input and dynamic changes to the task structure, and show that it greatly increases the efficiency and flexibility of downstream Deep RL algorithms.

Identifiants

DOI: 10.1016/j.neunet.2022.03.015 PMID: 35358888 PMC: PMC9037388

pubmed: 35358888

pii: S0893-6080(22)00093-4

doi: 10.1016/j.neunet.2022.03.015

pmc: PMC9037388

pii:

doi:

Substances chimiques

Biological Factors 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

408-421

Informations de copyright

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Références

Neural Netw. 2022 Jan;145:10-21

pubmed: 34710787

Nat Commun. 2017 Nov 24;8(1):1768

pubmed: 29170381

J Neurosci. 2015 May 27;35(21):8145-57

pubmed: 26019331

Annu Rev Neurosci. 1995;18:193-222

pubmed: 7605061

Science. 2014 Jun 27;344(6191):1481-6

pubmed: 24876345

Neurosci Biobehav Rev. 2014 Oct;46 Pt 1:30-43

pubmed: 24929218

Annu Rev Neurosci. 2001;24:167-202

pubmed: 11283309

Child Dev. 1987 Jun;58(3):798-806

pubmed: 3608650

Interface Focus. 2018 Aug 6;8(4):20180013

pubmed: 29951193

Neuron. 2009 Jan 29;61(2):168-85

pubmed: 19186161

Philos Trans R Soc Lond B Biol Sci. 2014 Nov 5;369(1655):

pubmed: 25267817

Nature. 1999 Jun 10;399(6736):575-9

pubmed: 10376597

Front Neurosci. 2017 Sep 29;11:545

pubmed: 29033784

Adv Child Dev Behav. 2004;32:163-212

pubmed: 15641463

Elife. 2018 Oct 01;7:

pubmed: 30272560

Behav Brain Sci. 2017 Jan;40:e253

pubmed: 27881212

J Exp Psychol Hum Percept Perform. 2014 Aug;40(4):1580-602

pubmed: 24842065

Front Hum Neurosci. 2012 Jan 24;5:189

pubmed: 22291631

Neural Comput. 1997 Nov 15;9(8):1735-80

pubmed: 9377276

Curr Opin Neurobiol. 2003 Aug;13(4):428-32

pubmed: 12965289

Sci Rep. 2017 Dec 15;7(1):17676

pubmed: 29247192

Behav Brain Sci. 1998 Feb;21(1):1-17; discussion 17-54

pubmed: 10097010

Nature. 2015 Feb 26;518(7540):529-33

pubmed: 25719670

Neuron. 2015 Nov 18;88(4):832-44

pubmed: 26526392

Trends Cogn Sci. 2019 Apr;23(4):278-292

pubmed: 30824227

Nature. 2019 Nov;575(7782):350-354

pubmed: 31666705

Nat Neurosci. 2002 Jul;5(7):631-2

pubmed: 12068304

IEEE Trans Pattern Anal Mach Intell. 2006 Apr;28(4):594-611

pubmed: 16566508

Selective particle attention: Rapidly and flexibly selecting features for deep reinforcement learning.

Journal

Informations de publication

Résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Références

Auteurs

Sam Blakeman (S)

Denis Mareschal (D)

Articles similaires

Rats' performance in a suboptimal choice procedure implemented in a natural-foraging analogue.

Transmission of societal stereotypes to individual-level prejudice through instrumental learning.

Spatial selective auditory attention is preserved in older age but is degraded by peripheral hearing loss.

Autistic traits foster effective curiosity-driven exploration.

Classifications MeSH