Distributional Reinforcement Learning in the Brain.
artificial intelligence
deep neural networks
dopamine
machine learning
population coding
reward
Journal
Trends in neurosciences
ISSN: 1878-108X
Titre abrégé: Trends Neurosci
Pays: England
ID NLM: 7808616
Informations de publication
Date de publication:
12 2020
12 2020
Historique:
received:
09
06
2020
revised:
14
08
2020
accepted:
08
09
2020
pubmed:
24
10
2020
medline:
19
8
2021
entrez:
23
10
2020
Statut:
ppublish
Résumé
Learning about rewards and punishments is critical for survival. Classical studies have demonstrated an impressive correspondence between the firing of dopamine neurons in the mammalian midbrain and the reward prediction errors of reinforcement learning algorithms, which express the difference between actual reward and predicted mean reward. However, it may be advantageous to learn not only the mean but also the complete distribution of potential rewards. Recent advances in machine learning have revealed a biologically plausible set of algorithms for reconstructing this reward distribution from experience. Here, we review the mathematical foundations of these algorithms as well as initial evidence for their neurobiological implementation. We conclude by highlighting outstanding questions regarding the circuit computation and behavioral readout of these distributional codes.
Identifiants
pubmed: 33092893
pii: S0166-2236(20)30198-3
doi: 10.1016/j.tins.2020.09.004
pmc: PMC8073212
mid: NIHMS1692131
pii:
doi:
Substances chimiques
Dopamine
VTD58H1Z2X
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Review
Langues
eng
Sous-ensembles de citation
IM
Pagination
980-997Subventions
Organisme : NINDS NIH HHS
ID : R01 NS108740
Pays : United States
Organisme : NINDS NIH HHS
ID : R01 NS116753
Pays : United States
Informations de copyright
Copyright © 2020 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Références
Trends Cogn Sci. 2019 May;23(5):408-422
pubmed: 31003893
Annu Rev Neurosci. 2017 Jul 25;40:373-394
pubmed: 28441114
Front Neuroanat. 2017 Mar 27;11:25
pubmed: 28396627
Nature. 2015 Sep 10;525(7568):243-6
pubmed: 26322583
Proc Natl Acad Sci U S A. 2014 Jun 10;111(23):8619-24
pubmed: 24812127
Elife. 2017 Mar 21;6:
pubmed: 28322190
Nat Rev Neurosci. 2019 Oct;20(10):635-644
pubmed: 31147631
Neural Netw. 2002 Jun-Jul;15(4-6):495-506
pubmed: 12371507
Nat Rev Neurosci. 2019 Aug;20(8):482-494
pubmed: 31171839
Proc Natl Acad Sci U S A. 2011 Mar 29;108(13):5466-71
pubmed: 21402915
Neuron. 2010 Nov 18;68(4):789-800
pubmed: 21092866
Neuropsychopharmacology. 2009 Feb;34(3):681-97
pubmed: 18668030
Neuron. 2009 Mar 12;61(5):786-800
pubmed: 19285474
Philos Trans R Soc Lond B Biol Sci. 2008 Dec 12;363(1511):3801-11
pubmed: 18829433
Nature. 2015 May 28;521(7553):436-44
pubmed: 26017442
Nature. 2019 Jun;570(7762):509-513
pubmed: 31142844
Annu Rev Neurosci. 2019 Jul 8;42:459-483
pubmed: 31018098
Science. 2013 Aug 2;341(6145):546-9
pubmed: 23908236
Annu Rev Neurosci. 2012;35:287-308
pubmed: 22462543
N Engl J Med. 1988 Apr 7;318(14):876-80
pubmed: 3352672
Nat Neurosci. 2016 Mar;19(3):479-86
pubmed: 26854803
Nature. 2016 Jan 28;529(7587):484-9
pubmed: 26819042
Neuron. 2017 Jul 19;95(2):245-258
pubmed: 28728020
Science. 2014 Sep 26;345(6204):1616-20
pubmed: 25258080
Cold Spring Harb Symp Quant Biol. 2018;83:83-95
pubmed: 30787046
Elife. 2016 Oct 19;5:
pubmed: 27760002
Neuron. 2016 Sep 21;91(6):1374-1389
pubmed: 27618675
Brain Struct Funct. 2019 Jan;224(1):219-238
pubmed: 30302539
Elife. 2016 Nov 28;5:
pubmed: 27892854
Curr Biol. 2017 Mar 20;27(6):821-832
pubmed: 28285994
Nat Neurosci. 2007 Sep;10(9):1214-21
pubmed: 17676057
Proc Natl Acad Sci U S A. 2017 Nov 28;114(48):E10494-E10503
pubmed: 29133424
Cognition. 1994 Apr-Jun;50(1-3):7-15
pubmed: 8039375
Science. 2019 May 3;364(6439):
pubmed: 31048462
Neuron. 2009 Sep 24;63(6):733-45
pubmed: 19778504
Curr Biol. 2019 Jun 17;29(12):2066-2074.e5
pubmed: 31155352
Mov Disord. 2012 Nov;27(13):1679-82
pubmed: 23150469
J Neurosci. 2009 Jan 14;29(2):444-53
pubmed: 19144844
Nat Neurosci. 2013 Sep;16(9):1170-8
pubmed: 23955561
Psychol Rev. 1981 Mar;88(2):135-70
pubmed: 7291377
J Neurosci. 2001 Sep 15;21(18):7247-60
pubmed: 11549735
Proc Natl Acad Sci U S A. 2013 Sep 24;110(39):15788-93
pubmed: 24019461
J Clin Exp Neuropsychol. 2011 Apr;33(4):395-409
pubmed: 21140314
Proc Natl Acad Sci U S A. 2013 Oct 15;110(42):17154-9
pubmed: 24082097
Neuron. 2006 Aug 3;51(3):381-90
pubmed: 16880132
J Neurosci. 2011 Oct 26;31(43):15310-9
pubmed: 22031877
PLoS Comput Biol. 2016 Sep 02;12(9):e1005062
pubmed: 27589489
Nature. 2020 Jan;577(7792):671-675
pubmed: 31942076
Nature. 2020 Mar;579(7800):555-560
pubmed: 32214250
Science. 2003 Mar 21;299(5614):1898-902
pubmed: 12649484
Nat Rev Neurosci. 2019 Nov;20(11):703-714
pubmed: 31570826
J Neurosci. 2013 Mar 13;33(11):4693-709
pubmed: 23486943
Nat Neurosci. 2018 Jun;21(6):787-793
pubmed: 29760524
Neuron. 2015 Sep 23;87(6):1304-1316
pubmed: 26365765
Nature. 2016 Mar 31;531(7596):642-6
pubmed: 27007845
Neural Comput. 2003 Oct;15(10):2255-79
pubmed: 14511521
Nat Neurosci. 2006 Nov;9(11):1432-8
pubmed: 17057707
Nat Commun. 2017 Jul 26;8(1):134
pubmed: 28747623
Neuron. 2019 Oct 9;104(1):63-77
pubmed: 31600516
Neuron. 2018 May 2;98(3):616-629.e6
pubmed: 29656872
Nat Commun. 2018 May 14;9(1):1891
pubmed: 29760401
Cell. 2020 Dec 10;183(6):1600-1616.e25
pubmed: 33248024
Nature. 2015 Feb 26;518(7540):529-33
pubmed: 25719670
J Neurol Neurosurg Psychiatry. 2015 Feb;86(2):186-90
pubmed: 24860137
Science. 1997 Mar 14;275(5306):1593-9
pubmed: 9054347
Nat Neurosci. 2017 Apr;20(4):581-589
pubmed: 28263301
Neuron. 2019 Dec 4;104(5):987-999.e4
pubmed: 31627985
Cogn Affect Behav Neurosci. 2008 Dec;8(4):429-53
pubmed: 19033240
Curr Opin Neurobiol. 2020 Oct;64:46-52
pubmed: 32146296