Trial-by-trial dynamics of reward prediction error-associated signals during extinction learning and renewal.
Electrophysiology
Extinction learning
Renewal
Reward prediction error
Trial-by-trial learning
Journal
Progress in neurobiology
ISSN: 1873-5118
Titre abrégé: Prog Neurobiol
Pays: England
ID NLM: 0370121
Informations de publication
Date de publication:
02 2021
02 2021
Historique:
received:
15
04
2020
revised:
06
07
2020
accepted:
18
08
2020
pubmed:
28
8
2020
medline:
24
12
2021
entrez:
27
8
2020
Statut:
ppublish
Résumé
Reward prediction errors (RPEs) have been suggested to drive associative learning processes, but their precise temporal dynamics at the single-neuron level remain elusive. Here, we studied the neural correlates of RPEs, focusing on their trial-by-trial dynamics during an operant extinction learning paradigm. Within a single behavioral session, pigeons went through acquisition, extinction and renewal - the context-dependent response recovery after extinction. We recorded single units from the avian prefrontal cortex analogue, the nidopallium caudolaterale (NCL) and found that the omission of reward during extinction led to a peak of population activity that moved backwards in time as trials progressed. The chronological order of these signal changes during the progress of learning was indicative of temporal shifts of RPE signals that started during reward omission and then moved backwards to the presentation of the conditioned stimulus. Switches from operant choices to avoidance behavior (and vice versa) coincided with changes in population activity during the animals' decision-making. On the single unit level, we found more diverse patterns where some neurons' activity correlated with RPE signals whereas others correlated with the absolute value during the outcome period. Finally, we demonstrated that mere sensory contextual changes during the renewal test were sufficient to elicit signals likely associated with RPEs. Thus, RPEs are truly expectancy-driven since they can be elicited by changes in reward expectation, without an actual change in the quality or quantity of reward.
Identifiants
pubmed: 32846162
pii: S0301-0082(20)30156-8
doi: 10.1016/j.pneurobio.2020.101901
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
101901Informations de copyright
Copyright © 2020 The Author(s). Published by Elsevier Ltd.. All rights reserved.