Trial-by-trial dynamics of reward prediction error-associated signals during extinction learning and renewal.

Animals Columbidae Conditioning, Operant Learning Prefrontal Cortex Reward

Electrophysiology Extinction learning Renewal Reward prediction error Trial-by-trial learning

Journal

Progress in neurobiology

ISSN: 1873-5118

Titre abrégé: Prog Neurobiol

Pays: England

ID NLM: 0370121

Informations de publication

Date de publication:
02 2021

Historique:

received: 15 04 2020

revised: 06 07 2020

accepted: 18 08 2020

pubmed: 28 8 2020

medline: 24 12 2021

entrez: 27 8 2020

Statut: ppublish

Résumé

Reward prediction errors (RPEs) have been suggested to drive associative learning processes, but their precise temporal dynamics at the single-neuron level remain elusive. Here, we studied the neural correlates of RPEs, focusing on their trial-by-trial dynamics during an operant extinction learning paradigm. Within a single behavioral session, pigeons went through acquisition, extinction and renewal - the context-dependent response recovery after extinction. We recorded single units from the avian prefrontal cortex analogue, the nidopallium caudolaterale (NCL) and found that the omission of reward during extinction led to a peak of population activity that moved backwards in time as trials progressed. The chronological order of these signal changes during the progress of learning was indicative of temporal shifts of RPE signals that started during reward omission and then moved backwards to the presentation of the conditioned stimulus. Switches from operant choices to avoidance behavior (and vice versa) coincided with changes in population activity during the animals' decision-making. On the single unit level, we found more diverse patterns where some neurons' activity correlated with RPE signals whereas others correlated with the absolute value during the outcome period. Finally, we demonstrated that mere sensory contextual changes during the renewal test were sufficient to elicit signals likely associated with RPEs. Thus, RPEs are truly expectancy-driven since they can be elicited by changes in reward expectation, without an actual change in the quality or quantity of reward.

Identifiants

DOI: 10.1016/j.pneurobio.2020.101901 PMID: 32846162

pubmed: 32846162

pii: S0301-0082(20)30156-8

doi: 10.1016/j.pneurobio.2020.101901

pii:

doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

Pagination

101901

Trial-by-trial dynamics of reward prediction error-associated signals during extinction learning and renewal.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Auteurs

Julian Packheiser (J)

José R Donoso (JR)

Sen Cheng (S)

Onur Güntürkün (O)

Roland Pusch (R)

Articles similaires

Evaluating the efficacy of telesurgery with dual console SSI Mantra Surgical Robotic System: experiment on animal model and clinical trials.

Odour generalisation and detection dog training.

FBXO22 inhibits colitis and colorectal carcinogenesis by regulating the degradation of the S2448-phosphorylated form of mTOR.

Use of organic material provided by an automatic enrichment device by weaner pigs and its influence on tail lesions.

Classifications MeSH