Temporal coherence shapes cortical responses to speech mixtures in a ferret cocktail party.
Journal
Communications biology
ISSN: 2399-3642
Titre abrégé: Commun Biol
Pays: England
ID NLM: 101719179
Informations de publication
Date de publication:
25 Oct 2024
25 Oct 2024
Historique:
received:
01
07
2024
accepted:
17
10
2024
medline:
26
10
2024
pubmed:
26
10
2024
entrez:
25
10
2024
Statut:
epublish
Résumé
Perceptual segregation of complex sounds such as speech and music simultaneously emanating from multiple sources is a remarkable ability that is common in humans and other animals alike. Unlike animal physiological experiments with simplified sounds or human investigations with spatially broad imaging techniques, this study combines insights from animal single-unit recordings with segregation of speech-like sound mixtures. Ferrets are trained to attend to a female voice and detect a target word, both in presence and absence of a concurrent equally salient male voice. Recordings are made in primary and secondary auditory cortical fields, and in frontal cortex. During task performance, representation of the female words becomes enhanced relative to the male in all, but especially in higher cortical regions. Analysis of the temporal and spectral response characteristics during task performance reveals how speech segregation gradually emerges in the auditory cortex. A computational model evaluated on the same voice mixtures replicates and extends these results to different attentional targets (attention to female or male voices). These findings underscore the role of the principle of temporal coherence whereby attention to a target voice binds together all neural responses coherently modulated with the target, thus ultimately forming and extracting a common auditory stream.
Identifiants
pubmed: 39455846
doi: 10.1038/s42003-024-07096-3
pii: 10.1038/s42003-024-07096-3
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1392Subventions
Organisme : U.S. Department of Health & Human Services | National Institutes of Health (NIH)
ID : R01DC017118
Organisme : United States Department of Defense | United States Air Force | AFMC | Air Force Office of Scientific Research (AF Office of Scientific Research)
ID : FA9550-19-1-0408
Informations de copyright
© 2024. The Author(s).
Références
Bregman, A. S. et al. Auditory Scene Analysis: The Perceptual Organization of Sound (MIT Press, 1994).
Wood, N. L. & Cowan, N. The cocktail party phenomenon revisited: attention and memory in the classic selective listening procedure of Cherry (1953). J. Exp. Psychol. Gen. 124, 243 (1995).
doi: 10.1037/0096-3445.124.3.243
pubmed: 7673862
Haykin, S. & Chen, Z. The cocktail party problem. Neural Comput. 17, 1875–1902 (2005).
doi: 10.1162/0899766054322964
pubmed: 15992485
Mesgarani, N. & Edward, F. C. Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485, 233–236 (2012).
doi: 10.1038/nature11020
pubmed: 22522927
Holmes, E. et al. Active inference, selective attention, and the cocktail party problem. Neurosci. Biobehav. Rev. 131, 1288–1304 (2021).
doi: 10.1016/j.neubiorev.2021.09.038
pubmed: 34687699
Golumbic, E. M. Z., Poeppel, D. & Schroeder, C. E. Temporal context in speech processing and attentional stream selection: a behavioral and neural perspective. Brain Lang. 122, 151–161 (2012).
doi: 10.1016/j.bandl.2011.12.010
O’sullivan, J. A. et al. Attentional selection in a cocktail party environment can be decoded from single-trial EEG. Cereb. Cortex 25, 1697–1706 (2015).
doi: 10.1093/cercor/bht355
pubmed: 24429136
Bednar, A. & Edmund, C. L. Where is the cocktail party? Decoding locations of attended and unattended moving sound sources using EEG. NeuroImage 205, 116283 (2020).
doi: 10.1016/j.neuroimage.2019.116283
pubmed: 31629828
Akram, S., Presacco, A., Simon, J. Z., Shamma, S. A. & Babadi, B. Robust decoding of selective auditory attention from MEG in a competing-speaker environment via state-space modeling. NeuroImage 124, 906–917 (2016).
doi: 10.1016/j.neuroimage.2015.09.048
pubmed: 26436490
Miran, S. et al. Real-time tracking of selective auditory attention from M/EEG: a Bayesian filtering approach. Front. Neurosci. 12, 262 (2018).
doi: 10.3389/fnins.2018.00262
pubmed: 29765298
Gutschalk, A., Rupp, A. & Dykstra, A. R. Interaction of streaming and attention in human auditory cortex. PLoS One 10, e0118962 (2015).
doi: 10.1371/journal.pone.0118962
pubmed: 25785997
Elhilali, M., Xiang, J., Shamma, S. A. & Simon, J. Z. Auditory streaming at the cocktail party: simultaneous neural and behavioral studies of auditory attention The Neurophysiology Bases of Auditory Perception 545–553 Springer, 2010.
doi: 10.1007/978-1-4419-5686-6_50
Shamma, S. & Elhilali M. Temporal coherence principle in auditory scene analysis. The Senses A Comprehensive Reference (2nd edition) (Elsevier, 2021).
Shamma, S. A., Elhilali, M. & Micheyl, C. Temporal coherence and attention in auditory scene analysis. Trends Neurosci. 34, 114–123 (2011).
doi: 10.1016/j.tins.2010.11.002
pubmed: 21196054
Cusack, R. et al. Effects of location, frequency region, and time course of selective attention on auditory scene analysis. J. Exp. Psychol. Hum. Percept. Perform. 30, 643–656 (2004).
doi: 10.1037/0096-1523.30.4.643
pubmed: 15301615
Bidet-Caulet, A. et al. Effects of selective attention on the electrophysiological representation of concurrent sounds in the human auditory cortex. J. Neurosci. 27, 9252–9261 (2007).
doi: 10.1523/JNEUROSCI.1402-07.2007
pubmed: 17728439
Kim, Y. J. et al. Attention induces synchronization-based response gain in steady-state visual evoked potentials. Nat. Neurosci. 10, 117–125 (2007).
doi: 10.1038/nn1821
pubmed: 17173045
Steinmetz, P. N. et al. Attention modulates synchronized neuronal firing in primate somatosensory cortex. Nature 404, 187–190 (2000).
doi: 10.1038/35004588
pubmed: 10724171
Ma, L., Yin, P., Christophe, M., Oxenham, J. A. & Shamma, S. A. Behavioral measures of auditory streaming in ferrets (Mustela putorius). J. Comp. Psychol. 124, 317 (2010).
doi: 10.1037/a0018273
pubmed: 20695663
Rezaeizadeh, M. & Shamma, S. Binding the acoustic features of an auditory source through temporal coherence. Cereb. Cortex Commun. 2, tgab060 (2021).
doi: 10.1093/texcom/tgab060
pubmed: 34746791
O’Sullivan, J. et al. Hierarchical encoding of attended auditory objects in multi-talker speech perception. Neuron 104, 1195–1209 (2019). 6.
doi: 10.1016/j.neuron.2019.09.007
pubmed: 31648900
Lu, K. et al. Temporal coherence structure rapidly shapes neuronal interactions. Nat. Commun. 8, 13900 (2017).
doi: 10.1038/ncomms13900
pubmed: 28054545
Elhilali, M., Ma, L., Christophe, M., Oxenham, A. J. & Shamma, S. A. Temporal coherence in the perceptual organization and cortical representation of auditory scenes. Neuron 61, 317–329 (2009).
doi: 10.1016/j.neuron.2008.12.005
pubmed: 19186172
Mesgarani, N., Stephen, D. V., Fritz, J. B. & Shamma, S. A. Influence of context and behavior on stimulus reconstruction from neural activity in primary auditory cortex. J. Neurophysiol. 102, 3329–3339 (2009).
doi: 10.1152/jn.91128.2008
pubmed: 19759321
Mesgarani, N., Stephen, D. V., Fritz, J. B. & Shamma, S.A. Mechanisms of noise robust representation of speech in primary auditory cortex. Proc. Natl Acad. Sci. 111, 6792–6797 (2014).
doi: 10.1073/pnas.1318017111
pubmed: 24753585
Elgueda, D. et al. State-dependent encoding of sound and behavioral meaning in a tertiary region of the ferret auditory cortex. Nat. Neurosci. 22, 447 (2019).
doi: 10.1038/s41593-018-0317-8
pubmed: 30692690
Lu, Kai et al. Implicit memory for complex sounds in higher auditory cortex of the ferret. J. Neurosci. 38, 9955–9966 (2018).
doi: 10.1523/JNEUROSCI.2118-18.2018
pubmed: 30266740
Krishnan, L., Elhilali, M. & Shamma, S. Segregating complex sound sources through temporal coherence. PLoS Comput. Biol. 10, e1003985 (2014).
doi: 10.1371/journal.pcbi.1003985
pubmed: 25521593
Yin, P., Fritz, J. & Shamma, S. Dynamics and hierarchical encoding of non-compact acoustic categories in auditory and frontal cortex. Curr. Biol. 30, 1649–1663 (2020).
doi: 10.1016/j.cub.2020.02.047
pubmed: 32220317
Otazu, G. H., Tai, L.-H., Yang, Y. & Zador, A. M. Engaging in an auditory task suppresses responses in auditory cortex. Nat. Neurosci. 12, 646–654 (2009).
doi: 10.1038/nn.2306
pubmed: 19363491
Bellur, A., Thakkar, K. & Elhilali, M. Explicit-memory multiresolution adaptive framework for speech and music separation. EURASIP J. Audio Speech Music Process. 2023, 20 (2023).
doi: 10.1186/s13636-023-00286-7
pubmed: 37181589
Walker, K. M. M., Bizley, J. K., King, A. J. & Schnupp, J. W. Multiplexed and robust representations of sound features in auditory cortex. J. Neurosci. 31, 14565–14576 (2011).
doi: 10.1523/JNEUROSCI.2074-11.2011
pubmed: 21994373
Atiani, S. et al. Emergent selectivity for task-relevant stimuli in higher-order auditory cortex. Neuron 82, 486–499 (2014).
doi: 10.1016/j.neuron.2014.02.029
pubmed: 24742467
Lu, K., et al. Temporal-coherence induces binding of responses to sound sequences in ferret auditory cortex. bioRxiv: 2024-05 (2024).
Lu, K., Liu, W., Dutta, K., Zan, P., Fritz, J. B. & Shamma, S. A. Adaptive efficient coding of correlated acoustic properties. J. Neurosci. 39, 8664–8678 (2019).
doi: 10.1523/JNEUROSCI.0141-19.2019
pubmed: 31519821
van der Heijden, K., et al. Joint population coding and temporal coherence link an attended talker’s voice and location features in naturalistic multi-talker scenes. bioRxiv: 2024–05 (2024).
Micheyl, C. et al. The role of auditory cortex in the formation of auditory streams. Hear. Res. 229, 116–131 (2007).
doi: 10.1016/j.heares.2007.01.007
pubmed: 17307315
Walker, K. M. M., Schnupp, J. W. H., Hart-Schnupp, S. M. B., King, A. J. & Bizley, J. K. Pitch discrimination by ferrets for simple and complex sounds. J. Acoustical Soc. Am. 126, 1321–1335 (2009).
doi: 10.1121/1.3179676
Walker, K. M. M., Gonzalez, R., Kang, J. Z., McDermott, J. H. & King, A. J. Across-species differences in pitch perception are consistent with differences in cochlear filtering. elife 8, e41626 (2019).
doi: 10.7554/eLife.41626
pubmed: 30874501
Bizley, J. K., Kerry, M. M. Walker, Nodal, F. R., King, A. J. & Schnupp, J. W. H. Auditory cortex represents both pitch judgments and the corresponding acoustic cues. Curr. Biol. 23, 620–625 (2013).
doi: 10.1016/j.cub.2013.03.003
pubmed: 23523247
Chi, T. et al. Multiresolution spectrotemporal analysis of complex sounds. J. Acoustical Soc. Am. 118, 887–906 (2005).
doi: 10.1121/1.1945807
Kraskov, A., Stögbauer. H. & Grassberger, P. Estimating mutual information. Phys Rev E. 69, 066138 (2004).
Joshi, N., et al. Data for 2024 temporal coherence shapes cortical responses to speech mixtures in a ferret cocktail party. Figshare. https://doi.org/10.6084/m9.figshare.27233988 (2024).