FMRI speech tracking in primary and non-primary auditory cortex while listening to noisy scenes.
Journal
Communications biology
ISSN: 2399-3642
Titre abrégé: Commun Biol
Pays: England
ID NLM: 101719179
Informations de publication
Date de publication:
30 Sep 2024
30 Sep 2024
Historique:
received:
24
05
2023
accepted:
17
09
2024
medline:
1
10
2024
pubmed:
1
10
2024
entrez:
30
9
2024
Statut:
epublish
Résumé
Invasive and non-invasive electrophysiological measurements during "cocktail-party"-like listening indicate that neural activity in the human auditory cortex (AC) "tracks" the envelope of relevant speech. However, due to limited coverage and/or spatial resolution, the distinct contribution of primary and non-primary areas remains unclear. Here, using 7-Tesla fMRI, we measured brain responses of participants attending to one speaker, in the presence and absence of another speaker. Through voxel-wise modeling, we observed envelope tracking in bilateral Heschl's gyrus (HG), right middle superior temporal sulcus (mSTS) and left temporo-parietal junction (TPJ), despite the signal's sluggish nature and slow temporal sampling. Neurovascular activity correlated positively (HG) or negatively (mSTS, TPJ) with the envelope. Further analyses comparing the similarity between spatial response patterns in the single speaker and concurrent speakers conditions and envelope decoding indicated that tracking in HG reflected both relevant and (to a lesser extent) non-relevant speech, while mSTS represented the relevant speech signal. Additionally, in mSTS, the similarity strength correlated with the comprehension of relevant speech. These results indicate that the fMRI signal tracks cortical responses and attention effects related to continuous speech and support the notion that primary and non-primary AC process ongoing speech in a push-pull of acoustic and linguistic information.
Identifiants
pubmed: 39349723
doi: 10.1038/s42003-024-06913-z
pii: 10.1038/s42003-024-06913-z
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1217Subventions
Organisme : Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands Organisation for Scientific Research)
ID : 451-17-033
Organisme : Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands Organisation for Scientific Research)
ID : 406.20.GO.030
Informations de copyright
© 2024. The Author(s).
Références
Crosse, M. J., Di Liberto, G. M., Bednar, A. & Lalor, E. C. The multivariate temporal response function (mTRF) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Front. Hum. Neurosci. 10, 604 (2016).
pubmed: 27965557
pmcid: 5127806
doi: 10.3389/fnhum.2016.00604
Obleser, J. & Kayser, C. Neural entrainment and attentional selection in the listening brain. Trends Cogn. Sci. 23, 913–926 (2019).
pubmed: 31606386
doi: 10.1016/j.tics.2019.08.004
Brodbeck, C. & Simon, J. Z. Continuous speech processing. Curr. Opin. Physiol. 18, 25–31 (2020).
pubmed: 33225119
pmcid: 7673294
doi: 10.1016/j.cophys.2020.07.014
Wöstmann, M., Fiedler, L. & Obleser, J. Tracking the signal, cracking the code: speech and speech comprehension in non-invasive human electrophysiology. Lang. Cogn. Neurosci. 32, 855–869 (2017).
doi: 10.1080/23273798.2016.1262051
Ding, N. & Simon, J. Z. Emergence of neural encoding of auditory objects while listening to competing speakers. Proc. Natl Acad. Sci. USA 109, 11854–11859 (2012).
pubmed: 22753470
pmcid: 3406818
doi: 10.1073/pnas.1205381109
Zion Golumbic, E. M. et al. Mechanisms underlying selective neuronal tracking of attended speech at a “cocktail party”. Neuron 77, 980–991 (2013).
pubmed: 23473326
pmcid: 3891478
doi: 10.1016/j.neuron.2012.12.037
O’Sullivan, J. A. et al. Attentional selection in a cocktail party environment can be decoded from single-trial EEG. Cereb. Cortex 25, 1697–1706 (2015).
pubmed: 24429136
doi: 10.1093/cercor/bht355
Rimmele, J. M., Zion Golumbic, E., Schröger, E. & Poeppel, D. The effects of selective attention and speech acoustics on neural speech-tracking in a multi-talker scene. Cortex 68, 144–154 (2015).
pubmed: 25650107
pmcid: 4475476
doi: 10.1016/j.cortex.2014.12.014
Petersen, E. B., Wöstmann, M., Obleser, J. & Lunner, T. Neural tracking of attended versus ignored speech is differentially affected by hearing loss. J. Neurophysiol. 117, 18–27 (2017).
pubmed: 27707813
doi: 10.1152/jn.00527.2016
Ding, N. & Simon, J. Z. Neural coding of continuous speech in auditory cortex during monaural and dichotic listening. J. Neurophysiol. 107, 78–89 (2012).
pubmed: 21975452
doi: 10.1152/jn.00297.2011
O’Sullivan, J. et al. Hierarchical encoding of attended auditory objects in multi-talker speech perception. Neuron 104, 1195–1209.e3 (2019).
pubmed: 31648900
pmcid: 8082956
doi: 10.1016/j.neuron.2019.09.007
Brodbeck, C., Hong, L. E. & Simon, J. Z. Rapid transformation from auditory to linguistic representations of continuous speech. Curr. Biol. 28, 3976–3983.e5 (2018).
pubmed: 30503620
pmcid: 6339854
doi: 10.1016/j.cub.2018.10.042
Forte, A. E., Etard, O. & Reichenbach, T. The human auditory brainstem response to running speech reveals a subcortical mechanism for selective attention. Elife 6, e27203 (2017).
pubmed: 28992445
pmcid: 5634786
doi: 10.7554/eLife.27203
Maddox, R. K. & Lee, A. K. C. Auditory brainstem responses to continuous natural speech in human listeners. eNeuro 5, ENEURO.0441–17.2018 (2018).
pubmed: 29435487
doi: 10.1523/ENEURO.0441-17.2018
Puschmann, S. et al. The right temporoparietal junction supports speech tracking during selective listening: evidence from concurrent EEG-fMRI. J. Neurosci. 37, 11505–11516 (2017).
pubmed: 29061698
pmcid: 6596752
doi: 10.1523/JNEUROSCI.1007-17.2017
Friston, K. J. et al. Statistical parametric maps in functional imaging: a general linear approach. Hum. Brain Mapp. 2, 189–210 (1994).
doi: 10.1002/hbm.460020402
Peelen, M. V., Fei-Fei, L. & Kastner, S. Neural mechanisms of rapid natural scene categorization in human visual cortex. Nature 460, 94–97 (2009).
pubmed: 19506558
pmcid: 2752739
doi: 10.1038/nature08103
Hausfeld, L., Riecke, L. & Formisano, E. Acoustic and higher-level representations of naturalistic auditory scenes in human auditory and frontal cortex. Neuroimage 173, 472–483 (2018).
pubmed: 29518569
doi: 10.1016/j.neuroimage.2018.02.065
Formisano, E., De Martino, F., Bonte, M. & Goebel, R. ‘Who’ is saying ‘what’? brain-based decoding of human voice and speech. Science 322, 970–973 (2008).
pubmed: 18988858
doi: 10.1126/science.1164318
Mesgarani, N. & Chang, E. F. Selective cortical representation of attended speaker in multi-talker speech perception. Nature 485, 233–236 (2012).
pubmed: 22522927
doi: 10.1038/nature11020
Pasley, B. N. et al. Reconstructing speech from human auditory cortex. PLoS Biol. 10, e1001251 (2012).
pubmed: 22303281
pmcid: 3269422
doi: 10.1371/journal.pbio.1001251
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).
doi: 10.1111/j.2517-6161.1995.tb02031.x
Mirkovic, B., Debener, S., Jaeger, M. & De Vos, M. Decoding the attended speech stream with multi-channel EEG: implications for online, daily-life applications. J. Neural Eng. 12, 046007 (2015).
pubmed: 26035345
doi: 10.1088/1741-2560/12/4/046007
Das, N., Bertrand, A. & Francart, T. EEG-based auditory attention detection: boundary conditions for background noise and speaker positions. J. Neural Eng. 15, 066017 (2018).
pubmed: 30207293
doi: 10.1088/1741-2552/aae0a6
Hamilton, L. S., Edwards, E. & Chang, E. F. A spatial map of onset and sustained responses to speech in the human superior temporal gyrus. Curr. Biol. 28, 1860–1871.e4 (2018).
pubmed: 29861132
doi: 10.1016/j.cub.2018.04.033
De Martino, F. et al. The impact of ultra-high field MRI on cognitive and computational neuroimaging. Neuroimage 168, 366–382 (2018).
pubmed: 28396293
doi: 10.1016/j.neuroimage.2017.03.060
Haufe, S. et al. Elucidating relations between fMRI, ECoG, and EEG through a common natural stimulus. Neuroimage 179, 79–91 (2018).
pubmed: 29902585
doi: 10.1016/j.neuroimage.2018.06.016
Kayser, C. A comparison of hemodynamic and neural responses in cat visual cortex using complex stimuli. Cereb. Cortex 14, 881–891 (2004).
pubmed: 15084493
doi: 10.1093/cercor/bhh047
Logothetis, N. K., Pauls, J., Augath, M., Trinath, T. & Oeltermann, A. Neurophysiological investigation of the basis of the fMRI signal. Nature 412, 150–157 (2001).
pubmed: 11449264
doi: 10.1038/35084005
Hickok, G. & Poeppel, D. The cortical organization of speech processing. Nat. Rev. Neurosci. 8, 393–402 (2007).
pubmed: 17431404
doi: 10.1038/nrn2113
Wilson, S. M., Bautista, A. & McCarron, A. Convergence of spoken and written language processing in the superior temporal sulcus. Neuroimage 171, 62–74 (2018).
pubmed: 29277646
doi: 10.1016/j.neuroimage.2017.12.068
Quillen, I. A., Yen, M. & Wilson, S. M. Distinct neural correlates of linguistic and non-linguistic demand. Neurobiol. Lang. 2, 202–225 (2021).
doi: 10.1162/nol_a_00031
de Heer, W. A., Huth, A. G., Griffiths, T. L., Gallant, J. L. & Theunissen, F. E. The hierarchical cortical organization of human speech processing. J. Neurosci. 37, 6539–6557 (2017).
pubmed: 28588065
pmcid: 5511884
doi: 10.1523/JNEUROSCI.3267-16.2017
Matchin, W. & Hickok, G. The cortical organization of syntax. Cereb. Cortex 30, 1481–1498 (2020).
pubmed: 31670779
doi: 10.1093/cercor/bhz180
Luo, H. & Poeppel, D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54, 1001–1010 (2007).
pubmed: 17582338
pmcid: 2703451
doi: 10.1016/j.neuron.2007.06.004
Poeppel, D. The analysis of speech in different temporal integration windows: cerebral lateralization as ‘asymmetric sampling in time. Speech Commun. 41, 245–255 (2003).
doi: 10.1016/S0167-6393(02)00107-3
Ding, N. & Simon, J. Z. Adaptive temporal encoding leads to a background-insensitive cortical representation of speech. J. Neurosci. 33, 5728–5735 (2013).
pubmed: 23536086
pmcid: 3643795
doi: 10.1523/JNEUROSCI.5297-12.2013
Lesenfants, D., Vanthornhout, J., Verschueren, E., Decruy, L. & Francart, T. Predicting individual speech intelligibility from the cortical tracking of acoustic- and phonetic-level speech representations. Hear. Res. 380, 1–9 (2019).
pubmed: 31167150
doi: 10.1016/j.heares.2019.05.006
Etard, O. & Reichenbach, T. Neural speech tracking in the theta and in the delta frequency band differentially encode clarity and comprehension of speech in noise. J. Neurosci. 39, 5750–5759 (2019).
pubmed: 31109963
pmcid: 6636082
doi: 10.1523/JNEUROSCI.1828-18.2019
Fiedler, L., Wöstmann, M., Herbst, S. K. & Obleser, J. Late cortical tracking of ignored speech facilitates neural selectivity in acoustically challenging conditions. Neuroimage 186, 33–42 (2019).
pubmed: 30367953
doi: 10.1016/j.neuroimage.2018.10.057
Broderick, M. P., Anderson, A. J., Di Liberto, G. M., Crosse, M. J. & Lalor, E. C. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech. Curr. Biol. 28, 803–809.e3 (2018).
pubmed: 29478856
doi: 10.1016/j.cub.2018.01.080
Brodbeck, C., Jiao, A., Hong, L. E. & Simon, J. Z. Neural speech restoration at the cocktail party: auditory cortex recovers masked speech of both attended and ignored speakers. PLoS Biol. 18, 1–22 (2020).
doi: 10.1371/journal.pbio.3000883
Hausfeld, L., Riecke, L., Valente, G. & Formisano, E. Cortical tracking of multiple streams outside the focus of attention in naturalistic auditory scenes. Neuroimage 181, 617–626 (2018).
pubmed: 30048749
doi: 10.1016/j.neuroimage.2018.07.052
Khalighinejad, B., Herrero, J. L., Mehta, A. D. & Mesgarani, N. Adaptation of the human auditory cortex to changing background noise. Nat. Commun. 10, 2509 (2019).
pubmed: 31175304
pmcid: 6555798
doi: 10.1038/s41467-019-10611-4
Puvvada, K. C. & Simon, J. Z. Cortical representations of speech in a multitalker auditory scene. J. Neurosci. 37, 9189–9196 (2017).
pubmed: 28821680
pmcid: 5607465
doi: 10.1523/JNEUROSCI.0938-17.2017
Varnet, L., Ortiz-Barajas, M. C., Erra, R. G., Gervain, J. & Lorenzi, C. A cross-linguistic study of speech modulation spectra. J. Acoust. Soc. Am. 142, 1976–1989 (2017).
pubmed: 29092595
doi: 10.1121/1.5006179
Lewis, L. D., Setsompop, K., Rosen, B. R. & Polimeni, J. R. Fast fMRI can detect oscillatory neural activity in humans. Proc. Natl Acad. Sci. 113, E6679–E6685 (2016).
pubmed: 27729529
pmcid: 5087037
doi: 10.1073/pnas.1608117113
Di Liberto, G. M., O’Sullivan, J. A. & Lalor, E. C. Low-frequency cortical entrainment to speech reflects phoneme-level processing. Curr. Biol. 25, 2457–2465 (2015).
pubmed: 26412129
doi: 10.1016/j.cub.2015.08.030
Heilbron, M., Armeni, K., Schoffelen, J.-M., Hagoort, P. & de Lange, F. P. A hierarchy of linguistic predictions during natural language comprehension. Proc. Natl Acad. Sci. USA 119, e2201968119 (2022).
pubmed: 35921434
pmcid: 9371745
doi: 10.1073/pnas.2201968119
Shain, C., Blank, I. A., Fedorenko, E., Gibson, E. & Schuler, W. Robust effects of working memory demand during naturalistic language comprehension in language-selective cortex. J. Neurosci. 42, 7412–7430 (2022).
pubmed: 36002263
pmcid: 9525168
doi: 10.1523/JNEUROSCI.1894-21.2022
de Cheveigné, A. & Kawahara, H. YIN, a fundamental frequency estimator for speech and music. J. Acoust. Soc. Am. 111, 1917–1930 (2002).
pubmed: 12002874
doi: 10.1121/1.1458024
Boersma, P. & Weenink, D. Praat: doing phonetics by computer. Ear. Hear. 32, 266 (2019).
Marques, J. P. et al. MP2RAGE, a self bias-field corrected sequence for improved segmentation and T1-mapping at high field. Neuroimage 49, 1271–1281 (2010).
pubmed: 19819338
doi: 10.1016/j.neuroimage.2009.10.002
Cumming, G. Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis 1st edn, Vol. 536 (Routledge, 2012).
Jezzard, P. & Balaban, R. S. Correction for geometric distortion in echo planar images from B0 field variations. Magn. Reson. Med. 34, 65–73 (1995).
pubmed: 7674900
doi: 10.1002/mrm.1910340111
Talairach, J. & Tournoux, P. Co-planar Stereotaxic Atlas of the Human Brain: 3-Dimensional Proportional System—an Approach to Cerebral Imaging 1st edn, Vol. 132 (Thieme Medical Publishers, 1988).
Goebel, R., Esposito, F. & Formisano, E. Analysis of functional image analysis contest (FIAC) data with brainvoyager QX: from single-subject to cortically aligned group general linear model analysis and self-organizing group independent component analysis. Hum. Brain Mapp. 27, 392–401 (2006).
pubmed: 16596654
pmcid: 6871277
doi: 10.1002/hbm.20249
Kriegeskorte, N., Simmons, W. K., Bellgowan, P. S. F. & Baker, C. I. Circular analysis in systems neuroscience: the dangers of double dipping. Nat. Neurosci. 12, 535–540 (2009).
pubmed: 19396166
pmcid: 2841687
doi: 10.1038/nn.2303
Hausfeld, L., Hamers, I. M. H. & Formisano, E. Data from: FMRI speech tracking in primary and non-primary auditory cortex while listening to noisy scenes [Data set]. Zenodo https://doi.org/10.5281/zenodo.13359542 (2024).