Segmentation window of speech information processing in the human auditory cortex.
Auditory evoked magnetic fields
Continuous speech
Speech perception
Superior temporal area
Temporal segmentation window
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
24 Oct 2024
24 Oct 2024
Historique:
received:
05
05
2024
accepted:
10
10
2024
medline:
25
10
2024
pubmed:
25
10
2024
entrez:
25
10
2024
Statut:
epublish
Résumé
Humans perceive continuous speech signals as discrete sequences. To clarify the temporal segmentation window of speech information processing in the human auditory cortex, the relationship between speech perception and cortical responses was investigated using auditory evoked magnetic fields (AEFs). AEFs were measured while participants heard synthetic Japanese words /atataka/. There were eight types of /atataka/ with different speech rates. The durations of the words ranged from 75 to 600 ms. The results revealed a clear correlation between the AEFs and syllables. Specifically, when the durations of the words were between 375 and 600 ms, the evoked responses exhibited four clear responses from the superior temporal area, M100, that corresponded not only to the onset of speech but also to each group of consonant/vowel syllable units. The number of evoked M100 responses was correlated to the duration of the stimulus as well as the number of perceived syllables. The approximate range of the temporal segmentation window limit of speech perception was considered to be between 75 and 94 ms. This finding may contribute to optimizing the temporal performance of high-speed synthesized speech generation systems.
Identifiants
pubmed: 39448758
doi: 10.1038/s41598-024-76137-y
pii: 10.1038/s41598-024-76137-y
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
25044Subventions
Organisme : JSPS KAKENHI Grant
ID : 18K11379, 24K15685
Organisme : Cooperative Study Program of the National Institute for Physiological Sciences
ID : 24NIPS136, 23NIPS147, 22NIPS152, 21-521
Organisme : Cooperative Study Program of the National Institute for Physiological Sciences
ID : 24NIPS136, 23NIPS147, 22NIPS152, 21-521
Informations de copyright
© 2024. The Author(s).
Références
Cohen, M. X. Where does EEG come from and what does it mean? Trends Neurosci. 40, 208–218 (2017).
pubmed: 28314445
doi: 10.1016/j.tins.2017.02.004
Cohen, D. Magnetoencephalography: evidence of magnetic fields produced by alpha-rhythm currents. Science 161, 784–786 (1968).
pubmed: 5663803
doi: 10.1126/science.161.3843.784
Näätänen, R. et al. Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature 385, 432–444 (1997).
pubmed: 9009189
doi: 10.1038/385432a0
Aniruddh, D. P. & Evan, B. Temporal patterns of human cortical activity reflect tone sequence structure. Nature 404, 80–84 (2000).
doi: 10.1038/35003577
Hari, R., Levanen, S. & Raij, T. Timing of human cortical functions during cognition: role of MEG. Trends Cogn. Sci. 4, 455–462 (2000).
pubmed: 11115759
doi: 10.1016/S1364-6613(00)01549-7
Hari, R. & Puce, A. MEG-EEG Primer 311–318 (Oxford University Press, 2017).
Miller, G. A. Decision units in the perception of speech. IRE Trans. Inf. Theory 8, 81–83 (1962).
doi: 10.1109/TIT.1962.1057697
Furui, S. On the role of spectral transition for speech perception. J. Acoust. Soc. Am. 80, 1016–1025 (1986).
pubmed: 3771921
doi: 10.1121/1.393842
Port, R. F., Dalby, J. & O’Dell, M. Evidence for mora timing in Japanese. J. Acoust. Soc. Am. 81, 1574–1585 (1987).
pubmed: 3584695
doi: 10.1121/1.394510
Goldinger, S. D. & Azuma, T. Puzzle-solving science: the quixotic quest for units in speech perception. J. Phon. 31, 305–320 (2003).
pubmed: 29093608
pmcid: 5661981
doi: 10.1016/S0095-4470(03)00030-5
Hayashi, M. Auditory neuromagnetic fields evoked by spectral transition of syllables. J. Robot Mechatron. 5, 409–412 (1993).
doi: 10.20965/jrm.1993.p0409
Hayashi, M. Analysis of auditory magnetic fields evoked by speech sounds. Biomed. Res. 18, 91–100 (1997).
Cynx, J. Experimental determination of a unit of song production in the zebra finch (Taeniopygia guttata). J. Comp. Psychol. 104, 3–10 (1990).
pubmed: 2354628
doi: 10.1037/0735-7036.104.1.3
Hahnloser, R. H. R., Kozhevnikov, A. A. & Fee, M. S. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419, 65–70 (2002).
pubmed: 12214232
doi: 10.1038/nature00974
Leonardo, A. & Fee, M. S. Ensemble coding of vocal control in birdsong. J. Neurosci. 25, 652–661 (2005).
pubmed: 15659602
pmcid: 6725314
doi: 10.1523/JNEUROSCI.3036-04.2005
Glaze, C. M. & Troyer, T. W. Behavioral measurements of a temporally precise motor code for birdsong. J. Neurosci. 27, 7631–7639 (2007).
pubmed: 17634357
pmcid: 6672882
doi: 10.1523/JNEUROSCI.1065-07.2007
Troyer, W. T. The units of a song. Nature 495, 56–57 (2013).
pubmed: 23446352
doi: 10.1038/nature11957
Weaver, J. Song circuit in bird brain contains map of space and time. PLoS Biol. 3, e1002159 (2015).
doi: 10.1371/journal.pbio.1002159
Schroeder, C. E., Molhom, S., Lakatos, P., Ritter, W. & Foxe, J. J. Human–simian correspondence in the early cortical processing of multisensory cues. Cogn. Process. 5, 140–151 (2004).
doi: 10.1007/s10339-004-0020-4
Itoh, K. et al. Cerebral cortical processing time is elongated in human brain evolution. Sci. Rep. 12, 1103 (2022).
pubmed: 35058509
pmcid: 8776799
doi: 10.1038/s41598-022-05053-w
Boemio, A., Fromm, S., Braun, A. & Poeppel, D. Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat. Neurosci. 8, 389–395 (2005).
pubmed: 15723061
doi: 10.1038/nn1409
Luo, H. & Poeppel, D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54, 1001–1010 (2007).
pubmed: 17582338
pmcid: 2703451
doi: 10.1016/j.neuron.2007.06.004
Giraud, A. L. & Poeppel, D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15, 511–517 (2012).
pubmed: 22426255
pmcid: 4461038
doi: 10.1038/nn.3063
Alexander, M. et al. Speech-specific tuning of neurons in human superior temporal gyrus. Cereb. Cortex. 24, 2679–2693 (2014).
doi: 10.1093/cercor/bht127
Alain, C., Arsenault, J. S., Garami, L., Bidelman, G. M. & Snyder, J. S. Neural correlates of speech segregation based on formant frequencies of adjacent vowels. Sci. Rep. 7, 40790 (2017).
pubmed: 28102300
pmcid: 5244401
doi: 10.1038/srep40790
Zhang, N. & Zhang, Q. Rhythmic pattern facilitates speech production: an ERP study. Sci. Rep. 9, 12974 (2019).
pubmed: 31506472
pmcid: 6736834
doi: 10.1038/s41598-019-49375-8
Giraud, A. L. et al. Endogenous cortical rhythms determine cerebral specialization for speech perception and production. Neuron 56, 1127–1134 (2007).
pubmed: 18093532
doi: 10.1016/j.neuron.2007.09.038
Lee, B. & Cho, K. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference. Sci. Rep. 6, 37647 (2016).
pubmed: 27876875
pmcid: 5120313
doi: 10.1038/srep37647
Meyer, L. The neural oscillations of speech processing and language comprehension: state of the art and emerging mechanisms. Eur. J. Neurosci. 48, 2609–2621 (2017).
pubmed: 29055058
doi: 10.1111/ejn.13748
Teng, X., Tian, A., Doelling, K. & Poeppel, D. Theta band oscillations reflect more than entrainment: behavioral and neural evidence demonstrate an active chunking process. Eur. J. Neurosci. 48, 2770–2782 (2018).
pubmed: 29044763
doi: 10.1111/ejn.13742
Burroughs, A., Kazanina, N. & Houghton, C. Grammatical category and the neural processing of phrases. Sci. Rep. 11, 2446 (2021).
pubmed: 33510230
pmcid: 7844293
doi: 10.1038/s41598-021-81901-5
Lu, Y., Jin, P., Ding, N. & Tian, X. Delta-band neural tracking primarily reflects rule-based chunking instead of semantic relatedness between words. Cereb. Cortex 33, 4448–4458 (2022).
pmcid: 10110438
doi: 10.1093/cercor/bhac354
Chalas, N. et al. Multivariate analysis of speech envelope tracking reveals coupling beyond auditory cortex. NeuroImage 258, 119395 (2022).
pubmed: 35718023
doi: 10.1016/j.neuroimage.2022.119395
Abbasi, O., Steingräber, N., Chalas, N., Kluger, D. S. & Gross, J. Spatiotemporal dynamics characterise spectral connectivity profiles of continuous speaking and listening. PLoS Biol. 21, e3002178 (2023).
pubmed: 37478152
doi: 10.1371/journal.pbio.3002178
Teng, X., Tian, X. & Poeppel, D. Testing multi-scale processing in the auditory system. Sci. Rep. 6, 34390 (2016).
pubmed: 27713546
pmcid: 5054370
doi: 10.1038/srep34390
Norman-Haignere, S. V. et al. Multiscale temporal integration organizes hierarchical computation in human auditory cortex. Nat. Hum. Behav. 6, 455–469 (2022).
pubmed: 35145280
pmcid: 8957490
doi: 10.1038/s41562-021-01261-y
Teng, X., Tian, X., Rowland, J. & Poeppel, D. Concurrent temporal channels for auditory processing: oscillatory neural entrainment reveals segregation of function at different scales. PLoS Biol. 15, e2000812 (2017).
pubmed: 29095816
pmcid: 5667736
doi: 10.1371/journal.pbio.2000812
Marinato, G. & Baldauf, D. Object-based attention in complex, naturalistic auditory streams. Sci. Rep. 9, 2854 (2019).
pubmed: 30814547
pmcid: 6393668
doi: 10.1038/s41598-019-39166-6
Kaukoranta, E., Hari, R. & Lounasmaa, O. V. Responses of the human auditory cortex to vowel onset after fricative consonants. Exp. Brain Res. 69, 19–23 (1987).
pubmed: 3436386
doi: 10.1007/BF00247025
Hayashi, M., Mashiko, T., Imada, T. & Odaka, K. Brain magnetic fields evoked by five Japanese vowels. Proceedings of 14th International Congress on Acoustics. I2, 3 (1992).
Hayashi, M. & Kariya, K. Source localization of auditory magnetic fields evoked by syllables and modulated noises. Measurement 24, 69–77 (1998).
doi: 10.1016/S0263-2241(98)00024-4
Pantev, C. et al. Increased auditory cortical representation in musicians. Nature 392, 811–814 (1998).
pubmed: 9572139
doi: 10.1038/33918
Peelle, J. E. & Davis, M. H. Neural oscillations carry speech rhythm through to comprehension. Front Psychol. 3, 320 (2012).
Inui, K. et al. Non-linear laws of echoic memory and auditory change detection in humans. BMC Neurosci. 11, 80 (2010).
pubmed: 20598152
pmcid: 2904354
doi: 10.1186/1471-2202-11-80
Mitsudo, T., Hironaga, N. & Mori, S. Cortical activity associated with the detection of temporal gaps in tones: a magnetoencephalography study. Front. Hum. Neurosci. 8, 763 (2014).
pubmed: 25346672
pmcid: 4191557
doi: 10.3389/fnhum.2014.00763
Inui, K., Okamoto, H., Miki, K., Gunji, A. & Kakigi, R. Serial and parallel processing in the human auditory cortex: a magnetoencephalographic study. Cereb. Cortex 16, 18–30 (2006).
pubmed: 15800024
doi: 10.1093/cercor/bhi080
Barton, B., Venezia, J. H., Saberi, K., Hickok, G. & Brewer, A. A. Orthogonal acoustic dimensions define auditory field maps in human cortex. Proc. Natl. Acad. Sci. USA 109, 20738–20743 (2012).
Arsenault, J., Buchsbaum, B. R. & S. & Distributed neural representations of phonological features during speech perception. J. Neurosci. 35, 634–642 (2015).
pubmed: 25589757
pmcid: 6605373
doi: 10.1523/JNEUROSCI.2454-14.2015
Ozker, M., Yoshor, D. & Beauchamp, M. S. Converging evidence from electrocorticography and BOLD fMRI for a sharp functional boundary in superior temporal gyrus related to multisensory speech processing. Front. Hum. Neurosci. 12, 141 (2018).
pubmed: 29740294
pmcid: 5928751
doi: 10.3389/fnhum.2018.00141
Hamilton, L. S., Oganian, Y., Hall, J. & Chang, E. F. Parallel and distributed encoding of speech across human auditory cortex. Cell 184, 4626–4639 (2021).
pubmed: 34411517
pmcid: 8456481
doi: 10.1016/j.cell.2021.07.019
Inui, K. et al. Echoic memory of a single pure tone indexed by change-related brain activity. BMC Neurosci. 11, 135–145 (2010).
pubmed: 20961454
pmcid: 2978218
doi: 10.1186/1471-2202-11-135
Asakawa, C., Takagi, H., Ino, H. & Ifukube, S. Maximum listening speeds for the blind. Proceedings of the Conference of International Community for Auditory Display. 276–279 (2003).
Bellegarda, J. R. Unit-centric feature mapping for inventory pruning in unit selection text-to-speech synthesis. IEEE Trans. Audio Speech Lang. Process. 16, 74–82 (2008).
doi: 10.1109/TASL.2007.911059
Ren, Y. et al. Fastspeech 2: fast and high-quality end-to-end text to speech. Preprint at (2020). https://arxiv.org/abs/2006.04558
Buzsáki, G. Rhythms of the Brain 5–10 (Oxford University Press, 2006).
Buzsáki, G. Large-scale recording of neuronal ensembles. Nat. Neurosci. 7, 446–451 (2004).
pubmed: 15114356
doi: 10.1038/nn1233
Ghitza, O. & Greenberg, S. On the possible role of brain rhythms in speech perception: intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica 66, 113–126 (2009).
pubmed: 19390234
doi: 10.1159/000208934
Poeppel, D. et al. Task-induced asymmetry of the auditory evoked M100 neuromagnetic field elicited by speech sounds. Cogn. Brain Res. 4, 231–242 (1996).
doi: 10.1016/S0926-6410(96)00643-X
de Vries, I. E. J., Marinato, G. & Baldauf, D. Decoding object-based auditory attention from source-reconstructed MEG alpha oscillations. J. Neurosci. 41, 8603–8617 (2021).
pubmed: 34429378
pmcid: 8513695
doi: 10.1523/JNEUROSCI.0583-21.2021
Peelle, J. E. & Davis, M. H. Neural oscillations carry speech rhythm through to comprehension. Front. Psychol. 3, 320 (2012).
pubmed: 22973251
pmcid: 3434440
doi: 10.3389/fpsyg.2012.00320
Di Liberto, G. M., O’Sullivan, J. A. & Lalor, E. C. Low-frequency cortical entrainment to speech reflects phoneme-level processing. Curr. Biol. 25, 2457–2465 (2015).
pubmed: 26412129
doi: 10.1016/j.cub.2015.08.030
Arnal, L. H., Poeppel, D. & Giraud, A. L. Temporal coding in the auditory cortex. Handb. Clin. Neurol. 129, 85–98 (2015).
pubmed: 25726264
doi: 10.1016/B978-0-444-62630-1.00005-6
Crosse, M. J., Di Liberto, G. M., Bednar, A. & Lalor, E. C. The multivariate temporal response function (mTRF) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Front. Hum. Neurosci. 10, 604 (2016).
pubmed: 27965557
pmcid: 5127806
doi: 10.3389/fnhum.2016.00604
Oldfield, R. C. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9, 97–113 (1971).
pubmed: 5146491
doi: 10.1016/0028-3932(71)90067-4
Hämäläinen, M., Hari, R., Ilmoniemi, R. J., Knuutila, J. & Lounasmaa, O. V. Magnetoencephalography – theory, instrumentation, and applications to noninvasive studies of the working human brain. Rev. Mod. Phys. 65, 413–487 (1993).
doi: 10.1103/RevModPhys.65.413
Inui, K., Tsuji, T. & Kakigi, R. Temporal analysis of cortical mechanisms for pain relief by tactile stimuli in humans. Cereb. Cortex 16, 355–365 (2006).
pubmed: 15901650
doi: 10.1093/cercor/bhi114
Kida, T., Tanaka, E. & Kakigi, R. Multi-dimensional dynamics of human electromagnetic brain activity. Front. Hum. Neurosci. 9, 713 (2016).
pubmed: 26834608
pmcid: 4717327
doi: 10.3389/fnhum.2015.00713
Maruyama, S., Fukunaga, M., Fautz, H. P., Heidemann, R. & Sadato, N. Comparison of 3T and 7T MRI for the visualization of Globus Pallidus sub-segments. Sci. Rep. 9, 18357 (2019).
pubmed: 31797993
pmcid: 6892946
doi: 10.1038/s41598-019-54880-x