Segmentation window of speech information processing in the human auditory cortex.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
24 Oct 2024
Historique:
received: 05 05 2024
accepted: 10 10 2024
medline: 25 10 2024
pubmed: 25 10 2024
entrez: 25 10 2024
Statut: epublish

Résumé

Humans perceive continuous speech signals as discrete sequences. To clarify the temporal segmentation window of speech information processing in the human auditory cortex, the relationship between speech perception and cortical responses was investigated using auditory evoked magnetic fields (AEFs). AEFs were measured while participants heard synthetic Japanese words /atataka/. There were eight types of /atataka/ with different speech rates. The durations of the words ranged from 75 to 600 ms. The results revealed a clear correlation between the AEFs and syllables. Specifically, when the durations of the words were between 375 and 600 ms, the evoked responses exhibited four clear responses from the superior temporal area, M100, that corresponded not only to the onset of speech but also to each group of consonant/vowel syllable units. The number of evoked M100 responses was correlated to the duration of the stimulus as well as the number of perceived syllables. The approximate range of the temporal segmentation window limit of speech perception was considered to be between 75 and 94 ms. This finding may contribute to optimizing the temporal performance of high-speed synthesized speech generation systems.

Identifiants

pubmed: 39448758
doi: 10.1038/s41598-024-76137-y
pii: 10.1038/s41598-024-76137-y
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

25044

Subventions

Organisme : JSPS KAKENHI Grant
ID : 18K11379, 24K15685
Organisme : Cooperative Study Program of the National Institute for Physiological Sciences
ID : 24NIPS136, 23NIPS147, 22NIPS152, 21-521
Organisme : Cooperative Study Program of the National Institute for Physiological Sciences
ID : 24NIPS136, 23NIPS147, 22NIPS152, 21-521

Informations de copyright

© 2024. The Author(s).

Références

Cohen, M. X. Where does EEG come from and what does it mean? Trends Neurosci. 40, 208–218 (2017).
pubmed: 28314445 doi: 10.1016/j.tins.2017.02.004
Cohen, D. Magnetoencephalography: evidence of magnetic fields produced by alpha-rhythm currents. Science 161, 784–786 (1968).
pubmed: 5663803 doi: 10.1126/science.161.3843.784
Näätänen, R. et al. Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature 385, 432–444 (1997).
pubmed: 9009189 doi: 10.1038/385432a0
Aniruddh, D. P. & Evan, B. Temporal patterns of human cortical activity reflect tone sequence structure. Nature 404, 80–84 (2000).
doi: 10.1038/35003577
Hari, R., Levanen, S. & Raij, T. Timing of human cortical functions during cognition: role of MEG. Trends Cogn. Sci. 4, 455–462 (2000).
pubmed: 11115759 doi: 10.1016/S1364-6613(00)01549-7
Hari, R. & Puce, A. MEG-EEG Primer 311–318 (Oxford University Press, 2017).
Miller, G. A. Decision units in the perception of speech. IRE Trans. Inf. Theory 8, 81–83 (1962).
doi: 10.1109/TIT.1962.1057697
Furui, S. On the role of spectral transition for speech perception. J. Acoust. Soc. Am. 80, 1016–1025 (1986).
pubmed: 3771921 doi: 10.1121/1.393842
Port, R. F., Dalby, J. & O’Dell, M. Evidence for mora timing in Japanese. J. Acoust. Soc. Am. 81, 1574–1585 (1987).
pubmed: 3584695 doi: 10.1121/1.394510
Goldinger, S. D. & Azuma, T. Puzzle-solving science: the quixotic quest for units in speech perception. J. Phon. 31, 305–320 (2003).
pubmed: 29093608 pmcid: 5661981 doi: 10.1016/S0095-4470(03)00030-5
Hayashi, M. Auditory neuromagnetic fields evoked by spectral transition of syllables. J. Robot Mechatron. 5, 409–412 (1993).
doi: 10.20965/jrm.1993.p0409
Hayashi, M. Analysis of auditory magnetic fields evoked by speech sounds. Biomed. Res. 18, 91–100 (1997).
Cynx, J. Experimental determination of a unit of song production in the zebra finch (Taeniopygia guttata). J. Comp. Psychol. 104, 3–10 (1990).
pubmed: 2354628 doi: 10.1037/0735-7036.104.1.3
Hahnloser, R. H. R., Kozhevnikov, A. A. & Fee, M. S. An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419, 65–70 (2002).
pubmed: 12214232 doi: 10.1038/nature00974
Leonardo, A. & Fee, M. S. Ensemble coding of vocal control in birdsong. J. Neurosci. 25, 652–661 (2005).
pubmed: 15659602 pmcid: 6725314 doi: 10.1523/JNEUROSCI.3036-04.2005
Glaze, C. M. & Troyer, T. W. Behavioral measurements of a temporally precise motor code for birdsong. J. Neurosci. 27, 7631–7639 (2007).
pubmed: 17634357 pmcid: 6672882 doi: 10.1523/JNEUROSCI.1065-07.2007
Troyer, W. T. The units of a song. Nature 495, 56–57 (2013).
pubmed: 23446352 doi: 10.1038/nature11957
Weaver, J. Song circuit in bird brain contains map of space and time. PLoS Biol. 3, e1002159 (2015).
doi: 10.1371/journal.pbio.1002159
Schroeder, C. E., Molhom, S., Lakatos, P., Ritter, W. & Foxe, J. J. Human–simian correspondence in the early cortical processing of multisensory cues. Cogn. Process. 5, 140–151 (2004).
doi: 10.1007/s10339-004-0020-4
Itoh, K. et al. Cerebral cortical processing time is elongated in human brain evolution. Sci. Rep. 12, 1103 (2022).
pubmed: 35058509 pmcid: 8776799 doi: 10.1038/s41598-022-05053-w
Boemio, A., Fromm, S., Braun, A. & Poeppel, D. Hierarchical and asymmetric temporal sensitivity in human auditory cortices. Nat. Neurosci. 8, 389–395 (2005).
pubmed: 15723061 doi: 10.1038/nn1409
Luo, H. & Poeppel, D. Phase patterns of neuronal responses reliably discriminate speech in human auditory cortex. Neuron 54, 1001–1010 (2007).
pubmed: 17582338 pmcid: 2703451 doi: 10.1016/j.neuron.2007.06.004
Giraud, A. L. & Poeppel, D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15, 511–517 (2012).
pubmed: 22426255 pmcid: 4461038 doi: 10.1038/nn.3063
Alexander, M. et al. Speech-specific tuning of neurons in human superior temporal gyrus. Cereb. Cortex. 24, 2679–2693 (2014).
doi: 10.1093/cercor/bht127
Alain, C., Arsenault, J. S., Garami, L., Bidelman, G. M. & Snyder, J. S. Neural correlates of speech segregation based on formant frequencies of adjacent vowels. Sci. Rep. 7, 40790 (2017).
pubmed: 28102300 pmcid: 5244401 doi: 10.1038/srep40790
Zhang, N. & Zhang, Q. Rhythmic pattern facilitates speech production: an ERP study. Sci. Rep. 9, 12974 (2019).
pubmed: 31506472 pmcid: 6736834 doi: 10.1038/s41598-019-49375-8
Giraud, A. L. et al. Endogenous cortical rhythms determine cerebral specialization for speech perception and production. Neuron 56, 1127–1134 (2007).
pubmed: 18093532 doi: 10.1016/j.neuron.2007.09.038
Lee, B. & Cho, K. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference. Sci. Rep. 6, 37647 (2016).
pubmed: 27876875 pmcid: 5120313 doi: 10.1038/srep37647
Meyer, L. The neural oscillations of speech processing and language comprehension: state of the art and emerging mechanisms. Eur. J. Neurosci. 48, 2609–2621 (2017).
pubmed: 29055058 doi: 10.1111/ejn.13748
Teng, X., Tian, A., Doelling, K. & Poeppel, D. Theta band oscillations reflect more than entrainment: behavioral and neural evidence demonstrate an active chunking process. Eur. J. Neurosci. 48, 2770–2782 (2018).
pubmed: 29044763 doi: 10.1111/ejn.13742
Burroughs, A., Kazanina, N. & Houghton, C. Grammatical category and the neural processing of phrases. Sci. Rep. 11, 2446 (2021).
pubmed: 33510230 pmcid: 7844293 doi: 10.1038/s41598-021-81901-5
Lu, Y., Jin, P., Ding, N. & Tian, X. Delta-band neural tracking primarily reflects rule-based chunking instead of semantic relatedness between words. Cereb. Cortex 33, 4448–4458 (2022).
pmcid: 10110438 doi: 10.1093/cercor/bhac354
Chalas, N. et al. Multivariate analysis of speech envelope tracking reveals coupling beyond auditory cortex. NeuroImage 258, 119395 (2022).
pubmed: 35718023 doi: 10.1016/j.neuroimage.2022.119395
Abbasi, O., Steingräber, N., Chalas, N., Kluger, D. S. & Gross, J. Spatiotemporal dynamics characterise spectral connectivity profiles of continuous speaking and listening. PLoS Biol. 21, e3002178 (2023).
pubmed: 37478152 doi: 10.1371/journal.pbio.3002178
Teng, X., Tian, X. & Poeppel, D. Testing multi-scale processing in the auditory system. Sci. Rep. 6, 34390 (2016).
pubmed: 27713546 pmcid: 5054370 doi: 10.1038/srep34390
Norman-Haignere, S. V. et al. Multiscale temporal integration organizes hierarchical computation in human auditory cortex. Nat. Hum. Behav. 6, 455–469 (2022).
pubmed: 35145280 pmcid: 8957490 doi: 10.1038/s41562-021-01261-y
Teng, X., Tian, X., Rowland, J. & Poeppel, D. Concurrent temporal channels for auditory processing: oscillatory neural entrainment reveals segregation of function at different scales. PLoS Biol. 15, e2000812 (2017).
pubmed: 29095816 pmcid: 5667736 doi: 10.1371/journal.pbio.2000812
Marinato, G. & Baldauf, D. Object-based attention in complex, naturalistic auditory streams. Sci. Rep. 9, 2854 (2019).
pubmed: 30814547 pmcid: 6393668 doi: 10.1038/s41598-019-39166-6
Kaukoranta, E., Hari, R. & Lounasmaa, O. V. Responses of the human auditory cortex to vowel onset after fricative consonants. Exp. Brain Res. 69, 19–23 (1987).
pubmed: 3436386 doi: 10.1007/BF00247025
Hayashi, M., Mashiko, T., Imada, T. & Odaka, K. Brain magnetic fields evoked by five Japanese vowels. Proceedings of 14th International Congress on Acoustics. I2, 3 (1992).
Hayashi, M. & Kariya, K. Source localization of auditory magnetic fields evoked by syllables and modulated noises. Measurement 24, 69–77 (1998).
doi: 10.1016/S0263-2241(98)00024-4
Pantev, C. et al. Increased auditory cortical representation in musicians. Nature 392, 811–814 (1998).
pubmed: 9572139 doi: 10.1038/33918
Peelle, J. E. & Davis, M. H. Neural oscillations carry speech rhythm through to comprehension. Front Psychol. 3, 320 (2012).
Inui, K. et al. Non-linear laws of echoic memory and auditory change detection in humans. BMC Neurosci. 11, 80 (2010).
pubmed: 20598152 pmcid: 2904354 doi: 10.1186/1471-2202-11-80
Mitsudo, T., Hironaga, N. & Mori, S. Cortical activity associated with the detection of temporal gaps in tones: a magnetoencephalography study. Front. Hum. Neurosci. 8, 763 (2014).
pubmed: 25346672 pmcid: 4191557 doi: 10.3389/fnhum.2014.00763
Inui, K., Okamoto, H., Miki, K., Gunji, A. & Kakigi, R. Serial and parallel processing in the human auditory cortex: a magnetoencephalographic study. Cereb. Cortex 16, 18–30 (2006).
pubmed: 15800024 doi: 10.1093/cercor/bhi080
Barton, B., Venezia, J. H., Saberi, K., Hickok, G. & Brewer, A. A. Orthogonal acoustic dimensions define auditory field maps in human cortex. Proc. Natl. Acad. Sci. USA 109, 20738–20743 (2012).
Arsenault, J., Buchsbaum, B. R. & S. & Distributed neural representations of phonological features during speech perception. J. Neurosci. 35, 634–642 (2015).
pubmed: 25589757 pmcid: 6605373 doi: 10.1523/JNEUROSCI.2454-14.2015
Ozker, M., Yoshor, D. & Beauchamp, M. S. Converging evidence from electrocorticography and BOLD fMRI for a sharp functional boundary in superior temporal gyrus related to multisensory speech processing. Front. Hum. Neurosci. 12, 141 (2018).
pubmed: 29740294 pmcid: 5928751 doi: 10.3389/fnhum.2018.00141
Hamilton, L. S., Oganian, Y., Hall, J. & Chang, E. F. Parallel and distributed encoding of speech across human auditory cortex. Cell 184, 4626–4639 (2021).
pubmed: 34411517 pmcid: 8456481 doi: 10.1016/j.cell.2021.07.019
Inui, K. et al. Echoic memory of a single pure tone indexed by change-related brain activity. BMC Neurosci. 11, 135–145 (2010).
pubmed: 20961454 pmcid: 2978218 doi: 10.1186/1471-2202-11-135
Asakawa, C., Takagi, H., Ino, H. & Ifukube, S. Maximum listening speeds for the blind. Proceedings of the Conference of International Community for Auditory Display. 276–279 (2003).
Bellegarda, J. R. Unit-centric feature mapping for inventory pruning in unit selection text-to-speech synthesis. IEEE Trans. Audio Speech Lang. Process. 16, 74–82 (2008).
doi: 10.1109/TASL.2007.911059
Ren, Y. et al. Fastspeech 2: fast and high-quality end-to-end text to speech. Preprint at (2020). https://arxiv.org/abs/2006.04558
Buzsáki, G. Rhythms of the Brain 5–10 (Oxford University Press, 2006).
Buzsáki, G. Large-scale recording of neuronal ensembles. Nat. Neurosci. 7, 446–451 (2004).
pubmed: 15114356 doi: 10.1038/nn1233
Ghitza, O. & Greenberg, S. On the possible role of brain rhythms in speech perception: intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phonetica 66, 113–126 (2009).
pubmed: 19390234 doi: 10.1159/000208934
Poeppel, D. et al. Task-induced asymmetry of the auditory evoked M100 neuromagnetic field elicited by speech sounds. Cogn. Brain Res. 4, 231–242 (1996).
doi: 10.1016/S0926-6410(96)00643-X
de Vries, I. E. J., Marinato, G. & Baldauf, D. Decoding object-based auditory attention from source-reconstructed MEG alpha oscillations. J. Neurosci. 41, 8603–8617 (2021).
pubmed: 34429378 pmcid: 8513695 doi: 10.1523/JNEUROSCI.0583-21.2021
Peelle, J. E. & Davis, M. H. Neural oscillations carry speech rhythm through to comprehension. Front. Psychol. 3, 320 (2012).
pubmed: 22973251 pmcid: 3434440 doi: 10.3389/fpsyg.2012.00320
Di Liberto, G. M., O’Sullivan, J. A. & Lalor, E. C. Low-frequency cortical entrainment to speech reflects phoneme-level processing. Curr. Biol. 25, 2457–2465 (2015).
pubmed: 26412129 doi: 10.1016/j.cub.2015.08.030
Arnal, L. H., Poeppel, D. & Giraud, A. L. Temporal coding in the auditory cortex. Handb. Clin. Neurol. 129, 85–98 (2015).
pubmed: 25726264 doi: 10.1016/B978-0-444-62630-1.00005-6
Crosse, M. J., Di Liberto, G. M., Bednar, A. & Lalor, E. C. The multivariate temporal response function (mTRF) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli. Front. Hum. Neurosci. 10, 604 (2016).
pubmed: 27965557 pmcid: 5127806 doi: 10.3389/fnhum.2016.00604
Oldfield, R. C. The assessment and analysis of handedness: the Edinburgh inventory. Neuropsychologia 9, 97–113 (1971).
pubmed: 5146491 doi: 10.1016/0028-3932(71)90067-4
Hämäläinen, M., Hari, R., Ilmoniemi, R. J., Knuutila, J. & Lounasmaa, O. V. Magnetoencephalography – theory, instrumentation, and applications to noninvasive studies of the working human brain. Rev. Mod. Phys. 65, 413–487 (1993).
doi: 10.1103/RevModPhys.65.413
Inui, K., Tsuji, T. & Kakigi, R. Temporal analysis of cortical mechanisms for pain relief by tactile stimuli in humans. Cereb. Cortex 16, 355–365 (2006).
pubmed: 15901650 doi: 10.1093/cercor/bhi114
Kida, T., Tanaka, E. & Kakigi, R. Multi-dimensional dynamics of human electromagnetic brain activity. Front. Hum. Neurosci. 9, 713 (2016).
pubmed: 26834608 pmcid: 4717327 doi: 10.3389/fnhum.2015.00713
Maruyama, S., Fukunaga, M., Fautz, H. P., Heidemann, R. & Sadato, N. Comparison of 3T and 7T MRI for the visualization of Globus Pallidus sub-segments. Sci. Rep. 9, 18357 (2019).
pubmed: 31797993 pmcid: 6892946 doi: 10.1038/s41598-019-54880-x

Auteurs

Minoru Hayashi (M)

Department of Interdisciplinary Science and Engineering, School of Science and Engineering, Meisei University, Tokyo, 191-8506, Japan. minoru.hayashi@meisei-u.ac.jp.

Tetsuo Kida (T)

Department of Functioning and Disability, Institute for Developmental Research, Aichi Developmental Disability Center, Kasugai, Japan.
Section of Brain Function Information, National Institute for Physiological Sciences, Okazaki, Japan.

Koji Inui (K)

Department of Functioning and Disability, Institute for Developmental Research, Aichi Developmental Disability Center, Kasugai, Japan.
Section of Brain Function Information, National Institute for Physiological Sciences, Okazaki, Japan.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH