Imagined speech can be decoded from low- and cross-frequency intracranial EEG features.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
10 01 2022
10 01 2022
Historique:
received:
12
04
2021
accepted:
03
12
2021
entrez:
11
1
2022
pubmed:
12
1
2022
medline:
28
1
2022
Statut:
epublish
Résumé
Reconstructing intended speech from neural activity using brain-computer interfaces holds great promises for people with severe speech production deficits. While decoding overt speech has progressed, decoding imagined speech has met limited success, mainly because the associated neural signals are weak and variable compared to overt speech, hence difficult to decode by learning algorithms. We obtained three electrocorticography datasets from 13 patients, with electrodes implanted for epilepsy evaluation, who performed overt and imagined speech production tasks. Based on recent theories of speech neural processing, we extracted consistent and specific neural features usable for future brain computer interfaces, and assessed their performance to discriminate speech items in articulatory, phonetic, and vocalic representation spaces. While high-frequency activity provided the best signal for overt speech, both low- and higher-frequency power and local cross-frequency contributed to imagined speech decoding, in particular in phonetic and vocalic, i.e. perceptual, spaces. These findings show that low-frequency power and cross-frequency dynamics contain key information for imagined speech decoding.
Identifiants
pubmed: 35013268
doi: 10.1038/s41467-021-27725-3
pii: 10.1038/s41467-021-27725-3
pmc: PMC8748882
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
48Subventions
Organisme : NINDS NIH HHS
ID : R01 NS021135
Pays : United States
Informations de copyright
© 2022. The Author(s).
Références
Hochberg, L. R. et al. Reach and grasp by people with tetraplegia using a neurally controlled robotic arm. Nature 485, 372–375 (2012).
pubmed: 22596161
pmcid: 3640850
doi: 10.1038/nature11076
Anumanchipalli, G. K., Chartier, J. & Chang, E. F. Speech synthesis from neural decoding of spoken sentences. Nature 568, 493–498 (2019).
pubmed: 31019317
doi: 10.1038/s41586-019-1119-1
Livezey, J. A., Bouchard, K. E. & Chang, E. F. Deep learning as a tool for neural data analysis: speech classification and cross-frequency coupling in human sensorimotor cortex. PLOS Comput. Biol. 15, e1007091 (2019).
pubmed: 31525179
pmcid: 6762206
doi: 10.1371/journal.pcbi.1007091
Makin, J. G., Moses, D. A. & Chang, E. F. Machine translation of cortical activity to text with an encoder–decoder framework. Nat. Neurosci. 23, 575–582 (2020).
pubmed: 32231340
doi: 10.1038/s41593-020-0608-8
Moses, D. A. et al. Neuroprosthesis for decoding speech in a paralyzed person with anarthria. New Engl. J. Med. 385, 217–227 (2021).
pubmed: 34260835
doi: 10.1056/NEJMoa2027540
Guenther, F. H. et al. A wireless brain-machine interface for real-time speech synthesis. PLoS ONE 4, e8218 (2009).
pubmed: 20011034
pmcid: 2784218
doi: 10.1371/journal.pone.0008218
Wilson, G. H. et al. Decoding spoken English from intracortical electrode arrays in dorsal precentral gyrus. J. Neural Eng. 17, 066007 (2020).
pubmed: 33236720
pmcid: 8293867
doi: 10.1088/1741-2552/abbfef
Geva, S. et al. The neural correlates of inner speech defined by voxel-based lesion-symptom mapping. Brain 134, 3071–3082 (2011).
pubmed: 21975590
pmcid: 3187541
doi: 10.1093/brain/awr232
Gajardo-Vidal, A. et al. Damage to Broca’s area does not contribute to long-term speech production outcome after stroke. Brain 144, 817–832 (2021).
pubmed: 33517378
pmcid: 8041045
doi: 10.1093/brain/awaa460
Cooney, C., Folli, R. & Coyle, D. Neurolinguistics research advancing development of a direct-speech brain-computer interface. iScience 8, 103–125 (2018).
pubmed: 30296666
pmcid: 6174918
doi: 10.1016/j.isci.2018.09.016
Angrick, M. et al. Real-time synthesis of imagined speech processes from minimally invasive recordings of neural activity. Commun. Biol. 4, 1055 (2021).
pubmed: 34556793
pmcid: 8460739
doi: 10.1038/s42003-021-02578-0
Bocquelet, F., Hueber, T., Girin, L., Chabardès, S. & Yvert, B. Key considerations in designing a speech brain-computer interface. J. Physiol. -Paris 110, 392–401 (2016).
pubmed: 28756027
doi: 10.1016/j.jphysparis.2017.07.002
Nguyen, C. H., Karavas, G. K. & Artemiadis, P. Inferring imagined speech using EEG signals: a new approach using Riemannian manifold features. J. Neural Eng. 15, 016002 (2018).
pubmed: 28745299
doi: 10.1088/1741-2552/aa8235
Cooney, C., Korik, A., Folli, R. & Coyle, D. Evaluation of Hyperparameter Optimization in Machine and Deep Learning Methods for Decoding Imagined Speech EEG. Sensors 20, 4629 (2020).
pmcid: 7472624
doi: 10.3390/s20164629
Rezazadeh Sereshkeh, A., Yousefi, R., Wong, A. T., Rudzicz, F. & Chau, T. Development of a ternary hybrid fNIRS-EEG brain–computer interface based on imagined speech. Brain-Comput. Interfaces 6, 128–140 (2019).
Dash, D., Ferrari, P. & Wang, J. Decoding imagined and spoken phrases from non-invasive neural (MEG) signals. Front. Neurosci. 14, 290 (2020).
pubmed: 32317917
pmcid: 7154084
doi: 10.3389/fnins.2020.00290
Dash, D. et al. MEG sensor selection for neural speech decoding. IEEE Access. 8, 182320–182337 (2020).
pubmed: 33204579
pmcid: 7668411
doi: 10.1109/ACCESS.2020.3028831
Martin, S. et al. Word pair classification during imagined speech using direct brain recordings. Sci. Rep. 6, 25803 (2016).
pubmed: 27165452
pmcid: 4863149
doi: 10.1038/srep25803
Pei, X., Barbour, D. L., Leuthardt, E. C. & Schalk, G. Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J. Neural Eng. 8, 046028 (2011).
pubmed: 21750369
pmcid: 3772685
doi: 10.1088/1741-2560/8/4/046028
Leszczyński, M. et al. Dissociation of broadband high-frequency activity and neuronal firing in the neocortex. Sci. Adv. 6, eabb0977 (2020).
pubmed: 32851172
pmcid: 7423365
doi: 10.1126/sciadv.abb0977
Rich, E. L. & Wallis, J. D. Spatiotemporal dynamics of information encoding revealed in orbitofrontal high-gamma. Nat. Commun. 8, 1139 (2017).
pubmed: 29074960
pmcid: 5658402
doi: 10.1038/s41467-017-01253-5
Steinschneider, M., Fishman, Y. I. & Arezzo, J. C. Spectrotemporal analysis of evoked and induced electroencephalographic responses in primary auditory cortex (A1) of the awake monkey. Cereb. Cortex 18, 610–625 (2008).
pubmed: 17586604
doi: 10.1093/cercor/bhm094
Ray, S. & Maunsell, J. H. R. Different origins of gamma rhythm and high-gamma activity in macaque visual cortex. PLoS Biol. 9, e1000610 (2011).
pubmed: 21532743
pmcid: 3075230
doi: 10.1371/journal.pbio.1000610
Chartier, J., Anumanchipalli, G. K., Johnson, K. & Chang, E. F. Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex. Neuron 98, 1042–1054.e4 (2018).
pubmed: 29779940
pmcid: 5992088
doi: 10.1016/j.neuron.2018.04.031
Martin, S. et al. Decoding spectrotemporal features of overt and covert speech from the human cortex. Front. Neuroeng. 7, 14 (2014).
pubmed: 24904404
pmcid: 4034498
doi: 10.3389/fneng.2014.00014
Oppenheim, G. M. & Dell, G. S. Motor movement matters: the flexible abstractness of inner speech. Mem. Cogn. 38, 1147–1160 (2010).
doi: 10.3758/MC.38.8.1147
Miller, K. J. et al. Cortical activity during motor execution, motor imagery, and imagery-based online feedback. Proc. Natl Acad. Sci. USA 107, 4430–4435 (2010).
pubmed: 20160084
pmcid: 2840149
doi: 10.1073/pnas.0913697107
Mackay, D. G. Auditory Imagery (ed. Reisberg, D.) p. 121–149 (Lawrence Erlbaum Associates, Inc, 1992).
Wheeldon, L. R. & Levelt, W. J. M. Monitoring the time course of phonological encoding. J. Mem. Lang. 34, 311–334 (1995).
doi: 10.1006/jmla.1995.1014
Indefrey, P. & Levelt, W. J. M. The spatial and temporal signatures of word production components. Cognition 92, 101–144 (2004).
pubmed: 15037128
doi: 10.1016/j.cognition.2002.06.001
Pickering, M. J. & Garrod, S. An integrated theory of language production and comprehension. Behav. Brain Sci. 36, 329–347 (2013).
pubmed: 23789620
doi: 10.1017/S0140525X12001495
Scott, M., Yeung, H. H., Gick, B. & Werker, J. F. Inner speech captures the perception of external speech. J. Acoust. Soc. Am. 133, EL286–EL292 (2013).
pubmed: 23556693
doi: 10.1121/1.4794932
Tian, X. Mental imagery of speech and movement implicates the dynamics of internal forward models. Front. Psychol. 1, 166 (2010).
pubmed: 21897822
pmcid: 3158430
doi: 10.3389/fpsyg.2010.00166
Perrone-Bertolotti, M., Rapin, L., Lachaux, J.-P., Baciu, M. & Lœvenbruck, H. What is that little voice inside my head? Inner speech phenomenology, its role in cognitive performance, and its relation to self-monitoring. Behav. Brain Res. 261, 220–239 (2014).
pubmed: 24412278
doi: 10.1016/j.bbr.2013.12.034
Giraud, A.-L. & Poeppel, D. Cortical oscillations and speech processing: emerging computational principles and operations. Nat. Neurosci. 15, 511–517 (2012).
pubmed: 22426255
pmcid: 4461038
doi: 10.1038/nn.3063
Marchesotti, S. et al. Selective enhancement of low-gamma activity by tACS improves phonemic processing and reading accuracy in dyslexia. PLoS Biol. 18, e3000833 (2020).
pubmed: 32898188
pmcid: 7478834
doi: 10.1371/journal.pbio.3000833
Hovsepyan, S., Olasagasti, I. & Giraud, A.-L. Combining predictive coding and neural oscillations enables online syllable recognition in natural speech. Nat. Commun. 11, 3117 (2020).
pubmed: 32561726
pmcid: 7305192
doi: 10.1038/s41467-020-16956-5
Giraud, A.-L. Oscillations for all A commentary on Meyer, Sun & Martin (2020). Lang. Cogn. Neurosci. (2020).
Gross, J. et al. Speech rhythms and multiplexed oscillatory sensory coding in the human brain. PLoS Biol. 11, e1001752 (2013).
pubmed: 24391472
pmcid: 3876971
doi: 10.1371/journal.pbio.1001752
Pefkou, M., Arnal, L. H., Fontolan, L. & Giraud, A.-L. θ-band and β-band neural activity reflects independent syllable tracking and comprehension of time-compressed speech. J. Neurosci. 37, 7930–7938 (2017).
pubmed: 28729443
pmcid: 6596908
doi: 10.1523/JNEUROSCI.2882-16.2017
Lewis, A. G. & Bastiaansen, M. A predictive coding framework for rapid neural dynamics during sentence-level language comprehension. Cortex 68, 155–168 (2015).
pubmed: 25840879
doi: 10.1016/j.cortex.2015.02.014
Rimmele, J. M., Morillon, B., Poeppel, D. & Arnal, L. H. Proactive sensing of periodic and aperiodic auditory patterns. Trends Cogn. Sci. 22, 870–882 (2018).
pubmed: 30266147
doi: 10.1016/j.tics.2018.08.003
Fontolan, L., Morillon, B., Liegeois-Chauvel, C. & Giraud, A.-L. The contribution of frequency-specific activity to hierarchical information processing in the human auditory cortex. Nat. Commun. 5, 4694 (2014).
pubmed: 25178489
doi: 10.1038/ncomms5694
Bastos, A. M., Lundqvist, M., Waite, A. S., Kopell, N. & Miller, E. K. Layer and rhythm specificity for predictive routing. Proc. Natl Acad. Sci. USA 117, 31459–31469 (2020).
pubmed: 33229572
pmcid: 7733827
doi: 10.1073/pnas.2014868117
Pei, X. et al. Spatiotemporal dynamics of electrocorticographic high gamma activity during overt and covert word repetition. NeuroImage 54, 2960–2972 (2011).
pubmed: 21029784
doi: 10.1016/j.neuroimage.2010.10.029
Bouchard, K. E., Mesgarani, N., Johnson, K. & Chang, E. F. Functional organization of human sensorimotor cortex for speech articulation. Nature 495, 327–332 (2013).
pubmed: 23426266
pmcid: 3606666
doi: 10.1038/nature11911
Mesgarani, N., Cheung, C., Johnson, K. & Chang, E. F. Phonetic feature encoding in human superior temporal gyrus. Science 343, 1006–1010 (2014).
pubmed: 24482117
pmcid: 4350233
doi: 10.1126/science.1245994
Arnal, L. H. & Giraud, A.-L. Cortical oscillations and sensory predictions. Trends Cogn. Sci. 16, 390–398 (2012).
pubmed: 22682813
doi: 10.1016/j.tics.2012.05.003
Bowers, A., Saltuklaroglu, T., Jenson, D., Harkrider, A. & Thornton, D. Power and phase coherence in sensorimotor mu and temporal lobe alpha components during covert and overt syllable production. Exp. Brain Res. 237, 705–721 (2019).
pubmed: 30552451
doi: 10.1007/s00221-018-5447-4
Buschman, T. J., Denovellis, E. L., Diogo, C., Bullock, D. & Miller, E. K. Synchronous oscillatory neural ensembles for rules in the prefrontal cortex. Neuron 76, 838–846 (2012).
pubmed: 23177967
pmcid: 3907768
doi: 10.1016/j.neuron.2012.09.029
Morillon, B., Arnal, L. H., Schroeder, C. E. & Keitel, A. Prominence of delta oscillatory rhythms in the motor cortex and their relevance for auditory and speech perception. Neurosci. Biobehav. Rev. 107, 136–142 (2019).
pubmed: 31518638
doi: 10.1016/j.neubiorev.2019.09.012
Li, Y., Luo, H. & Tian, X. Mental operations in rhythm: Motor-to-sensory transformation mediates imagined singing. PLoS Biol. 18, e3000504 (2020).
pubmed: 33017389
pmcid: 7561264
doi: 10.1371/journal.pbio.3000504
Aru, J. et al. Untangling cross-frequency coupling in neuroscience. Curr. Opin. Neurobiol. 31, 51–61 (2015).
pubmed: 25212583
doi: 10.1016/j.conb.2014.08.002
Hyafil, A. Misidentifications of specific forms of cross-frequency coupling: three warnings. Front. Neurosci. 9, 370 (2015).
pubmed: 26500488
pmcid: 4598949
doi: 10.3389/fnins.2015.00370
Morel, M., Achard, C., Kulpa, R. & Dubuisson, S. Time-series averaging using constrained dynamic time warping with tolerance. Pattern Recognit. 74, 77–89 (2018).
doi: 10.1016/j.patcog.2017.08.015
Petitjean, F., Ketterlin, A. & Gancarski, P. A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit. 44, 678 (2011).
doi: 10.1016/j.patcog.2010.09.013
Roussel, P. et al. Observation and assessment of acoustic contamination of electrophysiological brain signals during speech production and sound perception. J. Neural Eng. 17, 056028 (2020).
pubmed: 33055383
doi: 10.1088/1741-2552/abb25e
Gehrig, J. et al. Low-frequency oscillations code speech during verbal working memory. J. Neurosci. 39, 6498–6512 (2019).
pubmed: 31196933
pmcid: 6697399
doi: 10.1523/JNEUROSCI.0018-19.2019
Huth, A. G., Nishimoto, S., Vu, A. T. & Gallant, J. L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron 76, 1210–1224 (2012).
pubmed: 23259955
pmcid: 3556488
doi: 10.1016/j.neuron.2012.10.014
Pereira, F. et al. Toward a universal decoder of linguistic meaning from brain activation. Nat. Commun. 9, 963 (2018).
Yarkoni, T. The Generalizability Crisis. Behavioral and Brain Sciences 1–37 (2020).
Krakauer, J. W., Ghazanfar, A. A., Gomez-Marin, A., MacIver, M. A. & Poeppel, D. Neuroscience needs behavior: correcting a reductionist bias. Neuron 93, 480–490 (2017).
pubmed: 28182904
doi: 10.1016/j.neuron.2016.12.041
Pulvermüller, F. Words in the brain’s language. Behav. Brain Sci. 22, 253–279 (1999).
pubmed: 11301524
doi: 10.1017/S0140525X9900182X
Tian, X. & Poeppel, D. The effect of imagination on stimulation: the functional specificity of efference copies in speech processing. J. Cogn. Neurosci. 25, 1020–1036 (2013).
pubmed: 23469885
doi: 10.1162/jocn_a_00381
Tian, X., Zarate, J. M. & Poeppel, D. Mental imagery of speech implicates two mechanisms of perceptual reactivation. Cortex 77, 1–12 (2016).
pubmed: 26889603
pmcid: 5357080
doi: 10.1016/j.cortex.2016.01.002
Alderson-Day, B. & Fernyhough, C. Inner speech: Development, cognitive functions, phenomenology, and neurobiology. Psychol. Bull. 141, 931–965 (2015).
pubmed: 26011789
pmcid: 4538954
doi: 10.1037/bul0000021
Kühn, S., Fernyhough, C., Alderson-Day, B. & Hurlburt, R. T. Inner experience in the scanner: can high fidelity apprehensions of inner experience be integrated with fMRI? Front. Psychol. 5, 1393 (2014).
pubmed: 25538649
pmcid: 4260673
Rainey, S., Martin, S., Christen, A., Mégevand, P. & Fourneret, E. Brain recording, mind-reading, and neurotechnology: ethical issues from consumer devices to brain-based speech decoding. Sci. Eng. Ethics 26, 2295–2311 (2020).
pubmed: 32356091
pmcid: 7417394
doi: 10.1007/s11948-020-00218-0
Garcia-Cortadella, R. et al. Switchless multiplexing of graphene active sensor arrays for brain mapping. Nano Lett. 20, 3528–3537 (2020).
pubmed: 32223249
doi: 10.1021/acs.nanolett.0c00467
Schalk, G., McFarland, D. J., Hinterberger, T., Birbaumer, N. & Wolpaw, J. R. BCI2000: a general-purpose brain-computer interface (BCI) system. IEEE Trans. Biomed. Eng. 51, 1034–1043 (2004).
pubmed: 15188875
doi: 10.1109/TBME.2004.827072
Groppe, D. M. et al. iELVis: An open source MATLAB toolbox for localizing and visualizing human intracranial electrode data. J. Neurosci. Methods 281, 40–48 (2017).
pubmed: 28192130
doi: 10.1016/j.jneumeth.2017.01.022
Fischl, B. FreeSurfer. NeuroImage 62, 774–781 (2012).
pubmed: 22248573
doi: 10.1016/j.neuroimage.2012.01.021
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. 57, 289–300 (1995).
Tort, A. B. L., Komorowski, R., Eichenbaum, H. & Kopell, N. Measuring phase-amplitude coupling between neuronal oscillations of different frequencies. J. Neurophysiol. 104, 1195–1210 (2010).
pubmed: 20463205
pmcid: 2941206
doi: 10.1152/jn.00106.2010
Combrisson, E. & Jerbi, K. Exceeding chance level by chance: the caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy. J. Neurosci. Methods 250, 126–136 (2015).
pubmed: 25596422
doi: 10.1016/j.jneumeth.2015.01.010