Differentiation of speech in Parkinson's disease and spinocerebellar degeneration using deep neural networks.

Deep learning Dysarthria Machine learning Parkinson’s disease Spinocerebellar degeneration

Journal

Journal of neurology

ISSN: 1432-1459

Titre abrégé: J Neurol

Pays: Germany

ID NLM: 0423161

Informations de publication

Date de publication:
21 Nov 2023

Historique:

received: 08 10 2023

accepted: 30 10 2023

revised: 29 10 2023

medline: 22 11 2023

pubmed: 22 11 2023

entrez: 22 11 2023

Statut: aheadofprint

Résumé

Assessing dysarthria features in patients with neurodegenerative diseases helps diagnose underlying pathologies. Although deep neural network (DNN) techniques have been widely adopted in various audio processing tasks, few studies have tested whether DNNs can help differentiate neurodegenerative diseases using patients' speech data. This study evaluated whether a DNN model using a transformer architecture could differentiate patients with Parkinson's disease (PD) from patients with spinocerebellar degeneration (SCD) using speech data. Speech data were obtained from 251 and 101 patients with PD and SCD, respectively, while they read a passage. We fine-tuned a pre-trained DNN model using log-mel spectrograms generated from speech data. The DNN model was trained to predict whether the input spectrogram was generated from patients with PD or SCD. We used fivefold cross-validation to evaluate the predictive performance using the area under the receiver operating characteristic curve (AUC) and accuracy, sensitivity, and specificity. Average ± standard deviation of the AUC, accuracy, sensitivity, and specificity of the trained model for the fivefold cross-validation were 0.93 ± 0.04, 0.87 ± 0.03, 0.83 ± 0.05, and 0.89 ± 0.05, respectively. The DNN model can differentiate speech data of patients with PD from that of patients with SCD with relatively high accuracy and AUC. The proposed method can be used as a non-invasive, easy-to-perform screening method to differentiate PD from SCD using patient speech and is expected to be applied to telemedicine.

Identifiants

DOI: 10.1007/s00415-023-12091-5 PMID: 37989963

pubmed: 37989963

doi: 10.1007/s00415-023-12091-5

pii: 10.1007/s00415-023-12091-5

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Subventions

Organisme : Japan Society for the Promotion of Science

ID : JP22K20843

Informations de copyright

Références

Darley FL, Aronson AE, Brown JR (1969) Differential diagnostic patterns of dysarthria. J Speech Hear Res 12:246–269. https://doi.org/10.1044/jshr.1202.246

doi: 10.1044/jshr.1202.246 pubmed: 5808852

Ackermann H (2008) Cerebellar contributions to speech production and speech perception: psycholinguistic and neurobiological perspectives. Trends Neurosci 31:265–272. https://doi.org/10.1016/j.tins.2008.02.011

doi: 10.1016/j.tins.2008.02.011 pubmed: 18471906

Schmitz-Hübsch T, Eckert O, Schlegel U, Klockgether T, Skodda S (2012) Instability of syllable repetition in patients with spinocerebellar ataxia and Parkinson’s disease. Mov Disord 27:316–319. https://doi.org/10.1002/mds.24030

doi: 10.1002/mds.24030 pubmed: 22109901

Rusz J, Tykalová T, Salerno G, Bancone S, Scarpelli J, Pellecchia MT (2019) Distinctive speech signature in cerebellar and parkinsonian subtypes of multiple system atrophy. J Neurol 266:1394–1404. https://doi.org/10.1007/s00415-019-09271-7

doi: 10.1007/s00415-019-09271-7 pubmed: 30859316

Idrisoglu A, Dallora AL, Anderberg P, Berglund JS (2023) Applied machine learning techniques to diagnose voice-affecting conditions and disorders: systematic literature review. J Med Internet Res 19:e46105. https://doi.org/10.2196/46105

doi: 10.2196/46105

Ngo QC, Motin MA, Pah ND, Drotár P, Kempster P, Kumar D (2022) Computerized analysis of speech and voice for Parkinson’s disease: a systematic review. Comput Methods Programs Biomed 226:107133. https://doi.org/10.1016/j.cmpb.2022.107133

doi: 10.1016/j.cmpb.2022.107133 pubmed: 36183641

Purwins H, Li B, Virtanen T, Schulter J, Chang S, Sainath T (2019) Deep learning for audio signal processing. IEEE J Sel Top Sig Process 13:206–219. https://doi.org/10.1109/JSTSP.2019.2908700

doi: 10.1109/JSTSP.2019.2908700

Ajit A, Acharya K, Samanta A (2020) A Review of Convolutional Neural Networks. International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE). pp 1–5. https://doi.org/10.1109/ic-ETITE47903.2020.049

Piczak KJ (2015) Environmental sound classification with convolutional neural networks. In: Proc. 25th Int. Workshop Mach. Learning Signal Process. pp 1–6. https://doi.org/10.1109/MLSP.2015.7324337

Jianfeng Z, Xia M, Lijiang C (2018) Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomed Signal Process Control 47:312–323. https://doi.org/10.1016/j.bspc.2018.08.035

doi: 10.1016/j.bspc.2018.08.035

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Presented at: Proceedings of the Advances in Neural Information Processing Systems pp 1–11. https://doi.org/10.48550/arXiv.1706.03762

Koutini K, Schlüter J, Eghbal-zadeh H, Widmer G (2022) Efficient training of audio transformers with patchout. Proc Interspeech. https://doi.org/10.48550/arXiv.2110.05069

doi: 10.48550/arXiv.2110.05069

Hireš M, Gazda M, Drotár P, Pah ND, Motin MA, Kumar DK (2022) Convolutional neural network ensemble for Parkinson’s disease detection from voice recordings. Comput Biol Med 141:105021. https://doi.org/10.1016/j.compbiomed.2021.105021

doi: 10.1016/j.compbiomed.2021.105021 pubmed: 34799077

Zhang X, Ma J, Li Y, Wang P, Liu Y (2021) Few-shot learning of Parkinson’s disease speech data with optimal convolution sparse kernel transfer learning. Biomed Signal Process Control 69:102850. https://doi.org/10.1016/j.bspc.2021.102850

doi: 10.1016/j.bspc.2021.102850

Hughes AJ, Daniel SE, Kilford L, Lees AJ (1992) Accuracy of clinical diagnosis of idiopathic Parkinson’s disease: a clinico-pathological study of 100 cases. J Neurol Neurosurg Psychiatry 55:181–184. https://doi.org/10.1136/jnnp.55.3.181

doi: 10.1136/jnnp.55.3.181 pubmed: 1564476 pmcid: 1014720

Gilman S, Wenning GK, Low PA, Brooks DJ, Mathias CJ, Trojanowski JQ, Wood NW, Colosimo C, Dürr A, Fowler CJ, Kaufmann H, Klockgether T, Lees A, Poewe W, Quinn N, Revesz T, Robertson D, Sandroni P, Seppi K, Vidailhet M (2008) Second consensus statement on the diagnosis of multiple system atrophy. Neurology 71:670–676. https://doi.org/10.1212/01.wnl.0000324625.00404.15

doi: 10.1212/01.wnl.0000324625.00404.15 pubmed: 18725592 pmcid: 2676993

van Swieten JC, Koudstaal PJ, Visser MC, Schouten HJ, van Gijn J (1988) Interobserver agreement for the assessment of handicap in stroke patients. Stroke 19:604–607. https://doi.org/10.1161/01.STR.19.5.604

doi: 10.1161/01.STR.19.5.604 pubmed: 3363593

Hoehn MM, Yahr MD (1967) Parkinsonism: onset, progression and mortality. Neurology 17:427–442. https://doi.org/10.1212/WNL.17.5.427

doi: 10.1212/WNL.17.5.427 pubmed: 6067254

Zhou H, Chen Z, Shi H, Wu Y, Yin S (2013) Categories of auditory performance and speech intelligibility ratings of early-implanted children without speech training. PLoS ONE 8:e53852. https://doi.org/10.1371/journal.pone.0053852

doi: 10.1371/journal.pone.0053852 pubmed: 23349752 pmcid: 3549925

Gemmeke JF, Ellis DPW, Freedman D, Jansen A, Lawrence W, Moore RC, Pakal M, Ritter M (2017) Audio set: an ontology and human-labeled dataset for audio events. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp 776–780. https://doi.org/10.1109/ICASSP.2017.7952261

Hendrycks D, Gimpel K (2016) Gaussian error linear units (Gelus) https://doi.org/10.48550/arXiv.1606.08415

Naseer A, Rani M, Naz S, Razzak MI, Imran M, Xu G (2020) Refining Parkinson’s neurological disorder identification through deep transfer learning. Neural Comput Appl 32:39–854. https://doi.org/10.1007/s00521-019-04069-0

doi: 10.1007/s00521-019-04069-0

Abou Jaoude M, Jing J, Sun H, Jacobs CS, Pellerin KR, Westover MB, Cash SS, Lam AD (2020) Detection of mesial temporal lobe epileptiform discharges on intracranial electrodes using deep learning. Clin Neurophysiol 131:133–141. https://doi.org/10.1016/j.clinph.2019.09.031

doi: 10.1016/j.clinph.2019.09.031 pubmed: 31760212

Fast L, Temuulen U, Villringer K, Kufner A, Ali HF, Siebert E, Huo S, Piper SK, Sperber PS, Liman T, Endres M, Ritter K (2023) Machine learning-based prediction of clinical outcomes after first-ever ischemic stroke. Front Neurol 14:1114360. https://doi.org/10.3389/fneur.2023.1114360

doi: 10.3389/fneur.2023.1114360 pubmed: 36895902 pmcid: 9990416

Nakayama K, Yamamoto T, Oda C, Sato M, Murakami T, Horiguchi S (2020) Effectiveness of Lee Silverman voice treatment® LOUD on Japanese-speaking patients with Parkinson’s disease. Rehabil Res Pract 24:6585264. https://doi.org/10.1155/2020/6585264

doi: 10.1155/2020/6585264

Pattanayak CW, Rubin DB, Zell ER (2011) Propensity score methods for creating covariate balance in observational studies. Rev Esp Cardiol 64:897–903. https://doi.org/10.1016/j.recesp.2011.06.008

doi: 10.1016/j.recesp.2011.06.008 pubmed: 21872981

Ascherio A, Schwarzschild MA (2016) The epidemiology of Parkinson’s disease: risk factors and prevention. Lancet Neurol 15:1257–1272. https://doi.org/10.1016/S1474-4422(16)30230-7

doi: 10.1016/S1474-4422(16)30230-7 pubmed: 27751556

Differentiation of speech in Parkinson's disease and spinocerebellar degeneration using deep neural networks.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Subventions

Informations de copyright

Références

Auteurs

Katsuki Eguchi (K)

Hiroaki Yaguchi (H)

Ikue Kudo (I)

Ibuki Kimura (I)

Tomoko Nabekura (T)

Ryuto Kumagai (R)

Kenichi Fujita (K)

Yuichi Nakashiro (Y)

Yuki Iida (Y)

Shinsuke Hamada (S)

Sanae Honma (S)

Asako Takei (A)

Fumio Moriwaka (F)

Ichiro Yabe (I)

Classifications MeSH