Differentiation of speech in Parkinson's disease and spinocerebellar degeneration using deep neural networks.
Deep learning
Dysarthria
Machine learning
Parkinson’s disease
Spinocerebellar degeneration
Journal
Journal of neurology
ISSN: 1432-1459
Titre abrégé: J Neurol
Pays: Germany
ID NLM: 0423161
Informations de publication
Date de publication:
21 Nov 2023
21 Nov 2023
Historique:
received:
08
10
2023
accepted:
30
10
2023
revised:
29
10
2023
medline:
22
11
2023
pubmed:
22
11
2023
entrez:
22
11
2023
Statut:
aheadofprint
Résumé
Assessing dysarthria features in patients with neurodegenerative diseases helps diagnose underlying pathologies. Although deep neural network (DNN) techniques have been widely adopted in various audio processing tasks, few studies have tested whether DNNs can help differentiate neurodegenerative diseases using patients' speech data. This study evaluated whether a DNN model using a transformer architecture could differentiate patients with Parkinson's disease (PD) from patients with spinocerebellar degeneration (SCD) using speech data. Speech data were obtained from 251 and 101 patients with PD and SCD, respectively, while they read a passage. We fine-tuned a pre-trained DNN model using log-mel spectrograms generated from speech data. The DNN model was trained to predict whether the input spectrogram was generated from patients with PD or SCD. We used fivefold cross-validation to evaluate the predictive performance using the area under the receiver operating characteristic curve (AUC) and accuracy, sensitivity, and specificity. Average ± standard deviation of the AUC, accuracy, sensitivity, and specificity of the trained model for the fivefold cross-validation were 0.93 ± 0.04, 0.87 ± 0.03, 0.83 ± 0.05, and 0.89 ± 0.05, respectively. The DNN model can differentiate speech data of patients with PD from that of patients with SCD with relatively high accuracy and AUC. The proposed method can be used as a non-invasive, easy-to-perform screening method to differentiate PD from SCD using patient speech and is expected to be applied to telemedicine.
Identifiants
pubmed: 37989963
doi: 10.1007/s00415-023-12091-5
pii: 10.1007/s00415-023-12091-5
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Japan Society for the Promotion of Science
ID : JP22K20843
Informations de copyright
© 2023. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany.
Références
Darley FL, Aronson AE, Brown JR (1969) Differential diagnostic patterns of dysarthria. J Speech Hear Res 12:246–269. https://doi.org/10.1044/jshr.1202.246
doi: 10.1044/jshr.1202.246
pubmed: 5808852
Ackermann H (2008) Cerebellar contributions to speech production and speech perception: psycholinguistic and neurobiological perspectives. Trends Neurosci 31:265–272. https://doi.org/10.1016/j.tins.2008.02.011
doi: 10.1016/j.tins.2008.02.011
pubmed: 18471906
Schmitz-Hübsch T, Eckert O, Schlegel U, Klockgether T, Skodda S (2012) Instability of syllable repetition in patients with spinocerebellar ataxia and Parkinson’s disease. Mov Disord 27:316–319. https://doi.org/10.1002/mds.24030
doi: 10.1002/mds.24030
pubmed: 22109901
Rusz J, Tykalová T, Salerno G, Bancone S, Scarpelli J, Pellecchia MT (2019) Distinctive speech signature in cerebellar and parkinsonian subtypes of multiple system atrophy. J Neurol 266:1394–1404. https://doi.org/10.1007/s00415-019-09271-7
doi: 10.1007/s00415-019-09271-7
pubmed: 30859316
Idrisoglu A, Dallora AL, Anderberg P, Berglund JS (2023) Applied machine learning techniques to diagnose voice-affecting conditions and disorders: systematic literature review. J Med Internet Res 19:e46105. https://doi.org/10.2196/46105
doi: 10.2196/46105
Ngo QC, Motin MA, Pah ND, Drotár P, Kempster P, Kumar D (2022) Computerized analysis of speech and voice for Parkinson’s disease: a systematic review. Comput Methods Programs Biomed 226:107133. https://doi.org/10.1016/j.cmpb.2022.107133
doi: 10.1016/j.cmpb.2022.107133
pubmed: 36183641
Purwins H, Li B, Virtanen T, Schulter J, Chang S, Sainath T (2019) Deep learning for audio signal processing. IEEE J Sel Top Sig Process 13:206–219. https://doi.org/10.1109/JSTSP.2019.2908700
doi: 10.1109/JSTSP.2019.2908700
Ajit A, Acharya K, Samanta A (2020) A Review of Convolutional Neural Networks. International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE). pp 1–5. https://doi.org/10.1109/ic-ETITE47903.2020.049
Piczak KJ (2015) Environmental sound classification with convolutional neural networks. In: Proc. 25th Int. Workshop Mach. Learning Signal Process. pp 1–6. https://doi.org/10.1109/MLSP.2015.7324337
Jianfeng Z, Xia M, Lijiang C (2018) Speech emotion recognition using deep 1D & 2D CNN LSTM networks. Biomed Signal Process Control 47:312–323. https://doi.org/10.1016/j.bspc.2018.08.035
doi: 10.1016/j.bspc.2018.08.035
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. Presented at: Proceedings of the Advances in Neural Information Processing Systems pp 1–11. https://doi.org/10.48550/arXiv.1706.03762
Koutini K, Schlüter J, Eghbal-zadeh H, Widmer G (2022) Efficient training of audio transformers with patchout. Proc Interspeech. https://doi.org/10.48550/arXiv.2110.05069
doi: 10.48550/arXiv.2110.05069
Hireš M, Gazda M, Drotár P, Pah ND, Motin MA, Kumar DK (2022) Convolutional neural network ensemble for Parkinson’s disease detection from voice recordings. Comput Biol Med 141:105021. https://doi.org/10.1016/j.compbiomed.2021.105021
doi: 10.1016/j.compbiomed.2021.105021
pubmed: 34799077
Zhang X, Ma J, Li Y, Wang P, Liu Y (2021) Few-shot learning of Parkinson’s disease speech data with optimal convolution sparse kernel transfer learning. Biomed Signal Process Control 69:102850. https://doi.org/10.1016/j.bspc.2021.102850
doi: 10.1016/j.bspc.2021.102850
Hughes AJ, Daniel SE, Kilford L, Lees AJ (1992) Accuracy of clinical diagnosis of idiopathic Parkinson’s disease: a clinico-pathological study of 100 cases. J Neurol Neurosurg Psychiatry 55:181–184. https://doi.org/10.1136/jnnp.55.3.181
doi: 10.1136/jnnp.55.3.181
pubmed: 1564476
pmcid: 1014720
Gilman S, Wenning GK, Low PA, Brooks DJ, Mathias CJ, Trojanowski JQ, Wood NW, Colosimo C, Dürr A, Fowler CJ, Kaufmann H, Klockgether T, Lees A, Poewe W, Quinn N, Revesz T, Robertson D, Sandroni P, Seppi K, Vidailhet M (2008) Second consensus statement on the diagnosis of multiple system atrophy. Neurology 71:670–676. https://doi.org/10.1212/01.wnl.0000324625.00404.15
doi: 10.1212/01.wnl.0000324625.00404.15
pubmed: 18725592
pmcid: 2676993
van Swieten JC, Koudstaal PJ, Visser MC, Schouten HJ, van Gijn J (1988) Interobserver agreement for the assessment of handicap in stroke patients. Stroke 19:604–607. https://doi.org/10.1161/01.STR.19.5.604
doi: 10.1161/01.STR.19.5.604
pubmed: 3363593
Hoehn MM, Yahr MD (1967) Parkinsonism: onset, progression and mortality. Neurology 17:427–442. https://doi.org/10.1212/WNL.17.5.427
doi: 10.1212/WNL.17.5.427
pubmed: 6067254
Zhou H, Chen Z, Shi H, Wu Y, Yin S (2013) Categories of auditory performance and speech intelligibility ratings of early-implanted children without speech training. PLoS ONE 8:e53852. https://doi.org/10.1371/journal.pone.0053852
doi: 10.1371/journal.pone.0053852
pubmed: 23349752
pmcid: 3549925
Gemmeke JF, Ellis DPW, Freedman D, Jansen A, Lawrence W, Moore RC, Pakal M, Ritter M (2017) Audio set: an ontology and human-labeled dataset for audio events. In: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp 776–780. https://doi.org/10.1109/ICASSP.2017.7952261
Hendrycks D, Gimpel K (2016) Gaussian error linear units (Gelus) https://doi.org/10.48550/arXiv.1606.08415
Naseer A, Rani M, Naz S, Razzak MI, Imran M, Xu G (2020) Refining Parkinson’s neurological disorder identification through deep transfer learning. Neural Comput Appl 32:39–854. https://doi.org/10.1007/s00521-019-04069-0
doi: 10.1007/s00521-019-04069-0
Abou Jaoude M, Jing J, Sun H, Jacobs CS, Pellerin KR, Westover MB, Cash SS, Lam AD (2020) Detection of mesial temporal lobe epileptiform discharges on intracranial electrodes using deep learning. Clin Neurophysiol 131:133–141. https://doi.org/10.1016/j.clinph.2019.09.031
doi: 10.1016/j.clinph.2019.09.031
pubmed: 31760212
Fast L, Temuulen U, Villringer K, Kufner A, Ali HF, Siebert E, Huo S, Piper SK, Sperber PS, Liman T, Endres M, Ritter K (2023) Machine learning-based prediction of clinical outcomes after first-ever ischemic stroke. Front Neurol 14:1114360. https://doi.org/10.3389/fneur.2023.1114360
doi: 10.3389/fneur.2023.1114360
pubmed: 36895902
pmcid: 9990416
Nakayama K, Yamamoto T, Oda C, Sato M, Murakami T, Horiguchi S (2020) Effectiveness of Lee Silverman voice treatment® LOUD on Japanese-speaking patients with Parkinson’s disease. Rehabil Res Pract 24:6585264. https://doi.org/10.1155/2020/6585264
doi: 10.1155/2020/6585264
Pattanayak CW, Rubin DB, Zell ER (2011) Propensity score methods for creating covariate balance in observational studies. Rev Esp Cardiol 64:897–903. https://doi.org/10.1016/j.recesp.2011.06.008
doi: 10.1016/j.recesp.2011.06.008
pubmed: 21872981
Ascherio A, Schwarzschild MA (2016) The epidemiology of Parkinson’s disease: risk factors and prevention. Lancet Neurol 15:1257–1272. https://doi.org/10.1016/S1474-4422(16)30230-7
doi: 10.1016/S1474-4422(16)30230-7
pubmed: 27751556