An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.


Journal

Computational and mathematical methods in medicine
ISSN: 1748-6718
Titre abrégé: Comput Math Methods Med
Pays: United States
ID NLM: 101277751

Informations de publication

Date de publication:
2022
Historique:
received: 21 01 2022
revised: 17 02 2022
accepted: 07 03 2022
entrez: 9 5 2022
pubmed: 10 5 2022
medline: 11 5 2022
Statut: epublish

Résumé

Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormalities and enable the early diagnosis of voice pathology. For instance, in the early identification and diagnosis of voice problems, the automatic system for distinguishing healthy and diseased voices has gotten much attention. As a result, artificial intelligence-assisted voice analysis brings up new possibilities in healthcare. The work was aimed at assessing the utility of several automatic speech signal analysis methods for diagnosing voice disorders and suggesting a strategy for classifying healthy and diseased voices. The proposed framework integrates the efficacy of three voice characteristics: chroma, mel spectrogram, and mel frequency cepstral coefficient (MFCC). We also designed a deep neural network (DNN) capable of learning from the retrieved data and producing a highly accurate voice-based disease prediction model. The study describes a series of studies using the Saarbruecken Voice Database (SVD) to detect abnormal voices. The model was developed and tested using the vowels /a/, /i/, and /u/ pronounced in high, low, and average pitches. We also maintained the "continuous sentence" audio files collected from SVD to select how well the developed model generalizes to completely new data. The highest accuracy achieved was 77.49%, superior to prior attempts in the same domain. Additionally, the model attains an accuracy of 88.01% by integrating speaker gender information. The designed model trained on selected diseases can also obtain a maximum accuracy of 96.77% (cordectomy × healthy). As a result, the suggested framework is the best fit for the healthcare industry.

Identifiants

pubmed: 35529259
doi: 10.1155/2022/7814952
pmc: PMC9071878
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

7814952

Informations de copyright

Copyright © 2022 Mohammed Zakariah et al.

Déclaration de conflit d'intérêts

The authors declare that there is no conflict of interest regarding the publication of this paper.

Références

Annu Int Conf IEEE Eng Med Biol Soc. 2009;2009:2514-7
pubmed: 19964970
Comput Biol Med. 2016 Feb 1;69:270-6
pubmed: 26471193
J Voice. 2017 Mar;31(2):248.e11-248.e23
pubmed: 27692682
J Neural Transm (Vienna). 2017 Mar;124(3):303-334
pubmed: 28101650
Logoped Phoniatr Vocol. 2011 Jul;36(2):60-9
pubmed: 21073260
IEEE Trans Biomed Eng. 2006 Oct;53(10):1943-53
pubmed: 17019858
J Voice. 2017 Jan;31(1):3-15
pubmed: 26992554
Sensors (Basel). 2017 Jan 29;17(2):
pubmed: 28146069
Curr Opin Otolaryngol Head Neck Surg. 2008 Jun;16(3):211-5
pubmed: 18475073
J Voice. 2017 Jan;31(1):113.e9-113.e18
pubmed: 27105857
J Speech Hear Res. 1980 Mar;23(1):202-9
pubmed: 7442177
J Speech Hear Res. 1996 Apr;39(2):311-21
pubmed: 8729919
Comput Math Methods Med. 2015;2015:956249
pubmed: 26681977
Conf Proc IEEE Eng Med Biol Soc. 2006;2006:1669-73
pubmed: 17946059

Auteurs

Mohammed Zakariah (M)

Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O. Box 57168, Riyadh 21574, Saudi Arabia.

Reshma B (R)

Division of Electronics Engineering, School of Engineering, Cochin University of Science and Technology, India.

Yousef Ajmi Alotaibi (Y)

Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, P.O. Box 57168, Riyadh 21574, Saudi Arabia.

Yanhui Guo (Y)

University of Illinois Springfield, USA.

Kiet Tran-Trung (K)

Faculty of Computer Science, Ho Chi Minh City Open University, 97 Vo Van Tan, Ward Vo Thi Sau, District 3, Ho Chi Minh City Code postal: 70000, Vietnam.

Mohammad Mamun Elahi (MM)

Department of Computer Science and Engineering, United International University, Dhaka, Bangladesh.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH