An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.

Artificial Intelligence Humans Neural Networks, Computer Speech-Language Pathology Voice Voice Disorders / diagnosis

Journal

Computational and mathematical methods in medicine

ISSN: 1748-6718

Titre abrégé: Comput Math Methods Med

Pays: United States

ID NLM: 101277751

Informations de publication

Date de publication:
2022

Historique:

received: 21 01 2022

revised: 17 02 2022

accepted: 07 03 2022

entrez: 9 5 2022

pubmed: 10 5 2022

medline: 11 5 2022

Statut: epublish

Résumé

Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormalities and enable the early diagnosis of voice pathology. For instance, in the early identification and diagnosis of voice problems, the automatic system for distinguishing healthy and diseased voices has gotten much attention. As a result, artificial intelligence-assisted voice analysis brings up new possibilities in healthcare. The work was aimed at assessing the utility of several automatic speech signal analysis methods for diagnosing voice disorders and suggesting a strategy for classifying healthy and diseased voices. The proposed framework integrates the efficacy of three voice characteristics: chroma, mel spectrogram, and mel frequency cepstral coefficient (MFCC). We also designed a deep neural network (DNN) capable of learning from the retrieved data and producing a highly accurate voice-based disease prediction model. The study describes a series of studies using the Saarbruecken Voice Database (SVD) to detect abnormal voices. The model was developed and tested using the vowels /a/, /i/, and /u/ pronounced in high, low, and average pitches. We also maintained the "continuous sentence" audio files collected from SVD to select how well the developed model generalizes to completely new data. The highest accuracy achieved was 77.49%, superior to prior attempts in the same domain. Additionally, the model attains an accuracy of 88.01% by integrating speaker gender information. The designed model trained on selected diseases can also obtain a maximum accuracy of 96.77% (cordectomy × healthy). As a result, the suggested framework is the best fit for the healthcare industry.

Identifiants

DOI: 10.1155/2022/7814952 PMID: 35529259 PMC: PMC9071878

pubmed: 35529259

doi: 10.1155/2022/7814952

pmc: PMC9071878

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

7814952

Informations de copyright

Déclaration de conflit d'intérêts

The authors declare that there is no conflict of interest regarding the publication of this paper.

Références

Annu Int Conf IEEE Eng Med Biol Soc. 2009;2009:2514-7

pubmed: 19964970

Comput Biol Med. 2016 Feb 1;69:270-6

pubmed: 26471193

J Voice. 2017 Mar;31(2):248.e11-248.e23

pubmed: 27692682

J Neural Transm (Vienna). 2017 Mar;124(3):303-334

pubmed: 28101650

Logoped Phoniatr Vocol. 2011 Jul;36(2):60-9

pubmed: 21073260

IEEE Trans Biomed Eng. 2006 Oct;53(10):1943-53

pubmed: 17019858

J Voice. 2017 Jan;31(1):3-15

pubmed: 26992554

Sensors (Basel). 2017 Jan 29;17(2):

pubmed: 28146069

Curr Opin Otolaryngol Head Neck Surg. 2008 Jun;16(3):211-5

pubmed: 18475073

J Voice. 2017 Jan;31(1):113.e9-113.e18

pubmed: 27105857

J Speech Hear Res. 1980 Mar;23(1):202-9

pubmed: 7442177

J Speech Hear Res. 1996 Apr;39(2):311-21

pubmed: 8729919

Comput Math Methods Med. 2015;2015:956249

pubmed: 26681977

Conf Proc IEEE Eng Med Biol Soc. 2006;2006:1669-73

pubmed: 17946059

An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Références

Auteurs

Mohammed Zakariah (M)

Reshma B (R)

Yousef Ajmi Alotaibi (Y)

Yanhui Guo (Y)

Kiet Tran-Trung (K)

Mohammad Mamun Elahi (MM)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH