Mixed Script Identification Using Automated DNN Hyperparameter Optimization.


Journal

Computational intelligence and neuroscience
ISSN: 1687-5273
Titre abrégé: Comput Intell Neurosci
Pays: United States
ID NLM: 101279357

Informations de publication

Date de publication:
2021
Historique:
received: 03 10 2021
revised: 30 10 2021
accepted: 05 11 2021
entrez: 20 12 2021
pubmed: 21 12 2021
medline: 22 12 2021
Statut: epublish

Résumé

Mixed script identification is a hindrance for automated natural language processing systems. Mixing cursive scripts of different languages is a challenge because NLP methods like POS tagging and word sense disambiguation suffer from noisy text. This study tackles the challenge of mixed script identification for mixed-code dataset consisting of Roman Urdu, Hindi, Saraiki, Bengali, and English. The language identification model is trained using word vectorization and RNN variants. Moreover, through experimental investigation, different architectures are optimized for the task associated with Long Short-Term Memory (LSTM), Bidirectional LSTM, Gated Recurrent Unit (GRU), and Bidirectional Gated Recurrent Unit (Bi-GRU). Experimentation achieved the highest accuracy of 90.17 for Bi-GRU, applying learned word class features along with embedding with GloVe. Moreover, this study addresses the issues related to multilingual environments, such as Roman words merged with English characters, generative spellings, and phonetic typing.

Identifiants

pubmed: 34925496
doi: 10.1155/2021/8415333
pmc: PMC8683192
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

8415333

Informations de copyright

Copyright © 2021 Muhammad Yasir et al.

Déclaration de conflit d'intérêts

There are no conflicts of interest for the publication of this research among all the scholars who participated in this study.

Références

Comput Intell Neurosci. 2021 Oct 19;2021:2195922
pubmed: 34712316
Neural Netw. 2005 Jun-Jul;18(5-6):602-10
pubmed: 16112549
J Adv Res. 2020 Apr 26;25:87-96
pubmed: 32922977
IEEE Trans Pattern Anal Mach Intell. 2009 May;31(5):855-68
pubmed: 19299860
Neural Comput. 1997 Nov 15;9(8):1735-80
pubmed: 9377276

Auteurs

Muhammad Yasir (M)

School of Information Science and Technology, Northwest University, Xi'an, Shaanxi, China.

Li Chen (L)

School of Information Science and Technology, Northwest University, Xi'an, Shaanxi, China.

Amna Khatoon (A)

Department of Information Engineering, Chang'an University, Xi'an, Shaanxi, China.

Muhammad Amir Malik (MA)

Department of Computer Science, Islamic International University, Islamabad, Pakistan.

Fazeel Abid (F)

Department of Information System, University of Management and Technology, Lahore, Pakistan.

Articles similaires

Humans Self-Control Longitudinal Studies Child, Preschool Child

Why experimental variation in neuroimaging should be embraced.

Gregory Kiar, Jeanette A Mumford, Ting Xu et al.
1.00
Humans Neuroimaging Brain Reproducibility of Results Magnetic Resonance Imaging

Classifications MeSH