Transfer Learning for Classifying Spanish and English Text by Clinical Specialties.
Classification
Natural Language Processing
Spanish
Transfer learning
Journal
Studies in health technology and informatics
ISSN: 1879-8365
Titre abrégé: Stud Health Technol Inform
Pays: Netherlands
ID NLM: 9214582
Informations de publication
Date de publication:
27 May 2021
27 May 2021
Historique:
entrez:
27
5
2021
pubmed:
28
5
2021
medline:
1
6
2021
Statut:
ppublish
Résumé
Transfer learning has demonstrated its potential in natural language processing tasks, where models have been pre-trained on large corpora and then tuned to specific tasks. We applied pre-trained transfer models to a Spanish biomedical document classification task. The main goal is to analyze the performance of text classification by clinical specialties using state-of-the-art language models for Spanish, and compared them with the results using corresponding models in English and with the most important pre-trained model for the biomedical domain. The outcomes present interesting perspectives on the performance of language models that are pre-trained for a particular domain. In particular, we found that BioBERT achieved better results on Spanish texts translated into English than the general domain model in Spanish and the state-of-the-art multilingual model.
Identifiants
pubmed: 34042769
pii: SHTI210184
doi: 10.3233/SHTI210184
doi:
Types de publication
Journal Article
Langues
eng