Performance of Machine Learning Methods to Classify French Medical Publications.
Document classification
French
Revue Médicale Suisse
deep learning
machine learning
natural language processing
unstructured medical data
Journal
Studies in health technology and informatics
ISSN: 1879-8365
Titre abrégé: Stud Health Technol Inform
Pays: Netherlands
ID NLM: 9214582
Informations de publication
Date de publication:
25 May 2022
25 May 2022
Historique:
entrez:
25
5
2022
pubmed:
26
5
2022
medline:
27
5
2022
Statut:
ppublish
Résumé
Many medical narratives are read by care professionals in their preferred language. These documents can be produced by organizations, authorities or national publishers. However, they are often hardly findable using the usual query engines based on English such as PubMed. This work explores the possibility to automatically categorize medical documents in French following an automatic Natural Language Processing pipeline. The pipeline is used to compare the performance of 6 different machine learning and deep neural network approaches on a large dataset of peer-reviewed weekly published Swiss medical journal in French covering major topics in medicine over the last 15 years. An accuracy of 96% was achieved for 5-topic classification and 81% for 20-topic classification.
Identifiants
pubmed: 35612232
pii: SHTI220613
doi: 10.3233/SHTI220613
doi:
Types de publication
Journal Article
Langues
eng