Information Extraction from Medical Texts with BERT Using Human-in-the-Loop Labeling.
BERT
information extraction
medical texts
named entity recognition
natural language processing
Journal
Studies in health technology and informatics
ISSN: 1879-8365
Titre abrégé: Stud Health Technol Inform
Pays: Netherlands
ID NLM: 9214582
Informations de publication
Date de publication:
18 May 2023
18 May 2023
Historique:
medline:
22
5
2023
pubmed:
19
5
2023
entrez:
19
5
2023
Statut:
ppublish
Résumé
Neural network language models, such as BERT, can be used for information extraction from medical texts with unstructured free text. These models can be pre-trained on a large corpus to learn the language and characteristics of the relevant domain and then fine-tuned with labeled data for a specific task. We propose a pipeline using human-in-the-loop labeling to create annotated data for Estonian healthcare information extraction. This method is particularly useful for low-resource languages and is more accessible to those in the medical field than rule-based methods like regular expressions.
Identifiants
pubmed: 37203510
pii: SHTI230281
doi: 10.3233/SHTI230281
doi:
Types de publication
Journal Article
Langues
eng