Extracting tumour prognostic factors from a diverse electronic record dataset in genito-urinary oncology.
Electronic medical record
Genitourinary cancers
Natural language processing
Text mining
Tumor staging
Journal
International journal of medical informatics
ISSN: 1872-8243
Titre abrégé: Int J Med Inform
Pays: Ireland
ID NLM: 9711057
Informations de publication
Date de publication:
01 2019
01 2019
Historique:
received:
27
04
2017
revised:
17
09
2018
accepted:
21
10
2018
entrez:
15
12
2018
pubmed:
14
12
2018
medline:
6
7
2019
Statut:
ppublish
Résumé
To implement a system for unsupervised extraction of tumor stage and prognostic data in patients with genitourinary cancers using clinicopathological and radiology text. A corpus of 1054 electronic notes (clinician notes, radiology reports and pathology reports) was annotated for tumor stage, prostate specific antigen (PSA) and Gleason grade. Annotations from five clinicians were reconciled to form a gold standard dataset. A training dataset of 386 documents was sequestered. The Medtex algorithm was adapted using the training dataset. Adapted Medtex equaled or exceeded human performance in most annotations, except for implicit M stage (F-measure of 0.69 vs 0.84) and PSA (0.92 vs 0.96). Overall Medtex performed with an F-measure of 0.86 compared to human annotations of 0.92. There was significant inter-observer variability when comparing human annotators to the gold standard. The Medtex algorithm performed similarly to human annotators for extracting stage and prognostic data from varied clinical texts.
Identifiants
pubmed: 30545489
pii: S1386-5056(18)30330-7
doi: 10.1016/j.ijmedinf.2018.10.008
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
53-57Informations de copyright
Copyright © 2018 Elsevier B.V. All rights reserved.