Annotating German Clinical Documents for De-Identification.


Journal

Studies in health technology and informatics
ISSN: 1879-8365
Titre abrégé: Stud Health Technol Inform
Pays: Netherlands
ID NLM: 9214582

Informations de publication

Date de publication:
21 Aug 2019
Historique:
entrez: 24 8 2019
pubmed: 24 8 2019
medline: 11 9 2019
Statut: ppublish

Résumé

We devised annotation guidelines for the de-identification of German clinical documents and assembled a corpus of 1,106 discharge summaries and transfer letters with 44K annotated protected health information (PHI) items. After three iteration rounds, our annotation team finally reached an inter-annotator agreement of 0.96 on the instance level and 0.97 on the token level of annotation (averaged pair-wise F1 score). To establish a baseline for automatic de-identification on our corpus, we trained a recurrent neural network (RNN) and achieved F1 scores greater than 0.9 on most major PHI categories.

Identifiants

pubmed: 31437914
pii: SHTI190212
doi: 10.3233/SHTI190212
doi:

Types de publication

Journal Article

Langues

eng

Pagination

203-207

Auteurs

Tobias Kolditz (T)

Jena University Language & Information Engineering (JULIE) Lab, Friedrich Schiller University Jena, Jena, Germany.

Christina Lohr (C)

Jena University Language & Information Engineering (JULIE) Lab, Friedrich Schiller University Jena, Jena, Germany.

Johannes Hellrich (J)

Jena University Language & Information Engineering (JULIE) Lab, Friedrich Schiller University Jena, Jena, Germany.

Luise Modersohn (L)

Jena University Language & Information Engineering (JULIE) Lab, Friedrich Schiller University Jena, Jena, Germany.

Boris Betz (B)

Institute of Clinical Chemistry and Laboratory Diagnostics, Jena University Hospital, Jena, Germany.

Michael Kiehntopf (M)

Institute of Clinical Chemistry and Laboratory Diagnostics, Jena University Hospital, Jena, Germany.

Udo Hahn (U)

Jena University Language & Information Engineering (JULIE) Lab, Friedrich Schiller University Jena, Jena, Germany.

Articles similaires

Primary Health Care Electronic Health Records Humans Tanzania Surveys and Questionnaires
Humans Patient Reported Outcome Measures Neoplasms Electronic Health Records Delivery of Health Care

Unsupervised learning for real-time and continuous gait phase detection.

Dollaporn Anopas, Yodchanan Wongsawat, Jetsada Arnin
1.00
Humans Gait Neural Networks, Computer Unsupervised Machine Learning Walking
Humans Shoulder Fractures Tomography, X-Ray Computed Neural Networks, Computer Female

Classifications MeSH