Temporal information extraction from mental health records to identify duration of untreated psychosis.
Electronic health records
Mental health
Natural language processing
Schizophrenia
Temporal information extraction
Journal
Journal of biomedical semantics
ISSN: 2041-1480
Titre abrégé: J Biomed Semantics
Pays: England
ID NLM: 101531992
Informations de publication
Date de publication:
10 03 2020
10 03 2020
Historique:
received:
29
05
2019
accepted:
03
03
2020
entrez:
12
3
2020
pubmed:
12
3
2020
medline:
3
8
2021
Statut:
epublish
Résumé
Duration of untreated psychosis (DUP) is an important clinical construct in the field of mental health, as longer DUP can be associated with worse intervention outcomes. DUP estimation requires knowledge about when psychosis symptoms first started (symptom onset), and when psychosis treatment was initiated. Electronic health records (EHRs) represent a useful resource for retrospective clinical studies on DUP, but the core information underlying this construct is most likely to lie in free text, meaning it is not readily available for clinical research. Natural Language Processing (NLP) is a means to addressing this problem by automatically extracting relevant information in a structured form. As a first step, it is important to identify appropriate documents, i.e., those that are likely to include the information of interest. Next, temporal information extraction methods are needed to identify time references for early psychosis symptoms. This NLP challenge requires solving three different tasks: time expression extraction, symptom extraction, and temporal "linking". In this study, we focus on the first step, using two relevant EHR datasets. We applied a rule-based NLP system for time expression extraction that we had previously adapted to a corpus of mental health EHRs from patients with a diagnosis of schizophrenia (first referrals). We extended this work by applying this NLP system to a larger set of documents and patients, to identify additional texts that would be relevant for our long-term goal, and developed a new corpus from a subset of these new texts (early intervention services). Furthermore, we added normalized value annotations ("2011-05") to the annotated time expressions ("May 2011") in both corpora. The finalized corpora were used for further NLP development and evaluation, with promising results (normalization accuracy 71-86%). To highlight the specificities of our annotation task, we also applied the final adapted NLP system to a different temporally annotated clinical corpus. Developing domain-specific methods is crucial to address complex NLP tasks such as symptom onset extraction and retrospective calculation of duration of a preclinical syndrome. To the best of our knowledge, this is the first clinical text resource annotated for temporal entities in the mental health domain.
Sections du résumé
BACKGROUND
Duration of untreated psychosis (DUP) is an important clinical construct in the field of mental health, as longer DUP can be associated with worse intervention outcomes. DUP estimation requires knowledge about when psychosis symptoms first started (symptom onset), and when psychosis treatment was initiated. Electronic health records (EHRs) represent a useful resource for retrospective clinical studies on DUP, but the core information underlying this construct is most likely to lie in free text, meaning it is not readily available for clinical research. Natural Language Processing (NLP) is a means to addressing this problem by automatically extracting relevant information in a structured form. As a first step, it is important to identify appropriate documents, i.e., those that are likely to include the information of interest. Next, temporal information extraction methods are needed to identify time references for early psychosis symptoms. This NLP challenge requires solving three different tasks: time expression extraction, symptom extraction, and temporal "linking". In this study, we focus on the first step, using two relevant EHR datasets.
RESULTS
We applied a rule-based NLP system for time expression extraction that we had previously adapted to a corpus of mental health EHRs from patients with a diagnosis of schizophrenia (first referrals). We extended this work by applying this NLP system to a larger set of documents and patients, to identify additional texts that would be relevant for our long-term goal, and developed a new corpus from a subset of these new texts (early intervention services). Furthermore, we added normalized value annotations ("2011-05") to the annotated time expressions ("May 2011") in both corpora. The finalized corpora were used for further NLP development and evaluation, with promising results (normalization accuracy 71-86%). To highlight the specificities of our annotation task, we also applied the final adapted NLP system to a different temporally annotated clinical corpus.
CONCLUSIONS
Developing domain-specific methods is crucial to address complex NLP tasks such as symptom onset extraction and retrospective calculation of duration of a preclinical syndrome. To the best of our knowledge, this is the first clinical text resource annotated for temporal entities in the mental health domain.
Identifiants
pubmed: 32156302
doi: 10.1186/s13326-020-00220-2
pii: 10.1186/s13326-020-00220-2
pmc: PMC7063705
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
2Subventions
Organisme : Medical Research Council
ID : MC_PC_17214
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/T045302/1
Pays : United Kingdom
Organisme : Department of Health
Pays : United Kingdom
Références
J Am Med Inform Assoc. 2015 Sep;22(5):1001-8
pubmed: 25868462
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):806-13
pubmed: 23564629
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):849-58
pubmed: 23467472
BMJ Open. 2016 Mar 01;6(3):e008721
pubmed: 26932138
BMC Med Inform Decis Mak. 2013 Jul 11;13:71
pubmed: 23842533
Schizophr Res. 2012 Nov;141(2-3):215-21
pubmed: 23006501
Proc Conf Empir Methods Nat Lang Process. 2013 Oct;2013:821-826
pubmed: 29104970
J Am Med Inform Assoc. 2013 Sep-Oct;20(5):836-42
pubmed: 23558168
Br J Psychiatry. 2006 Jul;189:79-80
pubmed: 16816310
J Biomed Inform. 2018 Jan;77:34-49
pubmed: 29162496
Trans Assoc Comput Linguist. 2014 Apr;2:143-154
pubmed: 29082229
Schizophr Res. 2007 Sep;95(1-3):103-10
pubmed: 17644343
Yearb Med Inform. 2008;:128-44
pubmed: 18660887
J Biomed Inform. 2013 Dec;46 Suppl:S5-12
pubmed: 23872518
Stud Health Technol Inform. 2019 Aug 21;264:418-422
pubmed: 31437957