DR.BENCH: Diagnostic Reasoning Benchmark for Clinical Natural Language Processing.
Clinical diagnostic decision support
Clinical diagnostic reasoning
Clinical natural language processing benchmark
Natural language processing
Journal
Journal of biomedical informatics
ISSN: 1532-0480
Titre abrégé: J Biomed Inform
Pays: United States
ID NLM: 100970413
Informations de publication
Date de publication:
02 2023
02 2023
Historique:
received:
19
10
2022
revised:
13
12
2022
accepted:
09
01
2023
pmc-release:
01
02
2024
pubmed:
28
1
2023
medline:
3
3
2023
entrez:
27
1
2023
Statut:
ppublish
Résumé
The meaningful use of electronic health records (EHR) continues to progress in the digital era with clinical decision support systems augmented by artificial intelligence. A priority in improving provider experience is to overcome information overload and reduce the cognitive burden so fewer medical errors and cognitive biases are introduced during patient care. One major type of medical error is diagnostic error due to systematic or predictable errors in judgement that rely on heuristics. The potential for clinical natural language processing (cNLP) to model diagnostic reasoning in humans with forward reasoning from data to diagnosis and potentially reduce cognitive burden and medical error has not been investigated. Existing tasks to advance the science in cNLP have largely focused on information extraction and named entity recognition through classification tasks. We introduce a novel suite of tasks coined as Diagnostic Reasoning Benchmarks, Dr.Bench, as a new benchmark for developing and evaluating cNLP models with clinical diagnostic reasoning ability. The suite includes six tasks from ten publicly available datasets addressing clinical text understanding, medical knowledge reasoning, and diagnosis generation. DR.BENCH is the first clinical suite of tasks designed to be a natural language generation framework to evaluate pre-trained language models for diagnostic reasoning. The goal of DR. BENCH is to advance the science in cNLP to support downstream applications in computerized diagnostic decision support and improve the efficiency and accuracy of healthcare providers during patient care. We fine-tune and evaluate the state-of-the-art generative models on DR.BENCH. Experiments show that with domain adaptation pre-training on medical knowledge, the model demonstrated opportunities for improvement when evaluated in DR. BENCH. We share DR. BENCH as a publicly available GitLab repository with a systematic approach to load and evaluate models for the cNLP community. We also discuss the carbon footprint produced during the experiments and encourage future work on DR.BENCH to report the carbon footprint.
Identifiants
pubmed: 36706848
pii: S1532-0464(23)00007-2
doi: 10.1016/j.jbi.2023.104286
pmc: PMC9993808
mid: NIHMS1868157
pii:
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Langues
eng
Sous-ensembles de citation
IM
Pagination
104286Subventions
Organisme : NIDA NIH HHS
ID : R01 DA051464
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL157262
Pays : United States
Organisme : NLM NIH HHS
ID : R01 LM010090
Pays : United States
Organisme : NLM NIH HHS
ID : R01 LM012973
Pays : United States
Informations de copyright
Copyright © 2023 Elsevier Inc. All rights reserved.
Déclaration de conflit d'intérêts
Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Références
LREC Int Conf Lang Resour Eval. 2022 Jun;2022:5484-5493
pubmed: 35939277
N Engl J Med. 1968 Mar 14;278(11):593-600
pubmed: 5637758
J Am Med Inform Assoc. 2022 Sep 12;29(10):1797-1806
pubmed: 35923088
Am J Med. 2019 Apr;132(4):393-394
pubmed: 30599144
AMIA Annu Symp Proc. 2022 Feb 21;2021:418-427
pubmed: 35308919
Lancet Respir Med. 2020 Mar;8(3):243-244
pubmed: 32135094
Bioinformatics. 2020 Feb 15;36(4):1234-1240
pubmed: 31501885
Int J Technol Assess Health Care. 2007 Fall;23(4):425-32
pubmed: 17937829
J Med Libr Assoc. 2014 Jan;102(1):52-5
pubmed: 24415920
Sci Data. 2016 May 24;3:160035
pubmed: 27219127
Stud Health Technol Inform. 2013;186:3-21
pubmed: 23542959
Proc Int Conf Comput Ling. 2022 Oct;2022:2979-2991
pubmed: 36268128
AMIA Annu Symp Proc. 2018 Apr 16;2017:660-669
pubmed: 29854131
Diagnostics (Basel). 2022 Jan 04;12(1):
pubmed: 35054272
Implement Sci. 2020 Nov 4;15(1):100
pubmed: 33148311
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70
pubmed: 14681409
J Am Med Inform Assoc. 2022 Sep 12;29(10):1810-1817
pubmed: 35848784
Yearb Med Inform. 2008;:80-2
pubmed: 18660880
Teach Learn Med. 2013;25 Suppl 1:S26-32
pubmed: 24246103
N Engl J Med. 2006 Nov 23;355(21):2217-25
pubmed: 17124019
Appl Clin Inform. 2014 Apr 23;5(2):430-44
pubmed: 25024759
Appl Clin Inform. 2019 May;10(3):446-453
pubmed: 31216591
J R Coll Physicians Edinb. 2011 Jun;41(2):155-62
pubmed: 21677922
Am J Med. 2019 Nov;132(11):1256-1257
pubmed: 31051149
Med Educ Online. 2011 Mar 14;16:
pubmed: 21430797