Multi-label Few-shot ICD Coding as Autoregressive Generation with Prompt.
Journal
Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence
ISSN: 2374-3468
Titre abrégé: Proc AAAI Conf Artif Intell
Pays: United States
ID NLM: 101524790
Informations de publication
Date de publication:
26 Jun 2023
26 Jun 2023
Historique:
medline:
28
8
2023
pubmed:
28
8
2023
entrez:
28
8
2023
Statut:
ppublish
Résumé
Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with an average of 3,000+ tokens. This task is challenging due to the high-dimensional space of multi-label assignment (155,000+ ICD code candidates) and the long-tail challenge - Many ICD codes are infrequently assigned yet infrequent ICD codes are important clinically. This study addresses the long-tail challenge by transforming this multi-label classification task into an autoregressive generation task. Specifically, we first introduce a novel pretraining objective to generate free text diagnoses and procedures using the SOAP structure, the medical logic physicians use for note documentation. Second, instead of directly predicting the high dimensional space of ICD codes, our model generates the lower dimension of text descriptions, which then infers ICD codes. Third, we designed a novel prompt template for multi-label classification. We evaluate our Generation with Prompt (GP
Identifiants
pubmed: 37635946
doi: 10.1609/aaai.v37i4.25668
pmc: PMC10457101
mid: NIHMS1875188
doi:
Types de publication
Journal Article
Langues
eng
Pagination
5366-5374Subventions
Organisme : NIDA NIH HHS
ID : R01 DA045816
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH125027
Pays : United States
Références
AMIA Jt Summits Transl Sci Proc. 2023 Jun 16;2023:592-601
pubmed: 37350903
Artif Intell Med. 2015 Oct;65(2):155-66
pubmed: 26054428
Proc Conf Empir Methods Nat Lang Process. 2020 Nov;2020:3764-3773
pubmed: 33491009
AMIA Annu Symp Proc. 2023 Apr 29;2022:972-981
pubmed: 37128372
Proc Conf Empir Methods Nat Lang Process. 2022 Dec;2022:1767-1781
pubmed: 36848298
Comput Biol Med. 2021 Dec;139:104998
pubmed: 34739971
IEEE Trans Neural Netw Learn Syst. 2023 Aug 11;PP:
pubmed: 37566498
Int J Med Inform. 2019 Sep;129:49-59
pubmed: 31445289
JMIR Med Inform. 2019 Sep 12;7(3):e14830
pubmed: 31516126
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D267-70
pubmed: 14681409
Health Serv Res. 2005 Oct;40(5 Pt 2):1620-39
pubmed: 16178999
Sci Data. 2016 May 24;3:160035
pubmed: 27219127
J Am Med Inform Assoc. 2017 Nov 01;24(6):1134-1141
pubmed: 29016972