Prediction of tumor board procedural recommendations using large language models.
Artificial Intelligence
Head and Neck Oncology
Large Language Models
Tumor Board
Journal
European archives of oto-rhino-laryngology : official journal of the European Federation of Oto-Rhino-Laryngological Societies (EUFOS) : affiliated with the German Society for Oto-Rhino-Laryngology - Head and Neck Surgery
ISSN: 1434-4726
Titre abrégé: Eur Arch Otorhinolaryngol
Pays: Germany
ID NLM: 9002937
Informations de publication
Date de publication:
13 Sep 2024
13 Sep 2024
Historique:
received:
15
07
2024
accepted:
22
08
2024
medline:
13
9
2024
pubmed:
13
9
2024
entrez:
12
9
2024
Statut:
aheadofprint
Résumé
Multidisciplinary tumor boards are meetings where a team of medical specialists, including medical oncologists, radiation oncologists, radiologists, surgeons, and pathologists, collaborate to determine the best treatment plan for cancer patients. While decision-making in this context is logistically and cost-intensive, it has a significant positive effect on overall cancer survival. METHODS : We evaluated the quality and accuracy of predictions by several large language models for recommending procedures by a Head and Neck Oncology tumor board, which we adapted for the task using parameter-efficient fine-tuning or in-context learning. Records were divided into two sets: n=229 used for training and n=100 records for validation of our approaches. Randomized, blinded, manual human expert classification was used to evaluate the different models. RESULTS : Treatment line congruence varied depending on the model, reaching up to 86%, with medically justifiable recommendations up to 98%. Parameter-efficient fine-tuning yielded better outcomes than in-context learning, and larger/commercial models tend to perform better. Providing precise, medically justifiable procedural recommendations for complex oncology patients is feasible. Extending the data corpus to a larger patient cohort and incorporating the latest guidelines, assuming the model can handle sufficient context length, could result in more factual and guideline-aligned responses and is anticipated to enhance model performance. We, therefore, encourage further research in this direction to improve the efficacy and reliability of large language models as support in medical decision-making processes.
Identifiants
pubmed: 39266750
doi: 10.1007/s00405-024-08947-9
pii: 10.1007/s00405-024-08947-9
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Bayerisches Forschungsinstitut für Digitale Transformation
ID : ReGInA
Organisme : Bayerisches Staatsministerium für Bildung und Kultus, Wissenschaft und Kunst
ID : LFP-Projekt FOKUS-TML
Informations de copyright
© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
Références
Brown T, Mann B, Ryder N et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901
Caudell JJ, Gillison ML, Maghami E et al (2022) Nccn guidelines® insights: Head and neck cancers, version 1.2022: featured updates to the nccn guidelines. J Natl Compr Cancer Netw 20(3):224–234
doi: 10.6004/jnccn.2022.0016
De Ieso P, Coward JI, Letsa I et al (2013) A study of the decision outcomes and financial costs of multidisciplinary team meetings (mdms) in oncology. Br J Cancer 109(9):2295–2300
doi: 10.1038/bjc.2013.586
pubmed: 24084764
pmcid: 3817328
Dettmers T, Pagnoni A, Holtzman A et al (2024) Qlora: Efficient finetuning of quantized llms. Adv Neural Inf Process Syst 36
Diaz DA, Suneja G, Jagsi R et al (2021) Mitigating implicit bias in radiation oncology. Adv Radiat Oncol 6(5):100738. https://doi.org/10.1016/j.adro.2021.100738
doi: 10.1016/j.adro.2021.100738
pubmed: 34381930
pmcid: 8339323
El-Shabrawi K, Burkhardt V, Becker C (2023) Impact of a multidisciplinary head and neck tumor board on treatment and survival in laryngeal carcinoma. Curr Oncol 30(12):10085–10099
doi: 10.3390/curroncol30120733
pubmed: 38132367
pmcid: 10742396
German Guideline Program in Oncology (2019) S3-Leitlinie Diagnostik, Therapie und Nachsorge des Larynxkarzinoms, Langversion 1.1. URL http://www.leitlinienprogrammonkologie.de/leitlinien/larynxkarzinom/ , aWMF-Registernummer: 017/076OL
German Guideline Program in Oncology (2021) Evidence-based Guideline oral cavity cancer - V3.0. URL https://www.leitlinienprogramm-onkologie.de/fileadmin/user_upload/2021-11-9_Long_version_oral_cavity_cancer_EN.pdf , aWMF Registration Number: 007/100OL
German Guideline Program in Oncology (2024) S3-Leitlinie Diagnosis, treatment, prevention and aftercare of oropharyngeal and hypopharyngeal carcinoma, Long version 1.0. URL https://www.leitlinienprogramm-onkologie.de/leitlinien/oro-undhypopharynxkarzinom , aWMF Registration Number: 017-082OL
Hager P, Jungmann F, Holland R et al (2024) Evaluation and mitigation of the limitations of large language models in clinical decision-making. Nat Med pp 1–10
Hendrickx JJ, Mennega T, Uppelschoten JM et al (2023) Changes in multidisciplinary team decisions in a high volume head and neck oncological center following those made in its preferred partner. Front Oncol 13:1205224
doi: 10.3389/fonc.2023.1205224
pubmed: 37727212
pmcid: 10505803
Hu EJ, Shen Y, Wallis P et al (2021) Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685
Huang RS, Mihalache A, Nafees A et al (2024) The impact of multidisciplinary cancer conferences on overall survival: a meta-analysis. J Natl Cancer Inst 116(3):356–369. https://doi.org/10.1093/jnci/djad268
doi: 10.1093/jnci/djad268
pubmed: 38123515
Jiang AQ, Sablayrolles A, Mensch A et al (2023) Mistral 7b. arXiv preprint arXiv:2310.06825
Li Q, Tie Y, Alu A et al (2023) Targeted therapy for head and neck cancer: signaling pathways and clinical studies. Sig Transduct Target Ther 8:31. https://doi.org/10.1038/s41392-022-01297-0
doi: 10.1038/s41392-022-01297-0
Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Mach Intell 40(12):2935–2947
doi: 10.1109/TPAMI.2017.2773081
pubmed: 29990101
Lin BY, Ravichander A, Lu X et al (2023) The unlocking spell on base llms: Rethinking alignment via in-context learning. In: The twelfth international conference on learning representations
Mesnard T, Hardin C, Dadashi R et al (2024) Gemma: open models based on gemini research and technology. arXiv preprint arXiv:2403.08295
Rafailov R, Sharma A, Mitchell E et al (2024) Direct preference optimization: your language model is secretly a reward model. Advances in Neural Information Processing Systems 36
Saghir NSE, Keating NL, Carlson RW et al (2014) Tumor boards: optimizing the structure and improving efficiency of multidisciplinary management of patients with cancer worldwide. Am Soc Clin Oncol Educ Book 34:e461–e466
doi: 10.14694/EdBook_AM.2014.34.e461
Tonmoy SM, Zaman SM, Jain V et al (2024) A comprehensive survey of hallucination mitigation techniques in large language models. arXiv preprint arXiv:2401.01313
Touvron H, Martin L, Stone K et al (2023) Llama 2: open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288
Wheless SA, McKinney KA, Zanation AM (2010) A prospective study of the clinical impact of a multidisciplinary head and neck tumor board. Otolaryngol-Head Neck Surg 143(5):650–654
doi: 10.1016/j.otohns.2010.07.020
pubmed: 20974334
pmcid: 2994101
Ye H, Liu T, Zhang A et al (2023) Cognitive mirage: A review of hallucinations in large language models. arXiv preprint arXiv:2309.06794
Zhao H, Andriushchenko M, Croce F et al (2024) Is in-context learning sufficient for instruction following in llms? arXiv preprint arXiv:2405.19874
Ziegler DM, Stiennon N, Wu J et al (2019) Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593