From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain.

BERT Biomedical domain Domain adaptation GPT Large Language Models Probing tasks

Journal

Artificial intelligence in medicine
ISSN: 1873-2860
Titre abrégé: Artif Intell Med
Pays: Netherlands
ID NLM: 8915031

Informations de publication

Date de publication:
23 Oct 2024
Historique:
received: 30 03 2024
revised: 17 09 2024
accepted: 16 10 2024
medline: 30 10 2024
pubmed: 30 10 2024
entrez: 29 10 2024
Statut: aheadofprint

Résumé

In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models' downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models' downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.

Identifiants

pubmed: 39471773
pii: S0933-3657(24)00245-8
doi: 10.1016/j.artmed.2024.103003
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

103003

Informations de copyright

Copyright © 2024. Published by Elsevier B.V.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Agnese Bonfigli (A)

Research Unit of Intelligent Technology for Health and Wellbeing, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo, 21, Rome, 00128, Italy; ItaliaNLP Lab, Institute of Computational Linguistics "Antonio Zampolli", National Research Council, Via Giuseppe Moruzzi, 1, Pisa, 56124, Italy.

Luca Bacco (L)

ItaliaNLP Lab, Institute of Computational Linguistics "Antonio Zampolli", National Research Council, Via Giuseppe Moruzzi, 1, Pisa, 56124, Italy; Research Unit of Computer Systems and Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo, 21, Rome, 00128, Italy.

Mario Merone (M)

Research Unit of Intelligent Technology for Health and Wellbeing, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo, 21, Rome, 00128, Italy. Electronic address: m.merone@unicampus.it.

Felice Dell'Orletta (F)

ItaliaNLP Lab, Institute of Computational Linguistics "Antonio Zampolli", National Research Council, Via Giuseppe Moruzzi, 1, Pisa, 56124, Italy.

Classifications MeSH