From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain.

BERT Biomedical domain Domain adaptation GPT Large Language Models Probing tasks

Journal

Artificial intelligence in medicine

ISSN: 1873-2860

Titre abrégé: Artif Intell Med

Pays: Netherlands

ID NLM: 8915031

Informations de publication

Date de publication:
23 Oct 2024

Historique:

received: 30 03 2024

revised: 17 09 2024

accepted: 16 10 2024

medline: 30 10 2024

pubmed: 30 10 2024

entrez: 29 10 2024

Statut: aheadofprint

Résumé

In this study, we delve into the adaptation and effectiveness of Transformer-based, pre-trained Large Language Models (LLMs) within the biomedical domain, a field that poses unique challenges due to its complexity and the specialized nature of its data. Building on the foundation laid by the transformative architecture of Transformers, we investigate the nuanced dynamics of LLMs through a multifaceted lens, focusing on two domain-specific tasks, i.e., Natural Language Inference (NLI) and Named Entity Recognition (NER). Our objective is to bridge the knowledge gap regarding how these models' downstream performances correlate with their capacity to encapsulate task-relevant information. To achieve this goal, we probed and analyzed the inner encoding and attention mechanisms in LLMs, both encoder- and decoder-based, tailored for either general or biomedical-specific applications. This examination occurs before and after the models are fine-tuned across various data volumes. Our findings reveal that the models' downstream effectiveness is intricately linked to specific patterns within their internal mechanisms, shedding light on the nuanced ways in which LLMs process and apply knowledge in the biomedical context. The source code for this paper is available at https://github.com/agnesebonfigli99/LLMs-in-the-Biomedical-Domain.

Identifiants

DOI: 10.1016/j.artmed.2024.103003 PMID: 39471773

pubmed: 39471773

pii: S0933-3657(24)00245-8

doi: 10.1016/j.artmed.2024.103003

pii:

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

103003

Informations de copyright

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

From pre-training to fine-tuning: An in-depth analysis of Large Language Models in the biomedical domain.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Agnese Bonfigli (A)

Luca Bacco (L)

Mario Merone (M)

Felice Dell'Orletta (F)

Classifications MeSH