GPT-4 as a biomedical simulator.
Artificial intelligence
Biomedical simulation
Computational biology
GPT-4
Large language models
Journal
Computers in biology and medicine
ISSN: 1879-0534
Titre abrégé: Comput Biol Med
Pays: United States
ID NLM: 1250250
Informations de publication
Date de publication:
21 Jun 2024
21 Jun 2024
Historique:
received:
15
01
2024
revised:
18
06
2024
accepted:
19
06
2024
medline:
24
6
2024
pubmed:
24
6
2024
entrez:
23
6
2024
Statut:
aheadofprint
Résumé
Computational simulation of biological processes can be a valuable tool for accelerating biomedical research, but usually requires extensive domain knowledge and manual adaptation. Large language models (LLMs) such as GPT-4 have proven surprisingly successful for a wide range of tasks. This study provides proof-of-concept for the use of GPT-4 as a versatile simulator of biological systems. We introduce SimulateGPT, a proof-of-concept for knowledge-driven simulation across levels of biological organization through structured prompting of GPT-4. We benchmarked our approach against direct GPT-4 inference in blinded qualitative evaluations by domain experts in four scenarios and in two quantitative scenarios with experimental ground truth. The qualitative scenarios included mouse experiments with known outcomes and treatment decision support in sepsis. The quantitative scenarios included prediction of gene essentiality in cancer cells and progression-free survival in cancer patients. In qualitative experiments, biomedical scientists rated SimulateGPT's predictions favorably over direct GPT-4 inference. In quantitative experiments, SimulateGPT substantially improved classification accuracy for predicting the essentiality of individual genes and increased correlation coefficients and precision in the regression task of predicting progression-free survival. This proof-of-concept study suggests that LLMs may enable a new class of biomedical simulators. Such text-based simulations appear well suited for modeling and understanding complex living systems that are difficult to describe with physics-based first-principles simulations, but for which extensive knowledge is available as written text. Finally, we propose several directions for further development of LLM-based biomedical simulators, including augmentation through web search retrieval, integrated mathematical modeling, and fine-tuning on experimental data.
Sections du résumé
BACKGROUND
BACKGROUND
Computational simulation of biological processes can be a valuable tool for accelerating biomedical research, but usually requires extensive domain knowledge and manual adaptation. Large language models (LLMs) such as GPT-4 have proven surprisingly successful for a wide range of tasks. This study provides proof-of-concept for the use of GPT-4 as a versatile simulator of biological systems.
METHODS
METHODS
We introduce SimulateGPT, a proof-of-concept for knowledge-driven simulation across levels of biological organization through structured prompting of GPT-4. We benchmarked our approach against direct GPT-4 inference in blinded qualitative evaluations by domain experts in four scenarios and in two quantitative scenarios with experimental ground truth. The qualitative scenarios included mouse experiments with known outcomes and treatment decision support in sepsis. The quantitative scenarios included prediction of gene essentiality in cancer cells and progression-free survival in cancer patients.
RESULTS
RESULTS
In qualitative experiments, biomedical scientists rated SimulateGPT's predictions favorably over direct GPT-4 inference. In quantitative experiments, SimulateGPT substantially improved classification accuracy for predicting the essentiality of individual genes and increased correlation coefficients and precision in the regression task of predicting progression-free survival.
CONCLUSION
CONCLUSIONS
This proof-of-concept study suggests that LLMs may enable a new class of biomedical simulators. Such text-based simulations appear well suited for modeling and understanding complex living systems that are difficult to describe with physics-based first-principles simulations, but for which extensive knowledge is available as written text. Finally, we propose several directions for further development of LLM-based biomedical simulators, including augmentation through web search retrieval, integrated mathematical modeling, and fine-tuning on experimental data.
Identifiants
pubmed: 38909448
pii: S0010-4825(24)00881-3
doi: 10.1016/j.compbiomed.2024.108796
pii:
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
108796Informations de copyright
Copyright © 2024 The Authors. Published by Elsevier Ltd.. All rights reserved.
Déclaration de conflit d'intérêts
Declaration of competing interest The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Christoph Bock reports a relationship with Myllia Biotechnology and Neurolentech that includes: board membership.