Scientific figures interpreted by ChatGPT: strengths in plot recognition and limits in color perception.


Journal

NPJ precision oncology
ISSN: 2397-768X
Titre abrégé: NPJ Precis Oncol
Pays: England
ID NLM: 101708166

Informations de publication

Date de publication:
05 Apr 2024
Historique:
received: 26 10 2023
accepted: 27 02 2024
medline: 6 4 2024
pubmed: 6 4 2024
entrez: 5 4 2024
Statut: epublish

Résumé

Emerging studies underscore the promising capabilities of large language model-based chatbots in conducting basic bioinformatics data analyses. The recent feature of accepting image inputs by ChatGPT, also known as GPT-4V(ision), motivated us to explore its efficacy in deciphering bioinformatics scientific figures. Our evaluation with examples in cancer research, including sequencing data analysis, multimodal network-based drug repositioning, and tumor clonal evolution, revealed that ChatGPT can proficiently explain different plot types and apply biological knowledge to enrich interpretations. However, it struggled to provide accurate interpretations when color perception and quantitative analysis of visual elements were involved. Furthermore, while the chatbot can draft figure legends and summarize findings from the figures, stringent proofreading is imperative to ensure the accuracy and reliability of the content.

Identifiants

pubmed: 38580746
doi: 10.1038/s41698-024-00576-z
pii: 10.1038/s41698-024-00576-z
doi:

Types de publication

Journal Article

Langues

eng

Pagination

84

Subventions

Organisme : NIGMS NIH HHS
ID : P20 GM103434
Pays : United States

Informations de copyright

© 2024. The Author(s).

Références

Milano, S., McGrane, J. A. & Leonelli, S. Large language models challenge the future of higher education. Nat. Mach. Intell. 5, 333–334 (2023).
doi: 10.1038/s42256-023-00644-2
van Dis, E. A. M., Bollen, J., Zuidema, W., van Rooij, R. & Bockting, C. L. ChatGPT: five priorities for research. Nature 614, 224–226 (2023).
doi: 10.1038/d41586-023-00288-7 pubmed: 36737653
Lee, P., Bubeck, S. & Petro, J. Benefits, limits, and risks of GPT-4 as an AI Chatbot for medicine. N. Engl. J. Med. 388, 1233–1239 (2023).
doi: 10.1056/NEJMsr2214184 pubmed: 36988602
Shue, E., Liu, L., Li, B., Feng, Z., Li, X. & Hu, G. Empowering beginners in bioinformatics with ChatGPT. Quant. Biol. 11, 105–108 (2023).
doi: 10.15302/J-QB-023-0327 pubmed: 37378043 pmcid: 10299548
Piccolo, S. R., Denny, P., Luxton-Reilly, A., Payne, S. H. & Ridge, P. G. Evaluating a large language model’s ability to solve programming exercises from an introductory bioinformatics course. PLoS Comput. Biol. 19, e1011511 (2023).
doi: 10.1371/journal.pcbi.1011511 pubmed: 37769024 pmcid: 10564134
Merow, C., Serra-Diaz, J. M., Enquist, B. J. & Wilson, A. M. AI chatbots can boost scientific coding. Nat. Ecol. Evol. https://doi.org/10.1038/s41559-023-02063-3 (2023).
Perkel, J. M. Six tips for better coding with ChatGPT. Nature 618, 422–423 (2023).
doi: 10.1038/d41586-023-01833-0 pubmed: 37277596
Lubiana, T. et al. Ten quick tips for harnessing the power of ChatGPT in computational biology. PLoS Comput. Biol. 19, e1011319 (2023).
doi: 10.1371/journal.pcbi.1011319 pubmed: 37561669 pmcid: 10414555
Rahman, C. R. & Wong, L. How much can ChatGPT really help computational biologists in programming? Preprint at bioRxiv https://doi.org/10.48550/arXiv.2309.09126 (2023).
Pells, R. Spice up your bioinformatics skill set with AI. Nature 622, S1–S3 (2023).
doi: 10.1038/d41586-023-03067-6 pubmed: 37794269
Hu, G., Liu, L. & Xu, D. On the responsible use of chatbots in bioinformatics. Genom. Proteom. Bioinform. https://doi.org/10.1093/gpbjnl/qzae002 (2024).
Xu, D. ChatGPT opens a new door for bioinformatics. Quant. Biol. 11, 204–206 (2023).
doi: 10.15302/J-QB-023-0328 pubmed: 37900935 pmcid: 10609615
Chen, Q. et al. An extensive benchmark study on biomedical text generation and mining with ChatGPT. Bioinformatics https://doi.org/10.1093/bioinformatics/btad557 (2023).
Jin, Q., Yang, Y., Chen, Q. & Lu, Z. GeneGPT: augmenting large language models with domain tools for improved access to biomedical information. Bioinformatics https://doi.org/10.1093/bioinformatics/btae075 (2024).
Tang, X., Qian, B., Gao, R., Chen, J., Chen, X. & Gerstein, M. BioCoder: a benchmark for bioinformatics code generation with contextual pragmatic knowledge. Preprint at bioRxiv https://doi.org/10.48550/arXiv.2308.16458 (2023).
Sobania, D., Briesch, M., Hanna, C. & Petke, J. An analysis of the automatic bug fixing performance of ChatGPT. In 2023 IEEE/ACM International Workshop on Automated Program Repair (APR) 23–30 (Melbourne, Australia, 2023).
Hou, W. & Ji, Z. GeneTuring tests GPT models in genomics. Preprint at bioRxiv https://doi.org/10.1101/2023.03.11.532238 (2023).
Duong, D. & Solomon, B. D. Analysis of large-language model versus human performance for genetics questions. Eur. J. Hum. Genet. https://doi.org/10.1038/s41431-023-01396-8 (2023).
Hou, W. & Ji, Z. Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis. Preprint at bioRxiv https://doi.org/10.1101/2023.04.16.537094 (2023).
Dziadowicz, S. et al. Bone marrow stroma-induced transcriptome and regulome signatures of multiple myeloma. Cancers 14, 927 (2022).
doi: 10.3390/cancers14040927 pubmed: 35205675 pmcid: 8870223
Guo, N. L. et al. A predictive 7-gene assay and prognostic protein biomarkers for non-small cell lung cancer. EBioMedicine 32, 102–110 (2018).
doi: 10.1016/j.ebiom.2018.05.025 pubmed: 29861409 pmcid: 6020749
Ye, Q. et al. Molecular analysis of ZNF71 KRAB in non-small-cell lung cancer. Int. J. Mol. Sci. 22, https://doi.org/10.3390/ijms22073752 (2021).
Ye, Q. et al. Multi-omics immune interaction networks in lung cancer tumorigenesis, proliferation, and survival. Int. J. Mol. Sci. 23, https://doi.org/10.3390/ijms232314978 (2022).
Nickerson, R. S. Confirmation bias: a ubiquitous phenomenon in many guises. Rev. Gen. Psychol. 2, 175-220 https://doi.org/10.1037/1089-2680.2.2.175 (1998).
Maddigan, P. & Susnjak, T. Chat2VIS: generating data visualizations via natural language using ChatGPT, codex and GPT-3 large language models. IEEE Access 11, 45181–45193 (2023).
doi: 10.1109/ACCESS.2023.3274199
Wang, L., Ge, X., Liu, L. & Hu, G. Code interpreter for bioinformatics: are we there yet? Ann. Biomed. Eng. https://doi.org/10.1007/s10439-023-03324-9 (2023).
Yang, Z. et al. The Dawn of LMMs: preliminary explorations with GPT-4V(ision). Preprint at bioRxiv https://doi.org/10.48550/arXiv.2309.17421 (2023).
McBee, J. C. et al. Interdisciplinary inquiry via PanelGPT: application to explore chatbot application in sports rehabilitation. Preprint at bioRxiv https://doi.org/10.1101/2023.07.23.23292452 (2023).
Rose, D. et al. Visual chain of thought: bridging logical gaps with multimodal infillings. Preprint at bioRxiv https://doi.org/10.48550/arXiv.2305.02317 (2023).
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).
doi: 10.1038/nmeth.1226 pubmed: 18516045
Ge, S. X., Jung, D. & Yao, R. ShinyGO: a graphical gene-set enrichment tool for animals and plants. Bioinformatics 36, 2628–2629 (2020).
doi: 10.1093/bioinformatics/btz931 pubmed: 31882993
Mezheyeuski, A. et al. Multispectral imaging for quantitative and compartment-specific immune infiltrates reveals distinct immune profiles that classify lung cancer patients. J. Pathol. 244, 421–431 (2018).
doi: 10.1002/path.5026 pubmed: 29282718
Tsherniak, A. et al. Defining a cancer dependency map. Cell 170, 564–576 e516 (2017).
doi: 10.1016/j.cell.2017.06.010 pubmed: 28753430 pmcid: 5667678
Guo, L., Cukic, B. & Singh, H. Predicting fault prone modules by the Dempster–Shafer belief networks. In 18th IEEE International Conference on Automated Software Engineering, 2003. Proceedings. 249–252 (Montreal, QC, Canada, 2003).
Xu, J. Y. et al. Integrative proteomic characterization of human lung adenocarcinoma. Cell 182, 245–261e217 (2020).
doi: 10.1016/j.cell.2020.05.043 pubmed: 32649877
Grossman, R. L. et al. Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375, 1109–1112 (2016).
doi: 10.1056/NEJMp1607591 pubmed: 27653561 pmcid: 6309165
Ahmadinejad, N. et al. Accurate identification of subclones in tumor genomes. Mol. Biol. Evol. 39, https://doi.org/10.1093/molbev/msac136 (2022).
Dang, H. X. et al. ClonEvol: clonal ordering and visualization in cancer sequencing. Ann. Oncol. 28, 3076–3082 (2017).
doi: 10.1093/annonc/mdx517 pubmed: 28950321 pmcid: 5834020
Li, D., Harrison, J. K., Purushotham, D. & Wang, T. Exploring genomic data coupled with 3D chromatin structures using the WashU Epigenome Browser. Nat. Methods 19, 909–910 (2022).
doi: 10.1038/s41592-022-01550-y pubmed: 35864166
Honnibal, M., Montani, I., Van Landeghem, S. & Boyd, A. spaCy: Industrial-strength Natural Language Processing in Python https://github.com/explosion/spaCy (2020).

Auteurs

Jinge Wang (J)

Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, WV, 26506, USA.

Qing Ye (Q)

West Virginia University Cancer Institute, West Virginia University, Morgantown, WV, 26506, USA.

Li Liu (L)

College of Health Solutions, Arizona State University, Phoenix, AZ, 85004, USA.
Biodesign Institute, Arizona State University, Tempe, AZ, 85281, USA.

Nancy Lan Guo (NL)

West Virginia University Cancer Institute, West Virginia University, Morgantown, WV, 26506, USA.
Department of Occupational and Environmental Health Sciences, West Virginia University, Morgantown, WV, 26506, USA.

Gangqing Hu (G)

Department of Microbiology, Immunology & Cell Biology, West Virginia University, Morgantown, WV, 26506, USA. michael.hu@hsc.wvu.edu.
West Virginia University Cancer Institute, West Virginia University, Morgantown, WV, 26506, USA. michael.hu@hsc.wvu.edu.

Classifications MeSH