[ChatGPT and the German board examination for ophthalmology: an evaluation].

ChatGPT und die deutsche Facharztprüfung für Augenheilkunde: eine Evaluierung.

Artificial intelligence Large Language Model Medicine Open questions Subspeciality

Journal

Die Ophthalmologie

ISSN: 2731-7218

Titre abrégé: Ophthalmologie

Pays: Germany

ID NLM: 9918402288106676

Informations de publication

Date de publication:
27 May 2024

Historique:

received: 23 12 2023

accepted: 18 04 2024

revised: 18 04 2024

medline: 27 5 2024

pubmed: 27 5 2024

entrez: 27 5 2024

Statut: aheadofprint

Résumé

In recent years artificial intelligence (AI), as a new segment of computer science, has also become increasingly more important in medicine. The aim of this project was to investigate whether the current version of ChatGPT (ChatGPT 4.0) is able to answer open questions that could be asked in the context of a German board examination in ophthalmology. After excluding image-based questions, 10 questions from 15 different chapters/topics were selected from the textbook 1000 questions in ophthalmology (1000 Fragen Augenheilkunde 2nd edition, 2014). ChatGPT was instructed by means of a so-called prompt to assume the role of a board certified ophthalmologist and to concentrate on the essentials when answering. A human expert with considerable expertise in the respective topic, evaluated the answers regarding their correctness, relevance and internal coherence. Additionally, the overall performance was rated by school grades and assessed whether the answers would have been sufficient to pass the ophthalmology board examination. The ChatGPT would have passed the board examination in 12 out of 15 topics. The overall performance, however, was limited with only 53.3% completely correct answers. While the correctness of the results in the different topics was highly variable (uveitis and lens/cataract 100%; optics and refraction 20%), the answers always had a high thematic fit (70%) and internal coherence (71%). The fact that ChatGPT 4.0 would have passed the specialist examination in 12 out of 15 topics is remarkable considering the fact that this AI was not specifically trained for medical questions; however, there is a considerable performance variability between the topics, with some serious shortcomings that currently rule out its safe use in clinical practice. FRAGESTELLUNG: In den letzten Jahren nimmt die künstliche Intelligenz (KI) als neues Segment der Informatik auch in der Medizin eine immer größere Bedeutung ein. Ziel dieses Projekts war es zu untersuchen, ob die aktuelle Version von ChatGPT (ChatGPT 4.0) in der Lage ist, offene Fragen zu beantworten, die im Rahmen einer deutschen Facharztprüfung in der Augenheilkunde gestellt werden könnten. Aus dem Lehrbuch „1000 Fragen Augenheilkunde“ (2. Auflage, 2014) wurden nach Ausschluss bildbasierter Fragen jeweils 10 Fragen aus 15 verschiedenen Kapiteln/Themenschwerpunkten ausgewählt. ChatGPT wurde mittels eines sog. Prompt instruiert, die Rolle eines Facharztes für Augenheilkunde einzunehmen und sich im Umfang der Antworten auf das Wesentliche zu konzentrieren. Die Bewertung eines Themengebietes erfolgte durch einen in der Subspezialität langjährig erfahrenen Ophthalmologen, welcher die Antworten hinsichtlich Richtigkeit, Themenrelevanz und innerer Kohärenz beurteilte und die Gesamtleistung mit einer Schulnote bewertete. ChatGPT hätte die Facharztprüfung in 12 von 15 Themengebieten bestanden. Allerdings war die Gesamtleistung auf nur 53,3 % vollständig korrekte Antworten beschränkt. Während die Korrektheit der Ergebnisse in den unterschiedlichen Themengebieten sehr variabel war („Uveitis“ und „Linse/Katarakt“ 100 %; „Optik und Refraktion“ 20 %), hatten die Antworten stets eine hohe thematische Passgenauigkeit (70 %) und innere Kohärenz (71 %). Die Tatsache, dass ChatGPT 4.0 in 12 von 15 Themengebieten die Facharztprüfung bestanden hätte, ist vor dem Hintergrund bemerkenswert, dass diese KI nicht spezifisch für medizinische Fragestellungen trainiert wurde. Allerdings offenbart sich eine erhebliche Leistungsvarianz zwischen den Themengebieten mit zum Teil gravierenden Mängeln, die einen sicheren Einsatz in der klinischen Praxis derzeit ausschließt.

Autres résumés

Type: Publisher (ger)

FRAGESTELLUNG: In den letzten Jahren nimmt die künstliche Intelligenz (KI) als neues Segment der Informatik auch in der Medizin eine immer größere Bedeutung ein. Ziel dieses Projekts war es zu untersuchen, ob die aktuelle Version von ChatGPT (ChatGPT 4.0) in der Lage ist, offene Fragen zu beantworten, die im Rahmen einer deutschen Facharztprüfung in der Augenheilkunde gestellt werden könnten.

Identifiants

DOI: 10.1007/s00347-024-02046-0 PMID: 38801461

pubmed: 38801461

doi: 10.1007/s00347-024-02046-0

pii: 10.1007/s00347-024-02046-0

doi:

Types de publication

English Abstract Journal Article

Langues

ger

Sous-ensembles de citation

Informations de copyright

Références

Briganti G, Le Moine O (2020) Artificial intelligence in medicine: today and tomorrow. Front Med 7:27

doi: 10.3389/fmed.2020.00027

Bini SA (2018) Artificial intelligence, machine learning, deep learning, and cognitive computing: what do these terms mean and how will they impact health care? J Arthroplasty 33(8):2358–2361

doi: 10.1016/j.arth.2018.02.067 pubmed: 29656964

Van Dis EA, Bollen J, Zuidema W, van Rooij R, Bockting CL (2023) ChatGPT: five priorities for research. Nature 614(7947):224–226

doi: 10.1038/d41586-023-00288-7 pubmed: 36737653

Tan TF, Thirunavukarasu AJ, Campbell JP, Keane PA, Pasquale LR, Abramoff MD, u. a. Generative Artificial Intelligence through ChatGPT and Other Large Language Models in Ophthalmology: Clinical Applications and Challenges. Ophthalmol Sci. 2023;3(4):100394.

Patel SB, Lam K. ChatGPT: the future of discharge summaries? Lancet Digit Health. 2023;5(3):e107–8.

Ali MJ, Singh S (2023) ChatGPT and scientific abstract writing: pitfalls and caution. Graefes Arch Clin Exp Ophthalmol: 1–2

Singh S, Djalilian A, Ali MJ. ChatGPT and Ophthalmology: Exploring Its Potential with Discharge Summaries and Operative Notes. Semin Ophthalmol. 4. Juli 2023;38(5):503–7.

Potapenko I, Boberg-Ans LC, Stormly Hansen M, Klefter ON, van Dijk EHC, Subhi Y (2023) Artificial intelligence-based chatbot patient information on common retinal diseases using ChatGPT. Acta Ophthalmol (Copenh). 1. November 101(7):829–831

Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C et al (2023) Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. Plos Digit Heal 2(2):e198

doi: 10.1371/journal.pdig.0000198

Antaki F, Touma S, Milad D, El-Khoury J, Duval R (2023) Evaluating the performance of chatgpt in ophthalmology: An analysis of its successes and shortcomings. Ophthalmol. Sci 100324:

Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA et al (2023) How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. Jmir Med Educ 9(1):e45312

doi: 10.2196/45312 pubmed: 36753318 pmcid: 9947764

Jung LB, Gudera JA, Wiegand TL, Allmendinger S, Dimitriadis K, Koerte IK (2023) ChatGPT passes German state examination in medicine with picture questions omitted. Dtsch Ärztebl Int 120(373):21–22

Takagi S, Watari T, Erabi A, Sakaguchi K. Performance of GPT‑3.5 and GPT‑4 on the Japanese Medical Licensing Examination: comparison study. JMIR Med Educ. 2023;9(1):e48002.

Mihalache A, Popovic MM, Muni RH (2023) Performance of an artificial intelligence chatbot in ophthalmic knowledge assessment. JAMA Ophthalmol

Mihalache A, Huang RS, Popovic MM, Muni RH (2023) Performance of an upgraded artificial intelligence chatbot for ophthalmic knowledge assessment. JAMA Ophthalmol

Panthier C, Gatinel D (2023) Success of ChatGPT, an AI language model, in taking the French language version of the European Board of Ophthalmology examination: A novel approach to medical knowledge assessment. J Fr Ophtalmol 46(7):706–711

doi: 10.1016/j.jfo.2023.05.006 pubmed: 37537126

Lin JC, Younessi DN, Kurapati SS, Tang OY, Scott IU. Comparison of GPT‑3.5, GPT‑4, and human user performance on a practice ophthalmology written examination. Eye [Internet]. 8. Mai 2023; Verfügbar unter: https://doi.org/10.1038/s41433-023-02564-2

Raimondi R, Tzoumas N, Salisbury T, Di Simplicio S, Romano MR (2023) Comparative analysis of large language models in the Royal College of Ophthalmologists fellowship exams. Eye: 1–4

Kampik A, Grehn F, Facharztprüfung Augenheilkunde ME (2014) 1000 kommentierte Prüfungsfragen. Thieme

Open AI (2024) Prompt engineering (guides) [Internet]. [cité 11. avr (Disponible sur: https://platform.openai.com/docs/guides/prompt-engineering )

Harris PA, Taylor R, Minor BL, Elliott V, Fernandez M, O’Neal L et al (2019) The REDCap consortium: building an international community of software platform partners. J Biomed Inform 95:103208

doi: 10.1016/j.jbi.2019.103208 pubmed: 31078660 pmcid: 7254481

Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG (2009) Research electronic data capture (REDCap)—a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform 42(2):377–381

doi: 10.1016/j.jbi.2008.08.010 pubmed: 18929686

Dossantos J, An J, Javan R (2023) Eyes on AI: ChatGPT’s Transformative Potential Impact on Ophthalmology. Cureus 15(6)

Lai VD, Ngo NT, Veyseh APB, Man H, Dernoncourt F, Bui T et al (2023) Chatgpt beyond english: Towards a comprehensive evaluation of large language models in multilingual learning. ArXiv Prepr. ArXiv, Bd. 230405613

ChatGPT Is Cutting Non-English Languages Out of the AI Revolution. [zitiert 18. November 2023]; Verfügbar unter: https://www.wired.com/story/chatgpt-non-english-languages-ai-revolution/ ,

Bang Y, Cahyawijaya S, Lee N, Dai W, Su D, Wilie B et al (2023) A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. ArXiv Prepr. ArXiv, Bd. 230204023

Martinho A, Kroesen M, Chorus C (2021) A healthy debate: Exploring the views of medical doctors on the ethics of artificial intelligence. Artif Intell Med 121:102190

doi: 10.1016/j.artmed.2021.102190 pubmed: 34763805

Beutel G, Geerits E, Kielstein JT (2023) Artificial hallucination: GPT on LSD? Crit Care 27(1):148

doi: 10.1186/s13054-023-04425-6 pubmed: 37072798 pmcid: 10114308

Neues zur Geschichte des Begriffes Pannus. In: Archiv für Geschichte der Medizin [Internet]. Franz Steiner Verlag; 1927. S. 240–52. Verfügbar unter: https://www.jstor.org/stable/20773407

Schmidt-Rimpler H Augenheilkunde und. Opthalmoskopie (In: Werdens Sammlung kurzer medizinischer Lehrbücher. 2. Braunschweig: von Friedrich Werden)

Hirschberg J (1871) Professor A. von Graefe’s klinische Vorträge über Augenheilkunde. In, 1. Aufl. August Hirschwald, Berlin:

Stages of Trachoma. In: Trachoma Manual and Atlas [Internet]. Public Health Service Publication No.541; 1960. Verfügbar unter: https://books.google.de/books?id=KhKedH_sC2UC&pg=PA3&lpg=PA3&dq=%22MacCallan%27s+classification+of+trachoma+is+in+general+use+all+over+the+world%22&source=bl&ots=MjVgZHx7rn&sig=ACfU3U2vL3egFX-Q9Y_Q5kBtkG5xtxjl4A&hl=de&sa=X&ved=2ahUKEwjl5J3Ets6CAxVVg_0HHR

C. Stades, Milton Wyman, Michael H. Boeve, Willy Neumann, Bernhard Spiess. 10 Cornea and Sclera. In: Ophthalmology for the Veterinary Practitioner. 2. Schlütersche; 2007. S. 272.

Nash Squared Digital Leadership Report 2023; Website: https://www.nashsquared.com/2023-digital-leadership-report .

Srivastava R (2023) Applications of Artificial Intelligence in Medicine. Explor Res Hypothesis Med 000:0–0

doi: 10.14218/ERHM.2023.00048

Li J, Dada A, Puladi B, Kleesiek J, Egger J (2024) ChatGPT in healthcare: a taxonomy and systematic review. Comput Methods Programs Biomed 108013:

Finger RP (2020) Künstliche Intelligenz in der Augenheilkunde. Ophthalmol 117(10):963–964

Hswen Y, Voelker R (2023) New AI Tools Must Have Health Equity in Their DNA. JAMA

Voelker R (2023) The Promise and Pitfalls of AI in the Complex World of Diagnosis, Treatment, and Disease Management. JAMA

Tan TF, Thirunavukarasu AJ, Jin L, Lim J, Poh S, Teo ZL et al (2023) Artificial intelligence and digital health in global eye health: opportunities and challenges. Lancet Glob Health 11(9):e1432–43

doi: 10.1016/S2214-109X(23)00323-6 pubmed: 37591589

Alexandrou M (2024) Interventional Cardiologists’ Perspectives and Knowledge Towards Artificial Intelligence. In SCAI

van der Zander QE, van der Ende-van Loon MC, Janssen JM, Winkens B, van der Sommen F, Masclee AA et al (2022) Artificial intelligence in (gastrointestinal) healthcare: patients’ and physicians’ perspectives. Sci Rep 12(1):16779

doi: 10.1038/s41598-022-20958-2 pubmed: 36202957 pmcid: 9537305

Holzner D, Apfelbacher T, Rödle W, Schüttler C, Prokosch HU, Mikolajczyk RT et al (2022) Attitudes and Acceptance Towards Artificial Intelligence in. Medical, Care. In, S 68–72

Pedro AR, Dias MB, Laranjo L, Cunha AS, Cordeiro JV (2023) Artificial intelligence in medicine: A comprehensive survey of medical doctor’s perspectives in Portugal. PLoS ONE 18(9):e290613

doi: 10.1371/journal.pone.0290613 pubmed: 37676884 pmcid: 10484446

Chen M, Zhang B, Cai Z, Seery S, Gonzalez MJ, Ali NM et al (2022) Acceptance of clinical artificial intelligence among physicians and medical students: a systematic review with cross-sectional survey. Front Med 9:990604

doi: 10.3389/fmed.2022.990604

[ChatGPT and the German board examination for ophthalmology: an evaluation].

Journal

Informations de publication

Résumé

Autres résumés

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Références

Auteurs

Rémi Yaïci (R)

M Cieplucha (M)

R Bock (R)

F Moayed (F)

N E Bechrakis (NE)

P Berens (P)

N Feltgen (N)

D Friedburg (D)

M Gräf (M)

R Guthoff (R)

E M Hoffmann (EM)

H Hoerauf (H)

C Hintschich (C)

T Kohnen (T)

E M Messmer (EM)

M M Nentwich (MM)

U Pleyer (U)

U Schaudig (U)

B Seitz (B)

G Geerling (G)

M Roth (M)

Classifications MeSH