Are ChatGPT's Free-Text Responses on Periprosthetic Joint Infections of the Hip and Knee Reliable and Useful?
artificial intelligence
hip prosthesis
knee prosthesis
large language model
periprosthetic joint infection
Journal
Journal of clinical medicine
ISSN: 2077-0383
Titre abrégé: J Clin Med
Pays: Switzerland
ID NLM: 101606588
Informations de publication
Date de publication:
20 Oct 2023
20 Oct 2023
Historique:
received:
28
08
2023
revised:
12
10
2023
accepted:
17
10
2023
medline:
28
10
2023
pubmed:
28
10
2023
entrez:
28
10
2023
Statut:
epublish
Résumé
This study aimed to evaluate ChatGPT's performance on questions about periprosthetic joint infections (PJI) of the hip and knee. Twenty-seven questions from the 2018 International Consensus Meeting on Musculoskeletal Infection were selected for response generation. The free-text responses were evaluated by three orthopedic surgeons using a five-point Likert scale. Inter-rater reliability (IRR) was assessed via Fleiss' kappa (FK). Overall, near-perfect IRR was found for disagreement on the presence of factual errors (FK: 0.880, 95% CI [0.724, 1.035], ChatGPT's free-text responses to complex orthopedic questions were predominantly reliable and useful for orthopedic surgeons and patients. Given variations in performance by question and subtopic, consulting additional sources and exercising careful interpretation should be emphasized for reliable medical decision-making.
Sections du résumé
BACKGROUND
BACKGROUND
This study aimed to evaluate ChatGPT's performance on questions about periprosthetic joint infections (PJI) of the hip and knee.
METHODS
METHODS
Twenty-seven questions from the 2018 International Consensus Meeting on Musculoskeletal Infection were selected for response generation. The free-text responses were evaluated by three orthopedic surgeons using a five-point Likert scale. Inter-rater reliability (IRR) was assessed via Fleiss' kappa (FK).
RESULTS
RESULTS
Overall, near-perfect IRR was found for disagreement on the presence of factual errors (FK: 0.880, 95% CI [0.724, 1.035],
CONCLUSIONS
CONCLUSIONS
ChatGPT's free-text responses to complex orthopedic questions were predominantly reliable and useful for orthopedic surgeons and patients. Given variations in performance by question and subtopic, consulting additional sources and exercising careful interpretation should be emphasized for reliable medical decision-making.
Identifiants
pubmed: 37892793
pii: jcm12206655
doi: 10.3390/jcm12206655
pmc: PMC10607052
pii:
doi:
Types de publication
Journal Article
Langues
eng
Références
Eur Arch Otorhinolaryngol. 2023 Sep;280(9):4271-4278
pubmed: 37285018
Clin Orthop Relat Res. 2023 Aug 1;481(8):1623-1630
pubmed: 37220190
J Am Med Inform Assoc. 2023 Aug 18;30(9):1558-1560
pubmed: 37335851
Dtsch Arztebl Int. 2023 May 30;120(21):373-374
pubmed: 37530052
Aesthet Surg J. 2023 May 04;:
pubmed: 37140001
J Am Med Inform Assoc. 2010 Jul-Aug;17(4):373-4
pubmed: 20595302
Clin Orthop Relat Res. 2023 Apr 1;481(4):651-655
pubmed: 36877168
Nat Med. 2019 Jan;25(1):24-29
pubmed: 30617335
JMIR Med Educ. 2023 Feb 8;9:e45312
pubmed: 36753318
J Am Med Inform Assoc. 2023 Feb 16;30(3):529-538
pubmed: 36565465
Nurse Educ Pract. 2023 Jan;66:103537
pubmed: 36549229
J Am Med Inform Assoc. 2023 Aug 18;30(9):1552-1557
pubmed: 37279884
Int J Rheum Dis. 2023 Jul;26(7):1343-1349
pubmed: 37218530
J Dent. 2021 Apr;107:103610
pubmed: 33631303
Clin Exp Dermatol. 2023 Jun 02;:
pubmed: 37264670
J Bone Joint Surg Am. 2019 Oct 16;101(20):e107
pubmed: 31626015
PLOS Digit Health. 2023 Feb 9;2(2):e0000198
pubmed: 36812645
Biometrics. 1977 Mar;33(1):159-74
pubmed: 843571
J Am Med Inform Assoc. 2023 Jun 20;30(7):1237-1245
pubmed: 37087108