Reliability of a generative artificial intelligence tool for pediatric familial Mediterranean fever: insights from a multicentre expert survey.
AI
Artificial intelligence
FMF
Familial mediterranean fever
Generative artificial intelligence
Pediatric rheumatology
Journal
Pediatric rheumatology online journal
ISSN: 1546-0096
Titre abrégé: Pediatr Rheumatol Online J
Pays: England
ID NLM: 101248897
Informations de publication
Date de publication:
23 Aug 2024
23 Aug 2024
Historique:
received:
03
06
2024
accepted:
29
07
2024
medline:
24
8
2024
pubmed:
24
8
2024
entrez:
23
8
2024
Statut:
epublish
Résumé
Artificial intelligence (AI) has become a popular tool for clinical and research use in the medical field. The aim of this study was to evaluate the accuracy and reliability of a generative AI tool on pediatric familial Mediterranean fever (FMF). Fifteen questions repeated thrice on pediatric FMF were prompted to the popular generative AI tool Microsoft Copilot with Chat-GPT 4.0. Nine pediatric rheumatology experts rated response accuracy with a blinded mechanism using a Likert-like scale with values from 1 to 5. Median values for overall responses at the initial assessment ranged from 2.00 to 5.00. During the second assessment, median values spanned from 2.00 to 4.00, while for the third assessment, they ranged from 3.00 to 4.00. Intra-rater variability showed poor to moderate agreement (intraclass correlation coefficient range: -0.151 to 0.534). A diminishing level of agreement among experts over time was documented, as highlighted by Krippendorff's alpha coefficient values, ranging from 0.136 (at the first response) to 0.132 (at the second response) to 0.089 (at the third response). Lastly, experts displayed varying levels of trust in AI pre- and post-survey. AI has promising implications in pediatric rheumatology, including early diagnosis and management optimization, but challenges persist due to uncertain information reliability and the lack of expert validation. Our survey revealed considerable inaccuracies and incompleteness in AI-generated responses regarding FMF, with poor intra- and extra-rater reliability. Human validation remains crucial in managing AI-generated medical information.
Sections du résumé
BACKGROUND
BACKGROUND
Artificial intelligence (AI) has become a popular tool for clinical and research use in the medical field. The aim of this study was to evaluate the accuracy and reliability of a generative AI tool on pediatric familial Mediterranean fever (FMF).
METHODS
METHODS
Fifteen questions repeated thrice on pediatric FMF were prompted to the popular generative AI tool Microsoft Copilot with Chat-GPT 4.0. Nine pediatric rheumatology experts rated response accuracy with a blinded mechanism using a Likert-like scale with values from 1 to 5.
RESULTS
RESULTS
Median values for overall responses at the initial assessment ranged from 2.00 to 5.00. During the second assessment, median values spanned from 2.00 to 4.00, while for the third assessment, they ranged from 3.00 to 4.00. Intra-rater variability showed poor to moderate agreement (intraclass correlation coefficient range: -0.151 to 0.534). A diminishing level of agreement among experts over time was documented, as highlighted by Krippendorff's alpha coefficient values, ranging from 0.136 (at the first response) to 0.132 (at the second response) to 0.089 (at the third response). Lastly, experts displayed varying levels of trust in AI pre- and post-survey.
CONCLUSIONS
CONCLUSIONS
AI has promising implications in pediatric rheumatology, including early diagnosis and management optimization, but challenges persist due to uncertain information reliability and the lack of expert validation. Our survey revealed considerable inaccuracies and incompleteness in AI-generated responses regarding FMF, with poor intra- and extra-rater reliability. Human validation remains crucial in managing AI-generated medical information.
Identifiants
pubmed: 39180115
doi: 10.1186/s12969-024-01011-0
pii: 10.1186/s12969-024-01011-0
doi:
Types de publication
Journal Article
Multicenter Study
Langues
eng
Sous-ensembles de citation
IM
Pagination
78Informations de copyright
© 2024. The Author(s).
Références
Sadeghi P, Karimi H, Lavafian A, Rashedi R, Samieefar N, Shafiekhani S et al. Machine learning and artificial intelligence within pediatric autoimmune diseases: applications, challenges, future perspective. Expert Rev Clin Immunol. 2024;1–18.
Schnappauf O, Chae JJ, Kastner DL, Aksentijevich I. The pyrin inflammasome in Health and Disease. Front Immunol. 2019;10:1745.
doi: 10.3389/fimmu.2019.01745
pubmed: 31456795
pmcid: 6698799
Ben-Chetrit E, Touitou I. Familial Mediterranean Fever in the World. Arthritis Rheum. 2009;61:1447–53.
doi: 10.1002/art.24458
pubmed: 19790133
La Bella S, Di Ludovico A, Di Donato G, Basaran O, Ozen S, Gattorno M, et al. The pyrin inflammasome, a leading actor in pediatric autoinflammatory diseases. Front Immunol. 2023;14:1341680.
doi: 10.3389/fimmu.2023.1341680
pubmed: 38250061
Gattorno M, Hofer M, Federici S, Vanoni F, Bovis F, Aksentijevich I, et al. Classification criteria for autoinflammatory recurrent fevers. Ann Rheum Dis. 2019;78:1025–32.
doi: 10.1136/annrheumdis-2019-215048
pubmed: 31018962
La Bella S, Di Ludovico A, Di Donato G, Scorrano G, Chiarelli F, Vivarelli M, et al. Renal involvement in monogenic autoinflammatory diseases: a narrative review. Nephrol (Carlton). 2023;28:363–71.
doi: 10.1111/nep.14166
Ozen S, Demirkaya E, Erer B, Livneh A, Ben-Chetrit E, Giancane G, et al. EULAR recommendations for the management of familial Mediterranean fever. Ann Rheum Dis. 2016;75:644–51.
doi: 10.1136/annrheumdis-2015-208690
pubmed: 26802180
De Benedetti F, Gattorno M, Anton J, Ben-Chetrit E, Frenkel J, Hoffman HM, et al. Canakinumab for the treatment of Autoinflammatory recurrent fever syndromes. N Engl J Med. 2018;378:1908–19.
doi: 10.1056/NEJMoa1706314
pubmed: 29768139
La Bella S, Di Ludovico A, Mainieri F, Lauriola F, Silvestrini L, Ciarelli F et al. Quality and characteristics of Pediatric Rheumatology Content on Social Media: toward a new era of education for patients and caregivers? J Rheumatol. 2024;jrheum.2024-0039.
La Bella S, Breda L, Ravelli A. Gallia est omnis divisa in partes tres: Social Media Platforms as a New Educational Channel for Pediatric Rheumatology. J Rheumatol. 2024;jrheum.2024 – 0408.
Kingsland LC, Lindberg DA, Sharp GC. AI/RHEUM. A consultant system for rheumatology. J Med Syst. 1983;7:221–7.
doi: 10.1007/BF00993283
pubmed: 6352842
Porter JF, Kingsland LC, Lindberg DA, Shah I, Benge JM, Hazelwood SE, et al. The AI/RHEUM knowledge-based computer consultant system in rheumatology. Performance in the diagnosis of 59 connective tissue disease patients from Japan. Arthritis Rheum. 1988;31:219–26.
doi: 10.1002/art.1780310210
pubmed: 3279963
Bernelot Moens HJ. Validation of the AI/RHEUM knowledge base with data from consecutive rheumatological outpatients. Methods Inf Med. 1992;31:175–81.
doi: 10.1055/s-0038-1634877
pubmed: 1406331
Lee AS, Cutts JH, Sharp GC, Mitchell JA. AI/LEARN network. The use of computer-generated graphics to augment the educational utility of a knowledge-based diagnostic system (AI/RHEUM). J Med Syst. 1987;11:349–58.
doi: 10.1007/BF00996349
pubmed: 3320252
Athreya BH, Cheh ML, Kingsland LC. Computer-assisted diagnosis of pediatric rheumatic diseases. Pediatrics. 1998;102:E48.
doi: 10.1542/peds.102.4.e48
pubmed: 9755285
Rose-Davis B, Van Woensel W, Stringer E, Abidi S, Abidi SSR. Using an Artificial Intelligence-based argument theory to Generate Automated Patient Education Dialogues for Families of Children with juvenile idiopathic arthritis. Stud Health Technol Inf. 2019;264:1337–41.
Rose-Davis B, Van Woensel W, Raza Abidi S, Stringer E, Sibte Raza Abidi S. Semantic knowledge modeling and evaluation of argument theory to develop dialogue based patient education systems for chronic disease self-management. Int J Med Inf. 2022;160:104693.
doi: 10.1016/j.ijmedinf.2022.104693
Bhat CS, Chopra M, Andronikou S, Paul S, Wener-Fligner Z, Merkoulovitch A, et al. Artificial intelligence for interpretation of segments of whole body MRI in CNO: pilot study comparing radiologists versus machine learning algorithm. Pediatr Rheumatol Online J. 2020;18:47.
doi: 10.1186/s12969-020-00442-9
pubmed: 32517764
pmcid: 7285749
Kassani PH, Ehwerhemuepha L, Martin-King C, Kassab R, Gibbs E, Morgan G, et al. Artificial intelligence for nailfold capillaroscopy analyses - a proof of concept application in juvenile dermatomyositis. Pediatr Res. 2024;95(4):981–7.
doi: 10.1038/s41390-023-02894-7
pubmed: 37993641
Ding P, Du Y, Jiang X, Chen H, Huang L. Establishment and analysis of a novel diagnostic model for systemic juvenile idiopathic arthritis based on machine learning. Pediatr Rheumatol Online J. 2024;22:18.
doi: 10.1186/s12969-023-00949-x
pubmed: 38243323
pmcid: 10797915
Bentham J, Cesare MD, Bilano V, Bixby H, Zhou B, Stevens GA, et al. Worldwide trends in body-mass index, underweight, overweight, and obesity from 1975 to 2016: a pooled analysis of 2416 population-based measurement studies in 128·9 million children, adolescents, and adults. Lancet. 2017;390:2627–42.
doi: 10.1016/S0140-6736(17)32129-3
Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, et al. REVEL: an Ensemble Method for Predicting the pathogenicity of rare missense variants. Am J Hum Genet. 2016;99:877–85.
doi: 10.1016/j.ajhg.2016.08.016
pubmed: 27666373
pmcid: 5065685
Accetturo M, D’Uggento AM, Portincasa P, Stella A. Improvement of MEFV gene variants classification to aid treatment decision making in familial Mediterranean fever. Rheumatology (Oxford). 2020;59:754–61.
doi: 10.1093/rheumatology/kez332
pubmed: 31411330
Hirsch MC, Ronicke S, Krusche M, Wagner AD. Rare diseases 2030: how augmented AI will support diagnosis and treatment of rare diseases in the future. Ann Rheum Dis. 2020;79:740–3.
doi: 10.1136/annrheumdis-2020-217125
pubmed: 32209541
Isildak U, Stella A, Fumagalli M. Distinguishing between recent balancing selection and incomplete sweep using deep neural networks. Mol Ecol Resour. 2021;21:2706–18.
doi: 10.1111/1755-0998.13379
pubmed: 33749134
Adato O, Brenner R, Levy A, Shinar Y, Shemer A, Dvir S, et al. Determining the origin of different variants associated with familial mediterranean fever by machine-learning. Sci Rep. 2022;12:15206.
doi: 10.1038/s41598-022-19538-1
pubmed: 36076017
pmcid: 9458679
Chinnadurai S, Mahadevan S, Navaneethakrishnan B, Mamadapur M. Decoding applications of Artificial Intelligence in Rheumatology. Cureus. 2023;15(9):e46164.
pubmed: 37905264
pmcid: 10613315
Hügle M, Omoumi P, van Laar JM, Boedecker J, Hügle T. Applied machine learning and artificial intelligence in rheumatology. Rheumatol Adv Pract. 2020;4(1):rkaa005.
doi: 10.1093/rap/rkaa005
pubmed: 32296743
pmcid: 7151725
Kothari S, Gionfrida L, Bharath AA, Abraham S. Artificial Intelligence (AI) and rheumatology: a potential partnership. Rheumatology (Oxford). 2019;58:1894–5.
doi: 10.1093/rheumatology/kez194
pubmed: 31168589
Stoel B. Use of artificial intelligence in imaging in rheumatology - current status and future perspectives. RMD Open. 2020;6:e001063.
doi: 10.1136/rmdopen-2019-001063
pubmed: 31958283
pmcid: 6999690
Adams LC, Bressem KK, Ziegeler K, Vahldiek JL, Poddubnyy D. Artificial intelligence to analyze magnetic resonance imaging in rheumatology. Joint Bone Spine. 2024;91:105651.
doi: 10.1016/j.jbspin.2023.105651
pubmed: 37797827
Pillai J, Pillai K. Accuracy of generative artificial intelligence models in differential diagnoses of familial Mediterranean fever and deficiency of Interleukin-1 receptor antagonist. J Transl Autoimmun. 2023;7:100213.
doi: 10.1016/j.jtauto.2023.100213
pubmed: 37927888
pmcid: 10622681