Artificial intelligence in reproductive endocrinology: an in-depth longitudinal analysis of ChatGPTv4's month-by-month interpretation and adherence to clinical guidelines for diminished ovarian reserve.

Artificial Intelligence ChatGPTv4 Diminished ovarian reserve Reproductive endocrinology

Journal

Endocrine

ISSN: 1559-0100

Titre abrégé: Endocrine

Pays: United States

ID NLM: 9434444

Informations de publication

Date de publication:
28 Sep 2024

Historique:

received: 30 07 2024

accepted: 03 09 2024

medline: 29 9 2024

pubmed: 29 9 2024

entrez: 28 9 2024

Statut: aheadofprint

Résumé

To quantitatively assess the performance of ChatGPTv4, an Artificial Intelligence Language Model, in adhering to clinical guidelines for Diminished Ovarian Reserve (DOR) over two months, evaluating the model's consistency in providing guideline-based responses. A longitudinal study design was employed to evaluate ChatGPTv4's response accuracy and completeness using a structured questionnaire at baseline and at a two-month follow-up. ChatGPTv4 was tasked with interpreting DOR questionnaires based on standardized clinical guidelines. The study did not involve human participants; the questionnaire was exclusively administered to the ChatGPT model to generate responses about DOR. A guideline-based questionnaire with 176 open-ended, 166 multiple-choice, and 153 true/false questions were deployed to rigorously assess ChatGPTv4's ability to provide accurate medical advice aligned with current DOR clinical guidelines. AI-generated responses were rated on a 6-point Likert scale for accuracy and a 3-point scale for completeness. The two-phase design assessed the stability and consistency of AI-generated answers over two months. ChatGPTv4 achieved near-perfect scores across all question types, with true/false questions consistently answered with 100% accuracy. In multiple-choice queries, accuracy improved from 98.2 to 100% at the two-month follow-up. Open-ended question responses exhibited significant positive enhancements, with accuracy scores increasing from an average of 5.38 ± 0.71 to 5.74 ± 0.51 (max: 6.0) and completeness scores from 2.57 ± 0.52 to 2.85 ± 0.36 (max: 3.0). It underscored the improvements as significant (p < 0.001), with positive correlations between initial and follow-up accuracy (r = 0.597) and completeness (r = 0.381) scores. The study was limited by the reliance on a controlled, albeit simulated, setting that may not perfectly mirror real-world clinical interactions. ChatGPTv4 demonstrated exceptional and improving accuracy and completeness in handling DOR-related guideline queries over the studied period. These findings highlight ChatGPTv4's potential as a reliable, adaptable AI tool in reproductive endocrinology, capable of augmenting clinical decision-making and guideline development.

Identifiants

DOI: 10.1007/s12020-024-04031-8 PMID: 39341951

pubmed: 39341951

doi: 10.1007/s12020-024-04031-8

pii: 10.1007/s12020-024-04031-8

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Informations de copyright

Références

D. Bhaskar, T.A. Chang, S. Wang, Current trends in artificial intelligence in reproductive endocrinology. Curr. Opin. Obstet. Gynecol. 34(4), 159–163 (2022)

doi: 10.1097/GCO.0000000000000796 pubmed: 35895955

Q. Zhu, H. Ma, J. Wang, X. Liang, Understanding the mechanisms of diminished ovarian reserve: insights from genetic variants and regulatory factors. Reprod. Sci. 31, 1521–1532 (2024).

K. Feng, Z. Zhang, L. Wu, L. Zhu, X. Li, D. Li, et al. Predictive factors for the formation of viable embryos in subfertile patients with diminished ovarian reserve: a clinical prediction study. Reprod. Sci. 31 (6) 1747–1756 (2024).

Z. Tan, X. Gong, C.C. Wang, T. Zhang, J. Huang, Diminished ovarian reserve in endometriosis: insights from in vitro, in vivo, and human studies—a systematic review. Int. J. Mol. Sci. 24 (21) (2023).

M.I. Cedars, Managing poor ovarian response in the patient with diminished ovarian reserve. Fertil. Steril. 117(4), 655–656 (2022)

doi: 10.1016/j.fertnstert.2022.02.026 pubmed: 35367010

M.E. Wierman, K. Kiseljak-Vassiliades, Should dehydroepiandrosterone be administered to women? J. Clin. Endocrinol. Metab. 107(6), 1679–1685 (2022)

doi: 10.1210/clinem/dgac130 pubmed: 35254428 pmcid: 9113789

Q.L. Zhang, Y.L. Lei, Y. Deng, R.L. Ma, X.S. Ding, W. Xue et al. Treatment progress in diminished ovarian reserve: Western and Chinese Medicine. Chin. J. Integr. Med. 29(4), 361–367 (2023)

doi: 10.1007/s11655-021-3353-2 pubmed: 35015221

T. Ovarian Stimulation, E. Bosch, S. Broer, G. Griesinger, M. Grynberg, P. Humaidan et al. ESHRE guideline: ovarian stimulation for IVF/ICSI(†). Hum. Reprod. Open 2020(2), hoaa009 (2020)

doi: 10.1093/hropen/hoaa009 pubmed: 32395637

R. Tal, D.B. Seifer, Ovarian reserve testing: a user’s guide. Am. J. Obstet. Gynecol. 217(2), 129–140 (2017)

doi: 10.1016/j.ajog.2017.02.027 pubmed: 28235465

M. Rabijewski, L. Papierska, M. Binkowska, R. Maksym, K. Jankowska, W. Skrzypulec-Plinta et al. Supplementation of dehydroepiandrosterone (DHEA) in pre- and postmenopausal women—position statement of expert panel of Polish Menopause and Andropause Society. Ginekol Pol 91(9), 554–562 (2020)

doi: 10.5603/GP.2020.0091 pubmed: 33030737

N. Semrl, S. Feigl, N. Taumberger, T. Bracic, H. Fluhr, C. Blockeel et al. AI language models in human reproduction research: exploring ChatGPT’s potential to assist academic writing. Hum Reprod 38(12), 2281–2288 (2023)

doi: 10.1093/humrep/dead207 pubmed: 37833847

M. Eppler, C. Ganjavi, L.S. Ramacciotti, P. Piazza, S. Rodler, E. Checcucci et al. Awareness and use of ChatGPT and Large Language Models: a prospective cross-sectional global survey in urology. Eur Urol 85(2), 146–153 (2024)

doi: 10.1016/j.eururo.2023.10.014 pubmed: 37926642

L. Allahqoli, M.M. Ghiasvand, A. Mazidimoradi, H. Salehiniya, I. Alkatout, Diagnostic and management performance of ChatGPT in obstetrics and gynecology. Gynecol. Obstet. Investig. 88(5), 310–313 (2023)

doi: 10.1159/000533177

P. Irwin, D. Jones, S. Fealy, What is ChatGPT and what do we do with it? Implications of the age of AI for nursing and midwifery practice and education: an editorial. Nurse Educ. Today 127, 105835 (2023)

doi: 10.1016/j.nedt.2023.105835 pubmed: 37267643

C.L. Curchoe, Proceedings of the first world conference on AI in fertility. J. Assist. Reprod. Genet. 40(2), 215–222 (2023)

doi: 10.1007/s10815-022-02704-9 pubmed: 36598733 pmcid: 9935785

A. Grünebaum, J. Chervenak, S.L. Pollet, A. Katz, F.A. Chervenak, The exciting potential for ChatGPT in obstetrics and gynecology. Am. J. Obstet. Gynecol. 228(6), 696–705 (2023)

doi: 10.1016/j.ajog.2023.03.009 pubmed: 36924907

A. Suhag, J. Kidd, M. McGath, R. Rajesh, J. Gelfinbein, N. Cacace et al. ChatGPT: a pioneering approach to complex prenatal differential diagnosis. Am. J. Obstet. Gynecol. 5(8), 101029 (2023)

D.S.E. Santo, J.V. Joviano-Santos, Exploring the use of ChatGPT for guidance during unexpected labour. Eur. J. Obstet. Gynecol. Reprod. Biol. 285, 208–209 (2023)

doi: 10.1016/j.ejogrb.2023.04.001 pubmed: 37037752

J. Caterson, O. Ambler, N. Cereceda-Monteoliva, M. Horner, A. Jones, A.T. Poacher, Application of generative language models to orthopaedic practice. BMJ Open 14(3), e076484 (2024)

doi: 10.1136/bmjopen-2023-076484 pubmed: 38485486 pmcid: 10941106

G. Cil, K. Dogan, The efficacy of artificial intelligence in urology: a detailed analysis of kidney stone-related queries. World J. Urol. 42(1), 158 (2024)

doi: 10.1007/s00345-024-04847-z pubmed: 38483582 pmcid: 10940482

D.J. Campbell, L.E. Estephan, E.M. Sina, E.V. Mastrolonardo, R. Alapati, D.R. Amin, et al. Evaluating ChatGPT responses on thyroid nodules for patient education. Thyroid 34 (3) 371–377 (2023)

M.S. Deniz, B.Y. Guler Assessment of ChatGPT’s adherence to ETA-thyroid nodule management guideline over two different time intervals 14 days apart: in binary and multiple-choice queries. Endocrine 85, 794–802 (2024)

M. Sievert, O. Conrad, S.K. Mueller, R. Rupp, M. Balk, D. Richter et al. Risk stratification of thyroid nodules: assessing the suitability of ChatGPT for text-based analysis. Am. J. Otolaryngol. 45(2), 104144 (2024)

doi: 10.1016/j.amjoto.2023.104144 pubmed: 38113774

Artificial intelligence in reproductive endocrinology: an in-depth longitudinal analysis of ChatGPTv4's month-by-month interpretation and adherence to clinical guidelines for diminished ovarian reserve.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Références

Auteurs

Tugba Gurbuz (T)

Oya Gokmen (O)

Belgin Devranoglu (B)

Arzu Yurci (A)

Asena Ayar Madenli (AA)

Classifications MeSH