A Machine Learning Approach for the Prediction of Testicular Sperm Extraction in Nonobstructive Azoospermia: Algorithm Development and Validation Study.
azoospermia
biomedical informatics
infertile
infertility
machine learning
men's health
model
predict
prediction model
sperm
Journal
Journal of medical Internet research
ISSN: 1438-8871
Titre abrégé: J Med Internet Res
Pays: Canada
ID NLM: 100959882
Informations de publication
Date de publication:
21 06 2023
21 06 2023
Historique:
received:
04
11
2022
accepted:
07
04
2023
revised:
19
02
2023
medline:
23
6
2023
pubmed:
21
6
2023
entrez:
21
6
2023
Statut:
epublish
Résumé
Testicular sperm extraction (TESE) is an essential therapeutic tool for the management of male infertility. However, it is an invasive procedure with a success rate up to 50%. To date, no model based on clinical and laboratory parameters is sufficiently powerful to accurately predict the success of sperm retrieval in TESE. The aim of this study is to compare a wide range of predictive models under similar conditions for TESE outcomes in patients with nonobstructive azoospermia (NOA) to identify the correct mathematical approach to apply, most appropriate study size, and relevance of the input biomarkers. We analyzed 201 patients who underwent TESE at Tenon Hospital (Assistance Publique-Hôpitaux de Paris, Sorbonne University, Paris), distributed in a retrospective training cohort of 175 patients (January 2012 to April 2021) and a prospective testing cohort (May 2021 to December 2021) of 26 patients. Preoperative data (according to the French standard exploration of male infertility, 16 variables) including urogenital history, hormonal data, genetic data, and TESE outcomes (representing the target variable) were collected. A TESE was considered positive if we obtained sufficient spermatozoa for intracytoplasmic sperm injection. After preprocessing the raw data, 8 machine learning (ML) models were trained and optimized on the retrospective training cohort data set: The hyperparameter tuning was performed by random search. Finally, the prospective testing cohort data set was used for the model evaluation. The metrics used to evaluate and compare the models were the following: sensitivity, specificity, area under the receiver operating characteristic curve (AUC-ROC), and accuracy. The importance of each variable in the model was assessed using the permutation feature importance technique, and the optimal number of patients to include in the study was assessed using the learning curve. The ensemble models, based on decision trees, showed the best performance, especially the random forest model, which yielded the following results: AUC=0.90, sensitivity=100%, and specificity=69.2%. Furthermore, a study size of 120 patients seemed sufficient to properly exploit the preoperative data in the modeling process, since increasing the number of patients beyond 120 during model training did not bring any performance improvement. Furthermore, inhibin B and a history of varicoceles exhibited the highest predictive capacity. An ML algorithm based on an appropriate approach can predict successful sperm retrieval in men with NOA undergoing TESE, with promising performance. However, although this study is consistent with the first step of this process, a subsequent formal prospective multicentric validation study should be undertaken before any clinical applications. As future work, we consider the use of recent and clinically relevant data sets (including seminal plasma biomarkers, especially noncoding RNAs, as markers of residual spermatogenesis in NOA patients) to improve our results even more.
Sections du résumé
BACKGROUND
Testicular sperm extraction (TESE) is an essential therapeutic tool for the management of male infertility. However, it is an invasive procedure with a success rate up to 50%. To date, no model based on clinical and laboratory parameters is sufficiently powerful to accurately predict the success of sperm retrieval in TESE.
OBJECTIVE
The aim of this study is to compare a wide range of predictive models under similar conditions for TESE outcomes in patients with nonobstructive azoospermia (NOA) to identify the correct mathematical approach to apply, most appropriate study size, and relevance of the input biomarkers.
METHODS
We analyzed 201 patients who underwent TESE at Tenon Hospital (Assistance Publique-Hôpitaux de Paris, Sorbonne University, Paris), distributed in a retrospective training cohort of 175 patients (January 2012 to April 2021) and a prospective testing cohort (May 2021 to December 2021) of 26 patients. Preoperative data (according to the French standard exploration of male infertility, 16 variables) including urogenital history, hormonal data, genetic data, and TESE outcomes (representing the target variable) were collected. A TESE was considered positive if we obtained sufficient spermatozoa for intracytoplasmic sperm injection. After preprocessing the raw data, 8 machine learning (ML) models were trained and optimized on the retrospective training cohort data set: The hyperparameter tuning was performed by random search. Finally, the prospective testing cohort data set was used for the model evaluation. The metrics used to evaluate and compare the models were the following: sensitivity, specificity, area under the receiver operating characteristic curve (AUC-ROC), and accuracy. The importance of each variable in the model was assessed using the permutation feature importance technique, and the optimal number of patients to include in the study was assessed using the learning curve.
RESULTS
The ensemble models, based on decision trees, showed the best performance, especially the random forest model, which yielded the following results: AUC=0.90, sensitivity=100%, and specificity=69.2%. Furthermore, a study size of 120 patients seemed sufficient to properly exploit the preoperative data in the modeling process, since increasing the number of patients beyond 120 during model training did not bring any performance improvement. Furthermore, inhibin B and a history of varicoceles exhibited the highest predictive capacity.
CONCLUSIONS
An ML algorithm based on an appropriate approach can predict successful sperm retrieval in men with NOA undergoing TESE, with promising performance. However, although this study is consistent with the first step of this process, a subsequent formal prospective multicentric validation study should be undertaken before any clinical applications. As future work, we consider the use of recent and clinically relevant data sets (including seminal plasma biomarkers, especially noncoding RNAs, as markers of residual spermatogenesis in NOA patients) to improve our results even more.
Identifiants
pubmed: 37342078
pii: v25i1e44047
doi: 10.2196/44047
pmc: PMC10337455
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e44047Informations de copyright
©Guillaume Bachelot, Ferdinand Dhombres, Nathalie Sermondade, Rahaf Haj Hamid, Isabelle Berthaut, Valentine Frydman, Marie Prades, Kamila Kolanska, Lise Selleret, Emmanuelle Mathieu-D’Argent, Diane Rivet-Danon, Rachel Levy, Antonin Lamazière, Charlotte Dupont. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 21.06.2023.
Références
Fac Rev. 2021 Jan 26;10:7
pubmed: 33659925
Reprod Biol Endocrinol. 2020 Aug 26;18(1):90
pubmed: 32847601
J Mach Learn Res. 2019;20:
pubmed: 34335110
Nature. 2020 Sep;585(7825):357-362
pubmed: 32939066
J Steroid Biochem Mol Biol. 2022 Jun;220:106085
pubmed: 35292353
Hum Reprod. 2021 Sep 18;36(10):2649-2660
pubmed: 34477868
Reprod Fertil Dev. 2019 Apr;31(4):671-682
pubmed: 30423284
Reprod Med Biol. 2005 Mar 07;4(1):53-57
pubmed: 32351316
Syst Biol Reprod Med. 2020 Feb;66(1):70-75
pubmed: 31687848
Hum Reprod. 2020 Nov 1;35(11):2413-2427
pubmed: 32914196
J Urol. 2014 Jan;191(1):175-8
pubmed: 23911635
Clin Exp Reprod Med. 2017 Mar;44(1):22-27
pubmed: 28428940
Hum Reprod. 2000 Nov;15(11):2269-77
pubmed: 11056118
J Urol. 1999 Jan;161(1):112-6
pubmed: 10037381
J Urol. 2007 Apr;177(4):1447-9
pubmed: 17382751
Hum Reprod. 2020 Jul 1;35(7):1505-1514
pubmed: 32538428
Transl Androl Urol. 2017 Apr;6(2):282-287
pubmed: 28540237
Med Genet. 2018;30(1):12-20
pubmed: 29527098
J Urol. 2003 Oct;170(4 Pt 1):1287-90
pubmed: 14501743
J Int Med Res. 2021 Apr;49(4):3000605211002703
pubmed: 33794677
J Urol. 2004 Nov;172(5 Pt 1):1944-7
pubmed: 15540761
Andrology. 2020 Sep;8(5):1051-1063
pubmed: 32445591
BMC Med. 2015 Jan 06;13:1
pubmed: 25563062
Urol Ann. 2019 Jul-Sep;11(3):287-293
pubmed: 31413508
Asian J Androl. 2019 Sep-Oct;21(5):445-451
pubmed: 30880688
Asian J Androl. 2018 Jan-Feb;20(1):30-36
pubmed: 28361811
Hum Reprod Update. 2019 Nov 5;25(6):733-757
pubmed: 31665451
J Urol. 2021 Jan;205(1):36-43
pubmed: 33295257
Eur Urol. 2021 Nov;80(5):603-620
pubmed: 34511305
Ann Pathol. 2010 Jun;30(3):182-95
pubmed: 20621595
Fertil Steril. 2013 Feb;99(2):372-6
pubmed: 23122830
Hum Reprod. 1995 Jun;10(6):1457-60
pubmed: 7593514
Arab J Urol. 2017 Nov 16;16(1):44-52
pubmed: 29713535
Fertil Steril. 2019 Oct;112(4 Suppl1):e67-e70
pubmed: 31623744
Fertil Steril. 2015 Nov;104(5):1099-103.e1-3
pubmed: 26263080
Hum Reprod. 2016 Sep;31(9):1934-41
pubmed: 27406950
Prog Urol. 2021 Mar;31(3):131-144
pubmed: 33309127
Sci Rep. 2021 Dec 14;11(1):24003
pubmed: 34907216
Fertil Steril. 2008 Feb;89(2):444-8
pubmed: 17681330
Basic Clin Androl. 2013 Oct 02;23:5
pubmed: 25763186
Hum Reprod. 2003 Aug;18(8):1660-5
pubmed: 12871878
BJU Int. 2013 Mar;111(3):492-9
pubmed: 22583840
Fertil Steril. 2006 Aug;86(2):339-47
pubmed: 16753155
J Urol. 2013 Feb;189(2):638-42
pubmed: 23260551
Int J Androl. 2011 Aug;34(4):299-305
pubmed: 20695924
Hum Reprod. 2002 Apr;17(4):971-6
pubmed: 11925393
Endocr Rev. 2001 Apr;22(2):226-39
pubmed: 11294825
J Urol. 2020 Sep;204(3):551-556
pubmed: 32167868
Reprod Biomed Online. 2008 Feb;16(2):289-303
pubmed: 18284889
BMC Urol. 2018 Jul 4;18(1):63
pubmed: 29973189
Int Urol Nephrol. 2006;38(3-4):629-35
pubmed: 17111079
Andrologia. 2019 Dec;51(11):e13441
pubmed: 31583760