Precision medicine and machine learning towards the prediction of the outcome of potential celiac disease.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
11 03 2021
11 03 2021
Historique:
received:
07
10
2020
accepted:
23
02
2021
entrez:
12
3
2021
pubmed:
13
3
2021
medline:
15
12
2021
Statut:
epublish
Résumé
Potential Celiac Patients (PCD) bear the Celiac Disease (CD) genetic predisposition, a significant production of antihuman transglutaminase antibodies, but no morphological changes in the small bowel mucosa. A minority of patients (17%) showed clinical symptoms and need a gluten free diet at time of diagnosis, while the majority progress over several years (up to a decade) without any clinical problem neither a progression of the small intestine mucosal damage even when they continued to assume gluten in their diet. Recently we developed a traditional multivariate approach to predict the natural history, on the base of the information at enrolment (time 0) by a discriminant analysis model. Still, the traditional multivariate model requires stringent assumptions that may not be answered in the clinical setting. Starting from a follow-up dataset available for PCD, we propose the application of Machine Learning (ML) methodologies to extend the analysis on available clinical data and to detect most influent features predicting the outcome. These features, collected at time of diagnosis, should be capable to classify patients who will develop duodenal atrophy from those who will remain potential. Four ML methods were adopted to select features predictive of the outcome; the feature selection procedure was indeed capable to reduce the number of overall features from 85 to 19. ML methodologies (Random Forests, Extremely Randomized Trees, and Boosted Trees, Logistic Regression) were adopted, obtaining high values of accuracy: all report an accuracy above 75%. The specificity score was always more than 75% also, with two of the considered methods over 98%, while the best performance of sensitivity was 60%. The best model, optimized Boosted Trees, was able to classify PCD starting from the selected 19 features with an accuracy of 0.80, sensitivity of 0.58 and specificity of 0.84. Finally, with this work, we are able to categorize PCD patients that can more likely develop overt CD using ML. ML techniques appear to be an innovative approach to predict the outcome of PCD, since they provide a step forward in the direction of precision medicine aimed to customize healthcare, medical therapies, decisions, and practices tailoring the clinical management of PCD children.
Identifiants
pubmed: 33707543
doi: 10.1038/s41598-021-84951-x
pii: 10.1038/s41598-021-84951-x
pmc: PMC7952550
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
5683Références
N Engl J Med. 2019 Apr 4;380(14):1347-1358
pubmed: 30943338
Front Pharmacol. 2020 Apr 16;11:341
pubmed: 32372947
JAMA. 2019 Aug 8;:
pubmed: 31393527
Nature. 2019 Aug;572(7767):27-29
pubmed: 31363197
N Engl J Med. 2016 Sep 29;375(13):1216-9
pubmed: 27682033
BMJ Open. 2019 Nov 28;9(11):e032703
pubmed: 31784446
Front Cell Dev Biol. 2017 Sep 21;5:83
pubmed: 28983483
JAMA. 2019 Nov 12;322(18):1777-1779
pubmed: 31714974
BMC Bioinformatics. 2006 Feb 23;7:91
pubmed: 16504092
Nat Rev Rheumatol. 2020 Feb;16(2):69-70
pubmed: 31908355
Sci Rep. 2020 Jun 19;10(1):9973
pubmed: 32561768
Gastroenterol Res Pract. 2019 Oct 20;2019:8974751
pubmed: 31772571
Sci Rep. 2020 Sep 3;10(1):14623
pubmed: 32884091
Comput Biol Med. 2015 Oct 1;65:348-58
pubmed: 25770906
Clin Gastroenterol Hepatol. 2018 Aug;16(8):1354-1355.e1
pubmed: 29253540
Sci Rep. 2020 Jan 13;10(1):170
pubmed: 31932608
Sci Rep. 2020 Apr 24;10(1):6921
pubmed: 32332844
Clin Gastroenterol Hepatol. 2016 May;14(5):686-93.e1
pubmed: 26538207
Lancet Respir Med. 2018 Nov;6(11):801
pubmed: 30343029
JAMA. 2020 Jan 28;323(4):305-306
pubmed: 31904799
N Engl J Med. 2017 Sep 28;377(13):1209-1211
pubmed: 28953443
Gastroenterology. 2019 Aug;157(2):413-420.e3
pubmed: 30978358
Int J Clin Pract. 2019 Oct;73(10):e13389
pubmed: 31264310
N Engl J Med. 2020 Apr 23;382(17):1583-1586
pubmed: 32320568
Lancet. 2020 May 16;395(10236):1579-1586
pubmed: 32416782
Sci Rep. 2020 May 4;10(1):7470
pubmed: 32366838
JAMA. 2019 Nov 12;322(18):1806-1816
pubmed: 31714992
Am J Gastroenterol. 2014 Jun;109(6):913-21
pubmed: 24777149
N Engl J Med. 2019 Aug 15;381(7):668-676
pubmed: 31412182
JAMA. 2019 Nov 22;:
pubmed: 31755902