Machine learning-based health environmental-clinical risk scores in European children.
Journal
Communications medicine
ISSN: 2730-664X
Titre abrégé: Commun Med (Lond)
Pays: England
ID NLM: 9918250414506676
Informations de publication
Date de publication:
23 May 2024
23 May 2024
Historique:
received:
31
03
2023
accepted:
26
04
2024
medline:
24
5
2024
pubmed:
24
5
2024
entrez:
23
5
2024
Statut:
epublish
Résumé
Early life environmental stressors play an important role in the development of multiple chronic disorders. Previous studies that used environmental risk scores (ERS) to assess the cumulative impact of environmental exposures on health are limited by the diversity of exposures included, especially for early life determinants. We used machine learning methods to build early life exposome risk scores for three health outcomes using environmental, molecular, and clinical data. In this study, we analyzed data from 1622 mother-child pairs from the HELIX European birth cohorts, using over 300 environmental, 100 child peripheral, and 18 mother-child clinical markers to compute environmental-clinical risk scores (ECRS) for child behavioral difficulties, metabolic syndrome, and lung function. ECRS were computed using LASSO, Random Forest and XGBoost. XGBoost ECRS were selected to extract local feature contributions using Shapley values and derive feature importance and interactions. ECRS captured 13%, 50% and 4% of the variance in mental, cardiometabolic, and respiratory health, respectively. We observed no significant differences in predictive performances between the above-mentioned methods.The most important predictive features were maternal stress, noise, and lifestyle exposures for mental health; proteome (mainly IL1B) and metabolome features for cardiometabolic health; child BMI and urine metabolites for respiratory health. Besides their usefulness for epidemiological research, our risk scores show great potential to capture holistic individual level non-hereditary risk associations that can inform practitioners about actionable factors of high-risk children. As in the post-genetic era personalized prevention medicine will focus more and more on modifiable factors, we believe that such integrative approaches will be instrumental in shaping future healthcare paradigms. Growing up in different environments can greatly affect children’s health later in life. This research looked at how living in cities, being exposed to chemicals, and other experiences before birth and during childhood, work together to influence children’s mental, cardiovascular and respiratory health. We used advanced computer programs to help us understand these effects and estimate health risk scores. These scores are simple numerical measures that help us quantify the likelihood of children developing health issues based on their environmental exposures. Using those scores, the study identified key factors impacting children’s health, in particular psycho-social, perceived environmental and prenatal pollutant exposures for mental health. It also revealed complex patterns and interactions between environmental factors. The results highlighted the potential of such risk scores to support the identification of actionable factors in high-risk children, informing tailored prevention measures in healthcare.
Sections du résumé
BACKGROUND
BACKGROUND
Early life environmental stressors play an important role in the development of multiple chronic disorders. Previous studies that used environmental risk scores (ERS) to assess the cumulative impact of environmental exposures on health are limited by the diversity of exposures included, especially for early life determinants. We used machine learning methods to build early life exposome risk scores for three health outcomes using environmental, molecular, and clinical data.
METHODS
METHODS
In this study, we analyzed data from 1622 mother-child pairs from the HELIX European birth cohorts, using over 300 environmental, 100 child peripheral, and 18 mother-child clinical markers to compute environmental-clinical risk scores (ECRS) for child behavioral difficulties, metabolic syndrome, and lung function. ECRS were computed using LASSO, Random Forest and XGBoost. XGBoost ECRS were selected to extract local feature contributions using Shapley values and derive feature importance and interactions.
RESULTS
RESULTS
ECRS captured 13%, 50% and 4% of the variance in mental, cardiometabolic, and respiratory health, respectively. We observed no significant differences in predictive performances between the above-mentioned methods.The most important predictive features were maternal stress, noise, and lifestyle exposures for mental health; proteome (mainly IL1B) and metabolome features for cardiometabolic health; child BMI and urine metabolites for respiratory health.
CONCLUSIONS
CONCLUSIONS
Besides their usefulness for epidemiological research, our risk scores show great potential to capture holistic individual level non-hereditary risk associations that can inform practitioners about actionable factors of high-risk children. As in the post-genetic era personalized prevention medicine will focus more and more on modifiable factors, we believe that such integrative approaches will be instrumental in shaping future healthcare paradigms.
Growing up in different environments can greatly affect children’s health later in life. This research looked at how living in cities, being exposed to chemicals, and other experiences before birth and during childhood, work together to influence children’s mental, cardiovascular and respiratory health. We used advanced computer programs to help us understand these effects and estimate health risk scores. These scores are simple numerical measures that help us quantify the likelihood of children developing health issues based on their environmental exposures. Using those scores, the study identified key factors impacting children’s health, in particular psycho-social, perceived environmental and prenatal pollutant exposures for mental health. It also revealed complex patterns and interactions between environmental factors. The results highlighted the potential of such risk scores to support the identification of actionable factors in high-risk children, informing tailored prevention measures in healthcare.
Autres résumés
Type: plain-language-summary
(eng)
Growing up in different environments can greatly affect children’s health later in life. This research looked at how living in cities, being exposed to chemicals, and other experiences before birth and during childhood, work together to influence children’s mental, cardiovascular and respiratory health. We used advanced computer programs to help us understand these effects and estimate health risk scores. These scores are simple numerical measures that help us quantify the likelihood of children developing health issues based on their environmental exposures. Using those scores, the study identified key factors impacting children’s health, in particular psycho-social, perceived environmental and prenatal pollutant exposures for mental health. It also revealed complex patterns and interactions between environmental factors. The results highlighted the potential of such risk scores to support the identification of actionable factors in high-risk children, informing tailored prevention measures in healthcare.
Identifiants
pubmed: 38783062
doi: 10.1038/s43856-024-00513-y
pii: 10.1038/s43856-024-00513-y
doi:
Types de publication
Journal Article
Langues
eng
Pagination
98Informations de copyright
© 2024. The Author(s).
Références
Koppe, J. G. et al. Exposure to multiple environmental agents and their effect. Acta Paediatr. 95, 106–113 (2006).
doi: 10.1080/08035320600886646
Rauh, V. A. & Margolis, A. E. Research review: environmental exposures, neurodevelopment, and child mental health – new paradigms for the study of brain and behavioral effects. J. Child Psychol. Psychiatry 57, 775–793 (2016).
pubmed: 26987761
pmcid: 4914412
doi: 10.1111/jcpp.12537
Pryce, C. R. et al. Long-term effects of early-life environmental manipulations in rodents and primates: potential animal models in depression research. Neurosci. Biobehav. Rev. 29, 649–674 (2005).
pubmed: 15925698
doi: 10.1016/j.neubiorev.2005.03.011
Needleman, H. L., Schell, A., Bellinger, D., Leviton, A. & Allred, E. N. The long-term effects of exposure to low doses of lead in childhood. N. Engl. J. Med. 322, 83–88 (1990).
pubmed: 2294437
doi: 10.1056/NEJM199001113220203
Weihrauch-Blüher, S., Schwarz, P. & Klusmann, J.-H. Childhood obesity: increased risk for cardiometabolic disease and cancer in adulthood. Metabolism 92, 147–152 (2019).
pubmed: 30529454
doi: 10.1016/j.metabol.2018.12.001
Maitre, L. et al. Early-life environmental exposure determinants of child behavior in Europe: a longitudinal, population-based study. Environ. Int. 153, 106523 (2021).
pubmed: 33773142
pmcid: 8140407
doi: 10.1016/j.envint.2021.106523
Wild, C. P. Complementing the genome with an ‘exposome’: the outstanding challenge of environmental exposure measurement in molecular epidemiology. Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol 14, 1847–1850 (2005).
doi: 10.1158/1055-9965.EPI-05-0456
Jaffee, S. R. & Price, T. S. Genotype–environment correlations: implications for determining the relationship between environmental exposures and psychiatric illness. Psychiatry 7, 496–499 (2008).
pubmed: 20622930
doi: 10.1016/j.mppsy.2008.10.002
Johns, D. O. et al. Practical advancement of multipollutant scientific and risk assessment approaches for ambient air pollution. Environ. Health Perspect. 120, 1238–1242 (2012).
pubmed: 22645280
pmcid: 3440129
doi: 10.1289/ehp.1204939
D’Agostino, R. B. et al. General cardiovascular risk profile for use in primary care: the framingham heart study. Circulation 117, 743–753 (2008).
pubmed: 18212285
doi: 10.1161/CIRCULATIONAHA.107.699579
Khera, A. V. et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 50, 1219–1224 (2018).
pubmed: 30104762
pmcid: 6128408
doi: 10.1038/s41588-018-0183-z
Park, S. K., Tao, Y., Meeker, J. D., Harlow, S. D. & Mukherjee, B. Environmental risk score as a new tool to examine multi-pollutants in epidemiologic research: an example from the NHANES study using serum lipid levels. PLOS ONE 9, e98632 (2014).
pubmed: 24901996
pmcid: 4047033
doi: 10.1371/journal.pone.0098632
Murray, G. K. et al. Could polygenic risk scores be useful in psychiatry?: a review. JAMA Psychiatry 78, 210–219 (2021).
pubmed: 33052393
doi: 10.1001/jamapsychiatry.2020.3042
Wray, N. R. et al. From basic science to clinical application of polygenic risk scores: a primer. JAMA Psychiatry 78, 101–109 (2021).
pubmed: 32997097
doi: 10.1001/jamapsychiatry.2020.3049
Pries, L.-K., Erzin, G., Rutten, B. P. F., van Os, J. & Guloksuz, S. Estimating aggregate environmental risk score in psychiatry: the exposome score for schizophrenia. Front. Psychiatry. 12, 671334 (2021).
Gao, P. & Snyder, M. Exposome-wide association study for metabolic syndrome. Front. Genet. 12, 783930 (2021).
pubmed: 34950191
pmcid: 8688998
doi: 10.3389/fgene.2021.783930
Le Magueresse-Battistoni, B., Vidal, H. & Naville, D. Environmental pollutants and metabolic disorders: the multi-exposure scenario of life. Front. Endocrinol. 9, 582 (2018).
Vassos, E. et al. The Maudsley environmental risk score for psychosis. Psychol. Med. 50, 1–8 (2019).
Padmanabhan, J. L., Shah, J. L., Tandon, N. & Keshavan, M. S. The ‘polyenviromic risk score’: aggregating environmental risk factors predicts conversion to psychosis in familial high-risk subjects. Schizophr. Res. 181, 17–22 (2017).
pubmed: 28029515
doi: 10.1016/j.schres.2016.10.014
Maitre, L. et al. Human early life exposome (HELIX) study: a european population-based exposome cohort. BMJ Open 8, e021311 (2018).
pubmed: 30206078
pmcid: 6144482
doi: 10.1136/bmjopen-2017-021311
Vrijheid, M. et al. The human early-life exposome (HELIX): project rationale and design. Environ. Health Perspect. 122, 535–544 (2014).
pubmed: 24610234
pmcid: 4048258
doi: 10.1289/ehp.1307204
Wright, J. et al. Cohort profile: the born in bradford multi-ethnic family cohort study. Int. J. Epidemiol. 42, 978–991 (2013).
pubmed: 23064411
doi: 10.1093/ije/dys112
Heude, B. et al. Cohort profile: the EDEN mother-child cohort on the prenatal and early postnatal determinants of child health and development. Int. J. Epidemiol. 45, 353–363 (2016).
pubmed: 26283636
doi: 10.1093/ije/dyv151
Guxens, M. et al. Cohort profile: the INMA—INfancia y medio ambiente—(Environment and childhood) project. Int. J. Epidemiol. 41, 930–940 (2012).
pubmed: 21471022
doi: 10.1093/ije/dyr054
Grazuleviciene, R. et al. Surrounding greenness, proximity to city parks and pregnancy outcomes in kaunas cohort study. Int. J. Hyg. Environ. Health 218, 358–365 (2015).
pubmed: 25757723
pmcid: 4390161
doi: 10.1016/j.ijheh.2015.02.004
Magnus, P. et al. Cohort profile update: the Norwegian mother and child cohort study (MoBa). Int. J. Epidemiol. 45, 382–388 (2016).
pubmed: 27063603
doi: 10.1093/ije/dyw029
Paltiel, L. et al. The biobank of the Norwegian mother and child cohort study – present status. Nor. Epidemiol. 24, 29–35 (2014).
Chatzi, L. et al. Cohort profile: the mother-child cohort in crete, greece (Rhea study). Int. J. Epidemiol. 46, 1392–1393k (2017).
pubmed: 29040580
doi: 10.1093/ije/dyx084
Constantinou, M. P. et al. Changes in general and specific psychopathology factors over a pychosocial intervention. J. Am. Acad. Child Adolesc. Psychiatry 58, 776–786 (2019).
pubmed: 30768397
doi: 10.1016/j.jaac.2018.11.011
Haltigan, J. D. et al. “P” and “DP:” examining symptom-level bifactor models of psychopathology and dysregulation in clinically referred children and adolescents. J. Am. Acad. Child Adolesc. Psychiatry 57, 384–396 (2018).
pubmed: 29859554
doi: 10.1016/j.jaac.2018.03.010
Caspi, A. et al. Longitudinal assessment of mental health disorders and comorbidities across 4 decades among participants in the dunedin birth cohort study. JAMA Netw. Open 3, e203221 (2020).
pubmed: 32315069
pmcid: 7175086
doi: 10.1001/jamanetworkopen.2020.3221
Cervin, M. et al. The p factor consistently predicts long-term psychiatric and functional outcomes in anxiety-disordered youth. J. Am. Acad. Child Adolesc. Psychiatry 60, 902–912.e5 (2021).
pubmed: 32950650
doi: 10.1016/j.jaac.2020.08.440
Rijlaarsdam, J. et al. Genome-wide DNA methylation patterns associated with general psychopathology in children. J. Psychiatr. Res. 140, 214–220 (2021).
pubmed: 34118639
pmcid: 8578013
doi: 10.1016/j.jpsychires.2021.05.029
Rosseel, Y. lavaan: An R package for structural equation modeling. J. Stat. Softw. 48, 1–36 (2012).
doi: 10.18637/jss.v048.i02
Achenbach, T. M. Integrative Guide for the 1991 CBCL/4-18, YSR, and TRF Profiles. (Univ Vermont/Dept Psychiatry, 1991).
Stratakis, N. et al. Association of fish consumption and mercury exposure during pregnancy with metabolic health and inflammatory biomarkers in children. JAMA Netw. Open 3, e201007 (2020).
pubmed: 32176304
pmcid: 7076335
doi: 10.1001/jamanetworkopen.2020.1007
Agier, L. et al. Early-life exposome and lung function in children in Europe: an analysis of data from the longitudinal, population-based HELIX cohort. Lancet Planet. Health 3, e81–e92 (2019).
pubmed: 30737192
doi: 10.1016/S2542-5196(19)30010-5
Quanjer, P. H. et al. Multi-ethnic reference values for spirometry for the 3-95-yr age range: the global lung function 2012 equations. Eur. Respir. J. 40, 1324–1343 (2012).
pubmed: 22743675
pmcid: 3786581
doi: 10.1183/09031936.00080312
Robinson, O. et al. The urban exposome during pregnancy and its socioeconomic determinants. Environ. Health Perspect. 126, 077005 (2018).
pubmed: 30024382
pmcid: 6108870
doi: 10.1289/EHP2862
Lau, C.-H. E. et al. Determinants of the urinary and serum metabolome in children from six European populations. BMC Med. 16, 202 (2018).
pubmed: 30404627
pmcid: 6223046
doi: 10.1186/s12916-018-1190-8
Dieterle, F., Ross, A., Schlotterbeck, G. & Senn, H. Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. Anal. Chem. 78, 4281–4290 (2006).
pubmed: 16808434
doi: 10.1021/ac051632c
Cohen, S. Perceived stress in a probability sample of the United States. In The Social Psychology of Health. 31–67 (Sage Publications, Inc, Thousand Oaks, CA, US, 1988).
Sweet, L. H. N-Back Paradigm. In Encyclopedia of Clinical Neuropsychology (eds. Kreutzer, J. S., DeLuca, J. & Caplan, B.) 1718–1719 (Springer, New York, NY, 2011).
Maitre, L. et al. Multi-omics signatures of the human early life exposome. Nat. Commun. 13, 7024 (2022).
pubmed: 36411288
pmcid: 9678903
doi: 10.1038/s41467-022-34422-2
Stekhoven, D. J. & Bühlmann, P. MissForest—non-parametric missing value imputation for mixed-type data. Bioinformatics 28, 112–118 (2012).
pubmed: 22039212
doi: 10.1093/bioinformatics/btr597
Bergstra, J., Bardenet, R., Bengio, Y. & Kégl, B. Algorithms for hyper-parameter optimization. In Advances In Neural Information Processing Systems. 24, 2546–2554 (Curran Associates, Inc., 2011).
Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794 (Association for Computing Machinery, New York, NY, USA, 2016).
Yang, P., Hwa Yang, Y., Zhou, B. B. & Zomaya, Y. A. A review of ensemble methods in bioinformatics. Curr. Bioinforma. 5, 296–308 (2010).
doi: 10.2174/157489310794072508
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell 2, 56–67 (2020).
pubmed: 32607472
pmcid: 7326367
doi: 10.1038/s42256-019-0138-9
Lundberg, S. M., Erion, G. G. & Lee, S.-I. Consistent individualized feature attribution for tree ensembles. arXiv http://arxiv.org/abs/1802.03888 (2019).
Grinsztajn, L., Oyallon, E. & Varoquaux, G. Why do tree-based models still outperform deep learning on tabular data? Adv. Neural. Inf. Process. Syst. 35, 507–520 (2022).
Tamayo-Uria, I. et al. The early-life exposome: description and patterns in six European countries. Environ. Int. 123, 189–200 (2019).
pubmed: 30530161
doi: 10.1016/j.envint.2018.11.067
Lundberg, S. & Lee, S.-I. A Unified approach to interpreting model predictions. Adv. Neural. Inf. Process. Syst. 30 (2017)
Hart, S. Shapley value. in Game Theory (eds. Eatwell, J., Milgate, M. & Newman, P.) 210–216 (Palgrave Macmillan UK, London, 1989).
Balagopal, P. B. et al. Nontraditional risk factors and biomarkers for cardiovascular disease: mechanistic, research, and clinical considerations for youth. Circulation 123, 2749–2769 (2011).
pubmed: 21555711
doi: 10.1161/CIR.0b013e31821c7c64
He, Y. et al. Comparisons of polyexposure, polygenic, and clinical risk scores in risk prediction of type 2 diabetes. Diabetes Care 44, 935–943 (2021).
pubmed: 33563654
pmcid: 7985424
doi: 10.2337/dc20-2049
Hastie, T., Tibshirani, R. & Friedman, J. Overview of supervised learning. In The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Vol. 2 (eds. Hastie, T., Tibshirani, R. & Friedman, J.) 9–41 (Springer, New York, NY, 2009).
Hastie, T., Tibshirani, R. & Friedman, J. Model assessment and selection. In The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Vol. 2 (eds. Hastie, T., Tibshirani, R. & Friedman, J.) 219–259 (Springer, New York, NY, 2009).
Farewell, C. V., Melnick, E. & Leiferman, J. Maternal mental health and early childhood development: Exploring critical periods and unique sources of support. Infant Ment. Health J. 42, 603–615 (2021).
pubmed: 33998003
doi: 10.1002/imhj.21925
Wang, F. & Veugelers, P. J. Self-esteem and cognitive development in the era of the childhood obesity epidemic. Obes. Rev. 9, 615–623 (2008).
pubmed: 18647242
doi: 10.1111/j.1467-789X.2008.00507.x
Lim, J. et al. Negative impact of noise and noise sensitivity on mental health in childhood. Noise Health 20, 199–211 (2018).
pubmed: 30516173
pmcid: 6301087
Esser, N., Legrand-Poels, S., Piette, J., Scheen, A. J. & Paquot, N. Inflammation as a link between obesity, metabolic syndrome and type 2 diabetes. Diabetes Res. Clin. Pract. 105, 141–150 (2014).
pubmed: 24798950
doi: 10.1016/j.diabres.2014.04.006
Wilkins, J. T. et al. Spectrum of apolipoprotein AI and apolipoprotein aII proteoforms and their associations with indices of cardiometabolic health: the CARDIA study. J. Am. Heart Assoc. 10, e019890 (2021).
pubmed: 34472376
pmcid: 8649248
doi: 10.1161/JAHA.120.019890
Tsai, J.-P. The association of serum leptin levels with metabolic diseases. Tzu-Chi Med. J. 29, 192–196 (2017).
pmcid: 5740690
doi: 10.4103/tcmj.tcmj_123_17
Sun, S. et al. Metabolic syndrome and its components are associated with altered amino acid profile in Chinese han population. Front. Endocrinol. 12, 795044 (2022).
doi: 10.3389/fendo.2021.795044
Ding, Y., Wang, S. & Lu, J. Unlocking the potential: amino acids’ role in predicting and exploring therapeutic avenues for type 2 diabetes mellitus. Metabolites 13, 1017 (2023).
pubmed: 37755297
pmcid: 10535527
doi: 10.3390/metabo13091017
Novgorodtseva, T. P. et al. Composition of fatty acids in plasma and erythrocytes and eicosanoids level in patients with metabolic syndrome. Lipids Health Dis 10, 82 (2011).
pubmed: 21595891
pmcid: 3116500
doi: 10.1186/1476-511X-10-82
Sun, Y. et al. BMI is associated with FEV1 decline in chronic obstructive pulmonary disease: a meta-analysis of clinical trials. Respir. Res. 20, 236 (2019).
pubmed: 31665000
pmcid: 6819522
doi: 10.1186/s12931-019-1209-5
Köchli, S. et al. Lung function, obesity and physical fitness in young children: the EXAMIN YOUTH study. Respir. Med. 159, 105813 (2019).
pubmed: 31731085
doi: 10.1016/j.rmed.2019.105813
Agier, L. et al. A systematic comparison of linear regression–based statistical methods to assess exposome-health associations. Environ. Health Perspect. 124, 1848–1856 (2016).
pubmed: 27219331
pmcid: 5132632
doi: 10.1289/EHP172
Vrijheid, M. et al. Advancing tools for human early lifecourse exposome research and translation (ATHLETE). Environ. Epidemiol 5, e166 (2021).
Neufcourt, L. et al. Assessing how social exposures are integrated in exposome research: a scoping review. Environ. Health Perspect. 130, 116001 (2022).
pubmed: 36350665
pmcid: 9645433
doi: 10.1289/EHP11015
Gaye, A. et al. DataSHIELD: taking the analysis to the data, not the data to the analysis. Int. J. Epidemiol. 43, 1929–1944 (2014).
pubmed: 25261970
pmcid: 4276062
doi: 10.1093/ije/dyu188
Guimbaud, J.-B. ML based health ECRS in European children - figure source data. https://doi.org/10.6084/m9.figshare.25625109 .
Guimbaud, J.-B. ML based ECRS for European children - python code. https://doi.org/10.5281/zenodo.10519296 .