A review of geospatial exposure models and approaches for health data integration.
Environmental public health
Exposome
Exposure modeling
Linkage
Spatiotemporal
Toxicology
Journal
Journal of exposure science & environmental epidemiology
ISSN: 1559-064X
Titre abrégé: J Expo Sci Environ Epidemiol
Pays: United States
ID NLM: 101262796
Informations de publication
Date de publication:
06 Sep 2024
06 Sep 2024
Historique:
received:
02
02
2024
accepted:
05
08
2024
revised:
01
08
2024
medline:
10
9
2024
pubmed:
10
9
2024
entrez:
9
9
2024
Statut:
aheadofprint
Résumé
Geospatial methods are common in environmental exposure assessments and increasingly integrated with health data to generate comprehensive models of environmental impacts on public health. Our objective is to review geospatial exposure models and approaches for health data integration in environmental health applications. We conduct a literature review and synthesis. First, we discuss key concepts and terminology for geospatial exposure data and models. Second, we provide an overview of workflows in geospatial exposure model development and health data integration. Third, we review modeling approaches, including proximity-based, statistical, and mechanistic approaches, across diverse exposure types, such as air quality, water quality, climate, and socioeconomic factors. For each model type, we provide descriptions, general equations, and example applications for environmental exposure assessment. Fourth, we discuss the approaches used to integrate geospatial exposure data and health data, such as methods to link data sources with disparate spatial and temporal scales. Fifth, we describe the landscape of open-source tools supporting these workflows.
Sections du résumé
BACKGROUND
BACKGROUND
Geospatial methods are common in environmental exposure assessments and increasingly integrated with health data to generate comprehensive models of environmental impacts on public health.
OBJECTIVE
OBJECTIVE
Our objective is to review geospatial exposure models and approaches for health data integration in environmental health applications.
METHODS
METHODS
We conduct a literature review and synthesis.
RESULTS
RESULTS
First, we discuss key concepts and terminology for geospatial exposure data and models. Second, we provide an overview of workflows in geospatial exposure model development and health data integration. Third, we review modeling approaches, including proximity-based, statistical, and mechanistic approaches, across diverse exposure types, such as air quality, water quality, climate, and socioeconomic factors. For each model type, we provide descriptions, general equations, and example applications for environmental exposure assessment. Fourth, we discuss the approaches used to integrate geospatial exposure data and health data, such as methods to link data sources with disparate spatial and temporal scales. Fifth, we describe the landscape of open-source tools supporting these workflows.
Identifiants
pubmed: 39251872
doi: 10.1038/s41370-024-00712-8
pii: 10.1038/s41370-024-00712-8
doi:
Types de publication
Journal Article
Review
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2024. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.
Références
Matérn B. Spatial variation. Vol. 36. Springer Science & Business Media; 2013.
Journel AG, Huijbregts CJ. Mining geostatistics. Vol. 600. Academic Press London; 1978.
Krige D. A study of gold and uranium distribution patterns in the Klerksdorp gold field. Geoexploration. 1966;4:43–53.
doi: 10.1016/0016-7142(66)90010-X
Cressie N. Statistics for spatial data. John Wiley & Sons; 1993.
Goovaerts P, Journel A. Integrating soil map information in modelling the spatial variation of continuous soil properties. Eur J Soil Sci. 1995;46:397–414.
doi: 10.1111/j.1365-2389.1995.tb01336.x
Bogaert P, D’Or D. Estimating soil properties from thematic soil maps: the Bayesian maximum entropy approach. Soil Sci Soc Am J. 2002;66:1492–1500.
doi: 10.2136/sssaj2002.1492
Cressie N, Majure JJ. Spatio-temporal statistical modeling of livestock waste in streams. J Agric Biol Environ Stat. 1997;2:24–47.
doi: 10.2307/1400639
Nuckols JR, Ward MH, Jarup L. Using geographic information systems for exposure assessment in environmental epidemiology studies. Environ Health Perspect. 2004;112:1007–15.
pubmed: 15198921
pmcid: 1247194
doi: 10.1289/ehp.6738
Requia WJ, Di Q, Silvern R, Kelly JT, Koutrakis P, Mickley LJ, et al. An ensemble learning approach for estimating high spatiotemporal resolution of ground-level ozone in the contiguous United States. Environ Sci Technol. 2020;54:11037–47.
pubmed: 32808786
pmcid: 7498146
doi: 10.1021/acs.est.0c01791
Rodriguez-Galiano V, Mendes MP, Garcia-Soldado MJ, Chica-Olmo M, Ribeiro L. Predictive modeling of groundwater nitrate pollution using random forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (southern Spain). Sci Total Environ. 2014;476:189–206.
pubmed: 24463255
doi: 10.1016/j.scitotenv.2014.01.001
Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, et al. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos Environ. 2008;42:7561–78.
doi: 10.1016/j.atmosenv.2008.05.057
VoPham T, Hart JE, Laden F, Chiang Y-Y. Emerging trends in geospatial artificial intelligence (geoai): potential applications for environmental epidemiology. Environ Health. 2018;17:1–6.
doi: 10.1186/s12940-018-0386-x
Nieuwenhuijsen MJ. Exposure assessment in environmental epidemiology. OUP Us; 2015.
Vermeulen R, Schymanski EL, Barabási A-L, Miller GW. The exposome and health: Where chemistry meets biology. Science. 2020;367:392–96.
pubmed: 31974245
pmcid: 7227413
doi: 10.1126/science.aay3164
Wild CP. The exposome: from concept to utility. Int J Epidemiol. 2012;41:24–32.
pubmed: 22296988
doi: 10.1093/ije/dyr236
Hoef JMV, Peterson E, Theobald D. Spatial statistical models that use flow and stream distance. Environ Ecol Stat. 2006;13:449–64.
doi: 10.1007/s10651-006-0022-8
Money ES, Carter GP, Serre ML. Modern space/time geostatistics using river distances: data integration of turbidity and e. coli measurements to assess fecal contamination along the Raritan River in New Jersey. Environ Sci Technol. 2009;43:3736–42.
pubmed: 19544881
pmcid: 2752213
doi: 10.1021/es803236j
Jat P, Serre ML. Bayesian maximum entropy space/time estimation of surface water chloride in Maryland using river distances. Environ Pollut. 2016;219:1148–55.
pubmed: 27616646
pmcid: 7343247
doi: 10.1016/j.envpol.2016.09.020
Wikle CK. Modern perspectives on statistics for spatio-temporal data. Wiley Interdiscip Rev Comput Stat. 2015;7:86–98.
doi: 10.1002/wics.1341
Cressie N, Wikle CK. Statistics for spatio-temporal data. John Wiley & Sons; 2015.
National Institute of Envionmental Health Sciences (NIEHS). Climate and Health Outcomes Research Data Systems (CHORDS) (2024). https://www.niehs.nih.gov/research/programs/chords . Website.
Nolan BT, Hitt KJ. Vulnerability of shallow groundwater and drinking-water wells to nitrate in the United States. Environ Sci Technol. 2006;40:7834–40.
pubmed: 17256535
doi: 10.1021/es060911u
Owusu C, Flanagan B, Lavery AM, Mertzlufft CE, McKenzie BA, Kolling J, et al. Developing a granular scale environmental burden index (ebi) for diverse land cover types across the contiguous United States. Sci Total Environ. 2022;838:155908.
pubmed: 35588849
doi: 10.1016/j.scitotenv.2022.155908
Gelfand AE, Diggle P, Guttorp P, Fuentes M. Handbook of spatial statistics. CRC Press; 2010.
Roberts DR, Bahn V, Ciuti S, Boyce MS, Elith J, Guillera-Arroita G, et al. Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography. 2017;40:913–29.
doi: 10.1111/ecog.02881
Meyer H, Reudenbach C, Hengl T, Katurji M, Nauss T. Improving performance of spatio-temporal machine learning models using forward feature selection and target-oriented validation. Environ Model Softw. 2018;101:1–9.
doi: 10.1016/j.envsoft.2017.12.001
Valavi R, Elith J, Lahoz-Monfort JJ, Guillera-Arroita G. blockcv: An r package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution models. Biorxiv. 2018:357798
Watson GL, Reid CE, Jerrett M, Telesca D. Prediction and model evaluation for space-time data. J Appl Stat. 2023;51:2007–24.
Gneiting T, Katzfuss M. Probabilistic Forecasting. Annu Rev Stat Appl. 2014;1:125–51.
Kleiber W, Raftery AE, Baars J, Gneiting T, Mass CF, Grimit E, et al. Locally calibrated probabilistic temperature forecasting using geostatistical model averaging and local Bayesian model averaging. Monthly Weather Rev. 2011;139:2630–49.
doi: 10.1175/2010MWR3511.1
Forlani C, Bhatt S, Cameletti M, Krainski E, Blangiardo M. A joint Bayesian space–time model to integrate spatially misaligned air pollution data in r-inla. Environmetrics. 2020;31:e2644.
doi: 10.1002/env.2644
Bonas M, Castruccio S. Calibration of SpatioTemporal forecasts from citizen science urban air pollution data with sparse recurrent neural networks. Ann Appl Stat. 2023;17:1820–40.
Messier KP, Katzfuss M. Scalable penalized spatiotemporal land-use regression for ground-level nitrogen dioxide. Ann Appl Stat. 2021;15:688–710.
pubmed: 35069963
pmcid: 8774268
doi: 10.1214/20-AOAS1422
Patton A, Datta A, Zamora ML, Buehler C, Xiong F, Gentner DR, et al. Non-linear probabilistic calibration of low-cost environmental air pollution sensor networks for neighborhood level spatiotemporal exposure assessment. J Expo Sci Environ Epidemiol. 2022;32:908–16.
pubmed: 36352094
pmcid: 10292073
doi: 10.1038/s41370-022-00493-y
Derksen S, Keselman HJ. Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables. Br J Math Stat Psychol. 1992;45:265–82.
doi: 10.1111/j.2044-8317.1992.tb00992.x
Vienneau D, De Hoogh K, Beelen R, Fischer P, Hoek G, Briggs D, et al. Comparison of land-use regression models between Great Britain and the Netherlands. Atmos Environ. 2010;44:688–96.
doi: 10.1016/j.atmosenv.2009.11.016
Messier KP, Akita Y, Serre ML. Integrating address geocoding, land use regression, and spatiotemporal geostatistical estimation for groundwater tetrachloroethylene. Environ Sci Technol. 2012;46:2772–80.
pubmed: 22264162
pmcid: 3494280
doi: 10.1021/es203152a
Kerckhoffs J, Hoek G, Vlaanderen J, van Nunen E, Messier K, Brunekreef B, et al. Robustness of intra urban land-use regression models for ultrafine particles and black carbon based on mobile monitoring. Environ Res. 2017;159:500–8.
pubmed: 28866382
doi: 10.1016/j.envres.2017.08.040
Jones RR, Hoek G, Fisher JA, Hasheminassab S, Wang D, Ward MH, et al. Land use regression models for ultrafine particles, fine particles, and black carbon in southern California. Sci Total Environ. 2020;699:134234.
pubmed: 31793436
doi: 10.1016/j.scitotenv.2019.134234
Su J, Jerrett M, Beckerman B. A distance-decay variable selection strategy for land use regression modeling of ambient air pollution exposures. Sci Total Environ. 2009;407:3890–8.
pubmed: 19304313
doi: 10.1016/j.scitotenv.2009.01.061
Messier K, Kane E, Bolich R, Serre M. Nitrate variability in groundwater of North Carolina using monitoring and private well data models. Environ Sci Technol. 2014;48.
Hastie T, Tibshirani R, Tibshirani RJ. Extended comparisons of best subset selection, forward stepwise selection, and the lasso. arXiv preprint arXiv:1707.08692. 2017.
Smith G. Step away from stepwise. J Big Data. 2018;5:1–12.
doi: 10.1186/s40537-018-0143-6
Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33:1.
pubmed: 20808728
pmcid: 2929880
doi: 10.18637/jss.v033.i01
Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96:1348–60.
doi: 10.1198/016214501753382273
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc B (Methodol). 1996;58:267–88.
doi: 10.1111/j.2517-6161.1996.tb02080.x
Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc B (Methodol). 2005;67:301–20.
doi: 10.1111/j.1467-9868.2005.00503.x
Larkin A, Geddes JA, Martin RV, Xiao Q, Liu Y, Marshall JD, et al. Global land use regression model for nitrogen dioxide air pollution. Environ Sci Technol. 2017;51:6957–64.
pubmed: 28520422
pmcid: 5565206
doi: 10.1021/acs.est.7b01148
Son Y, Osornio-Vargas ÁR, O’Neill MS, Hystad P, Texcalac-Sangrador JL, Ohman-Strickland P, et al. Land use regression models to assess air pollution exposure in Mexico city using finer spatial and temporal input parameters. Sci Total Environ. 2018;639:40–8.
pubmed: 29778680
pmcid: 10896644
doi: 10.1016/j.scitotenv.2018.05.144
Ren X, Mi Z, Georgopoulos PG. Comparison of machine learning and land use regression for fine scale spatiotemporal estimation of ambient air pollution: Modeling ozone concentrations across the contiguous United States. Environ Int. 2020;142:105827.
pubmed: 32593834
doi: 10.1016/j.envint.2020.105827
Pearson K. On lines and planes of closest fit to systems of points in space. Lond Edinb Dublin Philos Mag J Sci. 1901;2:559–72.
doi: 10.1080/14786440109462720
Sampson PD, Richards M, Szpiro AA, Bergen S, Sheppard L, Larson TV, et al. A regionalized national universal kriging model using partial least squares regression for estimating annual PM2.5 concentrations in epidemiology. Atmos Environ. 2013;75:383–92.
doi: 10.1016/j.atmosenv.2013.04.015
Young MT, Bechle MJ, Sampson PD, Szpiro AA, Marshall JD, Sheppard L, et al. Satellite-based NO2 and model validation in a national prediction model based on universal kriging and land-use regression. Environ Sci Technol. 2016;50:3686–94.
pubmed: 26927327
pmcid: 5104568
doi: 10.1021/acs.est.5b05099
Wang Y, Yao H, Zhao S. Auto-encoder based dimensionality reduction. Neurocomputing. 2016;184:232–42.
doi: 10.1016/j.neucom.2015.08.104
Mcinnes L, Healy J, Melville J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction arXiv: 1802. 03426v2 [stat. ML] 6 Dec 2018 (2018). _eprint: arXiv:1802.03426v2.
Yan X, Zang Z, Luo N, Jiang Y, Li Z. New interpretable deep learning model to monitor real-time PM2.5 concentrations from satellite data. Environ Int. 2020;144:106060.
pubmed: 32920497
doi: 10.1016/j.envint.2020.106060
Yan X, Zang Z, Jiang Y, Shi W, Guo Y, Li D, et al. A spatial-temporal interpretable deep learning model for improving interpretability and predictive accuracy of satellite-based PM2.5. Environ Pollut. 2021;273:116459.
pubmed: 33465651
doi: 10.1016/j.envpol.2021.116459
Tibau X-A, Reimers C, Requena-Mesa C, Runge J. Spatio-temporal autoencoders in weather and climate research. Deep Learning for the Earth Sciences: A Comprehensive Approach to Remote Sensing, Climate Science, and Geosciences. Wiley Online Library; 2021:186–203.
Behrens G, Beucler T, Gentine P, Iglesias-Suarez F, Pritchard M, Eyring V, et al. Non-linear dimensionality reduction with a variational encoder decoder to understand convective processes in climate models. J Adv Modeling Earth Syst. 2022;14:e2022MS003130.
doi: 10.1029/2022MS003130
Venn A, Lewis S, Cooper M, Hubbard R, Hill I, Boddy R, et al. Local road traffic activity and the prevalence, severity, and persistence of wheeze in school children: combined cross sectional and longitudinal study. Occup Environ Med. 2000;57:152–58.
pubmed: 10810096
pmcid: 1739915
doi: 10.1136/oem.57.3.152
Hoek G, Brunekreef B, Goldbohm S, Fischer P, van den Brandt PA. Association between mortality and indicators of traffic-related air pollution in the Netherlands: a cohort study. lancet. 2002;360:1203–09.
pubmed: 12401246
doi: 10.1016/S0140-6736(02)11280-3
Jahnke JR, Messier KP, Lowe M, Jukic AM. Ambient air pollution exposure assessments in fertility studies: a systematic review and guide for reproductive epidemiologists. Curr Epidemiol Rep. 2022;9:87–107.
Kim JJ, Huen K, Adams S, Smorodinsky S, Hoats A, Malig B, et al. Residential traffic and children’s respiratory health. Environ Health Perspect. 2008;116:1274–79.
pubmed: 18795175
pmcid: 2535634
doi: 10.1289/ehp.10735
Briggs DJ, Collins S, Elliott P, Fischer P, Kingham S, Lebret E, et al. Mapping urban air pollution using GIS: a regression-based approach. Int J Geogr Inf Sci. 1997;11:699–718.
doi: 10.1080/136588197242158
Smith RA, Schwarz GE, Alexander RB. Regional interpretation of water-quality monitoring data. Water Resour Res. 1997;33:2781–98.
doi: 10.1029/97WR02171
Kleinbaum DG, Kupper LL, Nizam A, Rosenberg ES. Applied regression analysis and other multivariable methods. Cengage Learning; 2013.
Brunsdon C, Fotheringham AS, Charlton ME. Geographically weighted regression: a method for exploring spatial nonstationarity. Geogr Anal. 1996;28:281–98.
doi: 10.1111/j.1538-4632.1996.tb00936.x
Fotheringham AS, Crespo R, Yao J. Geographical and temporal weighted regression (gtwr). Geogr Anal. 2015;47:431–52.
doi: 10.1111/gean.12071
Gelfand AE, Kim H-J, Sirmans C, Banerjee S. Spatial modeling with spatially varying coefficient processes. J Am Stat Assoc. 2003;98:387–96.
doi: 10.1198/016214503000170
Hu X, Waller LA, Al-Hamdan MZ, Crosson WL, Estes Jr MG, Estes SM, et al. Estimating ground-level PM2.5 concentrations in the southeastern us using geographically weighted regression. Environ Res. 2013;121:1–10.
pubmed: 23219612
doi: 10.1016/j.envres.2012.11.003
Van Donkelaar A, Martin RV, Spurr RJ, Burnett RT. High-resolution satellite-derived PM2.5 from optimal estimation and geographically weighted regression over North America. Environ Sci Technol. 2015;49:10482–491.
pubmed: 26261937
doi: 10.1021/acs.est.5b02076
van Donkelaar A, Martin RV, Li C, Burnett RT. Regional estimates of chemical composition of fine particulate matter using a combined geoscience-statistical method with information from satellites, models, and monitors. Environ Sci Technol. 2019;53:2595–611.
pubmed: 30698001
doi: 10.1021/acs.est.8b06392
Kloog I, Nordio F, Coull BA, Schwartz J. Incorporating local land use regression and satellite aerosol optical depth in a hybrid model of spatiotemporal PM2.5 exposures in the mid-Atlantic states. Environ Sci Technol. 2012;46:11913–921.
pubmed: 23013112
pmcid: 4780577
doi: 10.1021/es302673e
Kloog I, Chudnovsky AA, Just AC, Nordio F, Koutrakis P, Coull BA, et al. A new hybrid spatio-temporal model for estimating daily multi-year pm2. 5 concentrations across northeastern USA using high resolution aerosol optical depth data. Atmos Environ. 2014;95:581–90.
doi: 10.1016/j.atmosenv.2014.07.014
Leung Y, Mei C-L, Zhang W-X. Statistical tests for spatial nonstationarity based on the geographically weighted regression model. Environ Plan A. 2000;32:9–32.
doi: 10.1068/a3162
Olea RA. Geostatistics for engineers and earth scientists. Springer Science & Business Media; 2012.
Williams CK, Rasmussen CE. Gaussian processes for machine learning, Vol. 2. MA: MIT Press Cambridge; 2006.
Waller LA, Gotway CA. Applied spatial statistics for public health data. John Wiley & Sons; 2004.
Zhan Y, Luo Y, Deng X, Zhang K, Zhang M, Grieneisen ML, et al. Satellite-based estimates of daily NO2 exposure in China using hybrid random forest and spatiotemporal kriging model. Environ Sci Technol. 2018;52:4180–89.
pubmed: 29544242
doi: 10.1021/acs.est.7b05669
Stein ML. Interpolation of spatial data: some theory for kriging. Springer Science & Business Media; 1999.
He J, Kolovos A. Bayesian maximum entropy approach and its applications: a review. Stoch Environ Res Risk Assess. 2018;32:859–77.
doi: 10.1007/s00477-017-1419-7
Banerjee S, Gelfand AE, Finley AO, Sang, H. Gaussian predictive process models for large spatial data sets. J R Stat Soc Series B Stat Methodol. 2008;70:825–48.
Katzfuss M, Guinness J. A general framework for Vecchia approximations of Gaussian processes. Stat Sci. 2021;36:124–41.
Rue H, Martino S, Chopin N. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. J R Stat Soc B (Stat Methodol). 2009;71:319–92.
doi: 10.1111/j.1467-9868.2008.00700.x
Moran KR, Wheeler MW. Fast increased fidelity samplers for approximate Bayesian Gaussian process regression. J R Stat Soc B Stat Methodol. 2022;84:1198–1228.
doi: 10.1111/rssb.12494
Breiman L. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Stat Sci. 2001;16:199–231.
doi: 10.1214/ss/1009213726
Yan Y. Machine learning fundamentals. Machine Learning in Chemical Safety and Health: Fundamentals with Applications. Wiley Online Library; 2022:19–46.
Hastie T, Tibshirani R, Friedman JH. The elements of statistical learning: data mining, inference, and prediction, Vol. 2. Springer; 2009.
Bishop CM. Neural networks and their applications. Rev Sci Instrum. 1994;65:1803–32.
doi: 10.1063/1.1144830
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.
pubmed: 26017442
doi: 10.1038/nature14539
Goodfellow I, Bengio Y, Courville A. Deep learning. MIT Press; 2016.
Di Q, Kloog I, Koutrakis P, Lyapustin A, Wang Y, Schwartz J, et al. Assessing PM2.5 exposures with high spatiotemporal resolution across the continental United States. Environ Sci Technol. 2016;50:4712–21.
pubmed: 27023334
pmcid: 5761665
doi: 10.1021/acs.est.5b06121
Di Q, Rowland S, Koutrakis P, Schwartz J. A hybrid model for spatially and temporally resolved ozone exposures in the continental United States. J Air Waste Manag Assoc. 2017;67:39–52.
pubmed: 27332675
pmcid: 5741295
doi: 10.1080/10962247.2016.1200159
Pyo J, Park LJ, Pachepsky Y, Baek SS, Kim K, Cho KH, et al. Using convolutional neural network for predicting cyanobacteria concentrations in river water. Water Res. 2020;186:116349.
pubmed: 32882452
doi: 10.1016/j.watres.2020.116349
Müller J, Park J, Sahu R, Varadharajan C, Arora B, Faybishenko B, et al. Surrogate optimization of deep neural networks for groundwater predictions. J Glob Optim. 2021;81:203–31.
doi: 10.1007/s10898-020-00912-0
Azimi S, Moghaddam MA, Monfared SH. Prediction of annual drinking water quality reduction based on groundwater resource index using the artificial neural network and fuzzy clustering. J Contam Hydrol. 2019;220:6–17.
pubmed: 30471981
doi: 10.1016/j.jconhyd.2018.10.010
Seligman B, Tuljapurkar S, Rehkopf D. Machine learning approaches to the social determinants of health in the health and retirement study. SSM Popul health. 2018;4:95–9.
pubmed: 29349278
doi: 10.1016/j.ssmph.2017.11.008
Weichenthal S, Hatzopoulou M, Brauer M. A picture tells a thousand… exposures: opportunities and challenges of deep learning image analyses in exposure science and environmental epidemiology. Environ Int. 2019;122:3–10.
pubmed: 30473381
doi: 10.1016/j.envint.2018.11.042
Breiman L. Bagging predictors. Mach Learn. 1996;24:123–40.
doi: 10.1007/BF00058655
Friedman JH. Greedy function approximation: a gradient boosting machine. Ann Stat. 2001;29:1189–232.
Wheeler DC, Nolan BT, Flory AR, DellaValle CT, Ward MH. Modeling groundwater nitrate concentrations in private wells in Iowa. Sci Total Environ. 2015;536:481–88.
pubmed: 26232757
pmcid: 6397646
doi: 10.1016/j.scitotenv.2015.07.080
Tesoriero AJ, Gronberg JA, Juckem PF, Miller MP, Austin BP. Predicting redox-sensitive contaminant concentrations in groundwater using random forest classification. Water Resour Res. 2017;53:7316–31.
doi: 10.1002/2016WR020197
Messier K, Wheeler D, Flory A, Jones R, Patel D, Nolan B, et al. Modeling groundwater nitrate exposure in private wells of North Carolina for the Agricultural Health Study. Sci Total Environ. 2019;655.
Ransom KM, Nolan BT, Traum JA, Faunt CC, Bell AM, Gronberg JAM, et al. A hybrid machine learning model to predict and visualize nitrate concentration throughout the central valley aquifer, California, USA. Sci Total Environ. 2017;601:1160–72.
pubmed: 28599372
doi: 10.1016/j.scitotenv.2017.05.192
Chen Z-Y, Zhang TH, Zhang R, Zhu ZM, Yang J, Chen PY, et al. Extreme gradient boosting model to estimate PM2.5 concentrations with missing-filled satellite data in China. Atmos Environ. 2019;202:180–9.
doi: 10.1016/j.atmosenv.2019.01.027
Zhang T, He W, Zheng H, Cui Y, Song H, Fu S, et al. Satellite-based ground PM2.5 estimation using a gradient boosting decision tree. Chemosphere. 2021;268:128801.
pubmed: 33139054
doi: 10.1016/j.chemosphere.2020.128801
He W, Meng H, Han J, Zhou G, Zheng H, Zhang S, et al. Spatiotemporal PM2.5 estimations in China from 2015 to 2020 using an improved gradient boosting decision tree. Chemosphere. 2022;296:134003.
pubmed: 35182532
doi: 10.1016/j.chemosphere.2022.134003
Zhan Y, Luo Y, Deng X, Chen H, Grieneisen ML, Shen X, et al. Spatiotemporal prediction of continuous daily PM2.5 concentrations across China using a spatially explicit machine learning algorithm. Atmos Environ. 2017;155:129–39.
doi: 10.1016/j.atmosenv.2017.02.023
Sigrist F. Gaussian process boosting. J Mach Learn Res. 2022;23:1–46.
Darcy H. Les fontaines publiques de la ville de Dijon: exposition et application des principes à suivre et des formules à employer dans les questions de distribution d’eau... un appendice relatif aux fournitures d’eau de plusieurs villes au filtrage des eaux. Vol. 1. Victor Dalmont, éditeur; 1856.
Gray WG, Miller CT. Introduction to the thermodynamically constrained averaging theory for porous medium systems. Vol. 696. Springer; 2014.
Tessum CW, Hill JD, Marshall JD. InMAP: a model for air pollution interventions. PLoS One. 2017;12:1–26.
doi: 10.1371/journal.pone.0176131
US EPA Office of Research and Development. CMAQ (2022). https://doi.org/10.5281/zenodo.7218076 .
Ramboll Environment and Health. User’s guide to the comprehensive air quality model with extensions version 5.40. ENVIRON International Corporation, Novato, CA. Available at: www.camx.com . 2014.
Peckham SE, Grell GA, McKeen SA, Ahmadov R, Wong KY, Barth M, et al. WRF-Chem version 3.8.1 user’s guide. ENVIRON International Corporation, Novato, CA. Available at: www.camx.com (2017). https://doi.org/10.7289/V5/TM-OAR-GSD-48 .
Global Modeling and Assimilation Office (GMAO). inst3_3d_asm_cp: Merra-2 3d iau state, meteorology instantaneous 3-hourly (p-coord, 0.625x0.5l42), version 5.12.4. Greenbelt, MD, USA: Goddard Space Flight Center Distributed Active Archive Center (GSFC DAAC) (2015). March 1, 2023 at https://doi.org/10.5067/VJAFPLI1CSIV .
Tessum CW, Apte JS, Goodkind AL, Muller NZ, Mullins KA, Paolella DA, et al. Inequity in consumption of goods and services adds to racial-ethnic disparities in air pollution exposure. Proc Natl Acad Sci USA. 2019;116:6001 LP–6006.
doi: 10.1073/pnas.1818859116
Snyder MG, Venkatram A, Heist DK, Perry SG, Petersen WB, Isakov V, et al. Rline: a line source dispersion model for near-surface releases. Atmos Environ. 2013;77:748–56.
doi: 10.1016/j.atmosenv.2013.05.074
Langevin CD, Hughes JD, Banta ER, Niswonger RG, Panday S, Provost AM, et al. Documentation for the modflow 6 groundwater flow model. Tech. Rep., US Geological Survey. 2017.
Gallagher LG, Webster TF, Aschengrau A, Vieira VM. Using residential history and groundwater modeling to examine drinking water exposure and breast cancer. Environ Health Perspect. 2010;118:749–55.
pubmed: 20164002
pmcid: 2898849
doi: 10.1289/ehp.0901547
Beven K, Kirkby M. A physically based, variable contributing area model of basin hydrology. Hydrol Sci. 1979;24:43–69.
doi: 10.1080/02626667909491834
Novotny EV, Bechle MJ, Millet DB, Marshall JD. National satellite-based land-use regression: NO2 in the United States. Environ Sci Technol. 2011;45:4407–14.
pubmed: 21520942
doi: 10.1021/es103578x
Messier K, Chambliss S, Gani S, Alvarez R, Brauer M, Choi J, et al. Mapping air pollution with Google Street View cars: efficient approaches with mobile monitoring and land use regression. Environ Sci Technol. 2018;52:12563–72.
de Hoogh K, Chen J, Gulliver J, Hoffmann B, Hertel O, Ketzel M, et al. Spatial PM2.5, NO2, O3 and BC models for Western Europe – Evaluation of spatiotemporal stability. Environ Int. 2018;120:81–92.
pubmed: 30075373
doi: 10.1016/j.envint.2018.07.036
Reyes JM, Serre ML. An LUR/BME framework to estimate PM2.5 explained by on road mobile and stationary sources. Environ Sci Technol. 2014;48:1736–44.
pubmed: 24387222
pmcid: 3983125
doi: 10.1021/es4040528
Van der Laan MJ, Polley EC, Hubbard AE. Super learner. Stat Appl Genet Mol Biol. 2007;6:25.
Danesh Yazdi M, Kuang Z, Dimakopoulou K, Barratt B, Suel E, Amini H, et al. Predicting fine particulate matter PM2.5 in the greater London area: an ensemble approach using machine learning methods. Remote Sensing. 2020;12. https://www.mdpi.com/2072-4292/12/6/914 .
Yu W, Li S, Ye T, Xu R, Song J, Guo Y, et al. Deep ensemble machine learning framework for the estimation of pm 2.5 concentrations. Environ Health Perspect. 2022;130:037004.
pubmed: 35254864
pmcid: 8901043
doi: 10.1289/EHP9752
Murray NL, Holmes HA, Liu Y, Chang HH. A Bayesian ensemble approach to combine PM2.5 estimates from statistical models using satellite imagery and numerical model simulation. Environ Res. 2019;178:108601.
pubmed: 31465992
pmcid: 7048623
doi: 10.1016/j.envres.2019.108601
Gotway CA, Young LJ. Combining incompatible spatial data. J Am Stat Assoc. 2002;97:632–48.
doi: 10.1198/016214502760047140
Young LJ, Gotway CA. Linking spatial data from different sources: the effects of change of support. Stoch Environ Res Risk Assess. 2007;21:589–600.
doi: 10.1007/s00477-007-0136-z
Abatzoglou JT, Brown TJ. A comparison of statistical downscaling methods suited for wildfire applications. Int J Climatol. 2012;32:772–80.
doi: 10.1002/joc.2312
Ford TW, Quiring SM. Comparison and application of multiple methods for temporal interpolation of daily soil moisture. Int J Climatol. 2014;34:2604–21.
doi: 10.1002/joc.3862
Schinasi LH, Auchincloss AH, Forrest CB, Roux AVD. Using electronic health record data for environmental and place based population health research: a systematic review. Ann Epidemiol. 2018;28:493–502.
pubmed: 29628285
doi: 10.1016/j.annepidem.2018.03.008
Kinnee EJ, Tripathy S, Schinasi L, Shmool JL, Sheffield PE, Holguin F, et al. Geocoding error, spatial uncertainty, and implications for exposure assessment and environmental epidemiology. Int J Environ Res public health. 2020;17:5845.
pubmed: 32806682
pmcid: 7459468
doi: 10.3390/ijerph17165845
Yi L, Xu Y, Eckel SP, O’Connor S, Cabison J, Rosales M, et al. Time-activity and daily mobility patterns during pregnancy and early postpartum–evidence from the madres cohort. Spat Spatio Temporal Epidemiol. 2022;41:100502.
doi: 10.1016/j.sste.2022.100502
Nethery E, Leckie SE, Teschke K, Brauer M. From measures to models: an evaluation of air pollution exposure assessment for epidemiological studies of pregnant women. Occup Environ Med. 2008;65:579–86.
pubmed: 18070798
doi: 10.1136/oem.2007.035337
Yi L, Wilson JP, Mason TB, Habre R, Wang S, Dunton GF, et al. Methodologies for assessing contextual exposure to the built environment in physical activity studies: a systematic review. Health Place. 2019;60:102226.
pubmed: 31797771
pmcid: 7377908
doi: 10.1016/j.healthplace.2019.102226
Ntarladima A-M, Karssenberg D, Vaartjes I, Grobbee DE, Schmitz O, Lu M, et al. A comparison of associations with childhood lung function between air pollution exposure assessment methods with and without accounting for time-activity patterns. Environ Res. 2021;202:111710.
pubmed: 34280420
doi: 10.1016/j.envres.2021.111710
Laatikainen TE, Hasanzadeh K, Kyttä M. Capturing exposure in environmental health research: challenges and opportunities of different activity space models. Int J Health Geogr. 2018;17:1–14.
doi: 10.1186/s12942-018-0149-5
Jankowska MM, Yang J-A, Luo N, Spoon C, Benmarhnia T. Accounting for space, time, and behavior using gps derived dynamic measures of environmental exposure. Health Place. 2021:102706.
Act A. Health insurance portability and accountability act of 1996. Public Law. 1996;104:191.
Brokamp C, Wolfe C, Lingren T, Harley J, Ryan P. Decentralized and reproducible geocoding and characterization of community and environmental exposures for multisite studies. J Am Med Inform Assoc. 2018;25:309–14.
pubmed: 29126118
doi: 10.1093/jamia/ocx128
Kane NJ, Wang X, Gerkovich MM, Breitkreutz M, Rivera B, Kunchithapatham H, et al. The envirome web service: Patient context at the point of care. J Biomed Inform. 2021;119:103817.
pubmed: 34020026
doi: 10.1016/j.jbi.2021.103817
Buck C, Dreger S, Pigeot I. Anonymisation of address coordinates for microlevel analyses of the built environment: a simulation study. BMJ Open. 2015;5:e006481.
pubmed: 25753360
pmcid: 4360832
doi: 10.1136/bmjopen-2014-006481
Choirat C, Braun D, Kioumourtzoglou M-A. Data science in environmental health research. Curr Epidemiol Rep. 2019;6:291–99.
pubmed: 31723546
pmcid: 6853613
doi: 10.1007/s40471-019-00205-5
Hu H, Liu X, Zheng Y, He X, Hart J, James P, et al. Methodological challenges in spatial and contextual exposome-health studies. Crit Rev Environ Sci Technol. 2023;53:827–46.
pubmed: 37138645
doi: 10.1080/10643389.2022.2093595
Cui Y, Eccles KM, Kwok RK, Joubert BR, Messier KP, Balshaw DM, et al. Integrating multiscale geospatial environmental data into large population health studies: Challenges and opportunities. Toxics. 2022;10:403.
pubmed: 35878308
pmcid: 9316943
doi: 10.3390/toxics10070403
US National Aeronautics and Space Administration (NASA). EarthData. 2024. https://www.earthdata.nasa.gov . Website.
Harvard University & Boston University. Climate Change and Health Research Coordinating Center (CAFE) Collection (2024). https://dataverse.harvard.edu/dataverse/CAFE . Website.
QGIS Association. QGIS Geographic Information System. 2023. http://www.qgis.org .
Pebesma E. Simple features for R: standardized support for spatial vector data. R J. 2018;10:439–46.
doi: 10.32614/RJ-2018-009
Jordahl K, den Bossche JV, Fleischmann M, Wasserman J, McBride J, Gerard J, et al. geopandas/geopandas: v0.8.1. 2020. https://doi.org/10.5281/zenodo.3946761 .
United States Centers for Disease Control and Prevention (US CDC). National Environmental Public Health Tracking Network Data Explorer. 2023. https://ephtracking.cdc.gov/DataExplorer/ .
OPeNDAP. OPeNDAP: Advanced Software for Remote Data Retrieval. 2023. https://www.opendap.org .
Wang Y, Köhler P, Braghiere RK, Longo M, Doughty R, Bloom AA, et al. Griddingmachine, a database and software for earth system modeling at global and regional scales. Sci Data. 2022;9:258.
pubmed: 35650204
pmcid: 9160223
doi: 10.1038/s41597-022-01346-x
Hijmans R, Bivand R, Pebesma E, Sumner M. Terra: Spatial Data Analysis. 2023. https://CRAN.R-project.org/package=terra . R Package, version 1.7-18.
Rew R, Davis G. Netcdf: an interface for scientific data access. IEEE Comp Graph Appl. 1990;10:76–82.
doi: 10.1109/38.56302
Brokamp C. Degauss: decentralized geomarker assessment for multi-site studies. J Open Source Softw. 2018;3:812.
doi: 10.21105/joss.00812
Anderson B, Yan M, Ferreri J, Crosson W, Al-Hamdan M, Schumacher A, et al. hurricaneexposure: Explore and Map County-Level Hurricane Exposure in the United States. 2020. https://cran.r-project.org/package=hurricaneexposure . R package version 0.1.1.
Qi M, Hankey S. Using street view imagery to predict street-level particulate air pollution. Environ Sci Technol. 2021;55:2695–704.
pubmed: 33539080
doi: 10.1021/acs.est.0c05572
Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, et al. Segment anything. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023;4015–26.