Use of genetic correlations to examine selection bias.

correlation covariance selection bias

Journal

Genetic epidemiology

ISSN: 1098-2272

Titre abrégé: Genet Epidemiol

Pays: United States

ID NLM: 8411723

Informations de publication

Date de publication:
30 Jul 2024

Historique:

revised: 13 07 2024

received: 04 04 2023

accepted: 17 07 2024

medline: 31 7 2024

pubmed: 31 7 2024

entrez: 31 7 2024

Statut: aheadofprint

Résumé

Observational studies are rarely representative of their target population because there are known and unknown factors that affect an individual's choice to participate (the selection mechanism). Selection can cause bias in a given analysis if the outcome is related to selection (conditional on the other variables in the model). Detecting and adjusting for selection bias in practice typically requires access to data on nonselected individuals. Here, we propose methods to detect selection bias in genetic studies by comparing correlations among genetic variants in the selected sample to those expected under no selection. We examine the use of four hypothesis tests to identify induced associations between genetic variants in the selected sample. We evaluate these approaches in Monte Carlo simulations. Finally, we use these approaches in an applied example using data from the UK Biobank (UKBB). The proposed tests suggested an association between alcohol consumption and selection into UKBB. Hence, UKBB analyses with alcohol consumption as the exposure or outcome may be biased by this selection.

Identifiants

DOI: 10.1002/gepi.22584 PMID: 39080969

pubmed: 39080969

doi: 10.1002/gepi.22584

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Subventions

Organisme : Medical Research Council

ID : MC_UU_00011/1 MC_UU_00011/3

Pays : United Kingdom

Informations de copyright

Références

Bartlett, J. W., Harel, O., & Carpenter, J. R. (2015). Asymptotically unbiased estimation of exposure odds ratios in complete records logistic regression. American Journal of Epidemiology, 182(8), 730–736. https://doi.org/10.1093/aje/kwv114

Bartlett, M. S. (1951). The effect of standardization on a χ $\chi $2 approximation in factor analysis. Biometrika, 38(3/4), 337–344.

Bowden, J., Del Greco, M. F., Minelli, C., Davey Smith, G., Sheehan, N., & Thompson, J. (2017). A framework for the investigation of pleiotropy in two‐sample summary data Mendelian randomization. Statistics in Medicine, 36(11), 1783–1802.

Box, G. E. (1949). A general distribution theory for a class of likelihood criteria. Biometrika, 36(3/4), 317–346.

Box, G. E. (1953). Non‐normality and tests on variances. Biometrika, 40(3/4), 318–335.

Brown, M. B. & Forsythe, A. B. (1974). Robust tests for the equality of variances. Journal of the American Statistical Association, 69(346), 364–367. http://www.jstor.org/stable/2285659

Cai, T. T. (2017). Global testing and large‐scale multiple testing for high‐dimensional covariance structures. Annual Review of Statistics and Its Application, 4, 423–446.

Chêne, G., & Thompson, S. G. (1996). Methods for summarizing the risk associations of quantitative variables in epidemiologic studies in a consistent form. American Journal of Epidemiology, 144(6), 610–621.

Fry, A., Littlejohns, T. J., Sudlow, C., Doherty, N., Adamska, L., Sprosen, T., Collins, R., & Allen, N. E. (2017). Comparison of sociodemographic and health‐related characteristics of UK Biobank participants with those of the general population. American Journal of Epidemiology, 186(9), 1026–1034.

Gkatzionis, A., Seaman, S. R., Hughes, R. A., & Tilling, K. (2023). Relationship between collider bias and interactions on the log‐additive scale. arXiv preprint arXiv:2308.00568.

Griffith, G. J., Morris, T. T., Tudball, M. J., Herbert, A., Mancano, G., Pike, L., Sharp, G. C., Sterne, J., Palmer, T. M., Davey Smith, G., Tilling, K., Zuccolo, L., Davies, N. M., & Hemani, G. (2020). Collider bias undermines our understanding of COVID‐19 disease risk and severity. Nature Communications, 11(1), 5749.

Han, B., Pouget, J. G., Slowikowski, K., Stahl, E., Lee, C. H., Diogo, D., Hu, X., Park, Y. R., Kim, E., Gregersen, P. K., Dahlqvist, S. R., Worthington, J., Martin, J., Eyre, S., Klareskog, L., Huizinga, T., Chen, W.‐M., Onengut‐Gumuscu, S., & Rich, S. S. (2016). A method to decipher pleiotropy by detecting underlying heterogeneity driven by hidden subgroups applied to autoimmune and neuropsychiatric diseases. Nature Genetics, 48(7), 803–810.

Hernán, M. A., Hernández‐Díaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 15(5), 615–625.

Howe, L. J., Nivard, M. G., Morris, T. T., Hansen, A. F., Rasheed, H., Cho, Y., Chittoor, G., Ahlskog, R., Lind, P. A., Palviainen, T., van der Zee, M. D., Cheesman, R., Mangino, M., Wang, Y., Li, S., Klaric, L., Ratliff, S. M., Bielak, L. F., Nygaard, M., … Davies, N. M. (2022). Within‐sibship genome‐wide association analyses decrease bias in estimates of direct genetic effects. Nature Genetics, 54(5), 581–592.

Hughes, R. A., Davies, N. M., Davey Smith, G., & Tilling, K. (2019). Selection bias when estimating average treatment effects using one‐sample instrumental variable analysis. Epidemiology (Cambridge, Massachusetts), 30(3), 350.

Hughes, R. A., Heron, J., Sterne, J. A., & Tilling, K. (2019). Accounting for missing data in statistical analyses: Multiple imputation is not always the answer. International Journal of Epidemiology, 48(4), 1294–1304.

Jennrich, R. I. (1970). An asymptotic χ $\chi $2 test for the equality of two correlation matrices. Journal of the American Statistical Association, 65(330), 904–912.

Larsson, S. C., Burgess, S., Mason, A. M., & Michaëlsson, K. (2020). Alcohol consumption and cardiovascular disease: A Mendelian randomization study. Circulation: Genomic and Precision Medicine, 13(3), e002814.

Larzelere, R. E., & Mulaik, S. A. (1977). Single‐sample tests for many correlations. Psychological Bulletin, 84(3), 557.

Layard, M. W. (1973). Robust large‐sample tests for homogeneity of variances. Journal of the American Statistical Association, 68(341), 195–198.

Liu, M., Jiang, Y., Wedow, R., Li, Y., Brazel, D. M., Chen, F., Datta, G., Davila‐Velderrain, J., McGuire, D., Tian, C., & Zhan, X., Me Research Team & HUNT All‐In Psychiatry. (2019). Association studies of up to 1.2 million individuals yield new insights into the genetic etiology of tobacco and alcohol use. Nature Genetics, 51(2), 237–244.

Locke, A. E., Kahali, B., Berndt, S. I., Justice, A. E., Pers, T. H., Day, F. R., Powell, C., Vedantam, S., Buchkovich, M. L., Yang, J., Croteau‐Chonka, D. C., Esko, T., Fall, T., Ferreira, T., Gustafsson, S., Kutalik, Z., Luan, J., Mägi, R., Randall, J. C., … Speliotes, E. K. (2015). Genetic studies of body mass index yield new insights for obesity biology. Nature, 518(7538), 197–206.

Mansournia, M. A., & Altman, D. G. (2016). Inverse probability weighting. BMJ, 352, i189. https://doi.org/10.1136/bmj.i189

Mitchell, R., Hemani, G., Dudding, T., Corbin, L., Harrison, S., & Paternoster, L. (2019). UK Biobank genetic data: MRC‐IEU quality control (Version 2). University of Bristol. https://doi.org/10.5523/bris.1ovaau5sxunp2cv8rcy88688v

Morris, T. P., White, I. R., & Crowther, M. J. (2019). Using simulation studies to evaluate statistical methods. Statistics in Medicine, 38(11), 2074–2102.

Munafò, M. R., Tilling, K., Taylor, A. E., Evans, D. M., & Davey Smith, G. (2018). Collider scope: When selection bias can substantially influence observed associations. International Journal of Epidemiology, 47(1), 226–235.

Neill, J. J., & Dunn, O. J. (1975). Equality of dependent correlation coefficients. Biometrics, 31, 531–543.

Pirastu, N., Cordioli, M., Nandakumar, P., Mignogna, G., Abdellaoui, A., Hollis, B., Kanai, M., Rajagopal, V. M., Parolo, P. D. B., Baya, N., Carey, C. E., Karjalainen, J., Als, T. D., Van der Zee, M. D., Day, F. R., Ong, K. K., & Study, F. G., Me Research Team & iPSYCH Consortium. (2021). Genetic analyses identify widespread sex‐differential participation bias. Nature Genetics, 53(5), 663–671.

Pompanon, F., Bonin, A., Bellemain, E., & Taberlet, P. (2005). Genotyping errors: Causes, consequences and solutions. Nature Reviews Genetics, 6(11), 847–859.

Rojas‐Saunero, L. P., Glymour, M. M., & Mayeda, E. R. (2024). Selection bias in health research: Quantifying, eliminating, or exacerbating health disparities? Current Epidemiology Reports, 11, 63–72.

Rothman, K., Greenland, S., & Lash, T. (2008). Modern epidemiology. Wolters Kluwer Health/Lippincott Williams & Wilkins. https://books.google.co.uk/books?id=Z3vjT9ALxHUC

Seaman, S. R., & White, I. R. (2013). Review of inverse probability weighting for dealing with missing data. Statistical Methods in Medical Research, 22(3), 278–295.

Stamatakis, E., Owen, K. B., Shepherd, L., Drayton, B., Hamer, M., & Bauman, A. E. (2021). Is cohort representativeness passé? Poststratified associations of lifestyle risk factors with mortality in the UK Biobank. Epidemiology (Cambridge, Massachusetts), 32(2), 179.

Steiger, J. H. (1980). Testing pattern hypotheses on correlation matrices: Alternative statistics and some empirical results. Multivariate Behavioral Research, 15(3), 335–352.

Tyrrell, J., Zheng, J., Beaumont, R., Hinton, K., Richardson, T. G., Wood, A. R., DaveySmith, G., Frayling, T. M., & Tilling, K. (2021). Genetic predictors of participation in optional components of UK Biobank. Nature Communications, 12(1), 886.

Yang, Y., & DeGruttola, V. (2012). Resampling‐based methods in single and multiple testing for equality of covariance/correlation matrices. The International Journal of Biostatistics, 8(1). https://doi.org/10.1515/1557-4679.1388

Yengo, L., Robinson, M. R., Keller, M. C., Kemper, K. E., Yang, Y., Trzaskowski, M., Gratten, J., Turley, P., Cesarini, D., Benjamin, D. J., Wray, N. R., Goddard, M. E., Yang, J., & Visscher, P. M. (2018). Imprint of assortative mating on the human genome. Nature Human Behaviour, 2(12), 948–954.

Zheng, S., Cheng, G., Guo, J., & Zhu, H. (2019). Test for high dimensional correlation matrices. Annals of Statistics, 47(5), 2887.

Use of genetic correlations to examine selection bias.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Subventions

Informations de copyright

Références

Auteurs

Chin Yang Shapland (CY)

Apostolos Gkatzionis (A)

Gibran Hemani (G)

Kate Tilling (K)

Classifications MeSH