A flexible quasi-likelihood model for microbiome abundance count data.
heteroscedasticity
skewness
spline
zero-inflation
Journal
Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016
Informations de publication
Date de publication:
10 Nov 2023
10 Nov 2023
Historique:
revised:
28
07
2023
received:
04
01
2023
accepted:
01
08
2023
medline:
7
12
2023
pubmed:
23
8
2023
entrez:
22
8
2023
Statut:
ppublish
Résumé
In this article, we present a flexible model for microbiome count data. We consider a quasi-likelihood framework, in which we do not make any assumptions on the distribution of the microbiome count except that its variance is an unknown but smooth function of the mean. By comparing our model to the negative binomial generalized linear model (GLM) and Poisson GLM in simulation studies, we show that our flexible quasi-likelihood method yields valid inferential results. Using a real microbiome study, we demonstrate the utility of our method by examining the relationship between adenomas and microbiota. We also provide an R package "fql" for the application of our method.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
4632-4643Subventions
Organisme : NCATS NIH HHS
ID : NIH UL1 TR002345
Pays : United States
Informations de copyright
© 2023 John Wiley & Sons Ltd.
Références
Everard A, Cani PD. Diabetes, obesity and gut microbiota. Best Pract Res Clin Gastroenterol. 2013;27:73-83.
Musso G, Gambino R, Cassader M. Obesity, diabetes, and gut microbiota: the hygiene hypothesis expanded? Diabetes Care. 2010;33(10):2277-2284.
Lewis JD, Chen EZ, Baldassano RN, et al. Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohn's disease. Cell Host Microbe. 2015;18(4):489-500.
Srinivasan S, Hoffman NG, Morgan MT, et al. Bacterial communities in women with bacterial vaginosis: high resolution phylogenetic analyses reveal relationships of microbiota to clinical criteria. PloS One. 2012;7(6):e37818.
Garrett WS. Cancer and the microbiota. Science. 2015;348:80-86.
Petrosino JF. The microbiome in precision medicine: the way forward. Genome Med. 2018;10(1):12.
Gilbert JA, Meyer F, Bailey MJ. The future of microbial metagenomics (or is ignorance bliss?). ISME J. 2011;5:777-779.
Lin H, Peddada SD. Analysis of microbial compositions: a review of normalization and differential abundance analysis. NPJ Biofilms Microbiomes. 2020;2(6):60.
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550.
Robinson MD, McCarthy DJ, Smyth GK. Edger: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139-140.
Mandal S, Van Treuren W, White RA, Eggesbø M, Knight R, Peddada SD. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microb Ecol Health Dis. 2015;26:27663.
Wedderburn RWM. Quasi-likelihood functions, generalized linear models, and the gauss-Newton method. Biometrika. 1974;61:439-447.
McCullagh P, Nelder JA. Generalized Linear Models. New York: Chapman Hall; 1989.
Nelder JA, Pregibon D. An extended quasi-likelihood function. Biometrika. 1987;74:221-232.
Basu A, Rathouz P. Estimating marginal and incremental effects on health outcomes using flexible link and variance function models. Biostatistics. 2005;6:93-109.
Chen J, Liu L, Zhang D, Shih T. A flexible model for the mean and variance functions, with application to medical cost data. Stat Med. 2013;32:4306-4318.
Chiou JM, Muller HG. Nonparametric Quasi-likelihood. Ann Stat. 1999;27:36-64.
Hale VL, Chen J, Johnson S, et al. Shifts in the fecal microbiota associated with adenomatous polyps. Cancer Epidemiol Biomarkers Prev. 2017;26(1):85-94.
Liu L, Strawderman RL, Johnson BA, O'Quigley JM. Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study. Stat Methods Med Res. 2016;25:133-152.
Liu L, Shih YCT, Strawderman RL, Zhang DW, Johnson B, Chai HT. Statistical analysis of zero-inflated nonnegative continuous data: a review. Stat Sci. 2019;34:253-279.
Duan N. Smearing estimate: a nonparametric retransformation method. J Am Stat Assoc. 1983;78:605-610.
Wood SN. Generalized Additive Models: an Introduction with R. 2nd ed. Boca Raton: Chapman & Hall/CRC; 2017.
Wood SN. Modeling and smoothing parameter estimation with multiple quadratic penalties. J R Stat Soc B. 2000;62:413-428.
Smith RL. Estimating tails of probability distributions. Ann Stat. 1987;15:1174-1207.
Gabaix X. Power Laws in economics and finance. Ann Rev Econ. 2009;1:255-293.
Yang L, Chen J. A comprehensive evaluation of microbial differential abundance analysis methods: current status and potential solutions. Microbiome. 2022;10(1):130.
Wang J, Reyes-Gibby CC, Shete S. An approach to analyze longitudinal zero-inflated microbiome count data using two-stage mixed effects models. Stat Biosci. 2021;13:267-290.
Xu L, Paterson AD, Turpin W, Xu W. Assessment and selection of competing models for zero-inflated microbiome data. PloS One. 2015;10:e0129606.
Smith VA, Neelon B, Maciejewski ML, Preisser JS. Two parts are better than one: modeling marginal means of semicontinuous data. Health Serv Outcomes Res Methodol. 2017;17:198-218.
Martín-Fernández J-A, Hron K, Templ M, Filzmoser P, Palarea-Albaladejo J. Bayesian-multiplicative treatment of count zeros in compositional data sets. Stat Model. 2015;15:134-158.
Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Dent Tech. 1992;34:1-14.
Greene WH. Accounting for excess zeros and sample selection in Poisson and negative binomial regression models. NYU Working Paper No. EC-94-10; 1994.
Bartholomew DJ, Knott M, Moustaki I. Latent Variable Models and Factor Analysis: A Unified Approach. Hoboken, NJ, US: John Wiley & Sons; 2011.
Long JS. Regression Models for Categorical and Limited Dependent Variables. Advanced Quantitative Techniques in the Social Sciences. Thousand Oaks, CA, USA: Sage; 1997.
Lee AH, Wang K, Yau KK, Somerford P. Truncated negative binomial mixed regression modelling of ischaemic stroke hospitalizations. Stat Med. 2003;22:1129-1139.
Chai HT, Jiang HM, Lin L, Liu L. A marginalized two-part Beta regression model for microbiome compositional data. PLoS Comput Biol. 2018;14:e1006329.
Washburne AD, Morton JT, Sanders J, et al. Methods for phylogenetic analysis of microbiome data. Nat Microbiol. 2018;3:652-661.