Functional Principal Component Analysis as an Alternative to Mixed-Effect Models for Describing Sparse Repeated Measures in Presence of Missing Data.
functional principal component analysis
missing data
mixed models
sparse functional data
Journal
Statistics in medicine
ISSN: 1097-0258
Titre abrégé: Stat Med
Pays: England
ID NLM: 8215016
Informations de publication
Date de publication:
09 Sep 2024
09 Sep 2024
Historique:
revised:
21
08
2024
received:
15
03
2024
accepted:
22
08
2024
medline:
9
9
2024
pubmed:
9
9
2024
entrez:
9
9
2024
Statut:
aheadofprint
Résumé
Analyzing longitudinal data in health studies is challenging due to sparse and error-prone measurements, strong within-individual correlation, missing data and various trajectory shapes. While mixed-effect models (MM) effectively address these challenges, they remain parametric models and may incur computational costs. In contrast, functional principal component analysis (FPCA) is a non-parametric approach developed for regular and dense functional data that flexibly describes temporal trajectories at a potentially lower computational cost. This article presents an empirical simulation study evaluating the behavior of FPCA with sparse and error-prone repeated measures and its robustness under different missing data schemes in comparison with MM. The results show that FPCA is well-suited in the presence of missing at random data caused by dropout, except in scenarios involving most frequent and systematic dropout. Like MM, FPCA fails under missing not at random mechanism. The FPCA was applied to describe the trajectories of four cognitive functions before clinical dementia and contrast them with those of matched controls in a case-control study nested in a population-based aging cohort. The average cognitive declines of future dementia cases showed a sudden divergence from those of their matched controls with a sharp acceleration 5 to 2.5 years prior to diagnosis.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : France 2030 Program / RRI PHDS
Informations de copyright
© 2024 The Author(s). Statistics in Medicine published by John Wiley & Sons Ltd.
Références
W. J. Boscardin, J. M. Taylor, and N. Law, “Longitudinal Models for AIDS Marker Data,” Statistical Methods in Medical Research 7, no. 1 (1998): 13–27, https://doi.org/10.1177/096228029800700103.
D. Ilic, M. Djulbegovic, J. H. Jung, et al., “Prostate Cancer Screening With Prostate‐Specific Antigen (PSA) Test: A Systematic Review and Meta‐Analysis,” BMJ 362 (2018): k3519, https://doi.org/10.1136/bmj.k3519.
A. Alpérovitch, P. Amouyel, J. F. Dartigues, et al., “Les études épidémiologiques Sur le Vieillissement en France: De l'étude Paquid à l'étude des Trois Cités,” Comptes Rendus Biologies 325, no. 6 (2002): 665–672, https://doi.org/10.1016/S1631‐0691(02)01476‐2.
A. Lokku, L. S. Lim, C. S. Birken, E. M. Pullenayegum, and TARGet Kids! Collaboration, “Summarizing the Extent of Visit Irregularity in Longitudinal Data,” BMC Medical Research Methodology 20, no. 1 (2020): 135, https://doi.org/10.1186/s12874‐020‐01023‐w.
J. W. Hogan, J. Roy, and C. Korkontzelou, “Handling Drop‐Out in Longitudinal Studies,” Statistics in Medicine 23, no. 9 (2004): 1455–1497, https://doi.org/10.1002/sim.1728.
G. Verbeke, “Linear Mixed Models for Longitudinal Data,” in Linear Mixed Models in Practice: A SAS‐Oriented Approach Lecture Notes in Statistics, eds. G. Verbeke and G. Molenberghs (New York, NY: Springer, 1997), 63–153.
D. B. Rubin, “Inference and Missing Data,” Biometrika 63, no. 3 (1976): 581–592, https://doi.org/10.2307/2335739.
R. J. A. Little and D. B. Rubin, Statistical Analysis With Missing Data (Hoboken, NJ: Wiley, 1987).
M. Davidian and D. M. Giltinan, “Nonlinear Models for Repeated Measurement Data: An Overview and Update,” Journal of Agricultural, Biological, and Environmental Statistics 8, no. 4 (2003): 387–419, https://doi.org/10.1198/1085711032697.
J. O. Ramsay and B. W. Silverman, Functional Data Analysis. Springer Series in Statistics (New York, NY: Springer, 2005).
J. L. Wang, J. M. Chiou, and H. G. Mueller, “Review of Functional Data Analysis,” (2015), arXiv:1507.05135 [stat].
F. Yao, H. G. Müller, and J. L. Wang, “Functional Data Analysis for Sparse Longitudinal Data,” Journal of the American Statistical Association 100, no. 470 (2005): 577–590.
M. F. Folstein, S. E. Folstein, and P. R. McHugh, ““Mini‐mental state”. A Practical Method for Grading the Cognitive State of Patients for the Clinician,” Journal of Psychiatric Research 12, no. 3 (1975): 189–198, https://doi.org/10.1016/0022‐3956(75)90026‐6.
A. L. Benton, The Revised Visual Retention Test: Clinical and Experimental Applications, 3rd ed. (New York, NY: Psychological Corporation, 1963).
B. Isaacs and A. T. Kennie, “The Set Test as an Aid to the Detection of Dementia in Old People,” British Journal of Psychiatry: The Journal of Mental Science 123, no. 575 (1973): 467–470, https://doi.org/10.1192/bjp.123.4.467.
R. M. Reitan, “Validity of the Trail Making Test as an Indicator of Organic Brain Damage,” Perceptual and Motor Skills 8, no. 3 (1958): 271–276, https://doi.org/10.2466/pms.1958.8.3.271.
N. M. Laird and J. H. Ware, “Random‐Effects Models for Longitudinal Data,” Biometrics 38, no. 4 (1982): 963, https://doi.org/10.2307/2529876.
C. Proust‐Lima, H. Amieva, J. F. Dartigues, and H. Jacqmin‐Gadda, “Sensitivity of Four Psychometric Tests to Measure Cognitive Changes in Brain Aging‐Population‐Based Studies,” American Journal of Epidemiology 165, no. 3 (2007): 344–350, https://doi.org/10.1093/aje/kwk017.
C. Proust‐Lima, J. F. Dartigues, and H. Jacqmin‐Gadda, “Misuse of the Linear Mixed Model When Evaluating Risk Factors of Cognitive Decline,” American Journal of Epidemiology 174, no. 9 (2011): 1077–1088, https://doi.org/10.1093/aje/kwr243.
J. Pan and R. Thompson, “Gauss‐Hermite Quadrature Approximation for Estimation in Generalised Linear Mixed Models,” Computational Statistics 18, no. 1 (2003): 57–78, https://doi.org/10.1007/s001800300132.
G. Molenberghs and G. Verbeke, Linear Mixed Models for Longitudinal Data. Springer Series in Statistics (New York, NY: Springer, 2000).
P. Diggle and M. G. Kenward, “Informative Drop‐Out in Longitudinal Data Analysis,” Journal of the Royal Statistical Society: Series C: Applied Statistics 43, no. 1 (1994): 49–93, https://doi.org/10.2307/2986113.
D. Rizopoulos, Joint Models for Longitudinal and Time‐To‐Event Data: With Applications in R (Boca Raton, FL: CRC Press, 2012).
C. Thomadakis, L. Meligkotsidou, N. Pantazis, and G. Touloumi, “Longitudinal and Time‐To‐Drop‐Out Joint Models Can Lead to Seriously Biased Estimates When the Drop‐Out Mechanism Is at Random,” Biometrics 75, no. 1 (2019): 58–68, https://doi.org/10.1111/biom.12986.
P. Besse and J. O. Ramsay, “Principal Components Analysis of Sampled Functions,” Psychometrika 51, no. 2 (1986): 285–311, https://doi.org/10.1007/BF02293986.
H. L. Shang, “A Survey of Functional Principal Component Analysis,” AStA Advances in Statistical Analysis 98, no. 2 (2014): 121–142, https://doi.org/10.1007/s10182‐013‐0213‐1.
K. Karhunen, “Zur Spektraltheorie Stochastischer Prozesse,” (1946).
M. Loève, “Fonctions Aléatoires à Décomposition Orthogonale Exponentielle,” La Revue Scientifique 84 (1946): 159–162.
J. A. Rice and B. W. Silverman, “Estimating the Mean and Covariance Structure Nonparametrically When the Data are Curves,” Journal of the Royal Statistical Society: Series B: Methodological 53, no. 1 (1991): 233–243.
J. Ramsay, “Fda: Functional Data Analysis,” (2023), R package version 6.1.4.
C. Happ and S. Greven, “Multivariate Functional Principal Component Analysis for Data Observed on Different (Dimensional) Domains,” Journal of the American Statistical Association 113 (2018): 649–659, https://doi.org/10.1080/01621459.2016.1273115.
L. Xiao, C. Li, W. Checkley, and C. Crainiceanu, “Face: Fast Covariance Estimation for Sparse Functional Data,” (2022), R package version 0.1‐7.
Y. Zhou, S. Bhattacharjee, C. Carroll, et al., “Fdapace: Functional Data Analysis and Empirical Dynamics,” (2022), R package version 0.5.9.
T. P. Morris, I. R. White, and M. J. Crowther, “Using Simulation Studies to Evaluate Statistical Methods,” Statistics in Medicine 38, no. 11 (2019): 2074–2102, https://doi.org/10.1002/sim.8086.
F. Miguez, “Nlraa: Nonlinear Regression for Agricultural Applications,” (2022), R package version 1.5.
C. Proust‐Lima, V. Philipps, A. Diakite, and B. Liquet, “Lcmm: Extended Mixed Models Using Latent Classes and Latent Processes,” (2023), R package version: 2.0.2.
D. Rizopoulos, “JM: An R Package for the Joint Modelling of Longitudinal and Time‐To‐Event Data,” Journal of Statistical Software 35, no. 9 (2010): 1–33.
B. F. Kurland, L. L. Johnson, B. L. Egleston, and P. H. Diehr, “Longitudinal Data With Follow‐Up Truncated by Death: Match the Analysis Method to Research Aims,” Statistical Science: A Review Journal of the Institute of Mathematical Statistics 24, no. 2 (2009): 211, https://doi.org/10.1214/09‐STS293.
A. Rouanet, C. Helmer, J. F. Dartigues, and H. Jacqmin‐Gadda, “Interpretation of Mixed Models and Marginal Models With Cohort Attrition Due to Death and Drop‐Out,” Statistical Methods in Medical Research 28, no. 2 (2019): 343–356, https://doi.org/10.1177/0962280217723675.
C. Weaver, L. Xiao, and W. Lu, “Functional Data Analysis for Longitudinal Data With Informative Observation Times,” Biometrics 79, no. 2 (2023): 722–733, https://doi.org/10.1111/biom.13646.
G. Xu, J. Zhang, Y. Li, and Y. Guan, “Bias‐Correction and Test for Mark‐Point Dependence With Replicated Marked Point Processes,” Journal of the American Statistical Association 119, no. 545 (2024): 217–231, https://doi.org/10.1080/01621459.2022.2106234.
X. Zhang and J. L. Wang, “From Sparse to Dense Functional Data and Beyond,” Annals of Statistics 44, no. 5 (2016): 2281–2321.
H. Jacqmin‐Gadda, D. Commenges, and J. F. Dartigues, “Random Changepoint Model for Joint Modeling of Cognitive Decline and Dementia,” Biometrics 62, no. 1 (2006): 254–260, https://doi.org/10.1111/j.1541‐0420.2005.00443.x.
H. Amieva, H. Mokri, M. Le Goff, et al., “Compensatory Mechanisms in Higher‐Educated Subjects With Alzheimer's Disease: A Study of 20 Years of Cognitive Decline,” Brain: A Journal of Neurology 137, no. Pt 4 (2014): 1167–1175, https://doi.org/10.1093/brain/awu035.
A. Dominicus, S. Ripatti, N. L. Pedersen, and J. Palmgren, “A Random Change Point Model for Assessing Variability in Repeated Measures of Cognitive Function,” Statistics in Medicine 27, no. 27 (2008): 5786–5798, https://doi.org/10.1002/sim.3380.
C. Segalas, H. Amieva, and H. Jacqmin‐Gadda, “A Hypothesis Testing Procedure for Random Changepoint Mixed Models,” Statistics in Medicine 38, no. 20 (2019): 3791–3803, https://doi.org/10.1002/sim.8195.
A. R. Rao and M. Reimherr, “Modern Multiple Imputation With Functional Data,” Stat 10, no. 1 (2021): e331, https://doi.org/10.1002/sta4.331.
A. Ciarleglio, E. Petkova, and O. Harel, “Elucidating Age and Sex‐Dependent Association Between Frontal EEG Asymmetry and Depression: An Application of Multiple Imputation in Functional Regression,” Journal of the American Statistical Association 117, no. 537 (2022): 12–26, https://doi.org/10.1080/01621459.2021.1942011.