PCovR2: A flexible principal covariates regression approach to parsimoniously handle multiple criterion variables.

Dimension reduction Multiple criteria PLS2 Principal covariates regression Regression models

Journal

Behavior research methods
ISSN: 1554-3528
Titre abrégé: Behav Res Methods
Pays: United States
ID NLM: 101244316

Informations de publication

Date de publication:
08 2021
Historique:
accepted: 02 11 2020
pubmed: 10 1 2021
medline: 7 9 2021
entrez: 9 1 2021
Statut: ppublish

Résumé

Principal covariates regression (PCovR) allows one to deal with the interpretational and technical problems associated with running ordinary regression using many predictor variables. In PCovR, the predictor variables are reduced to a limited number of components, and simultaneously, criterion variables are regressed on these components. By means of a weighting parameter, users can flexibly choose how much they want to emphasize reconstruction and prediction. However, when datasets contain many criterion variables, PCovR users face new interpretational problems, because many regression weights will be obtained and because some criteria might be unrelated to the predictors. We therefore propose PCovR2, which extends PCovR by also reducing the criteria to a few components. These criterion components are predicted based on the predictor components. The PCovR2 weighting parameter can again be flexibly used to focus on the reconstruction of the predictors and criteria, or on filtering out relevant predictor components and predictable criterion components. We compare PCovR2 to two other approaches, based on partial least squares (PLS) and principal components regression (PCR), that also reduce the criteria and are therefore called PLS2 and PCR2. By means of a simulated example, we show that PCovR2 outperforms PLS2 and PCR2 when one aims to recover all relevant predictor components and predictable criterion components. Moreover, we conduct a simulation study to evaluate how well PCovR2, PLS2 and PCR2 succeed in finding (1) all underlying components and (2) the subset of relevant predictor and predictable criterion components. Finally, we illustrate the use of PCovR2 by means of empirical data.

Identifiants

pubmed: 33420716
doi: 10.3758/s13428-020-01508-y
pii: 10.3758/s13428-020-01508-y
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1648-1668

Informations de copyright

© 2021. The Psychonomic Society, Inc.

Références

Acar, E., Kolda, T., & Dunlavy, D. (2011). All-at-once optimization for coupled matrix and tensor factorizations. Mining and Learning with Graphs (MLG'11). arXiv preprintarXiv:1105.3422
Acar, E., Papalexakis, E. E., Gürdeniz, G., Rasmussen, M. A., L. A., Nilsson, M., & Bro, R. (2014). Structure-revealing data fusion. BMC bioinformatics, 15(1), 239.
doi: 10.1186/1471-2105-15-239
Aldrin, M. (2002). Reduced-rank regression, in: A.H. El-Shaarawi, W.W. Piegorsch (Eds.). In Encyclopedia of Environmetrics (pp. 1724–1728). Chichester: John Wiley & Sons.
Bodner, N., Kuppens, P., Allen, N. B., Sheeber, L. B., & Ceulemans, E. (2018). Affective family interactions and their associations with adolescent depression: A dynamic network approach. Development and psychopathology, 30(4), 1459-1473.
doi: 10.1017/S0954579417001699
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Monterey: Brooks/Cole Publishing.
Bulteel, K., Tuerlinckx, F., Brose, A., & Ceulemans, E. (2018). Improved insight into and prediction of network dynamics by combining VAR and dimension reduction. Multivariate behavioral research, 53(6), 853-875.
Butler, E. A. (2011). Temporal interpersonal emotion systems: The "TIES" that form relationships. Personality and Social Psychology Review, 15(4), 367-393.
Cattell, R. B. (1966). The scree test for the number of factors. Multivariate Behav Res., 1(2), 245-276.
doi: 10.1207/s15327906mbr0102_10
Ceulemans, E., & Kiers, H. A. (2006). Selecting among three-mode principal component models of different types and complexities: A numerical convex hull based method. British journal of mathematical and statistical psychology, 59(1), 133-150.
doi: 10.1348/000711005X64817
Ceulemans, E., & Kiers, H. A. (2009). Discriminating between strong and weak structures in three-mode principal component analysis. British Journal of Mathematical and Statistical Psychology, 62(3), 601-620.
doi: 10.1348/000711008X369474
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed.). Mahwah, NJ, US: Lawrence Erlbaum Associates Publishers.
De Jong, S., & Kiers, H. A. (1992). Principal covariates regression: Part I. Theory. Chemometrics and Intelligent Laboratory Systems, 14(1-3), 155-164.
De Lathauwer, L., & Kofidis, E. (2017). Coupled matrix-tensor factorizations — The case of partially shared factors. 2017 51st Asilomar Conference on Signals, Systems, and Computers (pp. 711-715). Pacific Grove, CA: IEEE.
Epskamp, S., van Borkulo, C. D., van der Veen, D. C., Servaas, M. N., Isvoranu, A. M., Riese, H., & Cramer, A. O. (2018). Personalized network modeling in psychopathology: The importance of contemporaneous and temporal connections. Clinical Psychological Science, 6(3), 416-427.
doi: 10.1177/2167702617744325
Gates, K. M., & Liu, S. (2016). Methods for quantifying patterns of dynamic interactions in dyads. Assessment, 23(4), 459-471.
doi: 10.1177/1073191116641508
Gu, Z., & Van Deun, K. (2018). RegularizedSCA: Regularized simultaneous component analysis of multiblock data in R. Behavior research methods, 51(5), 2268-2289.
Hamaker, E. L., Asparouhov, T., Brose, A., Schmiedek, F., & Muthén, B. (2018). At the frontiers of modeling intensive longitudinal data: Dynamic structural equation models for the affective measurements from the COGITO study. Multivariate Behavioral Research, 53(6), 820-841.
Jolliffe, I. T. (1982). A note on the use of principal components in regression. Applied Statistics, 31(3), 300-303.
doi: 10.2307/2348005
Kiers, H. A. (2002). Setting up alternating least squares and iterative majorization algorithms for solving various matrix optimization problems. Computational Statistics & Data Analysis, 41(1), 157-170.
doi: 10.1016/S0167-9473(02)00142-1
Kiers, H. A., & Smilde, A. K. (2007). A comparison of various methods for multivariate regression with highly collinear variables. Statistical Methods and Applications, 16(2), 193-228.
doi: 10.1007/s10260-006-0025-5
Korth, B., & Tucker, L. R. (1975). The distribution of chance congruence coefficients from simulated data. Psychometrika, 40(3), 361-372.
Lorenzo-Seva, U., & ten Berge, J. M. (2006). Tucker's congruence coefficient as a meaningful index of factor similarity. Methodology, 2(2), 57-64.
doi: 10.1027/1614-2241.2.2.57
Manne, R. (1987). Analysis of two partial-least-squares algorithms for multivariate calibration. Chemometrics and Intelligent Laboratory Systems, 2, 187-197.
doi: 10.1016/0169-7439(87)80096-5
Meinshausen, N., & Bühlmann, P. (2010). Stability selection. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(4), 417-473.
doi: 10.1111/j.1467-9868.2010.00740.x
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(11), 559-572.
Schneider, B., & Linda, W. (2008). The 500 family study [1998-2000: United states]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research. https://doi.org/10.3886/ICPSR04549.v1
Van Deun, K., Crompvoets, E. A., & Ceulemans, E. (2018). Obtaining insights from high-dimensional data: Sparse principal covariates regression. BMC bioinformatics, 19(1), 104.
Vervloet, M., Kiers, H. A., Van den Noortgate, W., & Ceulemans, E. (2015). PCovR: An R package for principal covariates regression. Journal of Statistical Software, 65(8), 1-14.
doi: 10.18637/jss.v065.i08
Vervloet, M., Van den Noortgate, W., & Ceulemans, E. (2018). Retrieving relevant factors with exploratory SEM and principal-covariate regression: A comparison. Behavior research methods, 50(4), 1430-1445.
doi: 10.3758/s13428-018-1022-y
Vervloet, M., Van Deun, K., Van den Noortgate, W., & Ceulemans, E. (2013). On the selection of the weighting parameter value in Principal Covariates Regression. Chemometrics and Intelligent Laboratory Systems, 123, 36-43.
doi: 10.1016/j.chemolab.2013.02.005
Vervloet, M., Van Deun, K., Van den Noortgate, W., & Ceulemans, E. (2016). Model selection in principal covariates regression. Chemometrics and Intelligent Laboratory Systems, 151, 26-33.
doi: 10.1016/j.chemolab.2015.12.004
Wilderjans, T. F., Ceulemans, E., & Meers, K. (2013). CHull: A generic convex-hull-based model selection method. Behavior research methods, 45(1), 1-15.
doi: 10.3758/s13428-012-0238-5
Wilderjans, T. F., Ceulemans, E., & Van Mechelen, I. (2009). Simultaneous analysis of coupled data blocks differing in size: A comparison of two weighting schemes. Computational Statistics & Data Analysis, 53(4), 1086-1098.
doi: 10.1016/j.csda.2008.09.031
Wilderjans, T. F., Ceulemans, E., Van Mechelen, I., & van den Berg, R. A. (2011). Simultaneous analysis of coupled data matrices subject to different amounts of noise. British Journal of Mathematical and Statistical Psychology, 64(2), 277-290.
doi: 10.1348/000711010X513263
Wold, S., Ruhe, A., Wold, H., & Dunn III, W. J. (1984). The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM Journal of Statistics and Computations, 5(3), 735-743.
doi: 10.1137/0905052

Auteurs

Sopiko Gvaladze (S)

KU Leuven, Tiensestraat 102, 3000, Leuven, Belgium.

Marlies Vervloet (M)

KU Leuven, Tiensestraat 102, 3000, Leuven, Belgium.

Katrijn Van Deun (K)

Department of Methodology and Statistics, Tilburg University, Tilburg, The Netherlands.

Henk A L Kiers (HAL)

Department of Psychology, University of Groningen, Groningen, The Netherlands.

Eva Ceulemans (E)

KU Leuven, Tiensestraat 102, 3000, Leuven, Belgium. eva.ceulemans@kuleuven.be.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH