On Support Recovery with Sparse CCA: Information Theoretic and Computational Limits.
Canonical Correlation Analysis
High Dimension
Low Degree Polynomials
Support Recovery
Variable Selection
Journal
IEEE transactions on information theory
ISSN: 0018-9448
Titre abrégé: IEEE Trans Inf Theory
Pays: United States
ID NLM: 9880925
Informations de publication
Date de publication:
Mar 2023
Mar 2023
Historique:
pmc-release:
01
03
2024
medline:
16
10
2023
pubmed:
16
10
2023
entrez:
16
10
2023
Statut:
ppublish
Résumé
In this paper, we consider asymptotically exact support recovery in the context of high dimensional and sparse Canonical Correlation Analysis (CCA). Our main results describe four regimes of interest based on information theoretic and computational considerations. In regimes of "low" sparsity we describe a simple, general, and computationally easy method for support recovery, whereas in a regime of "high" sparsity, it turns out that support recovery is information theoretically impossible. For the sake of information theoretic lower bounds, our results also demonstrate a non-trivial requirement on the "minimal" size of the nonzero elements of the canonical vectors that is required for asymptotically consistent support recovery. Subsequently, the regime of "moderate" sparsity is further divided into two subregimes. In the lower of the two sparsity regimes, we show that polynomial time support recovery is possible by using a sharp analysis of a co-ordinate thresholding [1] type method. In contrast, in the higher end of the moderate sparsity regime, appealing to the "Low Degree Polynomial" Conjecture [2], we provide evidence that polynomial time support recovery methods are inconsistent. Finally, we carry out numerical experiments to compare the efficacy of various methods discussed.
Identifiants
pubmed: 37842015
doi: 10.1109/tit.2022.3214201
pmc: PMC10569110
mid: NIHMS1875481
doi:
Types de publication
Journal Article
Langues
eng
Pagination
1695-1738Subventions
Organisme : NIEHS NIH HHS
ID : P42 ES030990
Pays : United States
Références
J Am Stat Assoc. 2009 Jun 1;104(486):682-693
pubmed: 20617121
BMC Bioinformatics. 2009 Jan 26;10:34
pubmed: 19171069
Stat Sin. 2021 Jan;31(1):333-360
pubmed: 35046630
Biometrics. 2019 Sep;75(3):734-744
pubmed: 30714093
Ann Appl Stat. 2013 Mar 1;7(1):523-542
pubmed: 23745156
Neuroimage. 2003 Jul;19(3):837-45
pubmed: 12880812
PLoS One. 2022 Dec 30;17(12):e0276886
pubmed: 36584096
J Neural Eng. 2009 Aug;6(4):046002
pubmed: 19494422
Biostatistics. 2009 Jul;10(3):515-34
pubmed: 19377034
Neural Comput. 2004 Dec;16(12):2639-64
pubmed: 15516276
Neuroimage. 2010 Apr 15;50(3):1004-16
pubmed: 20083207
Stat Appl Genet Mol Biol. 2008;7(1):Article3
pubmed: 18241193