Partition: a surjective mapping approach for dimensionality reduction.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 02 2020
01 02 2020
Historique:
received:
14
03
2019
revised:
22
05
2019
accepted:
20
08
2019
pubmed:
11
9
2019
medline:
18
9
2020
entrez:
11
9
2019
Statut:
ppublish
Résumé
Large amounts of information generated by genomic technologies are accompanied by statistical and computational challenges due to redundancy, badly behaved data and noise. Dimensionality reduction (DR) methods have been developed to mitigate these challenges. However, many approaches are not scalable to large dimensions or result in excessive information loss. The proposed approach partitions data into subsets of related features and summarizes each into one and only one new feature, thus defining a surjective mapping. A constraint on information loss determines the size of the reduced dataset. Simulation studies demonstrate that when multiple related features are associated with a response, this approach can substantially increase the number of true associations detected as compared to principal components analysis, non-negative matrix factorization or no DR. This increase in true discoveries is explained both by a reduced multiple-testing challenge and a reduction in extraneous noise. In an application to real data collected from metastatic colorectal cancer tumors, more associations between gene expression features and progression free survival and response to treatment were detected in the reduced than in the full untransformed dataset. Freely available R package from CRAN, https://cran.r-project.org/package=partition. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 31504178
pii: 5554652
doi: 10.1093/bioinformatics/btz661
pmc: PMC8215926
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
676-681Subventions
Organisme : NIA NIH HHS
ID : P01 AG055367
Pays : United States
Organisme : NCI NIH HHS
ID : P01 CA196569
Pays : United States
Organisme : NCI NIH HHS
ID : P30 CA014089
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL118455
Pays : United States
Informations de copyright
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Références
Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9
pubmed: 15016911
Cell. 1995 Jun 16;81(6):957-66
pubmed: 7781071
BMC Bioinformatics. 2008 May 20;9:244
pubmed: 18492285
Nat Rev Genet. 2018 May;19(5):299-310
pubmed: 29479082
Nat Genet. 1997 Sep;17(1):88-91
pubmed: 9288104
Oncotarget. 2016 Nov 15;7(46):75561-75570
pubmed: 28070019
BMC Syst Biol. 2007 Nov 21;1:54
pubmed: 18031580
Genome Biol. 2010;11(10):R106
pubmed: 20979621
Transl Neurodegener. 2015 Oct 26;4:20
pubmed: 26504519
Ageing Res Rev. 2018 Mar;42:72-85
pubmed: 29288112
Bioinformatics. 2015 Jun 1;31(11):1754-61
pubmed: 25619995
Dis Markers. 2015;2015:657570
pubmed: 25802477
Oncotarget. 2016 Apr 26;7(17):23897-908
pubmed: 27004403
Pac Symp Biocomput. 2000;:418-29
pubmed: 10902190
Cancer Res. 2017 Apr 1;77(7):1542-1547
pubmed: 28330929
Bioinformatics. 2019 Jul 1;35(13):2276-2282
pubmed: 30462147
Lancet Oncol. 2014 Sep;15(10):1065-75
pubmed: 25088940
J Investig Med. 2015 Jun;63(5):740-6
pubmed: 25929234
Bioinformatics. 2010 Feb 1;26(3):440-3
pubmed: 19880370
Histol Histopathol. 2002 Jan;17(1):289-300
pubmed: 11813878
Oncotarget. 2017 Nov 30;8(65):109632-109645
pubmed: 29312635
Genome Biol. 2015 Apr 08;16:70
pubmed: 25887564
Nature. 2016 Mar 3;531(7592):47-52
pubmed: 26909576
Natl Sci Rev. 2014 Jun;1(2):293-314
pubmed: 25419469
Methods. 2016 Dec 1;111:21-31
pubmed: 27592382
Drug Metab Dispos. 2005 Mar;33(3):434-9
pubmed: 15608127
Stat Appl Genet Mol Biol. 2012 Oct 22;11(5):
pubmed: 23104842
Neuroimage. 2010 May 1;50(4):1519-35
pubmed: 20056158
Oncogene. 2017 Nov 16;36(46):6490-6500
pubmed: 28759041