Partition: a surjective mapping approach for dimensionality reduction.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
01 02 2020
Historique:
received: 14 03 2019
revised: 22 05 2019
accepted: 20 08 2019
pubmed: 11 9 2019
medline: 18 9 2020
entrez: 11 9 2019
Statut: ppublish

Résumé

Large amounts of information generated by genomic technologies are accompanied by statistical and computational challenges due to redundancy, badly behaved data and noise. Dimensionality reduction (DR) methods have been developed to mitigate these challenges. However, many approaches are not scalable to large dimensions or result in excessive information loss. The proposed approach partitions data into subsets of related features and summarizes each into one and only one new feature, thus defining a surjective mapping. A constraint on information loss determines the size of the reduced dataset. Simulation studies demonstrate that when multiple related features are associated with a response, this approach can substantially increase the number of true associations detected as compared to principal components analysis, non-negative matrix factorization or no DR. This increase in true discoveries is explained both by a reduced multiple-testing challenge and a reduction in extraneous noise. In an application to real data collected from metastatic colorectal cancer tumors, more associations between gene expression features and progression free survival and response to treatment were detected in the reduced than in the full untransformed dataset. Freely available R package from CRAN, https://cran.r-project.org/package=partition. Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 31504178
pii: 5554652
doi: 10.1093/bioinformatics/btz661
pmc: PMC8215926
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

676-681

Subventions

Organisme : NIA NIH HHS
ID : P01 AG055367
Pays : United States
Organisme : NCI NIH HHS
ID : P01 CA196569
Pays : United States
Organisme : NCI NIH HHS
ID : P30 CA014089
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL118455
Pays : United States

Informations de copyright

© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Références

Proc Natl Acad Sci U S A. 2004 Mar 23;101(12):4164-9
pubmed: 15016911
Cell. 1995 Jun 16;81(6):957-66
pubmed: 7781071
BMC Bioinformatics. 2008 May 20;9:244
pubmed: 18492285
Nat Rev Genet. 2018 May;19(5):299-310
pubmed: 29479082
Nat Genet. 1997 Sep;17(1):88-91
pubmed: 9288104
Oncotarget. 2016 Nov 15;7(46):75561-75570
pubmed: 28070019
BMC Syst Biol. 2007 Nov 21;1:54
pubmed: 18031580
Genome Biol. 2010;11(10):R106
pubmed: 20979621
Transl Neurodegener. 2015 Oct 26;4:20
pubmed: 26504519
Ageing Res Rev. 2018 Mar;42:72-85
pubmed: 29288112
Bioinformatics. 2015 Jun 1;31(11):1754-61
pubmed: 25619995
Dis Markers. 2015;2015:657570
pubmed: 25802477
Oncotarget. 2016 Apr 26;7(17):23897-908
pubmed: 27004403
Pac Symp Biocomput. 2000;:418-29
pubmed: 10902190
Cancer Res. 2017 Apr 1;77(7):1542-1547
pubmed: 28330929
Bioinformatics. 2019 Jul 1;35(13):2276-2282
pubmed: 30462147
Lancet Oncol. 2014 Sep;15(10):1065-75
pubmed: 25088940
J Investig Med. 2015 Jun;63(5):740-6
pubmed: 25929234
Bioinformatics. 2010 Feb 1;26(3):440-3
pubmed: 19880370
Histol Histopathol. 2002 Jan;17(1):289-300
pubmed: 11813878
Oncotarget. 2017 Nov 30;8(65):109632-109645
pubmed: 29312635
Genome Biol. 2015 Apr 08;16:70
pubmed: 25887564
Nature. 2016 Mar 3;531(7592):47-52
pubmed: 26909576
Natl Sci Rev. 2014 Jun;1(2):293-314
pubmed: 25419469
Methods. 2016 Dec 1;111:21-31
pubmed: 27592382
Drug Metab Dispos. 2005 Mar;33(3):434-9
pubmed: 15608127
Stat Appl Genet Mol Biol. 2012 Oct 22;11(5):
pubmed: 23104842
Neuroimage. 2010 May 1;50(4):1519-35
pubmed: 20056158
Oncogene. 2017 Nov 16;36(46):6490-6500
pubmed: 28759041

Auteurs

Joshua Millstein (J)

Department of Preventive Medicine, CA 90033, USA.

Francesca Battaglin (F)

Department of Medicine, Division of Medical Oncology, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA.
Clinical and Experimental Oncology Department, Medical Oncology Unit 1, Veneto Institute of Oncology IOV-IRCCS, Padua 35128, Italy.

Malcolm Barrett (M)

Department of Preventive Medicine, CA 90033, USA.

Shu Cao (S)

Department of Preventive Medicine, CA 90033, USA.

Wu Zhang (W)

Department of Medicine, Division of Medical Oncology, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA.

Sebastian Stintzing (S)

Medical Department, Division of Oncology and Hematology, Charité Universitaetsmedizin Berlin, Berlin 10117, Germany.

Volker Heinemann (V)

Department of Medicine III, University Hospital Munich, Munich 80336, Germany.

Heinz-Josef Lenz (HJ)

Department of Medicine, Division of Medical Oncology, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90033, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH