A network approach for low dimensional signatures from high throughput data.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
23 12 2022
23 12 2022
Historique:
received:
14
06
2022
accepted:
30
11
2022
entrez:
23
12
2022
pubmed:
24
12
2022
medline:
28
12
2022
Statut:
epublish
Résumé
One of the main objectives of high-throughput genomics studies is to obtain a low-dimensional set of observables-a signature-for sample classification purposes (diagnosis, prognosis, stratification). Biological data, such as gene or protein expression, are commonly characterized by an up/down regulation behavior, for which discriminant-based methods could perform with high accuracy and easy interpretability. To obtain the most out of these methods features selection is even more critical, but it is known to be a NP-hard problem, and thus most feature selection approaches focuses on one feature at the time (k-best, Sequential Feature Selection, recursive feature elimination). We propose DNetPRO, Discriminant Analysis with Network PROcessing, a supervised network-based signature identification method. This method implements a network-based heuristic to generate one or more signatures out of the best performing feature pairs. The algorithm is easily scalable, allowing efficient computing for high number of observables ([Formula: see text]-[Formula: see text]). We show applications on real high-throughput genomic datasets in which our method outperforms existing results, or is compatible with them but with a smaller number of selected features. Moreover, the geometrical simplicity of the resulting class-separation surfaces allows a clearer interpretation of the obtained signatures in comparison to nonlinear classification models.
Identifiants
pubmed: 36564421
doi: 10.1038/s41598-022-25549-9
pii: 10.1038/s41598-022-25549-9
pmc: PMC9789141
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
22253Informations de copyright
© 2022. The Author(s).
Références
Animals (Basel). 2020 Feb 05;10(2):
pubmed: 32033399
Genome Res. 2014 Jan;24(1):14-24
pubmed: 24092820
Bioinformatics. 2019 Oct 15;35(20):4200-4202
pubmed: 30903160
Nat Biotechnol. 2014 Jul;32(7):644-52
pubmed: 24952901
Genome Res. 2015 Nov;25(11):1610-21
pubmed: 26297486
Oncotarget. 2016 Mar 1;7(9):9666-79
pubmed: 26575327
Bioinformatics. 2005 Oct 15;21(20):3896-904
pubmed: 16105897
BMC Bioinformatics. 2009 Aug 20;10:256
pubmed: 19695104
Nat Commun. 2021 Apr 15;12(1):2277
pubmed: 33859189
Nucleic Acids Res. 2015 Jul 13;43(12):e79
pubmed: 25829177
J Alzheimers Dis. 2019;72(3):911-918
pubmed: 31658056
IEEE/ACM Trans Comput Biol Bioinform. 2012 Sep-Oct;9(5):1422-31
pubmed: 22547432
PLoS Comput Biol. 2015 Apr 08;11(4):e1004120
pubmed: 25853560
Cell. 2015 Jul 2;162(1):184-97
pubmed: 26095251
J Clin Oncol. 2009 May 1;27(13):2209-16
pubmed: 19307502
Annu Rev Genomics Hum Genet. 2011;12:217-44
pubmed: 21721939
Genome Med. 2016 Dec 19;8(1):134
pubmed: 27993174
Pharmacotherapy. 2017 Sep;37(9):988-989
pubmed: 28632968
Stat Appl Genet Mol Biol. 2004;3:Article19
pubmed: 16646797
Brief Bioinform. 2021 Nov 5;22(6):
pubmed: 34010955
Bioinformatics. 2009 May 1;25(9):1203-4
pubmed: 19276151