Multiblock variable influence on orthogonal projections (MB-VIOP) for enhanced interpretation of total, global, local and unique variations in OnPLS models.

Data Analysis Data Mining Multivariate Analysis Reproducibility of Results

Feature selection Latent variable interpretation MB-VIOP Multiblock variable selection OnPLS VIP Variable importance in multiblock regression Variable influence on projection

Journal

BMC bioinformatics

ISSN: 1471-2105

Titre abrégé: BMC Bioinformatics

Pays: England

ID NLM: 100965194

Informations de publication

Date de publication:
03 Apr 2021

Historique:

received: 16 07 2020

accepted: 10 02 2021

entrez: 4 4 2021

pubmed: 5 4 2021

medline: 10 4 2021

Statut: epublish

Résumé

For multivariate data analysis involving only two input matrices (e.g., X and Y), the previously published methods for variable influence on projection (e.g., VIP A method for variable selection in multiblock analysis, called multiblock variable influence on orthogonal projections (MB-VIOP) is explained in this paper. MB-VIOP is a model based variable selection method that uses the data matrices, the scores and the normalized loadings of an OnPLS model in order to sort the input variables of more than two data matrices according to their importance for both simplification and interpretation of the total multiblock model, and also of the unique, local and global model components separately. MB-VIOP has been tested using three datasets: a synthetic four-block dataset, a real three-block omics dataset related to plant sciences, and a real six-block dataset related to the food industry. We provide evidence for the usefulness and reliability of MB-VIOP by means of three examples (one synthetic and two real-world cases). MB-VIOP assesses in a trustable and efficient way the importance of both isolated and ranges of variables in any type of data. MB-VIOP connects the input variables of different data matrices according to their relevance for the interpretation of each latent variable, yielding enhanced interpretability for each OnPLS model component. Besides, MB-VIOP can deal with strong overlapping of types of variation, as well as with many data blocks with very different dimensionality. The ability of MB-VIOP for generating dimensionality reduced models with high interpretability makes this method ideal for big data mining, multi-omics data integration and any study that requires exploration and interpretation of large streams of data.

Sections du résumé

BACKGROUND BACKGROUND

For multivariate data analysis involving only two input matrices (e.g., X and Y), the previously published methods for variable influence on projection (e.g., VIP

RESULTS RESULTS

A method for variable selection in multiblock analysis, called multiblock variable influence on orthogonal projections (MB-VIOP) is explained in this paper. MB-VIOP is a model based variable selection method that uses the data matrices, the scores and the normalized loadings of an OnPLS model in order to sort the input variables of more than two data matrices according to their importance for both simplification and interpretation of the total multiblock model, and also of the unique, local and global model components separately. MB-VIOP has been tested using three datasets: a synthetic four-block dataset, a real three-block omics dataset related to plant sciences, and a real six-block dataset related to the food industry.

CONCLUSIONS CONCLUSIONS

We provide evidence for the usefulness and reliability of MB-VIOP by means of three examples (one synthetic and two real-world cases). MB-VIOP assesses in a trustable and efficient way the importance of both isolated and ranges of variables in any type of data. MB-VIOP connects the input variables of different data matrices according to their relevance for the interpretation of each latent variable, yielding enhanced interpretability for each OnPLS model component. Besides, MB-VIOP can deal with strong overlapping of types of variation, as well as with many data blocks with very different dimensionality. The ability of MB-VIOP for generating dimensionality reduced models with high interpretability makes this method ideal for big data mining, multi-omics data integration and any study that requires exploration and interpretation of large streams of data.

Identifiants

DOI: 10.1186/s12859-021-04015-9 PMID: 33812384 PMC: PMC8019512

pubmed: 33812384

doi: 10.1186/s12859-021-04015-9

pii: 10.1186/s12859-021-04015-9

pmc: PMC8019512

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

176

Références

J Proteome Res. 2009 Jan;8(1):199-210

pubmed: 19053836

Anal Chim Acta. 2016 Jan 1;902:70-81

pubmed: 26703255

Psychometrika. 1966 Sep;31(3):413-9

pubmed: 5221135

Mol Syst Biol. 2018 Jun 20;14(6):e8124

pubmed: 29925568

BMC Bioinformatics. 2020 Jan 9;21(1):9

pubmed: 31918677

Anal Chim Acta. 2013 Aug 12;791:13-24

pubmed: 23890602

Biostatistics. 2014 Jul;15(3):569-83

pubmed: 24550197

Anal Chem. 2018 Nov 20;90(22):13400-13408

pubmed: 30335973

PLoS Comput Biol. 2017 Nov 3;13(11):e1005752

pubmed: 29099853

J Pharm Biomed Anal. 1991;9(8):625-35

pubmed: 1790182

Ann Appl Stat. 2013 Mar 1;7(1):523-542

pubmed: 23745156

BMC Bioinformatics. 2018 Oct 11;19(1):371

pubmed: 30309317

Stat Appl Genet Mol Biol. 2008;7(1):Article 35

pubmed: 19049491

Anal Chim Acta. 2010 Sep 30;678(2):195-202

pubmed: 20888452

Food Res Int. 2016 Sep;87:142-151

pubmed: 29606235

Psychol Methods. 2009 Jun;14(2):81-100

pubmed: 19485623

Multiblock variable influence on orthogonal projections (MB-VIOP) for enhanced interpretation of total, global, local and unique variations in OnPLS models.

Journal

Informations de publication

Résumé

Sections du résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Références

Auteurs

Beatriz Galindo-Prieto (B)

Paul Geladi (P)

Johan Trygg (J)

Articles similaires

Relative victimization scale: initial development and retrospective reports of the impact on mental health.

Cultural adaptation and validation of the Sinhala version of the spiritual needs assessment for patients (S-SNAP) questionnaire.

Development of the two-factor modified Kids Eating Disorder Survey (M-KEDS): a validation study with hispanic adolescents.

Use of Artificial Intelligence in Cobb Angle Measurement for Scoliosis: Retrospective Reliability and Accuracy Study of a Mobile App.

Classifications MeSH