OptiMissP: A dashboard to assess missingness in proteomic data-independent acquisition mass spectrometry.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2021
2021
Historique:
received:
09
10
2020
accepted:
24
03
2021
entrez:
15
4
2021
pubmed:
16
4
2021
medline:
5
10
2021
Statut:
epublish
Résumé
Missing values are a key issue in the statistical analysis of proteomic data. Defining the strategy to address missing values is a complex task in each study, potentially affecting the quality of statistical analyses. We have developed OptiMissP, a dashboard to visually and qualitatively evaluate missingness and guide decision making in the handling of missing values in proteomics studies that use data-independent acquisition mass spectrometry. It provides a set of visual tools to retrieve information about missingness through protein densities and topology-based approaches, and facilitates exploration of different imputation methods and missingness thresholds. OptiMissP provides support for researchers' and clinicians' qualitative assessment of missingness in proteomic datasets in order to define study-specific strategies for the handling of missing values. OptiMissP considers biases in protein distributions related to the choice of imputation method and helps analysts to balance the information loss caused by low missingness thresholds and the noise introduced by selecting high missingness thresholds. This is complemented by topological data analysis which provides additional insight to the structure of the data and their missingness. We use an example in Chronic Kidney Disease to illustrate the main functionalities of OptiMissP.
Sections du résumé
BACKGROUND
Missing values are a key issue in the statistical analysis of proteomic data. Defining the strategy to address missing values is a complex task in each study, potentially affecting the quality of statistical analyses.
RESULTS
We have developed OptiMissP, a dashboard to visually and qualitatively evaluate missingness and guide decision making in the handling of missing values in proteomics studies that use data-independent acquisition mass spectrometry. It provides a set of visual tools to retrieve information about missingness through protein densities and topology-based approaches, and facilitates exploration of different imputation methods and missingness thresholds.
CONCLUSIONS
OptiMissP provides support for researchers' and clinicians' qualitative assessment of missingness in proteomic datasets in order to define study-specific strategies for the handling of missing values. OptiMissP considers biases in protein distributions related to the choice of imputation method and helps analysts to balance the information loss caused by low missingness thresholds and the noise introduced by selecting high missingness thresholds. This is complemented by topological data analysis which provides additional insight to the structure of the data and their missingness. We use an example in Chronic Kidney Disease to illustrate the main functionalities of OptiMissP.
Identifiants
pubmed: 33857200
doi: 10.1371/journal.pone.0249771
pii: PONE-D-20-31782
pmc: PMC8049317
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0249771Subventions
Organisme : Medical Research Council
ID : MR/R013942/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/N00583X/1
Pays : United Kingdom
Organisme : Medical Research Council
ID : MR/M008959/1
Pays : United Kingdom
Organisme : Department of Health
Pays : United Kingdom
Organisme : Cancer Research UK
ID : C5759/A25254
Pays : United Kingdom
Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Sci Rep. 2013;3:1236
pubmed: 23393618
J Proteome Res. 2019 Sep 6;18(9):3369-3382
pubmed: 31408348
J Proteome Res. 2021 Jan 1;20(1):1-13
pubmed: 32929967
J Proteomics. 2018 Oct 30;189:11-22
pubmed: 29501709
Bioinformatics. 2012 Jan 1;28(1):112-8
pubmed: 22039212
Bioinformatics. 2020 Jan 1;36(1):257-263
pubmed: 31199438
BMC Bioinformatics. 2012;13 Suppl 16:S5
pubmed: 23176322
J Clin Epidemiol. 2020 Sep;125:183-187
pubmed: 32540389
Metabolites. 2014 Jun 16;4(2):433-52
pubmed: 24957035
Mol Cell Proteomics. 2017 Apr;16(4 suppl 1):S108-S123
pubmed: 28223351
Bioinformatics. 2014 Sep 1;30(17):2524-6
pubmed: 24794931
Sci Rep. 2019 May 6;9(1):6913
pubmed: 31061415
Stat Interface. 2012 Jan 1;5(1):99-107
pubmed: 23888187
Bioinformatics. 2020 Apr 1;36(7):2217-2223
pubmed: 31790148
Sci Rep. 2018 Jan 12;8(1):663
pubmed: 29330539
Front Mol Biosci. 2015 Feb 02;2:4
pubmed: 25988172
J Proteome Res. 2015 May 1;14(5):1993-2001
pubmed: 25855118
J Proteome Res. 2016 Apr 1;15(4):1116-25
pubmed: 26906401
Curr Bioinform. 2012 Mar;7(1):96-108
pubmed: 22438836