Clinical biomarker discovery by SWATH-MS based label-free quantitative proteomics: impact of criteria for identification of differentiators and data normalization method.


Journal

Journal of translational medicine
ISSN: 1479-5876
Titre abrégé: J Transl Med
Pays: England
ID NLM: 101190741

Informations de publication

Date de publication:
31 05 2019
Historique:
received: 28 03 2019
accepted: 24 05 2019
entrez: 2 6 2019
pubmed: 4 6 2019
medline: 15 7 2020
Statut: epublish

Résumé

SWATH-MS has emerged as the strategy of choice for biomarker discovery due to the proteome coverage achieved in acquisition and provision to re-interrogate the data. However, in quantitative analysis using SWATH, each sample from the comparison group is run individually in mass spectrometer and the resulting inter-run variation may influence relative quantification and identification of biomarkers. Normalization of data to diminish this variation thereby becomes an essential step in SWATH data processing. In most reported studies, data normalization methods used are those provided in instrument-based data analysis software or those used for microarray data. This study, for the first time provides an experimental evidence for selection of normalization method optimal for biomarker identification. The efficiency of 12 normalization methods to normalize SWATH-MS data was evaluated based on statistical criteria in 'Normalyzer'-a tool which provides comparative evaluation of normalization by different methods. Further, the suitability of normalized data for biomarker discovery was assessed by evaluating the clustering efficiency of differentiators, identified from the normalized data based on p-value, fold change and both, by hierarchical clustering in Genesis software v.1.8.1. Conventional statistical criteria identified VSN-G as the optimal method for normalization of SWATH data. However, differentiators identified from VSN-G normalized data failed to segregate test and control groups. We thus assessed data normalized by eleven other methods for their ability to yield differentiators which segregate the study groups. Datasets in our study demonstrated that differentiators identified based on p-value from data normalized with Loess-R stratified the study groups optimally. This is the first report of experimentally tested strategy for SWATH-MS data processing with an emphasis on identification of clinically relevant biomarkers. Normalization of SWATH-MS data by Loess-R method and identification of differentiators based on p-value were found to be optimal for biomarker discovery in this study. The study also demonstrates the need to base the choice of normalization method on the application of the data.

Sections du résumé

BACKGROUND
SWATH-MS has emerged as the strategy of choice for biomarker discovery due to the proteome coverage achieved in acquisition and provision to re-interrogate the data. However, in quantitative analysis using SWATH, each sample from the comparison group is run individually in mass spectrometer and the resulting inter-run variation may influence relative quantification and identification of biomarkers. Normalization of data to diminish this variation thereby becomes an essential step in SWATH data processing. In most reported studies, data normalization methods used are those provided in instrument-based data analysis software or those used for microarray data. This study, for the first time provides an experimental evidence for selection of normalization method optimal for biomarker identification.
METHODS
The efficiency of 12 normalization methods to normalize SWATH-MS data was evaluated based on statistical criteria in 'Normalyzer'-a tool which provides comparative evaluation of normalization by different methods. Further, the suitability of normalized data for biomarker discovery was assessed by evaluating the clustering efficiency of differentiators, identified from the normalized data based on p-value, fold change and both, by hierarchical clustering in Genesis software v.1.8.1.
RESULTS
Conventional statistical criteria identified VSN-G as the optimal method for normalization of SWATH data. However, differentiators identified from VSN-G normalized data failed to segregate test and control groups. We thus assessed data normalized by eleven other methods for their ability to yield differentiators which segregate the study groups. Datasets in our study demonstrated that differentiators identified based on p-value from data normalized with Loess-R stratified the study groups optimally.
CONCLUSION
This is the first report of experimentally tested strategy for SWATH-MS data processing with an emphasis on identification of clinically relevant biomarkers. Normalization of SWATH-MS data by Loess-R method and identification of differentiators based on p-value were found to be optimal for biomarker discovery in this study. The study also demonstrates the need to base the choice of normalization method on the application of the data.

Identifiants

pubmed: 31151397
doi: 10.1186/s12967-019-1937-9
pii: 10.1186/s12967-019-1937-9
pmc: PMC6545036
doi:

Substances chimiques

Biomarkers 0
Peptide Fragments 0
Proteome 0

Types de publication

Evaluation Study Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

184

Références

Nucleic Acids Res. 2016 Jan 4;44(D1):D447-56
pubmed: 26527722
Mol Cell Proteomics. 2004 Apr;3(4):367-78
pubmed: 14990683
Anal Biochem. 1976 May 7;72:248-54
pubmed: 942051
Mol Cell Proteomics. 2014 Jul;13(7):1753-68
pubmed: 24741114
Sci Rep. 2016 May 25;6:26784
pubmed: 27222005
Exp Cell Res. 2017 Nov 15;360(2):125-137
pubmed: 28867478
Mol Cell Proteomics. 2015 Nov;14(11):3040-55
pubmed: 26316108
Biol Reprod. 2018 Aug 1;99(2):395-408
pubmed: 29228106
Nat Biotechnol. 2016 Nov;34(11):1130-1136
pubmed: 27701404
J Proteome Res. 2014 Jun 6;13(6):3114-20
pubmed: 24766612
BMC Bioinformatics. 2012 Mar 13;13 Suppl 2:S11
pubmed: 22536862
Proteomics. 2015 Nov;15(22):3905-20
pubmed: 26359947
Proteomes. 2014 Jul 22;2(3):363-381
pubmed: 28250386
J Proteome Res. 2017 Aug 4;16(8):3053-3067
pubmed: 28658951
Sci Rep. 2017 Jan 24;7:41191
pubmed: 28117408
Proteomics. 2015 Sep;15(17):2934-44
pubmed: 25914152
Exp Eye Res. 2018 Jul;172:21-29
pubmed: 29580721
Bioinformatics. 2014 Mar 15;30(6):801-7
pubmed: 22321699
J Proteome Res. 2015 Feb 6;14(2):609-18
pubmed: 25495469
J Proteome Res. 2010 Nov 5;9(11):6007-15
pubmed: 20949922
J Proteome Res. 2015 Sep 4;14(9):3793-803
pubmed: 26224564
Brief Bioinform. 2018 Jan 1;19(1):1-11
pubmed: 27694351
J Proteomics. 2016 Apr 14;138:106-14
pubmed: 26917472
Front Pharmacol. 2018 Jun 26;9:681
pubmed: 29997509
Mol Cell Proteomics. 2015 Aug;14(8):2150-9
pubmed: 26023067
Mol Cell Proteomics. 2012 Jun;11(6):O111.016717
pubmed: 22261725
Nat Med. 2015 Apr;21(4):407-13
pubmed: 25730263
Neurochem Int. 2015 Aug;87:1-12
pubmed: 25958317
Cell Syst. 2017 Dec 27;5(6):604-619.e7
pubmed: 29226804
J Proteome Res. 2006 Feb;5(2):277-86
pubmed: 16457593
Mol Cell Proteomics. 2009 Oct;8(10):2285-95
pubmed: 19596695
Mol Cell Proteomics. 2017 May;16(5):924-935
pubmed: 28336724
BMC Bioinformatics. 2012;13 Suppl 16:S5
pubmed: 23176322
Cell. 2017 Jun 1;169(6):1105-1118.e15
pubmed: 28575672
Theranostics. 2017 Sep 26;7(18):4350-4358
pubmed: 29158831
Sci Rep. 2017 Jul 5;7(1):4711
pubmed: 28680152
J Proteome Res. 2015 Sep 4;14(9):3982-95
pubmed: 26260330
Sci Data. 2014 Sep 16;1:140031
pubmed: 25977788
ACS Chem Neurosci. 2018 May 16;9(5):988-1000
pubmed: 29384651
Nat Commun. 2017 Oct 31;8(1):1212
pubmed: 29089484
J Proteomics. 2019 Jan 16;191:131-142
pubmed: 29530678
Bioinformatics. 2009 Mar 15;25(6):765-71
pubmed: 19176553
Nat Biotechnol. 2010 Jul;28(7):710-21
pubmed: 20622845
Mol Cell Proteomics. 2017 Apr;16(4 suppl 1):S108-S123
pubmed: 28223351
Mol Cell Proteomics. 2008 Jun;7(6):1162-73
pubmed: 18339795
Mol Cell Proteomics. 2015 Oct;14(10):2800-13
pubmed: 26199342
J Proteomics. 2014 Sep 23;109:228-39
pubmed: 25046836

Auteurs

Mythreyi Narasimhan (M)

Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, 410210, India.
BARC Training School Complex, Homi Bhabha National Institute, Anushakti Nagar, Mumbai, 400094, India.

Sadhana Kannan (S)

Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, 410210, India.

Aakash Chawade (A)

Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden.

Atanu Bhattacharjee (A)

Section of Biostatistics, Centre for Cancer Epidemiology, Tata Memorial Centre, Kharghar, Navi Mumbai, 410210, India.

Rukmini Govekar (R)

Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, 410210, India. rgovekar@actrec.gov.in.
BARC Training School Complex, Homi Bhabha National Institute, Anushakti Nagar, Mumbai, 400094, India. rgovekar@actrec.gov.in.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH