Simplifying MS1 and MS2 spectra to achieve lower mass error, more dynamic range, and higher peptide identification confidence on the Bruker timsTOF Pro.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2022
2022
Historique:
received:
10
06
2022
accepted:
19
06
2022
entrez:
7
7
2022
pubmed:
8
7
2022
medline:
12
7
2022
Statut:
epublish
Résumé
For bottom-up proteomic analysis, the goal of analytical pipelines that process the raw output of mass spectrometers is to detect, characterise, identify, and quantify peptides. The initial steps of detecting and characterising features in raw data must overcome some considerable challenges. The data presents as a sparse array, sometimes containing billions of intensity readings over time. These points represent both signal and chemical or electrical noise. Depending on the biological sample's complexity, tens to hundreds of thousands of peptides may be present in this vast data landscape. For ion mobility-based LC-MS analysis, each peptide is comprised of a grouping of hundreds of single intensity readings in three dimensions: mass-over-charge (m/z), mobility, and retention time. There is no inherent information about any associations between individual points; whether they represent a peptide or noise must be inferred from their structure. Peptides each have multiple isotopes, different charge states, and a dynamic range of intensity of over six orders of magnitude. Due to the high complexity of most biological samples, peptides often overlap in time and mobility, making it very difficult to tease apart isotopic peaks, to apportion the intensity of each and the contribution of each isotope to the determination of the peptide's monoisotopic mass, which is critical for the peptide's identification. Here we describe four algorithms for the Bruker timsTOF Pro that each play an important role in finding peptide features and determining their characteristics. These algorithms focus on separate characteristics that determine how candidate features are detected in the raw data. The first two algorithms deal with the complexity of the raw data, rapidly clustering raw data into spectra that allows isotopic peaks to be resolved. The third algorithm compensates for saturation of the instrument's detector thereby recovering lost dynamic range, and lastly, the fourth algorithm increases confidence of peptide identifications by simplification of the fragment spectra. These algorithms are effective in processing raw data to detect features and extracting the attributes required for peptide identification, and make an important contribution to an analytical pipeline by detecting features that are higher quality and better segmented from other peptides in close proximity. The software has been developed in Python using Numpy and Pandas and made freely available with an open-source MIT license to facilitate experimentation and further improvement (DOI 10.5281/zenodo.6513126). Data are available via ProteomeXchange with identifier PXD030706.
Identifiants
pubmed: 35797390
doi: 10.1371/journal.pone.0271025
pii: PONE-D-22-16781
pmc: PMC9262215
doi:
Substances chimiques
Isotopes
0
Peptides
0
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0271025Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Mol Cell Proteomics. 2002 Nov;1(11):845-67
pubmed: 12488461
J Am Soc Mass Spectrom. 2008 May;19(5):703-12
pubmed: 18325782
BMC Bioinformatics. 2008 Nov 28;9:504
pubmed: 19040729
J Proteome Res. 2010 Aug 6;9(8):4152-60
pubmed: 20578722
Anal Chem. 2013 Feb 19;85(4):1991-4
pubmed: 23350948
J Proteome Res. 2014 Feb 7;13(2):348-61
pubmed: 24313442
Chem Rev. 2001 Feb;101(2):269-95
pubmed: 11712248
J Am Soc Mass Spectrom. 2014 Aug;25(8):1374-83
pubmed: 24789774
Mol Cell Proteomics. 2020 Jun;19(6):1058-1069
pubmed: 32156793
J Chromatogr A. 2008 May 23;1192(1):139-46
pubmed: 18378252
Nucleic Acids Res. 2019 Jan 8;47(D1):D442-D450
pubmed: 30395289
J Exp Bot. 2005 Jan;56(410):273-86
pubmed: 15618299
Mol Cell Proteomics. 2020 Sep;19(9):1575-1585
pubmed: 32616513
Nat Biotechnol. 2016 Nov;34(11):1130-1136
pubmed: 27701404
Mol Cell Proteomics. 2021;20:100149
pubmed: 34543758
Anal Bioanal Chem. 2007 Oct;389(4):1017-31
pubmed: 17668192
Nat Methods. 2017 May;14(5):513-520
pubmed: 28394336
J Proteome Res. 2011 Sep 2;10(9):4150-7
pubmed: 21780838
J Am Soc Mass Spectrom. 2017 Dec;28(12):2724-2725
pubmed: 28887728
J Proteome Res. 2010 Oct 1;9(10):5492-5
pubmed: 20731397
Int J Mass Spectrom. 2018 Apr;427:91-99
pubmed: 29706793
J Mass Spectrom. 2012 Feb;47(2):226-36
pubmed: 22359333
Rapid Commun Mass Spectrom. 2021 Jan 15;:e9045
pubmed: 33450063
Amino Acids. 2012 Sep;43(3):1087-108
pubmed: 22821268
Mol Cell Proteomics. 2018 Dec;17(12):2534-2545
pubmed: 30385480
Bioinformatics. 2006 Sep 1;22(17):2059-65
pubmed: 16820428
Expert Rev Proteomics. 2018 Apr;15(4):353-366
pubmed: 29542338
Anal Chem. 2012 Mar 20;84(6):3026-32
pubmed: 22401145
Pac Symp Biocomput. 2006;:243-54
pubmed: 17094243
Nat Methods. 2009 May;6(5):359-62
pubmed: 19377485
Anal Chem. 2017 Feb 21;89(4):2232-2241
pubmed: 28194947
Am J Physiol Lung Cell Mol Physiol. 2008 Jul;295(1):L16-22
pubmed: 18456800
Mol Cell Proteomics. 2006 Jan;5(1):144-56
pubmed: 16219938
Anal Chem. 2006 Feb 15;78(4):975-83
pubmed: 16478086
J Am Soc Mass Spectrom. 2017 Sep;28(9):1836-1843
pubmed: 28733967