Intensity and retention time prediction improves the rescoring of protein-nucleic acid cross-links.
fragment peak intensities
protein‐RNA cross‐linking mass spectrometry
rescoring
retention time
transfer learning
Journal
Proteomics
ISSN: 1615-9861
Titre abrégé: Proteomics
Pays: Germany
ID NLM: 101092707
Informations de publication
Date de publication:
Apr 2024
Apr 2024
Historique:
revised:
29
12
2023
received:
27
05
2023
accepted:
05
01
2024
medline:
17
4
2024
pubmed:
17
4
2024
entrez:
17
4
2024
Statut:
ppublish
Résumé
In protein-RNA cross-linking mass spectrometry, UV or chemical cross-linking introduces stable bonds between amino acids and nucleic acids in protein-RNA complexes that are then analyzed and detected in mass spectra. This analytical tool delivers valuable information about RNA-protein interactions and RNA docking sites in proteins, both in vitro and in vivo. The identification of cross-linked peptides with oligonucleotides of different length leads to a combinatorial increase in search space. We demonstrate that the peptide retention time prediction tasks can be transferred to the task of cross-linked peptide retention time prediction using a simple amino acid composition encoding, yielding improved identification rates when the prediction error is included in rescoring. For the more challenging task of including fragment intensity prediction of cross-linked peptides in the rescoring, we obtain, on average, a similar improvement. Further improvement in the encoding and fine-tuning of retention time and intensity prediction models might lead to further gains, and merit further research.
Identifiants
pubmed: 38629965
doi: 10.1002/pmic.202300144
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e2300144Subventions
Organisme : European Union's Horizon 2020 research and innovation program
Organisme : Marie Skłodowska-Curie
ID : 956148
Organisme : Ministry of Science, Research and Arts Baden-Württemberg
Organisme : Deutsche Forschungsgemeinschaft
ID : SFB 860
Organisme : Deutsche Forschungsgemeinschaft
ID : SFB 1565
Organisme : Deutsche Forschungsgemeinschaft
ID : SPP 1935
Organisme : Research Foundation Flanders (FWO)
ID : 12B7123N
Organisme : Research Foundation Flanders (FWO)
ID : G010023N
Organisme : Research Foundation Flanders (FWO)
ID : G028821N
Organisme : Research Foundation Flanders (FWO)
ID : 1SE3722
Organisme : Vlaams Agentschap Innoveren en Ondernemen
ID : HBC.2020.2205
Organisme : European Union's Horizon 2020 Programme
ID : H2020-INFRAIA-2018-1
Organisme : European Union's Horizon 2020 Programme
ID : 823839
Organisme : Ghent University Concerted Research Action
ID : BOF21/GOA/033
Informations de copyright
© 2024 The Authors. PROTEOMICS published by Wiley‐VCH GmbH.
Références
Hentze, M. W., Castello, A., Schwarzl, T., & Preiss, T. (2018). A brave new world of RNA‐binding proteins. Nature Reviews Molecular Cell Biology, 19(5), 327–341.
Ramanathan, M., Porter, D. F., & Khavari, P. A. (2019). Methods to study RNA–protein interactions. Nature Methods, 16(3), 225–234.
Kelaini, S., Chan, C., Cornelius, V. A., & Margariti, A. (2021). RNA‐binding proteins hold key roles in function, dysfunction, and disease. Biology, 10(5), 366.
Götze, M., Sarnowski, C. P., de Vries, T., Knorlein, A., Allain, F. H. T., Hall, J., Aebersold, R., Leitner, A., & Leitner, A. (2021). Single nucleotide resolution RNA–protein cross‐linking mass spectrometry: A simple extension of the CLIR‐MS workflow. Analytical Chemistry, 93(44), 14626–14634.
Bae, J. W., Kim, S., Narry Kim, V., & Kim, J. S. (2021). Photoactivatable ribonucleosides mark base‐specific RNA‐binding sites. Nature Structural & Molecular Biology, 12(1), 6026.
Bae, J. W., Chul Kwon, S., Na, Y., Narry Kim, V., & Kim, J. S. (2020). Chemical RNA digestion enables robust RNA‐binding site mapping at single amino acid resolution. Nat Structural Molecular Biology, 27(7), 678–682.
Van Ende, R., Balzarini, S., & Geuten, K. (2020). Single and combined methods to specifically or bulk‐purify RNA–protein complexes. Biomolecules, 10(8), 1160.
Hafner, M., Katsantoni, M., Köster, T., Marks, J., Mukherjee, J., Staiger, D., Ule, J., & Zavolan, M. (2021). CLIP and complementary methods. Nature Reviews Methods, 1(1), 20.
Urdaneta, E. C., & Beckmann, B. M. (2020). Fast and unbiased purification of RNA‐protein complexes after UV cross‐linking. Methods (San Diego, Calif.), 178, 72–82.
Sarnowski, C. P., Knörlein, A., De Vries, T., Götze, M., Beusch, I., Aebersold, R., Allain, F. H. T., Hall, J., & Leitner, A. (2022). Sensitive detection and structural characterisation of UV‐induced cross‐links in protein‐RNA complexes using CLIR‐MS. bioRxiv, 2022–2023.
Vieira‐Vieira, C. H., & Selbach, M. (2021). Opportunities and challenges in global quantification of RNA‐protein interaction via UV cross‐linking. Frontiers in Molecular Biosciences, 8, 669939.
McHugh, C. A., Russell, P., & Guttman, M. (2014). Methods for comprehensive experimental identification of RNA‐protein interactions. Genome Biology, 15, 1–10.
Götze, M., Sarnowski, C. P., de Vries, T., Knorlein, A., Allain, F. H. T., Hall, J., Aebersold, R., & Leitner, A. (2021). Single nucleotide resolution RNA–protein cross‐linking mass spectrometry: A simple extension of the CLIR‐MS workflow. Analytical Chemistry, 93(44), 14626–14634.
Shchepachev, V., Bresson, S., Spanos, C., Petfalski, E., Fischer, L., Rappsilber, J., & Tollervey, D. (2019). Defining the RNA interactome by total RNA‐associated protein purification. Molecular Systems Biology, 15(4), e8689.
Kramer, K., Sachsenberg, T., Beckmann, B. M., Qamar, S., Boon, K. L., Hentze, M. W., Kohlbacher, O., & Urlaub, H. (2014). Photo‐cross‐linking and high‐resolution mass spectrometry for assignment of RNA‐binding sites in RNA‐binding proteins. Nature Methods, 11(10), 1064–1070.
Trendel, J., Schwarzl, T., Horos, R., Prakash, A., Bateman, A., Hentze, M. W., & Krijgsveld, J. (2019). The human RNA‐binding proteome and its dynamics during translational arrest. Cell, 176(1‐2), 391–403.e19.
Fondrie, W. E., & Noble, W. S. (2021). mokapot: Fast and flexible semisupervised learning for peptide detection. Journal of Proteome Research, 20(4), 1966–1971.
Granholm, V., Noble, W. S., & Käll, L. (2012). A cross‐validation scheme for machine learning algorithms in shotgun proteomics. BMC Bioinformatics, 13, 1–8.
Fedorova, E. S., Matyushin, D. D., Plyushchenko, I. V., Stavrianidi, A. N., & Buryak, A. K. (2022). Deep learning for retention time prediction in reversed‐phase liquid chromatography. Journal of Chromatography A, 1664, 462792.
Wen, B., Zeng, W. F., Liao, Y., Shi, Z., Savage, S. R., Jiang, W., & Zhang, B. (2020). Deep learning in proteomics. Proteomics, 20(21‐22), 1900335.
Declercq, A., Bouwmeester, R., Hirschler, A., Carapito, C., Degroeve, S., Martens, L., & Gabriels, R. (2022). MS2Rescore: Data‐driven rescoring dramatically boosts immunopeptide identification rates. Molecular & Cellular Proteomics: MCP, 21(8), 100266.
Gabriel, W., The, M., Zolg, D. P., Bayer, F. P., Shouman, O., Lautenbacher, L., Schnatbaum, K., Zerweck, J., Knaute, T., Delanghe, B., Huhmer, A., Wenschuh, H., Reimer, U., Médard, G., Kuster, B., Wilhelm, M., & Wilhelm, M. (2022). Prosit‐TMT: Deep learning boosts identification of TMT‐labeled peptides. Analytical Chemistry, 94(20), 7181–7190.
Zeng, W. F., Zhou, X. X., Willems, S., Ammar, C., Wahle, M., Bludau, I., Voytik, E., Strauss, M. T., & Mann, M. (2022). AlphaPeptDeep: A modular deep learning framework to predict peptide properties for proteomics. Nature Communications, 13(1), 7238.
Tarn, C., & Zeng, W. F. (2021). pDeep3: Toward more accurate spectrum prediction with fast few‐shot learning. Analytical Chemistry, 93(14), 5815–5822.
Bouwmeester, R., Gabriels, R., Hulstaert, N., Martens, L., & Degroeve, S. (2021). DeepLC can predict retention times for peptides that carry as‐yet unseen modifications. Nature Methods, 18(11), 1363–1369.
Degroeve, S., & Martens, L. (2013). MS2PIP: A tool for MS/MS peak intensity prediction. Bioinformatics, 29(24), 3199–3203.
Weisser, H., & Choudhary, J. S. (2017). Targeted feature detection for data‐dependent shotgun proteomics. Journal of Proteome Research, 16(8), 2964–2974.
The, M., MacCoss, M. J., Noble, W. S., & Käll, L. (2016). Fast and accurate protein false discovery rates on large‐scale proteomics data sets with percolator 3.0. Journal of the American Society for Mass Spectrometry, 27, 1719–1727.
Choi, H., & Nesvizhskii, A. I. (2008). False discovery rates and related statistical concepts in mass spectrometry‐based proteomics. Journal of Proteome Research, 7(01), 47–50.
Feng, X. D., Li, L. W., Zhang, J. H., Zhu, Y. P., Chang, C., Shu, K. X., & Ma, J. (2017). Using the entrapment sequence method as a standard to evaluate key steps of proteomics data analysis process. BMC Genomics, 18(2), 1–9.
Lin, A., Short, T., Noble, W. S., & Keich, U. (2022). Improving peptide‐level mass spectrometry analysis via double competition. Journal of Proteome Research, 21(10), 2412–2420.
Meyer, J. G. (2021). Deep learning neural network tools for proteomics. Cell Reports Methods, 1(2), 100003.
Degroeve, S., Gabriels, R., Velghe, K., Bouwmeester, R., Tichshenko, N., & Martens, L. (2021). ionbot: A novel, innovative and sensitive machine learning approach to LC‐MS/MS peptide identification. bioRxiv, 2021–2027.