PredNTS: Improved and Robust Prediction of Nitrotyrosine Sites by Integrating Multiple Sequence Features.
RFE feature selection
feature encoding
machine learning
nitrotyrosine
post-translational modification
Journal
International journal of molecular sciences
ISSN: 1422-0067
Titre abrégé: Int J Mol Sci
Pays: Switzerland
ID NLM: 101092791
Informations de publication
Date de publication:
08 Mar 2021
08 Mar 2021
Historique:
received:
21
01
2021
revised:
02
03
2021
accepted:
03
03
2021
entrez:
3
4
2021
pubmed:
4
4
2021
medline:
23
4
2021
Statut:
epublish
Résumé
Nitrotyrosine, which is generated by numerous reactive nitrogen species, is a type of protein post-translational modification. Identification of site-specific nitration modification on tyrosine is a prerequisite to understanding the molecular function of nitrated proteins. Thanks to the progress of machine learning, computational prediction can play a vital role before the biological experimentation. Herein, we developed a computational predictor PredNTS by integrating multiple sequence features including K-mer, composition of k-spaced amino acid pairs (CKSAAP), AAindex, and binary encoding schemes. The important features were selected by the recursive feature elimination approach using a random forest classifier. Finally, we linearly combined the successive random forest (RF) probability scores generated by the different, single encoding-employing RF models. The resultant PredNTS predictor achieved an area under a curve (AUC) of 0.910 using five-fold cross validation. It outperformed the existing predictors on a comprehensive and independent dataset. Furthermore, we investigated several machine learning algorithms to demonstrate the superiority of the employed RF algorithm. The PredNTS is a useful computational resource for the prediction of nitrotyrosine sites. The web-application with the curated datasets of the PredNTS is publicly available.
Identifiants
pubmed: 33800121
pii: ijms22052704
doi: 10.3390/ijms22052704
pmc: PMC7962192
pii:
doi:
Substances chimiques
Proteins
0
3-nitrotyrosine
3604-79-3
Tyrosine
42HK56048U
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Grant-in-Aid for Scientific Research (B)
ID : 19H04208
Organisme : Japan Society for the Promotion of Science (JSPS)
ID : 19F19377
Références
J Am Soc Mass Spectrom. 2015 Dec;26(12):2062-76
pubmed: 26450359
Sci Rep. 2019 Jun 4;9(1):8258
pubmed: 31164681
Genomics. 2021 Jan;113(1 Pt 2):689-698
pubmed: 33017626
Brief Bioinform. 2020 Nov 05;:
pubmed: 33152766
FEBS Lett. 2019 Nov;593(21):3029-3039
pubmed: 31297788
PLoS One. 2014 Aug 14;9(8):e105018
pubmed: 25121969
Mol Ther Nucleic Acids. 2020 Sep 16;22:406-420
pubmed: 33230445
Methods Mol Biol. 2009;566:137-63
pubmed: 20058170
Mol Ther Nucleic Acids. 2019 Jun 7;16:733-744
pubmed: 31146255
Med Res Rev. 2020 Jul;40(4):1276-1314
pubmed: 31922268
Bioinformatics. 2012 Dec 1;28(23):3150-2
pubmed: 23060610
Genomics Proteomics Bioinformatics. 2020 Oct 21;:
pubmed: 33099033
Mol Med Rep. 2017 Oct;16(4):5050-5054
pubmed: 28791396
Genomics. 2020 Jul;112(4):2813-2822
pubmed: 32234434
Mol Ther Nucleic Acids. 2019 Dec 6;18:131-141
pubmed: 31542696
Brief Bioinform. 2020 Sep 10;:
pubmed: 32910169
Front Genet. 2019 Mar 05;10:129
pubmed: 30891059
Sci Rep. 2021 Feb 4;11(1):3017
pubmed: 33542286
Int J Mol Sci. 2021 Feb 20;22(4):
pubmed: 33672741
Genomics Proteomics Bioinformatics. 2018 Aug;16(4):294-306
pubmed: 30268931
Anal Biochem. 2020 Jun 15;599:113747
pubmed: 32333902
BMC Bioinformatics. 2019 Jun 17;20(1):346
pubmed: 31208321
Bioinformatics. 2020 Jun 1;36(11):3350-3356
pubmed: 32145017
J Comput Aided Mol Des. 2020 Oct;34(10):1105-1116
pubmed: 32557165
Plant Mol Biol. 2020 May;103(1-2):225-234
pubmed: 32140819
Bioinformatics. 2006 Jun 15;22(12):1536-7
pubmed: 16632492
Mol Omics. 2019 Dec 2;15(6):451-458
pubmed: 31710075
Methods Enzymol. 2008;441:1-17
pubmed: 18554526
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jun 30;PP:
pubmed: 32750881
Mol Biosyst. 2011 Apr;7(4):1197-204
pubmed: 21258675
J Chem Inf Model. 2020 Dec 28;60(12):6666-6678
pubmed: 33094610
Brief Bioinform. 2018 Oct 31;:
pubmed: 30383239
Rapid Commun Mass Spectrom. 2006;20(19):2885-93
pubmed: 16941724
Brief Funct Genomics. 2021 Jan 25;:
pubmed: 33491072
J Comput Aided Mol Des. 2021 Mar;35(3):315-323
pubmed: 33392948
J Comput Aided Mol Des. 2020 Dec;34(12):1229-1236
pubmed: 32964284
Metab Brain Dis. 2018 Aug;33(4):1081-1096
pubmed: 29542039
Cells. 2019 Jan 28;8(2):
pubmed: 30696115
Mol Biosyst. 2016 Mar;12(3):786-95
pubmed: 26739209
Rapid Commun Mass Spectrom. 2007;21(17):2797-804
pubmed: 17661312
PLoS One. 2018 Oct 12;13(10):e0200283
pubmed: 30312302
Bioinformatics. 2019 Dec 1;35(23):4930-4937
pubmed: 31099381
PLoS One. 2015 Jun 16;10(6):e0129635
pubmed: 26080082
J Virol. 2017 Nov 14;91(23):
pubmed: 28904193
Bioinformatics. 2021 Feb 26;:
pubmed: 33638635
IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):621-632
pubmed: 31180870
J Proteome Res. 2017 Aug 4;16(8):2983-2992
pubmed: 28714690
Curr Genomics. 2020 Sep;21(6):454-463
pubmed: 33093807
Int J Mol Sci. 2014 Apr 14;15(4):6265-85
pubmed: 24736779
Curr Protein Pept Sci. 2020;21(12):1242-1250
pubmed: 31957610
Brief Bioinform. 2015 Jul;16(4):640-57
pubmed: 25212598
Nucleic Acids Res. 2008 Jan;36(Database issue):D202-5
pubmed: 17998252
Redox Biol. 2019 Sep;26:101251
pubmed: 31226647
Bioinformatics. 2019 Aug 15;35(16):2757-2765
pubmed: 30590410
Molecules. 2018 Jul 09;23(7):
pubmed: 29987232
Int J Biol Macromol. 2020 Aug 15;157:752-758
pubmed: 31805335
J Proteome Res. 2020 Oct 2;19(10):4125-4136
pubmed: 32897718
Front Pharmacol. 2018 Mar 27;9:276
pubmed: 29636690