PredPSP: a novel computational tool to discover pathway-specific photosynthetic proteins in plants.
Bioinformatics
Computational biology
Deep learning
Machine learning
Photosynthesis
Prediction server
Journal
Plant molecular biology
ISSN: 1573-5028
Titre abrégé: Plant Mol Biol
Pays: Netherlands
ID NLM: 9106343
Informations de publication
Date de publication:
24 Sep 2024
24 Sep 2024
Historique:
received:
16
02
2024
accepted:
04
09
2024
medline:
24
9
2024
pubmed:
24
9
2024
entrez:
24
9
2024
Statut:
epublish
Résumé
Photosynthetic proteins play a crucial role in agricultural productivity by harnessing light energy for plant growth. Understanding these proteins, especially within C
Identifiants
pubmed: 39316155
doi: 10.1007/s11103-024-01500-6
pii: 10.1007/s11103-024-01500-6
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
106Informations de copyright
© 2024. The Author(s), under exclusive licence to Springer Nature B.V.
Références
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410. https://doi.org/10.1016/S0022-2836(05)80360-2
Amerifar S, Norouzi M, Ghandi M (2022) A tool for feature extraction from biological sequences. Brief Bioinform 23:bbac108. https://doi.org/10.1093/bib/bbac108
doi: 10.1093/bib/bbac108
pubmed: 35383372
Ashkenazi S, Snir R, Ofran Y (2012) Assessing the relationship between conservation of function and conservation of sequence using photosynthetic proteins. Bioinformatics 28:3203–3210. https://doi.org/10.1093/bioinformatics/bts608
Aubry S, Brown NJ, Hibberd JM (2011) The role of proteins in C3 plants prior to their recruitment into the C4 pathway. J Exp Bot 62:3049–3059. https://doi.org/10.1093/jxb/err012
doi: 10.1093/jxb/err012
pubmed: 21321052
Bailey KJ, Gray JE, Walker RP, Leegood RC (2007) Coordinate regulation of Phosphoenolpyruvate Carboxylase and Phosphoenolpyruvate Carboxykinase by Light and CO2 during C4 photosynthesis. Plant Physiol 144:479–486. https://doi.org/10.1104/pp.106.093013
doi: 10.1104/pp.106.093013
pubmed: 17337522
pmcid: 1913779
Batista-Silva W, da Fonseca-Pereira P, Martins AO, Zsögön A, Nunes-Nesi A, Araújo WL (2020) Engineering Improved Photosynthesis in the era of Synthetic Biology. Plant Commun 1:100032. https://doi.org/10.1016/j.xplc.2020.100032
doi: 10.1016/j.xplc.2020.100032
pubmed: 33367233
pmcid: 7747996
Brahma S (2018) Improved Sentence modeling using Suffix bidirectional LSTM. Learning, arXiv. https://arXiv.org/1805.07340
Breiman L (1996) Bagging predictors. Mach Learn 24:123–140. https://doi.org/10.1007/BF00058655
doi: 10.1007/BF00058655
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
doi: 10.1023/A:1010933404324
Caffarri S, Tibiletti T, Jennings RC, Santabarbara S (2014) A comparison between Plant Photosystem I and Photosystem II Architecture and Functioning. Curr Protein Pept Sci 15:296–331. https://doi.org/10.2174/1389203715666140327102218
doi: 10.2174/1389203715666140327102218
pubmed: 24678674
pmcid: 4030627
Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, New York, NY, USA, pp 785–794
Chen K, Kurgan LA, Ruan J (2007) Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs. BMC Struct Biol 7:25. https://doi.org/10.1186/1472-6807-7-25
doi: 10.1186/1472-6807-7-25
pubmed: 17437643
pmcid: 1863424
Chen K, Jiang Y, Du L, Kurgan L (2009) Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs. J Comput Chem 30:163–172. https://doi.org/10.1002/jcc.2105
doi: 10.1002/jcc.2105
pubmed: 18567007
Chen R-C, Dewi C, Huang S-W, Caraka RE (2020) Selecting critical features for data classification based on machine learning methods. J Big Data 7:52. https://doi.org/10.1186/s40537-020-00327-4
doi: 10.1186/s40537-020-00327-4
Chen L, Yang Y, Zhao Z, Lu S, Lu Q, Cui C, Parry MAJ, Hu Y-G (2023) Genome-wide identification and comparative analyses of key genes involved in C4 photosynthesis in five main gramineous crops. Frontiers in Plant Science 14
Chou KC (2001) Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins 43:246–255. https://doi.org/10.1002/prot.1035
doi: 10.1002/prot.1035
pubmed: 11288174
Chou K-C, Cai Y-D (2004) Prediction of protein subcellular locations by GO-FunD-PseAA predictor. Biochem Biophys Res Commun 320:1236–1239. https://doi.org/10.1016/j.bbrc.2004.06.073
doi: 10.1016/j.bbrc.2004.06.073
pubmed: 15249222
Díaz-Uriarte R, Alvarez de Andrés S (2006) Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7:3. https://doi.org/10.1186/1471-2105-7-3
doi: 10.1186/1471-2105-7-3
pubmed: 16398926
pmcid: 1363357
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159
Eaton-Rye JJ, Sobotka R (2017) Editorial: Assembly of the Photosystem II membrane-protein complex of Oxygenic Photosynthesis. Frontiers in Plant Science 8
Freund Y, Schapire RE (1999) A short introduction to boosting. J Japanese Soc Artif Intell 14(5):771–780
Han LY, Zheng CJ, Lin HH, Cui J, Li H, Zhang HL, Tang ZQ, Chen YZ (2005) Prediction of functional class of novel plant proteins by a statistical learning method. New Phytol 168:109–121. https://doi.org/10.1111/j.1469-8137.2005.01482.x
doi: 10.1111/j.1469-8137.2005.01482.x
pubmed: 16159326
He K, Zhang X, Ren S, Sun J (2016) Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 770–778
Hibberd JM, Sheehy JE, Langdale JA (2008) Using C4 photosynthesis to increase the yield of rice-rationale and feasibility. Curr Opin Plant Biol 11:228–231. https://doi.org/10.1016/j.pbi.2007.11.002
doi: 10.1016/j.pbi.2007.11.002
pubmed: 18203653
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9:1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
doi: 10.1162/neco.1997.9.8.1735
pubmed: 9377276
Huang Y, Niu B, Gao Y et al (2010) CD-HIT suite: a web server for clustering and comparing biological sequences. Bioinformatics 26:680–682. https://doi.org/10.1093/bioinformatics/btq003
doi: 10.1093/bioinformatics/btq003
pubmed: 20053844
pmcid: 2828112
Huang M-L, Hung Y-H, Lee WM, Li RK, Jiang B-R (2014) SVM-RFE based feature selection and Taguchi Parameters Optimization for Multiclass SVM Classifier. ScientificWorldJournal 2014:795624. https://doi.org/10.1155/2014/795624
doi: 10.1155/2014/795624
pubmed: 25295306
pmcid: 4175386
Jiang G, Wang W (2017) Error estimation based on variance analysis of k-fold cross-validation. Pattern Recogn 69:94–106. https://doi.org/10.1016/j.patcog.2017.03.025
doi: 10.1016/j.patcog.2017.03.025
Kawashima S, Kanehisa M (2000) AAindex: amino acid index database. Nucleic Acids Res 28:374. https://doi.org/10.1093/nar/28.1.374
doi: 10.1093/nar/28.1.374
pubmed: 10592278
pmcid: 102411
Ke G, Meng Q, Finley T et al (2017) LightGBM: a highly efficient gradient boosting decision tree. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA, pp 3149–3157
Kim Y (2014) Convolutional Neural Networks for Sentence Classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, pp 1746–1751
Kubis A, Bar-Even A (2019) Synthetic biology approaches for improving photosynthesis. J Exp Bot 70:1425–1433. https://doi.org/10.1093/jxb/erz029
doi: 10.1093/jxb/erz029
pubmed: 30715460
pmcid: 6432428
Kulmanov M, Hoehndorf R (2020) DeepGOPlus: improved protein function prediction from sequence. Bioinformatics 36:422–429. https://doi.org/10.1093/bioinformatics/btz595
doi: 10.1093/bioinformatics/btz595
pubmed: 31350877
Kulmanov M, Khan MA, Hoehndorf R (2018) DeepGO: predicting protein functions from sequence and interactions using a deep ontology-aware classifier. Bioinformatics 34:660–668. https://doi.org/10.1093/bioinformatics/btx624
doi: 10.1093/bioinformatics/btx624
pubmed: 29028931
Li YH, Xu JY, Tao L, Li XF, Li S, Zeng X, Chen SY, Zhang P, Qin C, Zhang C, Chen Z, Zhu F, Chen YZ (2016) SVM-Prot 2016: a web-server for machine learning prediction of protein functional families from sequence irrespective of similarity. PLoS ONE 11:e0155290. https://doi.org/10.1371/journal.pone.0155290
doi: 10.1371/journal.pone.0155290
pubmed: 27525735
pmcid: 4985167
Long SP, Zhu X-G, Naidu SL, Ort DR (2006) Can improvement in photosynthesis increase crop yields? Plant Cell Environ 29:315–330. https://doi.org/10.1111/j.1365-3040.2005.01493.x
doi: 10.1111/j.1365-3040.2005.01493.x
pubmed: 17080588
Matsuoka M, Furbank RT, Fukayama H, Miyao M (2001) MOLECULAR ENGINEERING OF C4 PHOTOSYNTHESIS. Annu Rev Plant Physiol Plant Mol Biol 52:297–314. https://doi.org/10.1146/annurev.arplant.52.1.297
doi: 10.1146/annurev.arplant.52.1.297
pubmed: 11337400
Meher PK, Sahu TK, Saini V, Rao AR (2017) Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Sci Rep 7:42362. https://doi.org/10.1038/srep42362
doi: 10.1038/srep42362
pubmed: 28205576
pmcid: 5304217
Muhie SH (2022) Optimization of photosynthesis for sustainable crop production. CABI Agric Bioscience 3:50. https://doi.org/10.1186/s43170-022-00117-3
doi: 10.1186/s43170-022-00117-3
Nagashima S, Nagashima KVP (2013) Chapter Five - Comparison of Photosynthesis Gene Clusters Retrieved from Total Genome Sequences of Purple Bacteria. In: Beatty JT (ed) Advances in Botanical Research. Academic Press, pp 151–178
Nowicka B (2019) Target genes for plant productivity improvement. J Biotechnol 298:21–34. https://doi.org/10.1016/j.jbiotec.2019.04.008
doi: 10.1016/j.jbiotec.2019.04.008
pubmed: 30978366
Nowicka B, Ciura J, Szymańska R, Kruk J (2018) Improving photosynthesis, plant productivity and abiotic stress tolerance– current trends and future perspectives. J Plant Physiol 231:415–433. https://doi.org/10.1016/j.jplph.2018.10.022
doi: 10.1016/j.jplph.2018.10.022
pubmed: 30412849
Orr DJ, Pereira AM, Pereira PdaF, Pereira-Lima ÍA, Zsögön A, Araújo WL (2017) Engineering photosynthesis: progress and perspectives
Paul MJ (2021) Improving photosynthetic metabolism for crop yields: what is going to work? Frontiers in Plant Science 12
Pradhan UK, Meher PK, Naha S et al (2023) PlDBPred: a novel computational model for discovery of DNA binding proteins in plants. Brief Bioinform 24:bbac483. https://doi.org/10.1093/bib/bbac483
doi: 10.1093/bib/bbac483
pubmed: 36416116
Roberts K, Granum E, Leegood RC, Raven JA (2007) C3 and C4 pathways of photosynthetic Carbon Assimilation in Marine Diatoms are under genetic, not environmental, control. Plant Physiol 145:230–235. https://doi.org/10.1104/pp.107.102616
doi: 10.1104/pp.107.102616
pubmed: 17644625
pmcid: 1976569
Robles-Zazueta CA, Pinto F, Molero G, Foulkes MJ, Reynolds MP, Murchie EH (2022) Prediction of photosynthetic, Biophysical, and biochemical traits in wheat canopies to reduce the phenotyping bottleneck. Frontiers in Plant Science 13
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23:2507–2517. https://doi.org/10.1093/bioinformatics/btm344
doi: 10.1093/bioinformatics/btm344
pubmed: 17720704
Sage RF (2004) The evolution of C4 photosynthesis. New Phytol 161:341–370. https://doi.org/10.1111/j.1469-8137.2004.00974.x
doi: 10.1111/j.1469-8137.2004.00974.x
pubmed: 33873498
Sage RF, Christin P-A, Edwards EJ (2011) The C4 plant lineages of planet earth. J Exp Bot 62:3155–3169. https://doi.org/10.1093/jxb/err048
doi: 10.1093/jxb/err048
pubmed: 21414957
Sandri M, Zuccolotto P (2008) A Bias correction algorithm for the Gini Variable Importance measure in classification trees. J Comput Graphical Stat 17:611–628. https://doi.org/10.1198/106186008X344522
doi: 10.1198/106186008X344522
Sangphukieo A, Laomettachit T, Ruengjitchatchawalya M (2020) Photosynthetic protein classification using genome neighborhood-based machine learning feature. Sci Rep 10:7108. https://doi.org/10.1038/s41598-020-64053-w
doi: 10.1038/s41598-020-64053-w
pubmed: 32346070
pmcid: 7189237
Saravanan V, Gautham N (2015) Harnessing Computational Biology for exact Linear B-Cell Epitope Prediction: a novel amino acid composition-based feature descriptor. OMICS 19:648–658. https://doi.org/10.1089/omi.2015.0095
doi: 10.1089/omi.2015.0095
pubmed: 26406767
Schneider G, Wrede P (1994) The rational design of amino acid sequences by artificial neural networks and simulated molecular evolution: de novo design of an idealized leader peptidase cleavage site. Biophys J 66:335–344. https://doi.org/10.1016/s0006-3495(94)80782-9
doi: 10.1016/s0006-3495(94)80782-9
pubmed: 8161687
pmcid: 1275700
Shevela D, Kern JF, Govindjee G, Messinger J (2023) Solar energy conversion by photosystem II: principles and structures. Photosynth Res 156:279–307. https://doi.org/10.1007/s11120-022-00991-y
doi: 10.1007/s11120-022-00991-y
pubmed: 36826741
pmcid: 10203033
Sikander R, Wang Y, Ghulam A, Wu X (2021) Identification of enzymes-specific protein domain based on DDE, and convolutional neural network. Front Genet 12:759384. https://doi.org/10.3389/fgene.2021.759384
doi: 10.3389/fgene.2021.759384
pubmed: 34917128
pmcid: 8670239
Simkin AJ, López-Calcagno PE, Raines CA (2019) Feeding the world: improving photosynthetic efficiency for sustainable crop production. J Exp Bot 70:1119–1140. https://doi.org/10.1093/jxb/ery445
doi: 10.1093/jxb/ery445
pubmed: 30772919
pmcid: 6395887
South PF, Cavanagh AP, Liu HW, Ort DR (2019) Synthetic glycolate metabolism pathways stimulate crop growth and productivity in the field. Science 363:eaat9077. https://doi.org/10.1126/science.aat9077
doi: 10.1126/science.aat9077
pubmed: 30606819
The UniProt Consortium (2023) UniProt: the Universal protein knowledgebase in 2023. Nucleic Acids Res 51:D523–D531. https://doi.org/10.1093/nar/gkac1052
doi: 10.1093/nar/gkac1052
Vapnik V (1963) Pattern recognition using generalized portrait method. Autom Remote Control 24:774–780
Vasylenko T, Liou Y-F, Chen H-A, Charoenkwan P, Huang H-L, Ho S-Y (2015) SCMPSP: prediction and characterization of photosynthetic proteins based on a scoring card method. BMC Bioinformatics 16:S8. https://doi.org/10.1186/1471-2105-16-S1-S8
doi: 10.1186/1471-2105-16-S1-S8
pubmed: 25708243
pmcid: 4331707
Wang Y, Dai X, Fu D, Li P, Du B (2022) PGD: a machine learning-based photosynthetic-related gene detection approach. BMC Bioinformatics 23:183. https://doi.org/10.1186/s12859-022-04722-x
doi: 10.1186/s12859-022-04722-x
pubmed: 35581553
pmcid: 9112524
Wegener KM, Welsh EA, Thornton LE, Keren N, Jacobs JM, Hixson KK, Monroe ME, Camp DG, Smith RD, Pakrasi HB (2008) High sensitivity proteomics assisted discovery of a novel operon involved in the assembly of photosystem II, a membrane protein complex. J Biol Chem 283:27829–27837. https://doi.org/10.1074/jbc.M803918200
doi: 10.1074/jbc.M803918200
pubmed: 18693241
Wei L, Zhou C, Chen H, Song J, Su R (2018) ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides. Bioinformatics 34:4007–4016. https://doi.org/10.1093/bioinformatics/bty451
doi: 10.1093/bioinformatics/bty451
pubmed: 29868903
pmcid: 6247924
Yin W, Schütze H, Xiang B, Zhou B (2016) ABCNN: attention-based convolutional neural network for modeling sentence pairs. Trans Association Comput Linguistics 4:259–272. https://doi.org/10.1162/tacl_a_00097
doi: 10.1162/tacl_a_00097
Yu N, Yu Z, Pan Y (2017) A deep learning method for lincRNA detection using auto-encoder algorithm. BMC Bioinformatics 18:511. https://doi.org/10.1186/s12859-017-1922-3
doi: 10.1186/s12859-017-1922-3
pubmed: 29244011
pmcid: 5731497
Zhu X-G, Long SP, Ort DR (2010) Improving photosynthetic efficiency for greater yield. Annu Rev Plant Biol 61:235–261. https://doi.org/10.1146/annurev-arplant-042809-112206
doi: 10.1146/annurev-arplant-042809-112206
pubmed: 20192734