The conservation of gene models can support genome annotation.
Journal
The plant genome
ISSN: 1940-3372
Titre abrégé: Plant Genome
Pays: United States
ID NLM: 101273919
Informations de publication
Date de publication:
09 2023
09 2023
Historique:
revised:
19
07
2023
received:
16
03
2023
accepted:
24
07
2023
medline:
13
9
2023
pubmed:
21
8
2023
entrez:
21
8
2023
Statut:
ppublish
Résumé
Many genome annotations include false-positive gene models, leading to errors in phylogenetic and comparative studies. Here, we propose a method to support gene model prediction based on evolutionary conservation and use it to identify potentially erroneous annotations. Using this method, we developed a set of 15,345 representative gene models from 12 legume assemblies that can be used to support genome annotations for other legumes.
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e20377Informations de copyright
© 2023 The Authors. The Plant Genome published by Wiley Periodicals LLC on behalf of Crop Science Society of America.
Références
Allen, J. E., & Salzberg, S. L. (2005). JIGSAW: Integration of multiple sources of evidence for gene prediction. Bioinformatics, 21(18), 3596-3603. https://doi.org/10.1093/bioinformatics/bti609
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215(3), 403-410. https://doi.org/10.1016/S0022-2836(05)80360-2
Armisén, D., Lecharny, A., & Aubourg, S. (2008). Unique genes in plants: specificities and conserved features throughout evolution. BMC Evolutionary Biology, 8, 280-280. https://doi.org/10.1186/1471-2148-8-280
Brent, M. R. (2005). Genome annotation past, present, and future: how to define an ORF at each locus. Genome Research, 15(12), 1777-1786. https://doi.org/10.1101/gr.3866105
Crane, P. R., & Lidgard, S. (1989). Angiosperm diversification and paleolatitudinal gradients in cretaceous floristic diversity. Science, 246(4930), 675-678. https://doi.org/10.1126/science.246.4930.675
De Vega, J. J., Ayling, S., Hegarty, M., Kudrna, D., Goicoechea, J. L., Ergon, Å., Rognli, O. A., Jones, C., Swain, M., Geurts, R., Lang, C., Mayer, K. F. X., Rössner, S., Yates, S., Webb, K. J., Donnison, I. S., Oldroyd, G. E. D., Wing, R. A., Caccamo, M., … Skøt, L. (2015). Red clover (Trifolium pratense L.) draft genome provides a platform for trait improvement. Scientific Reports, 5(1), Article 17394. https://doi.org/10.1038/srep17394
Demuth, J. P., & Hahn, M. W. (2009). The life and death of gene families. BioEssays, 31(1), 29-39. https://doi.org/10.1002/bies.080085
Gerstein, M. B., Bruce, C., Rozowsky, J. S., Zheng, D., Du, J., Korbel, J. O., Emanuelsson, O., Zhang, Z. D., Weissman, S., & Snyder, M. (2007). What is a gene, post-ENCODE? History and updated definition. Genome Research, 17(6), 669-681. https://doi.org/10.1101/gr.6339607
Golicz, A. A., Bayer, P. E., Barker, G. C., Edger, P. P., Kim, H., Martinez, P. A., Chan, C. K. K., Severn-Ellis, A., McCombie, W. R., Parkin, I. A. P., Paterson, A. H., Pires, J. C., Sharpe, A. G., Tang, H., Teakle, G. R., Town, C. D., Batley, J., & Edwards, D. (2016). The pangenome of an agronomically important crop plant Brassica oleracea. Nature Communications, 7(1), Article 13390. https://doi.org/10.1038/ncomms13390
Gu, Z., Cavalcanti, A., Chen, F. C., Bouman, P., & Li, W. H. (2002). Extent of gene duplication in the genomes of Drosophila, nematode, and yeast. Molecular Biology and Evolution, 19(3), 256-262. https://doi.org/10.1093/oxfordjournals.molbev.a004079
Haas, B. J., Salzberg, S. L., Zhu, W., Pertea, M., Allen, J. E., Orvis, J., White, O., Buell, C. R., & Wortman, J. R. (2008). Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biology, 9(1), R7. https://doi.org/10.1186/gb-2008-9-1-r7
Hagberg, A., Swart, P., & Chult, S. D. (2008). Exploring network structure, dynamics, and function using NetworkX. In G. Varoquaux, T. Vaught, & J. Millman (Eds.), Proceedings of 7th Python in Science Conference (SciPy2008) (pp. 11-15). SciPy.
Hane, J. K., Ming, Y., Kamphuis, L. G., Nelson, M. N., Garg, G., Atkins, C. A., Bayer, P. E., Bravo, A., Bringans, S., Cannon, S., Edwards, D., Foley, R., Gao, L.-L., Harrison, M. J., Huang, W., Hurgobin, B., Li, S., Liu, C.-W., McGrath, A., … Singh, K. B. (2017). A comprehensive draft genome sequence for lupin (Lupinus angustifolius), an emerging health food: Insights into plant-microbe interactions and legume evolution. Plant Biotechnology Journal, 15(3), 318-330. https://doi.org/10.1111/pbi.12615
Holt, C., & Yandell, M. (2011). MAKER2: An annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics, 12(1), 491. https://doi.org/10.1186/1471-2105-12-491
Hurgobin, B., Golicz, A. A., Bayer, P. E., Chan, C.-K. K., Tirnaz, S., Dolatabadian, A., Schiessl, S. V., Samans, B., Montenegro, J. D., Parkin, I. A. P., Pires, J. C., Chalhoub, B., King, G. J., Snowdon, R., Batley, J., & Edwards, D. (2018). Homoeologous exchange is a major cause of gene presence/absence variation in the amphidiploid Brassica napus. Plant Biotechnology Journal, 16(7), 1265-1274. https://doi.org/10.1111/pbi.12867
Johannsen, W. (1909). Elemente der exakten Erblichkeitslehre. Fischer.
Kang, Y. J., Kim, S. K., Kim, M. Y., Lestari, P., Kim, K. H., Ha, B.-K., Jun, T. H., Hwang, W. J., Lee, T., Lee, J., Shim, S., Yoon, M. Y., Jang, Y. E., Han, K. S., Taeprayoon, P., Yoon, N., Somta, P., Tanya, P., Kim, K. S., … Lee, S.-H. (2014). Genome sequence of mungbean and insights into evolution within Vigna species. Nature Communications, 5(1), Article 5443. https://doi.org/10.1038/ncomms6443
Kang, Y. J., Satyawan, D., Shim, S., Lee, T., Lee, J., Hwang, W. J., Kim, S. K., Lestari, P., Laosatit, K., Kim, K. H., Ha, T. J., Chitikineni, A., Kim, M. Y., Ko, J.-M., Gwag, J.-G., Moon, J.-K., Lee, Y.-H., Park, B.-S., Varshney, R. K., & Lee, S.-H. (2015). Draft genome sequence of adzuki bean, Vigna angularis. Scientific Reports, 5(1), Article 8069. https://doi.org/10.1038/srep08069
Keilwagen, J., Wenk, M., Erickson, J. L., Schattat, M. H., Grau, J., & Hartung, F. (2016). Using intron position conservation for homology-based gene prediction. Nucleic Acids Research, 44(9), e89. https://doi.org/10.1093/nar/gkw092
König, S., Romoth, L., & Stanke, M. (2018). Comparative genome annotation. Methods in Molecular Biology, 1704, 189-212. https://doi.org/10.1007/978-1-4939-7463-4_6
Kreplak, J., Madoui, M.-A., Cápal, P., Novák, P., Labadie, K., Aubert, G., Bayer, P. E., Gali, K. K., Syme, R. A., Main, D., Klein, A., Bérard, A., Vrbová, I., Fournier, C., d'Agata, L., Belser, C., Berrabah, W., Toegelová, H., Milec, Z., … Burstin, J. (2019). A reference genome for pea provides insight into legume genome evolution. Nature Genetics, 51(9), 1411-1422. https://doi.org/10.1038/s41588-019-0480-1
Kumar, S., Stecher, G., Suleski, M., & Hedges, S. B. (2017). TimeTree: A resource for timelines, timetrees, and divergence times. Molecular Biology and Evolution, 34(7), 1812-1819. https://doi.org/10.1093/molbev/msx116
Lee, H., Chawla, H. S., Obermeier, C., Dreyer, F., Abbadi, A., & Snowdon, R. (2020). Chromosome-scale assembly of winter oilseed rape Brassica napus. Frontiers in Plant Science, 11. https://doi.org/10.3389/fpls.2020.00496
Lonardi, S., Muñoz-Amatriaín, M., Liang, Q., Shu, S., Wanamaker, S. I., Lo, S., Tanskanen, J., Schulman, A. H., Zhu, T., Luo, M.-C., Alhakami, H., Ounit, R., Hasan, A. M., Verdier, J., Roberts, P. A., Santos, J. R. P., Ndeve, A., Doležel, J., Vrána, J., … Close, T. J. (2019). The genome of cowpea (Vigna unguiculata [L.] Walp.). Plant Journal, 98(5), 767-782. https://doi.org/10.1111/tpj.14349
Lupia, R., Lidgard, S., & Crane, P. R. (1999). Comparing palynological abundance and diversity: Implications for biotic replacement during the cretaceous angiosperm radiation. Paleobiology, 25(3), 305-340. http://www.jstor.org/stable/2666001
Macneil, L. T., & Walhout, A. J. (2011). Gene regulatory networks and the role of robustness and stochasticity in the control of gene expression. Genome Research, 21(5), 645-657. https://doi.org/10.1101/gr.097378.109
NCBI Resources Coordinators. (2018). Database resources of the National Center for Biotechnology Information. Nucleic Acids Research, 46(D1), D8-D13. https://doi.org/10.1093/nar/gkx1095
Rana, D., van den Boogaart, T., O'Neill, C. M., Hynes, L., Bent, E., Macpherson, L., Park, J. Y., Lim, Y. P., & Bancroft, I. (2004). Conservation of the microstructure of genome segments in Brassica napus and its diploid relatives. Plant Journal, 40(5), 725-733. https://doi.org/10.1111/j.1365-313X.2004.02244.x
Sato, S., Nakamura, Y., Kaneko, T., Asamizu, E., Kato, T., Nakao, M., Sasamoto, S., Watanabe, A., Ono, A., Kawashima, K., Fujishiro, T., Katoh, M., Kohara, M., Kishida, Y., Minami, C., Nakayama, S., Nakazaki, N., Shimizu, Y., Shinpo, S., … Tabata, S. (2008). Genome structure of the legume, Lotus japonicus. DNA Research, 15(4), 227-239. https://doi.org/10.1093/dnares/dsn008
Schmutz, J., Cannon, S. B., Schlueter, J., Ma, J., Mitros, T., Nelson, W., Hyten, D. L., Song, Q., Thelen, J. J., Cheng, J., Xu, D., Hellsten, U., May, G. D., Yu, Y., Sakurai, T., Umezawa, T., Bhattacharyya, M. K., Sandhu, D., Valliyodan, B., … Jackson, S. A. (2010). Genome sequence of the palaeopolyploid soybean. Nature, 463(7278), 178-183. https://doi.org/10.1038/nature08670
Schmutz, J., McClean, P. E., Mamidi, S., Wu, G. A., Cannon, S. B., Grimwood, J., Jenkins, J., Shu, S., Song, Q., Chavarro, C., Torres-Torres, M., Geffroy, V., Moghaddam, S. M., Gao, D., Abernathy, B., Barry, K., Blair, M., Brick, M. A., Chovatia, M., … Jackson, S. A. (2014). A reference genome for common bean and genome-wide analysis of dual domestications. Nature Genetics, 46(7), 707-713. https://doi.org/10.1038/ng.3008
Schnable, J. C. (2020). Genes and gene models, an important distinction. New Phytologist, 228(1), 50-55. https://doi.org/10.1111/nph.16011
Sharma, V., Schwede, P., & Hiller, M. (2017). CESAR 2.0 substantially improves speed and accuracy of comparative gene annotation. Bioinformatics, 33(24), 3985-3987. https://doi.org/10.1093/bioinformatics/btx527
Tang, H., Krishnakumar, V., Bidwell, S., Rosen, B., Chan, A., Zhou, S., Gentzbittel, L., Childs, K. L., Yandell, M., Gundlach, H., Mayer, K. F., Schwartz, D. C., & Town, C. D. (2014). An improved genome release (version Mt4.0) for the model legume Medicago truncatula. BMC Genomics, 15, Article 312. https://doi.org/10.1186/1471-2164-15-312
Tian, X., Pascal, G., Fouchécourt, S., Pontarotti, P., & Monget, P. (2009). Gene birth, death, and divergence: The different scenarios of reproduction-related gene evolution. Biology of Reproduction, 80(4), 616-621. https://doi.org/10.1095/biolreprod.108.073684
Varshney, R. K., Chen, W., Li, Y., Bharti, A. K., Saxena, R. K., Schlueter, J. A., Donoghue, M. T. A., Azam, S., Fan, G., Whaley, A. M., Farmer, A. D., Sheridan, J., Iwata, A., Tuteja, R., Penmetsa, R. V., Wu, W., Upadhyaya, H. D., Yang, S.-P., Shah, T., … Jackson, S. A. (2012). Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nature Biotechnology, 30(1), 83-89. https://doi.org/10.1038/nbt.2022
Varshney, R. K., Song, C., Saxena, R. K., Azam, S., Yu, S., Sharpe, A. G., Cannon, S., Baek, J., Rosen, B. D., Tar'an, B., Millan, T., Zhang, X., Ramsay, L. D., Iwata, A., Wang, Y., Nelson, W., Farmer, A. D., Gaur, P. M., Soderlund, C., … Cook, D. R. (2013). Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement. Nature Biotechnology, 31(3), 240-246. https://doi.org/10.1038/nbt.2491
Wang, Z., Chen, Y., & Li, Y. (2004). A brief review of computational gene prediction methods. Genomics, Proteomics & Bioinformatics, 2(4), 216-221. https://doi.org/10.1016/s1672-0229(04)02028-5
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag. https://ggplot2.tidyverse.org
Yang, T., Liu, R., Luo, Y., Hu, S., Wang, D., Wang, C., Pandey, M. K., Ge, S., Xu, Q., Li, N., Li, G., Huang, Y., Saxena, R. K., Ji, Y., Li, M., Yan, X., He, Y., Liu, Y., Wang, X., … Zong, X. (2022). Improved pea reference genome and pan-genome highlight genomic features and evolutionary characteristics. Nature Genetics, 54(10), 1553-1563. https://doi.org/10.1038/s41588-022-01172-2