Near-gapless genome assemblies of Williams 82 and Lee cultivars for accelerating global soybean research.
Journal
The plant genome
ISSN: 1940-3372
Titre abrégé: Plant Genome
Pays: United States
ID NLM: 101273919
Informations de publication
Date de publication:
25 Sep 2023
25 Sep 2023
Historique:
revised:
01
08
2023
received:
17
05
2023
accepted:
03
08
2023
medline:
26
9
2023
pubmed:
26
9
2023
entrez:
26
9
2023
Statut:
aheadofprint
Résumé
Complete, gapless telomere-to-telomere chromosome assemblies are a prerequisite for comprehensively investigating the architecture of complex regions, like centromeres or telomeres and removing uncertainties in the order, spacing, and orientation of genes. Using complementary genomics technologies and assembly algorithms, we developed highly contiguous, nearly gapless, genome assemblies for two economically important soybean [Glycine max (L.) Merr] cultivars (Williams 82 and Lee). The centromeres were distinctly annotated on all the chromosomes of both assemblies. We further found that the canonical telomeric repeats were present at the telomeres of all chromosomes of both Williams 82 and Lee genomes. A total of 10 chromosomes in Williams 82 and eight in Lee were entirely reconstructed in single contigs without any gap. Using the combination of ab initio prediction, protein homology, and transcriptome evidence, we identified 58,287 and 56,725 protein-coding genes in Williams 82 and Lee, respectively. The genome assemblies and annotations will serve as a valuable resource for studying soybean genomics and genetics and accelerating soybean improvement.
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
e20382Subventions
Organisme : Missouri Agricultural Experimental Station at University of Missouri
Informations de copyright
© 2023 The Authors. The Plant Genome published by Wiley Periodicals LLC on behalf of Crop Science Society of America.
Références
Bayer, P. E., Valliyodan, B., Hu, H., Marsh, J. I., Yuan, Y., Vuong, T. D., Patil, G., Song, Q., Batley, J., Varshney, R. K., Lam, H.-M., Edwards, D., & Nguyen, H. T. (2022). Sequencing the USDA core soybean collection reveals gene loss during domestication and breeding. The Plant Genome, 15(1), e20109. https://doi.org/10.1002/tpg2.20109
Belser, C., Baurens, F.-C., Noel, B., Martin, G., Cruaud, C., Istace, B., Yahiaoui, N., Labadie, K., Hřibová, E., Doležel, J., Lemainque, A., Wincker, P., D'Hont, A., & Aury, J.-M. (2021). Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing. Communications Biology, 4(1), 1047. https://doi.org/10.1038/s42003-021-02559-3
Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114-2120. https://doi.org/10.1093/bioinformatics/btu170
Cabanettes, F., & Klopp, C. (2018). D-GENIES: Dot plot large genomes in an interactive, efficient and simple way. PeerJ, 6, e4958. https://doi.org/10.7717/peerj.4958
Chan, P. P., Lin, B. Y., Mak, A. J., & Lowe, T. M. (2021). tRNAscan-SE 2.0: Improved detection and functional classification of transfer RNA genes. Nucleic Acids Research, 49(16), 9077-9096. https://doi.org/10.1093/nar/gkab688
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H., & Li, H. (2021). Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nature Methods, 18(2), 170-175. https://doi.org/10.1038/s41592-020-01056-5
Chu, J. S.-C., Peng, B., Tang, K., Yi, X., Zhou, H., Wang, H., Li, G., Leng, J., Chen, N., & Feng, X. (2021). Eight soybean reference genome resources from varying latitudes and agronomic traits. Scientific Data, 8(1), 164. https://doi.org/10.1038/s41597-021-00947-2
Coleman, A. D., Maroschek, J., Raasch, L., Takken, F. L. W., Ranf, S., & Hückelhoven, R. (2020). The Arabidopsis leucine-rich repeat receptor-like kinase MIK2 is a crucial component of early immune responses to a fungal-derived elicitor. New Phytologist, 229(6), 3453-3466. https://doi.org/10.1111/nph.17122
Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., Whitwham, A., Keane, T., McCarthy, S. A., Davies, R. M., & Li, H. (2021). Twelve years of SAMtools and BCFtools. GigaScience, 10(2), giab008. https://doi.org/10.1093/gigascience/giab008
Deng, Y., Liu, S., Zhang, Y., Tan, J., Li, X., Chu, X., Xu, B., Tian, Y., Sun, Y., Li, B., Xu, Y., Deng, X. W., He, H., & Zhang, X. (2022). A telomere-to-telomere gap-free reference genome of watermelon and its mutation library provide important resources for gene discovery and breeding. Molecular Plant, 15(8), 1268-1284. https://doi.org/10.1016/j.molp.2022.06.010
Durand, N. C., Shamim, M. S., Machol, I., Rao, S. S. P., Huntley, M. H., Lander, E. S., & Aiden, E. L. (2016). Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Systems, 3(1), P95-P98. https://doi.org/10.1016/j.cels.2016.07.002
Fehr, W. R., Caviness, C. E., Burmood, D. T., & Pennington, J. S. (1971). Stage of development descriptions for soybeans, Glycine max (L.) Merrill. Crop Science, 11, 929-931. https://doi.org/10.2135/cropsci1971.0011183X001100060051x
Garg, V., Dudchenko, O., Wang, J., Khan, A. W., Gupta, S., Kaur, P., Han, K., Saxena, R. K., Kale, S. M., Pham, M., Yu, J., Chitikineni, A., Zhang, Z., Fan, G., Lui, C., Valluri, V., Meng, F., Bhandari, A., Liu, X., … Varshney, R. K. (2022). Chromosome-length genome assemblies of six legume species provide insights into genome organization, evolution, and agronomic traits for crop improvement. Journal of Advanced Research, 42, 315-329. https://doi.org/10.1016/j.jare.2021.10.009
Gill, N., Findley, S., Walling, J. G., Hans, C., Ma, J., Doyle, J., Stacey, G., & Jackson, S. A. (2009). Molecular and chromosomal evidence for allopolyploidy in soybean. Plant Physiology, 151(3), 1167-1174. https://doi.org/10.1104/pp.109.137935
Goel, M., Sun, H., Jiao, W.-B., & Schneeberger, K. (2019). SyRI: Finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biology, 20(1), 277. https://doi.org/10.1186/s13059-019-1911-0
Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., Couger, M. B., Eccles, D., Li, B., Lieber, M., MacManes, M. D., Ott, M., Orvis, J., Pochet, N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C. N., … Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature Protocols, 8(8), 1494-1512. https://doi.org/10.1038/nprot.2013.084
Haas, B. J., Salzberg, S. L., Zhu, W., Pertea, M., Allen, J. E., Orvis, J., White, O., Buell, C. R., & Wortman, J. R. (2008). Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biology, 9(1), R7. https://doi.org/10.1186/gb-2008-9-1-r7
Hoff, K. J., Lomsadze, A., Borodovsky, M., & Stanke, M. (2019). Whole-Genome Annotation with BRAKER. In M. Kollmar (Ed.), Gene prediction: Methods and protocols (pp. 65-95). Springer.
Kim, D., Paggi, J. M., Park, C., Bennett, C., & Salzberg, S. L. (2019). Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology, 37(8), 907-915. https://doi.org/10.1038/s41587-019-0201-4
Kovaka, S., Zimin, A. V., Pertea, G. M., Razaghi, R., Salzberg, S. L., & Pertea, M. (2019). Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biology, 20(1), 278. https://doi.org/10.1186/s13059-019-1910-1
Li, H. (2018). Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics, 34(18), 3094-3100. https://doi.org/10.1093/bioinformatics/bty191
Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25(14), 1754-1760. https://doi.org/10.1093/bioinformatics/btp324
Li, P., Quan, X., Jia, G., Xiao, J., Cloutier, S., & You, F. M. (2016). RGAugury: A pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants. BMC Genomics, 17(1), 852. https://doi.org/10.1186/s12864-016-3197-x
Liu, Y., Du, H., Li, P., Shen, Y., Peng, H., Liu, S., Zhou, G.-A., Zhang, H., Liu, Z., Shi, M., Huang, X., Li, Y., Zhang, M., Wang, Z., Zhu, B., Han, B., Liang, C., & Tian, Z. (2020). Pan-Genome of wild and cultivated soybeans. Cell, 182(1), 162-176.e13. https://doi.org/10.1016/j.cell.2020.05.023
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A., & Zdobnov, E. M. (2021). BUSCO update: Novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Molecular Biology and Evolution, 38(10), 4647-4654. https://doi.org/10.1093/molbev/msab199
Marçais, G., Delcher, A. L., Phillippy, A. M., Coston, R., Salzberg, S. L., & Zimin, A. (2018). MUMmer4: A fast and versatile genome alignment system. PLoS Computational Biology, 14(1), e1005944. https://doi.org/10.1371/journal.pcbi.1005944
Nawrocki, E. P., & Eddy, S. R. (2013). Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics, 29(22), 2933-2935. https://doi.org/10.1093/bioinformatics/btt509
Payne, T., Johnson, S. D., & Koltunow, A. M. (2004). KNUCKLES(KNU) encodes a C2H2 zinc-finger protein that regulates development of basal pattern elements of the Arabidopsis gynoecium. Development, 131(15), 3737-3749. https://doi.org/10.1242/dev.01216
Schmutz, J., Cannon, S. B., Schlueter, J., Ma, J., Mitros, T., Nelson, W., Hyten, D. L., Song, Q., Thelen, J. J., Cheng, J., Xu, D., Hellsten, U., May, G. D., Yu, Y., Sakurai, T., Umezawa, T., Bhattacharyya, M. K., Sandhu, D., Valliyodan, B., … Jackson, S. A. (2010). Genome sequence of the palaeopolyploid soybean. Nature, 463(7278), 178-183. https://doi.org/10.1038/nature08670
Slater, G. S. C., & Birney, E. (2005). Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics, 6, 31. https://doi.org/10.1186/1471-2105-6-31
UniProt Consortium. (2019). UniProt: A worldwide hub of protein knowledge. Nucleic Acids Research, 47(D1), D506-D515. https://doi.org/10.1093/nar/gky1049
Valliyodan, B., Cannon, S. B., Bayer, P. E., Shu, S., Brown, A. V., Ren, L., Jenkins, J., Chung, C. Y.-L., Chan, T.-F., Daum, C. G., Plott, C., Hastie, A., Baruch, K., Barry, K. W., Huang, W., Patil, G., Varshney, R. K., Hu, H., Batley, J., … Nguyen, H. T. (2019). Construction and comparison of three reference-quality genome assemblies for soybean. The Plant Journal, 100(5), 1066-1082. https://doi.org/10.1111/tpj.14500
Varshney, R. K., Sinha, P., Singh, V. K., Kumar, A., Zhang, Q., & Bennetzen, J. L. (2020). 5Gs for crop genetic improvement. Current Opinion in Plant Biology, 56, 190-196. https://doi.org/10.1016/j.pbi.2019.12.004
Wang, B., Yang, X., Jia, Y., Xu, Y., Jia, P., Dang, N., Wang, S., Xu, T., Zhao, X., Gao, S., Dong, Q., & Ye, K. (2022). High-quality Arabidopsis thaliana genome assembly with nanopore and HiFi long reads. Genomics, Proteomics & Bioinformatics, 20(1), 4-13.
Wu, T. D., & Watanabe, C. K. (2005). GMAP: A genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics, 21(9), 1859-1875. https://doi.org/10.1093/bioinformatics/bti310
Yang, X., Zhang, L., Guo, X., Xu, J., Zhang, K., Yang, Y., Yang, Y., Jian, Y., Dong, D., Huang, S., Cheng, F., & Li, G. (2023). The gap-free potato genome assembly reveals large tandem gene clusters of agronomical importance in highly repeated genomic regions. Molecular Plant, 16(2), 314-317. https://doi.org/10.1016/j.molp.2022.12.010