Construction and comparison of three reference-quality genome assemblies for soybean.
Alleles
Centromere
/ genetics
Disease Resistance
/ genetics
Fabaceae
/ genetics
Genetic Variation
Genetics, Population
Genome, Plant
Genotype
Haplotypes
Hardness
Multigene Family
Phylogeny
Polymorphism, Single Nucleotide
Quantitative Trait Loci
Repetitive Sequences, Nucleic Acid
Seed Bank
/ classification
Sequence Inversion
Telomere
/ genetics
Glycine max
Glycine soja
comparative genomics
domestication
genome assembly
soybean
Journal
The Plant journal : for cell and molecular biology
ISSN: 1365-313X
Titre abrégé: Plant J
Pays: England
ID NLM: 9207397
Informations de publication
Date de publication:
12 2019
12 2019
Historique:
received:
27
12
2018
revised:
10
07
2019
accepted:
17
07
2019
pubmed:
23
8
2019
medline:
31
7
2020
entrez:
22
8
2019
Statut:
ppublish
Résumé
We report reference-quality genome assemblies and annotations for two accessions of soybean (Glycine max) and for one accession of Glycine soja, the closest wild relative of G. max. The G. max assemblies provided are for widely used US cultivars: the northern line Williams 82 (Wm82) and the southern line Lee. The Wm82 assembly improves the prior published assembly, and the Lee and G. soja assemblies are new for these accessions. Comparisons among the three accessions show generally high structural conservation, but nucleotide difference of 1.7 single-nucleotide polymorphisms (snps) per kb between Wm82 and Lee, and 4.7 snps per kb between these lines and G. soja. snp distributions and comparisons with genotypes of the Lee and Wm82 parents highlight patterns of introgression and haplotype structure. Comparisons against the US germplasm collection show placement of the sequenced accessions relative to global soybean diversity. Analysis of a pan-gene collection shows generally high conservation, with variation occurring primarily in genomically clustered gene families. We found approximately 40-42 inversions per chromosome between either Lee or Wm82v4 and G. soja, and approximately 32 inversions per chromosome between Wm82 and Lee. We also investigated five domestication loci. For each locus, we found two different alleles with functional differences between G. soja and the two domesticated accessions. The genome assemblies for multiple cultivated accessions and for the closest wild ancestor of soybean provides a valuable set of resources for identifying causal variants that underlie traits for the domestication and improvement of soybean, serving as a basis for future research and crop improvement efforts for this important crop species.
Banques de données
GENBANK
['GCA_002905335']
Types de publication
Comparative Study
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
1066-1082Informations de copyright
© 2019 The Authors The Plant Journal © 2019 John Wiley & Sons Ltd.
Références
Arumuganathan, K., Slattery, J.P., Tanksley, S.D. and Earle, E.D. (1991) Preparation and flow cytometric analysis of metaphase chromosomes of tomato. Theor. Appl. Genet. 82, 101-111.
Avni, R., Nave, M., Barad, O. et al. (2017) Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science, 357, 93-97.
Bandillo, N., Jarquin, D., Song, Q., Nelson, R., Cregan, P., Specht, J. and Lorenz, A. (2015) A population structure and genome-wide association analysis on the USDA soybean germplasm collection. Plant Genome, 8, 1-13.
Bayer, P.E., Hurgobin, B., Golicz, A.A. et al. (2017) Assembly and comparison of two closely related Brassica napus genomes. Plant Biotechnol. J. 15, 1602-1610.
Bergelson, J., Buckler, E.S., Ecker, J.R., Nordborg, M. and Weigel, D. (2016) A proposal regarding best practices for validating the identity of genetic stocks and the effects of genetic variants. Plant Cell, 28, 606-609.
Bernard, R.L. and Cremeens, C.R. (1988) Registration of ‘Williams 82’ soybean. Crop Sci. 28, 1027-1028.
Burton, J.N., Adey, A., Patwardhan, R.P., Qiu, R., Kitzman, J.O. and Shendure, J. (2013) Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119.
Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K. and Madden, T.L. (2009) BLAST+: architecture and applications. BMC Bioinformatics, 10, 421.
Cannon, E.K. and Cannon, S.B. (2011) Chromosome visualization tool: a whole genome viewer. Int. J. Plant Genomics, 2011, 373875.
Cho, Y.B., Jones, S.I. and Vodkin, L.O. (2017) Mutations in Argonaute5 illuminate epistatic interactions of the K1 and I Loci leading to saddle seed color patterns in Glycine max. Plant Cell, 29, 708-725.
Clough, S.J., Tuteja, J.H., Li, M., Marek, L.F., Shoemaker, R.C. and Vodkin, L.O. (2004) Features of a 103-kb gene-rich region in soybean include an inverted perfect repeat cluster of CHS genes comprising the I locus. Genome, 47, 819-831.
Curtin, S.J., Xiong, Y., Michno, J.M., Campbell, B.W., Stec, A.O., Cermak, T., Starker, C., Voytas, D.F., Eamens, A.L. and Stupar, R.M. (2018) CRISPR/Cas9 and TALENs generate heritable mutations for genes involved in small RNA processing of Glycine max and Medicago truncatula. Plant Biotechnol. J. 16, 1125-1137.
Desta, Z.A. and Ortiz, R. (2014) Genomic selection: genome-wide prediction in plant improvement. Trends Plant Sci. 19, 592-601.
Dong, Y., Yang, X., Liu, J., Wang, B.H., Liu, B.L. and Wang, Y.Z. (2014) Pod shattering resistance associated with domestication is mediated by a NAC gene in soybean. Nat. Commun. 5, 3352.
Dorrance, A.E., Jia, H. and Abney, T.S. (2004) Evaluation of soybean differentials for their interaction with Phytophthora sojae. Plant Health Prog. 5, 9.
Du, J., Grant, D., Tian, Z., Nelson, R.T., Zhu, L., Shoemaker, R.C. and Ma, J. (2010) SoyTEdb: a comprehensive database of transposable elements in the soybean genome. BMC Genom. 11, 113.
Emms, D.M. and Kelly, S. (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157.
Findley, S.D., Cannon, S., Varala, K., Du, J., Ma, J., Hudson, M.E., Birchler, J.A. and Stacey, G. (2010) A fluorescence in situ hybridization system for karyotyping soybean. Genetics, 185, 727-744.
Finn, R.D., Bateman, A., Clements, J. et al. (2014) Pfam: the protein families database. Nucleic Acids Res. 42, D222-D230.
Funatsuki, H., Suzuki, M., Hirose, A. et al. (2014) Molecular basis of a shattering resistance boosting global dissemination of soybean. Proc. Natl Acad. Sci. USA, 111, 17797-17802.
Gao, H. and Bhattacharyya, M.K. (2008) The soybean-Phytophthora resistance locus Rps1-k encompasses coiled coil-nucleotide binding-leucine rich repeat-like genes and repetitive sequences. BMC Plant Biol. 8, 29.
Gao, M. and Zhu, H. (2013) Fine mapping of a major quantitative trait locus that regulates pod shattering in soybean. Mol. Breeding, 32, 485-491.
Gill, N., Findley, S., Walling, J.G., Hans, C., Ma, J., Doyle, J., Stacey, G. and Jackson, S.A. (2009) Molecular and chromosomal evidence for allopolyploidy in soybean. Plant Physiol. 151, 1167-1174.
Golicz, A.A., Batley, J. and Edwards, D. (2016) Towards plant pangenomics. Plant Biotechnol. J. 14, 1099-1105.
Haas, B.J., Delcher, A.L., Mount, S.M. et al. (2003) Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654-5666.
Han, M.V. and Zmasek, C.M. (2009) PhyloXML: XML for evolutionary biology and comparative genomics. BMC Bioinformatics, 10, 356.
Haun, W.J., Hyten, D.L., Xu, W.W. et al. (2011) The composition and origins of genomic variation among individuals of the soybean reference cultivar Williams 82. Plant Physiol. 155, 645-655.
Hwang, E.-Y., Song, Q., Jia, G., Specht, J.E., Hyten, D.L., Costa, J. and Cregan, P.B. (2014) A genome-wide association study of seed protein and oil content in soybean. BMC Genom. 15, 1.
Hyten, D.L., Song, Q., Zhu, Y., Choi, I.Y., Nelson, R.L., Costa, J.M., Specht, J.E., Shoemaker, R.C. and Cregan, P.B. (2006) Impacts of genetic bottlenecks on soybean genome diversity. Proc. Natl Acad. Sci. USA, 103, 16666-16671.
Jiao, Y., Peluso, P., Shi, J. et al. (2017) Improved maize reference genome with single-molecule technologies. Nature, 546, 524-527.
Jones, P., Binns, D., Chang, H.Y. et al. (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics, 30, 1236-1240.
Kent, W.J. (2002) BLAT-the BLAST-like alignment tool. Genome Res. 12, 656-664.
Kuhn, R.M., Haussler, D. and Kent, W.J. (2013) The UCSC genome browser and associated tools. Brief. Bioinform. 14, 144-161.
Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C. and Salzberg, S.L. (2004) Versatile and open software for comparing large genomes. Genome Biol. 5, R12.
Laetsch, D.R. and Blaxter, M.L. (2017) KinFin: software for Taxon-Aware analysis of clustered protein sequences. G3 (Bethesda), 7, 3349-3357.
Lee, J.D., Shannon, J.G., Vuong, T.D. and Nguyen, T.N. (2009) Inheritance of salt tolerance in wild soybean (Glycine soja Sieb. and Zucc.) accession PI 483463. J. Hered. 100, 798-801.
Li, Y.H., Zhou, G., Ma, J. et al. (2014) De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat. Biotechnol. 32, 1045-1052.
Li, P., Quan, X., Jia, G., Xiao, J., Cloutier, S. and You, F.M. (2016) RGAugury: a pipeline for genome-wide prediction of resistance gene analogs (RGAs) in plants. BMC Genom. 17, 852.
Ling, H.Q., Ma, B., Shi, X. et al. (2018) Genome sequence of the progenitor of wheat A subgenome Triticum urartu. Nature, 557, 424-428.
Liu, Q., Chang, S., Hartman, G.L. and Domier, L.L. (2018) Assembly and annotation of a draft genome sequence for Glycine latifolia, a perennial wild relative of soybean. Plant J. 95, 71-85.
Parra, G., Bradnam, K. and Korf, I. (2007) CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics, 23, 1061-1067.
Petersen, T.N., Brunak, S., vonHeijne, G. and Nielsen, H. (2011) SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat. Methods, 8, 785-786.
Price, M.N., Dehal, P.S. and Arkin, A.P. (2010) FastTree 2 - Approximately maximum-likelihood trees for large alignments. PLoS ONE, 5(3), e9490.
Prince, S.J., Valliyodan, B., Ye, H. et al. (2019) Understanding genetic control of root system architecture in soybean: insights into the genetic basis of lateral root number. Plant Cell Environ. 42, 212-229.
Raymond, O., Gouzy, J., Just, J. et al. (2018) The Rosa genome provides new insights into the domestication of modern roses. Nat. Genet. 50, 772-777.
Rognes, T., Flouri, T., Nichols, B., Quince, C. and Mahé, F. (2016) VSEARCH: a versatile open source tool for metagenomics. PeerJ, 4, e2584.
Salamov, A.A. and Solovyev, V.V. (2000) Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10, 516-522.
Schmutz, J., Cannon, S.B., Schlueter, J. et al. (2010) Genome sequence of the palaeopolyploid soybean. Nature, 463, 178-183.
Sedivy, E.J., Wu, F. and Hanzawa, Y. (2017) Soybean domestication: the origin, genetic architecture and molecular bases. New Phytol. 214, 539-553.
Shen, Y., Liu, J., Geng, H., Zhang, J., Liu, Y., Zhang, H., Xing, S., Du, J., Ma, S. and Tian, Z. (2018) De novo assembly of a Chinese soybean genome. Sci. China Life Sci. 61, 871-884.
Shimomura, M., Kanamori, H., Komatsu, S. et al. (2015) The Glycine max cv. Enrei genome for improvement of japanese soybean cultivars. Int. J. Genomics, 2015, 358127.
Shu, S., Goodstein, D. and Rokhsar, D. (2013) PERTRAN: Genome-guided RNA-seq Read Assembler. OSTI.gov: U.S. Department of Energy - Office of Scientific and Technical Information.
Simao, F.A., Waterhouse, R.M., Ioannidis, P., Kriventseva, E.V. and Zdobnov, E.M. (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics, 31, 3210-3212.
Song, Q., Hyten, D.L., Jia, G., Quigley, C.V., Fickus, E.W., Nelson, R.L. and Cregan, P.B. (2013) Development and evaluation of SoySNP50K, a high-density genotyping array for soybean. PLoS ONE, 8, e54985.
Song, Q., Hyten, D.L., Jia, G., Quigley, C.V., Fickus, E.W., Nelson, R.L. and Cregan, P.B. (2015) Fingerprinting soybean germplasm and its utility in genomic research. G3 (Bethesda), 5, 1999-2006.
Song, Q., Jenkins, J., Jia, G., Hyten, D.L., Pantalone, V., Jackson, S.A., Schmutz, J. and Cregan, P.B. (2016) Construction of high resolution genetic linkage maps to improve the soybean genome sequence assembly Glyma1.01. BMC Genom. 17, 33.
Springer, N.M., Anderson, S.N., Andorf, C.M. et al. (2018) The maize W22 genome provides a foundation for functional genomics and transposon biology. Nat. Genet. 50, 1282-1288.
Sun, L., Miao, Z., Cai, C. et al. (2015) GmHs1-1, encoding a calcineurin-like protein, controls hard-seededness in soybean. Nat. Genet. 47, 939.
Tek, A.L., Kashihara, K., Murata, M. and Nagaki, K. (2010) Functional centromeres in soybean include two distinct tandem repeats and a retrotransposon. Chromosome Res. 18, 337-347.
Tian, Z., Wang, X., Lee, R., Li, Y., Specht, J.E., Nelson, R.L., McClean, P.E., Qiu, L. and Ma, J. (2010) Artificial selection for determinate growth habit in soybean. Proc. Natl Acad. Sci. USA, 107, 8563-8568.
Tian, Z., Zhao, M., She, M. et al. (2012) Genome-wide characterization of nonreference transposons reveals evolutionary propensities of transposons in soybean. Plant Cell, 24, 4422-4436.
Tuteja, J.H., Clough, S.J., Chan, W.C. and Vodkin, L.O. (2004) Tissue-specific gene silencing mediated by a naturally occurring chalcone synthase gene cluster in Glycine max. Plant Cell, 16, 819-835.
Tuteja, J.H., Zabala, G., Varala, K., Hudson, M. and Vodkin, L.O. (2009) Endogenous, tissue-specific short interfering RNAs silence the chalcone synthase gene family in glycine max seed coats. Plant Cell, 21, 3063-3077.
Valliyodan, B., Dan, Q., Patil, G. et al. (2016) Landscape of genomic diversity and trait discovery in soybean. Sci. Rep. 6, 23598.
Valliyodan, B., Ye, H., Song, L., Murphy, M., Shannon, J.G. and Nguyen, H.T. (2017) Genetic diversity and genomic strategies for improving drought and waterlogging tolerance in soybeans. J. Exp. Bot. 68, 1835-1849.
Vaughn, J.N., Nelson, R.L., Song, Q., Cregan, P.B. and Li, Z. (2014) The genetic architecture of seed composition in soybean is refined by genome-wide association scans across multiple populations. G3 (Bethesda), 4, 2283-2294.
Wang, Y., Tang, H., Debarry, J.D. et al. (2012) MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40(7), e49.
Wang, C.S., Todd, J.J. and Vodkin, L.O. (1994) Chalcone synthase mRNA and activity are reduced in yellow soybean seed coats with dominant I alleles. Plant Physiol. 105, 739-748.
Wang, J., Sun, P., Li, Y. et al. (2017) Hierarchically aligning 10 legume genomes establishes a family-level genomics platform. Plant Physiol. 174, 284-300.
Wu, T.D. and Watanabe, C.K. (2005) GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics, 21, 1859-1875.
Xie, M., Chung, C.Y., Li, M.W. et al. (2019) A reference-grade wild soybean genome. Nat. Commun. 10, 1216.
Zheng, X., Levine, D., Shen, J., Gogarten, S., Laurie, C. and Weir, B. (2012) A high-performance computing toolset for relatedness and principal component analysis of SNP data. Bioinformatics, 28(24), 3326-3328.
Zhou, Z., Jiang, Y., Wang, Z. et al. (2015) Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat. Biotechnol.33, 408-414.