Chromosome genome assembly and annotation of Adzuki Bean (Vigna angularis).

Vigna / genetics Genome, Plant Molecular Sequence Annotation Chromosomes, Plant

Journal

Scientific data

ISSN: 2052-4463

Titre abrégé: Sci Data

Pays: England

ID NLM: 101640192

Informations de publication

Date de publication:
02 Oct 2024

Historique:

received: 19 06 2023

accepted: 23 09 2024

medline: 3 10 2024

pubmed: 3 10 2024

entrez: 2 10 2024

Statut: epublish

Résumé

Adzuki bean (Vigna angularis) is a significant dietary legume crop that is prevalent in East Asia. It also holds traditional medicinal importance in China. In this study, we report a high-quality, chromosome-level genome assembly of adzuki bean obtained by employing Illumina short-read sequencing, PacBio long-read sequencing, and Hi-C technology. The assembly spans 447.8 Mb, encompassing 96.32% of the estimated genome, with contig and scaffold N50 values of 16.5 and 41.0 Mb, respectively. More than 98.2% of the 1,614 BUSCO genes were fully identified, and 25,939 genes were annotated, with 98.23% of them being functionally identifiable. Vigna angularis was estimated to diverge successively from Vigna unguiculata and Vigna radiata about 15.3 and 8.7 million years ago (Ma), respectively. This chromosome-level reference genome of Vigna angularis provides a robust foundation for exploring the functional genomics and genome evolution of adzuki bean, thereby facilitating advancements in molecular breeding of adzuki bean.

Identifiants

DOI: 10.1038/s41597-024-03911-y PMID: 39358398

pubmed: 39358398

doi: 10.1038/s41597-024-03911-y

pii: 10.1038/s41597-024-03911-y

doi:

Types de publication

Journal Article Dataset

Langues

eng

Sous-ensembles de citation

Pagination

1074

Informations de copyright

Références

Xie, Y., Xu, J. H., Lu, W. Y. & Lin, G. Q. Adzuki bean: a new resource of biocatalyst for asymmetric reduction of aromatic ketones with high stereoselectivity and substrate tolerance. Bioresour Technol. 100, 2463–8 (2009).

doi: 10.1016/j.biortech.2008.11.054 pubmed: 19153040

Yook, J. S. et al. Black Adzuki bean (Vigna angularis) attenuates high-fat diet-induced colon inflammation in mice. J Med Food. 20, 367–375 (2017).

doi: 10.1089/jmf.2016.3821 pubmed: 28406732

Chu, L. et al. Genetic analysis of seed coat colour in adzuki bean (Vigna angularis L.). Plant Genet Resour. 19, 67–73 (2021).

doi: 10.1017/S1479262121000101

Xiang, H. et al. Uniconazole foliar spray treatment alleviates cold stress in adzuki bean (Vigna angularis) seedlings. Intl J Agric Biol. 23, 235–240 (2020).

Kramer, C. et al. Control of volunteer adzuki bean in soybean. Agri Sci. 3, 501–509 (2012).

Jameel, M., Al-Khayri, ShriMohan Jain, Dennis V. Johnson. Advances in plant breeding strategies: Legumes. Springer Nature Switzerland AG. Chapter 1 (2019)

Kang, Y. J. et al. Draft genome squence of adzuki bean, Vigna angularis. Sci Rep. 5, 8069 (2015).

doi: 10.1038/srep08069 pubmed: 25626881 pmcid: 5389050

Yamaguchi, H. Wild and weed azuki beans in Japan. Econ Bot. 46, 384–394 (1992).

doi: 10.1007/BF02866509

Sakai, H. et al. The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome. Sci. Rep. 5, 1–13 (2015).

doi: 10.1038/srep16780

Yang, K. et al. Genome sequencing of adzuki bean (Vigna angularis) provides insight into high starch and low fat accumulation and domestication. Proc. Natl. Acad. Sci. USA 112, 13213–13218 (2015).

doi: 10.1073/pnas.1420949112 pubmed: 26460024 pmcid: 4629392

Chu, L. et al. Chromosome-level reference genome and resequencing of 322 accessions reveal evolution, genomic imprint and key agronomic traits in adzuki bean. Plant Biotechnol. J. https://doi.org/10.1111/pbi.14337 (2024).

doi: 10.1111/pbi.14337 pubmed: 38715243 pmcid: 11332220

Liu, Y. et al. Pan-Genome of Wild and Cultivated Soybeans. Cell 182, 162–176 (2020).

doi: 10.1016/j.cell.2020.05.023 pubmed: 32553274

Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 27, 764 (2011).

doi: 10.1093/bioinformatics/btr011 pubmed: 21217122 pmcid: 3051319

Sergey, K. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

doi: 10.1101/gr.215087.116

Robert, V. et al. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).

doi: 10.1101/gr.214270.116

Bruce, W. et al. Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement. PloS One. 9, 112 (2014).

Roach, M. J. et al. Purge Haplotigs: Synteny Reduction for Third-gen Diploid Genome Assemblies. BMC Bioinformatics. 19, 460 (2018).

doi: 10.1186/s12859-018-2485-7 pubmed: 30497373 pmcid: 6267036

Simao, F. A. et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 31, 3210–3212 (2015).

doi: 10.1093/bioinformatics/btv351 pubmed: 26059717

Zhao, X. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, 265–268 (2007).

doi: 10.1093/nar/gkm286

Ou, S. J. & Jian, N. LTR_retriever: a highly accurate and sensitive program for identification of 2 long terminal-repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2017).

doi: 10.1104/pp.17.01310 pubmed: 29233850 pmcid: 5813529

Nicolas, S. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).

doi: 10.1186/s13059-015-0831-x

Jung, Y. & Han, D. BWA-MEME: BWA-MEM emulated with a machine learning approach. Bioinformatics. 38, 2404–2413 (2022).

doi: 10.1093/bioinformatics/btac137 pubmed: 35253835

Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98 (2016).

doi: 10.1016/j.cels.2016.07.002 pubmed: 27467249 pmcid: 5846465

Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 356, 92–95 (2017).

doi: 10.1126/science.aal3327 pubmed: 28336562 pmcid: 5635820

Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics. 21, i351–i358 (2005).

doi: 10.1093/bioinformatics/bti1018 pubmed: 15961478

Tempel, S. Using and Understanding RepeatMasker. Methods Mol Biol. 859, 29–51 (2012).

doi: 10.1007/978-1-61779-603-6_2 pubmed: 22367864

Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob DNA. 6, 11 (2015).

doi: 10.1186/s13100-015-0041-9 pubmed: 26045719 pmcid: 4455052

Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

doi: 10.1093/nar/27.2.573 pubmed: 9862982 pmcid: 148217

Kim, D., Langmead, B. & Salzberg, S. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 12, 357–360 (2015).

doi: 10.1038/nmeth.3317 pubmed: 25751142 pmcid: 4655817

Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat Biotechnol. 33, 290–295 (2015).

doi: 10.1038/nbt.3122 pubmed: 25690850 pmcid: 4643835

Pertea, M. et al. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc. 11, 1650–1667 (2016).

doi: 10.1038/nprot.2016.095 pubmed: 27560171 pmcid: 5032908

Gertz, E. M., Yu, Y. K., Agarwala, R., Schäffer, A. A. & Altschul, S. F. Composition-based statistics and translated nucleotide searches: improving the TBLASTN module of BLAST. BMC Biol. 4, 41 (2006).

doi: 10.1186/1741-7007-4-41 pubmed: 17156431 pmcid: 1779365

Stanke, M. & Morgenstern, B. AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints. Nucleic Acids Res. 33, W465–W467 (2005).

doi: 10.1093/nar/gki458 pubmed: 15980513 pmcid: 1160219

Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J Mol Biol. 268, 78–94 (1997).

doi: 10.1006/jmbi.1997.0951 pubmed: 9149143

Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 22, 12, 491 (2011).

The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45, D158–D169 (2016).

doi: 10.1093/nar/gkw1099

Finn, R. D. et al. InterPro in 2017—beyond protein family and domain annotations. Nucleic Acids Res. 45, D190–D199 (2016).

doi: 10.1093/nar/gkw1107 pubmed: 27899635 pmcid: 5210578

Tatusov, R. L. et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 11, 41 (2003).

doi: 10.1186/1471-2105-4-41

The Gene Ontology Consortium. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Res. 47, D330–D338 (2019).

doi: 10.1093/nar/gky1055

Kanehisa, M. et al. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic acids Res. 42, D199–205 (2014).

doi: 10.1093/nar/gkt1076 pubmed: 24214961

Xiang, H. whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:JABFOF000000000 (2020).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR11787767 (2020).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR11787766 (2020).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR11787768 (2020).

NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRR11787765 (2020).

Chromosome genome assembly and annotation of Adzuki Bean (Vigna angularis).

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Wan Li (W)

Fanglei He (F)

Xueyang Wang (X)

Qi Liu (Q)

Xiaoqing Zhang (X)

Zhiquan Yang (Z)

Chao Fang (C)

Hongtao Xiang (H)

Articles similaires

Exploring the complexity of genome size reduction in angiosperms.

A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study.

Fine mapping of a major QTL, qECQ8, for rice taste quality.

Genome-wide analysis of the Amorphophallus konjac AkCSLA gene family and its functional characterization in drought tolerance of transgenic arabidopsis.

Classifications MeSH