The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars.


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
16 09 2023
Historique:
received: 06 06 2023
accepted: 05 09 2023
medline: 18 9 2023
pubmed: 17 9 2023
entrez: 16 9 2023
Statut: epublish

Résumé

Musa acuminata is a main wild contributor to banana cultivars. Here, we reported a haplotype-resolved and telomere-to-telomere reference genome of M. acuminata by incorporating PacBio HiFi reads, Nanopore ultra-long reads, and Hi-C data. The genome size of the two haploid assemblies was estimated to be 469.83 Mb and 470.21 Mb, respectively. Multiple assessments confirmed the contiguity (contig N50: 16.53 Mb and 18.58 Mb; LAI: 20.18 and 19.48), completeness (BUSCOs: 98.57% and 98.57%), and correctness (QV: 45.97 and 46.12) of the genome. The repetitive sequences accounted for about half of the genome size. In total, 40,889 and 38,269 protein-coding genes were annotated in the two haploid assemblies, respectively, of which 9.56% and 3.37% were newly predicted. Genome comparison identified a large reciprocal translocation involving 3 Mb and 10 Mb from chromosomes 01 and 04 within M. acuminata. This reference genome of M. acuminata provides a valuable resource for further understanding of subgenome evolution of Musa species, and precise genetic improvement of banana.

Identifiants

pubmed: 37716992
doi: 10.1038/s41597-023-02546-9
pii: 10.1038/s41597-023-02546-9
pmc: PMC10505225
doi:

Types de publication

Dataset Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

631

Subventions

Organisme : National Natural Science Foundation of China (National Science Foundation of China)
ID : 32070237, 31261140366

Informations de copyright

© 2023. Springer Nature Limited.

Références

Brozynska, M., Furtado, A. & Henry, R. J. Genomics of crop wild relatives: expanding the gene pool for crop improvement. Plant Biotechnol. J. 14, 1070–1085 (2016).
pubmed: 26311018
Bohra, A. et al. Reap the crop wild relatives for breeding future crops. Trends Biotechnol. 40, 412–431 (2022).
pubmed: 34629170
Castaneda-Alvarez, N. P. et al. Global conservation priorities for crop wild relatives. Nat. Plants 2, 16022 (2016).
pubmed: 27249561
Perrier, X. et al. Multidisciplinary perspectives on banana (Musa spp.) domestication. Proc. Natl. Acad. Sci. USA 108, 11311–11318 (2011).
pubmed: 21730145 pmcid: 3136277
Davey, M. W. et al. A draft Musa balbisiana genome sequence for molecular genetics in polyploid, inter- and intra-specific Musa hybrids. BMC Genom. 14, 683 (2013).
Perrier, X. et al. Combining biological approaches to shed light on the evolution of edible bananas. Ethnobot. Res. App. 7, 199–216 (2009).
Shepherd K. Cytogenetics Of The Genus Musa (International Network for the Improvement of Banana and Plantain, 1999).
Hippolyte, I. et al. A saturated SSR/DarT linkage map of Musa acuminata addressing genome rearrangements among bananas. BMC Plant Biol. 10, 65 (2010).
pubmed: 20388207 pmcid: 2923539
Martin, G. et al. Evolution of the banana genome (Musa acuminata) is impacted by large chromosomal translocations. Mol. Biol. Evol. 34, 2140–2152 (2017).
pubmed: 28575404 pmcid: 5850475
Dupouy, M. et al. Two large reciprocal translocations characterized in the disease resistance-rich burmannica genetic group of Musa acuminata. Ann. Bot. 124, 319–329 (2019).
pubmed: 31241133 pmcid: 6758587
Martin, G. et al. Chromosome reciprocal translocations have accompanied subspecies evolution in bananas. Plant J. 104, 1698–1711 (2020).
pubmed: 33067829 pmcid: 7839431
D’Hont, A. et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213–217 (2012).
pubmed: 22801500
Belser, C. et al. Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing. Commun. Biol. 4, 1047 (2021).
pubmed: 34493830 pmcid: 8423783
Hu, G. et al. Two divergent haplotypes from a highly heterozygous lychee genome suggest independent domestication events for early and late-maturing cultivars. Nat. Genet. 54, 73–83 (2022).
pubmed: 34980919 pmcid: 8755541
Sun, X. et al. Phased diploid genome assemblies and pan-genomes provide insights into the genetic history of apple domestication. Nat. Genet. 52, 1423–1432 (2020).
pubmed: 33139952 pmcid: 7728601
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
pubmed: 21217122 pmcid: 3051319
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11, 1432 (2020).
pubmed: 32188846 pmcid: 7080791
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
doi: 10.1093/bioinformatics/bty560 pubmed: 30423086 pmcid: 6129281
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
pubmed: 33526886 pmcid: 7961889
Alonge, M. et al. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol. 20, 224 (2019).
pubmed: 31661016 pmcid: 6816165
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
pubmed: 27467249 pmcid: 5846465
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
pubmed: 28336562 pmcid: 5635820
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
pubmed: 27467250 pmcid: 5596920
Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2020).
pubmed: 31778144
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
pubmed: 29750242 pmcid: 6137996
Thorvaldsdottir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).
pubmed: 22517427
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075 (2013).
pubmed: 23422339 pmcid: 3624806
Krzywinski, M. et al. Circos: an information aesthetic for comparative genomics. Genome Res. 19, 1639–1645 (2009).
pubmed: 19541911 pmcid: 2752132
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 19451168 pmcid: 2705234
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
pubmed: 31375807 pmcid: 7605509
Barnett, D. W., Garrison, E. K., Quinlan, A. R., Stromberg, M. P. & Marth, G. T. BamTools: a C++ API and toolkit for analyzing and managing BAM files. Bioinformatics 27, 1691–1692 (2011).
pubmed: 21493652 pmcid: 3106182
Ou, S. & Jiang, N. LTR_retriever: a highly accurate and sensitive program for identification of long terminal repeat retrotransposons. Plant Physiol. 176, 1410–1422 (2018).
pubmed: 29233850
Manni, M., Berkeley, M. R., Seppey, M., Simao, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).
pubmed: 34320186 pmcid: 8476166
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome biol. 21, 245 (2020).
pubmed: 32928274 pmcid: 7488777
Ou, S. et al. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol. 20, 275 (2019).
pubmed: 31843001 pmcid: 6913007
Flynn, J. M. et al. RepeatModeler2 for automated genomic discovery of transposable element families. Proc. Natl. Acad. Sci. USA 117, 9451–9457 (2020).
pubmed: 32300014 pmcid: 7196820
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).
pubmed: 26045719 pmcid: 4455052
Campbell, M. S., Holt, C., Moore, B. & Yandell, M. Genome annotation and curation using MAKER and MAKER-P. Curr Protoc Bioinformatics 48, 4.11.11–14.11.39 (2014).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 45–48 (2000).
pubmed: 10592178 pmcid: 102476
Cantalapiedra, C. P., Hernandez-Plaza, A., Letunic, I., Bork, P. & Huerta-Cepas, J. eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol. Biol. Evol. 38, 5825–5829 (2021).
pubmed: 34597405 pmcid: 8662613
Huerta-Cepas, J. et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 47, D309–D314 (2019).
pubmed: 30418610
Melters, D. P. et al. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol. 14, R10 (2013).
pubmed: 23363705 pmcid: 4053949
Shi, X. et al. The complete reference genome for grapevine (Vitis vinifera L.) genetics and breeding. Hortic. Res. 10, uhad061 (2023).
pubmed: 37213686 pmcid: 10199708
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
pubmed: 9862982 pmcid: 148217
Marcais, G. et al. MUMmer4: A fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
pubmed: 29373581 pmcid: 5802927
Goel, M., Sun, H., Jiao, W. B. & Schneeberger, K. SyRI: finding genomic rearrangements and local sequence differences from whole-genome assemblies. Genome Biol. 20, 277 (2019).
pubmed: 31842948 pmcid: 6913012
Tang, H. et al. Synteny and collinearity in plant genomes. Science 320, 486–488 (2008).
pubmed: 18436778
Chen, C. et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol. Plant 13, 1194–1202 (2020).
pubmed: 32585190
NCBI Sequence Read Archive https://identifiers.org/ncbi/insdc.sra:SRP435127 (2023).
Liu, X. et al. Musa acuminata subsp. malaccensis genome assembly. GenBank https://identifiers.org/ncbi/insdc.gca:GCA_030219345.1 (2023).
Liu, X. et al. The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars. Figshare https://doi.org/10.6084/m9.figshare.22716271.v9 (2023).

Auteurs

Xin Liu (X)

Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
South China National Botanical Garden, Guangzhou, 510650, China.
University of Chinese Academy of Sciences, Beijing, 100049, China.

Rida Arshad (R)

National Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.

Xu Wang (X)

National Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.

Wei-Ming Li (WM)

School of Marine Sciences and Biotechnology, Guangxi University for Nationalities, Nanning, 530008, China.

Yongfeng Zhou (Y)

National Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Key Laboratory of Synthetic Biology, Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, China.
National Key Laboratory of Tropical Crop Breeding, Tropical Crops Genetic Resources Institute, Chinese Academy of Tropical Agricultural Sciences, Haikou, 571101, China.

Xue-Jun Ge (XJ)

Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.
South China National Botanical Garden, Guangzhou, 510650, China.

Hui-Run Huang (HR)

Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China. huirun.huang@scbg.ac.cn.
South China National Botanical Garden, Guangzhou, 510650, China. huirun.huang@scbg.ac.cn.

Articles similaires

Genome Size Genome, Plant Magnoliopsida Evolution, Molecular Arabidopsis
Humans Mendelian Randomization Analysis Graves Disease Aging Genome-Wide Association Study
Genome, Plant Medicago sativa Crops, Agricultural Genomics Polyploidy
Arabidopsis Amorphophallus Plants, Genetically Modified Phylogeny Droughts

Classifications MeSH