Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing.
Journal
Communications biology
ISSN: 2399-3642
Titre abrégé: Commun Biol
Pays: England
ID NLM: 101719179
Informations de publication
Date de publication:
07 09 2021
07 09 2021
Historique:
received:
11
06
2021
accepted:
13
08
2021
entrez:
8
9
2021
pubmed:
9
9
2021
medline:
15
12
2021
Statut:
epublish
Résumé
Long-read technologies hold the promise to obtain more complete genome assemblies and to make them easier. Coupled with long-range technologies, they can reveal the architecture of complex regions, like centromeres or rDNA clusters. These technologies also make it possible to know the complete organization of chromosomes, which remained complicated before even when using genetic maps. However, generating a gapless and telomere-to-telomere assembly is still not trivial, and requires a combination of several technologies and the choice of suitable software. Here, we report a chromosome-scale assembly of a banana genome (Musa acuminata) generated using Oxford Nanopore long-reads. We generated a genome coverage of 177X from a single PromethION flowcell with near 17X with reads longer than 75 kbp. From the 11 chromosomes, 5 were entirely reconstructed in a single contig from telomere to telomere, revealing for the first time the content of complex regions like centromeres or clusters of paralogous genes.
Identifiants
pubmed: 34493830
doi: 10.1038/s42003-021-02559-3
pii: 10.1038/s42003-021-02559-3
pmc: PMC8423783
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1047Subventions
Organisme : Agence Nationale de la Recherche (French National Research Agency)
ID : ANR-10-LABX-0001-01
Organisme : Agence Nationale de la Recherche (French National Research Agency)
ID : ANR-10-INBS-09-08
Informations de copyright
© 2021. The Author(s).
Références
Michael, T. P. & VanBuren, R. Building near-complete plant genomes. Curr. Opin. Plant Biol. 54, 26–33 (2020).
pubmed: 31981929
doi: 10.1016/j.pbi.2019.12.009
Rousseau-Gueutin, M. et al. Long-read assembly of the Brassica napus reference genome Darmor-bzh. GigaScience 9, giaa137 (2020).
Zhang, W. et al. Genome assembly of wild tea tree DASZ reveals pedigree and selection history of tea varieties. Nat. Commun. 11, 3719 (2020).
pubmed: 32709943
pmcid: 7381669
doi: 10.1038/s41467-020-17498-6
Schmidt, M. H.-W. et al. De novo assembly of a New Solanum pennellii accession using nanopore sequencing. Plant Cell 29, 2336–2348 (2017).
pubmed: 29025960
pmcid: 5774570
doi: 10.1105/tpc.17.00521
Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. Nature 585, 79–84 (2020).
pubmed: 32663838
pmcid: 7484160
doi: 10.1038/s41586-020-2547-7
Martin, G. et al. Genome ancestry mosaics reveal multiple and cryptic contributors to cultivated banana. Plant J. 102, 1008–1025 (2020).
pubmed: 31930580
pmcid: 7317953
doi: 10.1111/tpj.14683
Němečková, A. et al. Molecular and cytogenetic study of East African Highland Banana. Front. Plant Sci. 9, 1371(2018).
Langhe, E. D., Vrydaghs, L., Maret, P., de, Perrier, X. & Denham, T. Why bananas matter: an introduction to the history of banana domestication. Ethnobot. Res. Appl 7, 165–177 (2009).
doi: 10.17348/era.7.0.165-177
D’Hont, A. et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213–217 (2012).
pubmed: 22801500
doi: 10.1038/nature11241
Martin, G. et al. Improvement of the banana “Musa acuminata” reference sequence using NGS data and semi-automated bioinformatics methods. BMC Genomics 17, 243 (2016).
pubmed: 26984673
pmcid: 4793746
doi: 10.1186/s12864-016-2579-4
Chen, Y. et al. Efficient assembly of nanopore reads via highly accurate and intact error correction. Nat. Commun. 12, 60 (2021).
pubmed: 33397900
pmcid: 7782737
doi: 10.1038/s41467-020-20236-7
Vaser, R., Sović, I., Nagarajan, N. & Šikić, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
pubmed: 28100585
pmcid: 5411768
doi: 10.1101/gr.214270.116
nanoporetech/medaka. (Oxford Nanopore Technologies, 2021).
Aury, J.-M. & Istace, B. Hapo-G, haplotype-aware polishing of genome assemblies with accurate reads. NAR Genom. Bioinform. 3, lqab034 (2021).
Istace, B., Belser, C. & Aury, J.-M. BiSCoT: improving large eukaryotic genome assemblies with optical maps. PeerJ 8, e10150 (2020).
pubmed: 33194395
pmcid: 7649008
doi: 10.7717/peerj.10150
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
pubmed: 32928274
pmcid: 7488777
doi: 10.1186/s13059-020-02134-9
Čížková, J. et al. Molecular analysis and genomic organization of major DNA satellites in banana (Musa spp.). PLoS One 8, e54808 (2013).
pubmed: 23372772
pmcid: 3553004
doi: 10.1371/journal.pone.0054808
Tran, T. D. et al. Centromere and telomere sequence alterations reflect the rapid genome evolution within the carnivorous plant genus Genlisea.Plant J. Cell Mol. Biol. 84, 1087–1099 (2015).
doi: 10.1111/tpj.13058
Neumann, P. et al. Plant centromeric retrotransposons: a structural and cytogenetic perspective. Mob. DNA 2, 4 (2011).
pubmed: 21371312
pmcid: 3059260
doi: 10.1186/1759-8753-2-4
Panchy, N., Lehti-Shiu, M. & Shiu, S.-H. Evolution of gene duplication in plants. Plant Physiol. 171, 2294–2316 (2016).
pubmed: 27288366
pmcid: 4972278
doi: 10.1104/pp.16.00523
Del Terra, L. et al. Functional characterization of three Coffea arabica L. monoterpene synthases: Insights into the enzymatic machinery of coffee aroma. Phytochemistry 89, 6–14 (2013).
pubmed: 23398891
doi: 10.1016/j.phytochem.2013.01.005
Jiang, S.-Y., Jin, J., Sarojam, R. & Ramachandran, S. A comprehensive survey on the terpene synthase gene family provides new insight into its evolutionary patterns. Genome Biol. Evol. 11, 2078–2098 (2019).
pubmed: 31304957
pmcid: 6681836
doi: 10.1093/gbe/evz142
Falara, V. et al. The tomato terpene synthase gene family. Plant Physiol. 157, 770–789 (2011).
pubmed: 21813655
pmcid: 3192577
doi: 10.1104/pp.111.179648
Martin, D. M. et al. Functional annotation, genome organization and phylogeny of the grapevine (Vitis vinifera) terpene synthase gene family based on genome assembly, FLcDNA cloning, and enzyme assays. BMC Plant Biol. 10, 226 (2010).
pubmed: 20964856
pmcid: 3017849
doi: 10.1186/1471-2229-10-226
Wersch, Svan & Li, X. Stronger when together: clustering of plant NLR disease resistance genes. Trends Plant Sci. 24, 688–699 (2019).
pubmed: 31266697
doi: 10.1016/j.tplants.2019.05.005
Steuernagel, B. et al. The NLR-annotator tool enables annotation of the intracellular immune receptor repertoire. Plant Physiol. 183, 468–482 (2020).
pubmed: 32184345
pmcid: 7271791
doi: 10.1104/pp.19.01273
Belser, C. et al. Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps. Nat. Plants 4, 879–887 (2018).
pubmed: 30390080
doi: 10.1038/s41477-018-0289-4
Wang, Z. et al. Musa balbisiana genome reveals subgenome evolution and functional divergence. Nat. Plants 5, 810–821 (2019).
pubmed: 31308504
pmcid: 6784884
doi: 10.1038/s41477-019-0452-6
Lang, D. et al. Comparison of the two up-to-date sequencing technologies for genome assembly: HiFi reads of Pacific Biosciences Sequel II system and ultralong reads of Oxford Nanopore. GigaScience 9, giaa123 (2020).
Yang, X. et al. Amplification and adaptation of centromeric repeats in polyploid switchgrass species. N. Phytol. 218, 1645–1657 (2018).
doi: 10.1111/nph.15098
Miga, K. H. Centromere studies in the era of ‘telomere-to-telomere’ genomics. Exp. Cell Res. 394, 112127 (2020).
pubmed: 32504677
pmcid: 8284601
doi: 10.1016/j.yexcr.2020.112127
Comai, L., Maheshwari, S. & Marimuthu, M. P. A. Plant centromeres. Curr. Opin. Plant Biol. 36, 158–167 (2017).
pubmed: 28411416
doi: 10.1016/j.pbi.2017.03.003
Bellaire, L., de, L., de, Fouré, E., Abadie, C. & Carlier, J. Black leaf streak disease is challenging the banana industry. Fruits 65, 327–342 (2010).
doi: 10.1051/fruits/2010034
Kema, G. H. J. et al. Editorial: Fusarium wilt of banana, a recurring threat to global banana production. Front. Plant Sci. 11, 628888 (2021).
Ahmad, F. et al. Genetic mapping of Fusarium wilt resistance in a wild banana Musa acuminata ssp. malaccensis accession. Theor. Appl. Genet. 133, 3409–3418 (2020).
pubmed: 32918589
pmcid: 7567712
doi: 10.1007/s00122-020-03677-y
Gawel, N. J. & Jarret, R. L. A modified CTAB DNA extraction procedure forMusa andIpomoea. Plant Mol. Biol. Rep. 9, 262–266 (1991).
doi: 10.1007/BF02672076
Safár, J. et al. Creation of a BAC resource to study the structure and evolution of the banana (Musa balbisiana) genome. Genome 47, 1182–1191 (2004).
pubmed: 15644977
doi: 10.1139/g04-062
Šimková, H., Číhalíková, J., Vrána, J., Lysák, M. A. & Doležel, J. Preparation of HMW DNA from plant nuclei and chromosomes isolated from root tips. Biol. Plant. 46, 369–373 (2003).
doi: 10.1023/A:1024322001786
Engelen S., Aury J. M. fastxtend https://www.genoscope.cns.fr/externe/fastxtend/ .
Li, R., Li, Y., Kristiansen, K. & Wang, J. SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008).
pubmed: 18227114
doi: 10.1093/bioinformatics/btn025
Alberti, A. et al. Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition. Sci. Data 4, 170093 (2017).
pubmed: 28763055
pmcid: 5538240
doi: 10.1038/sdata.2017.93
rrwick/Filtlong. quality filtering tool for long reads https://github.com/rrwick/Filtlong .
Liu, H., Wu, S., Li, A. & Ruan, J. SMARTdenovo: a de novo assembler using long noisy reads. Gigabyte 2021, 1–9 (2021).
doi: 10.46471/gigabyte.15
Ruan, J. & Li, H. Fast and accurate long-read assembly with wtdbg2. Nat. Methods 17, 155–158 (2020).
pubmed: 31819265
doi: 10.1038/s41592-019-0669-3
Kolmogorov, M., Yuan, J., Lin, Y. & Pevzner, P. A. Assembly of long, error-prone reads using repeat graphs. Nat. Biotechnol. 37, 540–546 (2019).
doi: 10.1038/s41587-019-0072-8
pubmed: 30936562
Miller, J. R. et al. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics 24, 2818–2824 (2008).
pubmed: 18952627
pmcid: 2639302
doi: 10.1093/bioinformatics/btn548
Droc, G. et al. The banana genome hub. Database 2013, bat035 (2013).
SouthGreenPlatform/scaffhunter. (South Green Bioinformatics platform, 2019).
Martin, G., Baurens, F.-C., Cardi, C., Aury, J.-M. & D’Hont, A. The complete chloroplast genome of banana (Musa acuminata, Zingiberales): insight into plastid monocotyledon evolution. PLoS One 8, e67350 (2013).
pubmed: 23840670
pmcid: 3696114
doi: 10.1371/journal.pone.0067350
Fang, Y. et al. A complete sequence and transcriptomic analyses of date palm (Phoenix dactylifera L.) mitochondrial genome. PLoS One 7, e37164 (2012).
pubmed: 22655034
pmcid: 3360038
doi: 10.1371/journal.pone.0037164
Krumsiek, J., Arnold, R. & Rattei, T. Gepard: a rapid and sensitive tool for creating dotplots on genome scale. Bioinformatics 23, 1026–1028 (2007).
pubmed: 17309896
doi: 10.1093/bioinformatics/btm039
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
pubmed: 9862982
pmcid: 148217
doi: 10.1093/nar/27.2.573
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker http://repeatmasker.org/ .
Bao, W., Kojima, K. K. & Kohany, O. Repbase update, a database of repetitive elements in eukaryotic genomes. Mob. DNA 6, 11 (2015).
pubmed: 26045719
pmcid: 4455052
doi: 10.1186/s13100-015-0041-9
Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
pubmed: 11932250
pmcid: 187518
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988 (2004).
pubmed: 15123596
pmcid: 479130
doi: 10.1101/gr.1865504
Martin, G. et al. Improvement of the banana “Musa acuminata” reference sequence using NGS data and semi-automated bioinformatics methods. BMC Genomics 17, 243 (2016).
Mott, R. EST_GENOME: a program to align spliced DNA sequences to unspliced genomic DNA. Comput. Appl. Biosci. CABIOS 13, 477–478 (1997).
pubmed: 9283765
Dubarry, M. et al. Gmove a tool for eukaryotic gene predictions using various evidences. F1000Research 5 (2016).
Waterhouse, R. M. et al. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol. Biol. Evol. 35, 543–548 (2018).
pubmed: 29220515
doi: 10.1093/molbev/msx319
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
pubmed: 20110278
pmcid: 2832824
doi: 10.1093/bioinformatics/btq033
Wicker, T. et al. A unified classification system for eukaryotic transposable elements. Nat. Rev. Genet. 8, 973–982 (2007).
pubmed: 17984973
doi: 10.1038/nrg2165
Nattestad, M. & Schatz, M. C. Assemblytics: a web analytics tool for the detection of variants from an assembly. Bioinformatics 32, 3021–3023 (2016).
pubmed: 27318204
pmcid: 6191160
doi: 10.1093/bioinformatics/btw369
Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).
pubmed: 14759262
pmcid: 395750
doi: 10.1186/gb-2004-5-2-r12
Krzywinski, M. I. et al. Circos: an information aesthetic for comparative genomics. Genome Res. https://doi.org/10.1101/gr.092759.109 (2009).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
pubmed: 2231712
doi: 10.1016/S0022-2836(05)80360-2
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
pubmed: 25402007
doi: 10.1038/nmeth.3176
Tang, H. et al. Synteny and collinearity in plant genomes. Science 320, 486–488 (2008).
pubmed: 18436778
doi: 10.1126/science.1153917
Belser, C. et al. Musa acuminata DH-Pahang genome assembly: associated data. Zenodo https://doi.org/10.5281/zenodo.5120019 (2021).