Telomere-to-telomere Genome Assembly of two representative Asian and European pear cultivars.
Journal
Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192
Informations de publication
Date de publication:
26 Oct 2024
26 Oct 2024
Historique:
received:
20
02
2024
accepted:
18
10
2024
medline:
27
10
2024
pubmed:
27
10
2024
entrez:
27
10
2024
Statut:
epublish
Résumé
As the third most important temperate fruit, Pear (Pyrus spp.) exhibits a remarkable genetic diversity and is classified into two mainly categories known as Asian pear and European pear. Although several pear genomes are available, most of the released versions are fragmented and not chromosome-level high-quality. In this study, we report two high-quality genomes for Pyrus bretschneideri Rhed. cv. 'Danshansuli' (DS) and Pyrus communis L. cv. 'Conference' (KFL), which represent the predominant Asian and European cultivars, respectively, with nearly telomere-to-telomere (T2T) gap-free level. The finally assembled genome sizes for DS and KFL were 510.98 Mb and 510.71 Mb, respectively, with Contig N50 of 29.47 Mb and 30.47 Mb, where each chromosome was represented by a single contig. The DS and KFL genomes yielded a total of 46,394 and 44,702 protein-coding genes, respectively. Among these genes, the functional annotation accounted for 96.47% and 96.46% in the DS and KFL genomes. The two novels nearly T2T genomic information offers an invaluable resource for comparative genomics, genetic diversity analysis, molecular breeding strategies, and functional exploration.
Identifiants
pubmed: 39461942
doi: 10.1038/s41597-024-04015-3
pii: 10.1038/s41597-024-04015-3
doi:
Types de publication
Dataset
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1170Subventions
Organisme : Earmarked Fund for China Agriculture Research System
ID : CARS-28
Organisme : Earmarked Fund for China Agriculture Research System
ID : CARS-28
Informations de copyright
© 2024. The Author(s).
Références
Ou, S. et al. A de novo genome assembly of the dwarfing pear rootstock Zhongai 1. Scientific Data. 6, 281 (2019).
doi: 10.1038/s41597-019-0291-3
pubmed: 31767847
pmcid: 6877535
Li, J. et al. Pear genetics: recent advances, new prospects, and a road map for the future. HorticRes. 9 (2022).
Wu, J. et al. Diversification and independent domestication of Asian and European pears. Genome Biol. 19, 77 (2018).
doi: 10.1186/s13059-018-1452-y
pubmed: 29890997
pmcid: 5996476
Wu, J. et al. The genome of the pear (Pyrus bretschneideri Rehd.). Genome Res. 23, 396–408 (2013).
doi: 10.1101/gr.144311.112
pubmed: 23149293
pmcid: 3561880
Linsmith, G. et al. Pseudo-chromosome–length genome assembly of a double haploid “Bartlett” pear (Pyrus communis L.). GigaScience 8 (2019).
Dong, X. et al. De novo assembly of a wild pear (Pyrus betuleafolia) genome. Plant Biotechnology Journal 18, 581–595 (2020).
doi: 10.1111/pbi.13226
pubmed: 31368610
Gao, Y. et al. High-quality genome assembly of ‘Cuiguan’ pear (Pyrus pyrifolia) as a reference genome for identifying regulatory genes and epigenetic modifications responsible for bud dormancy. HorticRes. 8, 197 (2021).
Shirasawa, K. et al. Chromosome-scale genome assembly of Japanese pear (Pyrus pyrifolia) variety ‘Nijisseiki’. DNA Res. 28, dsab001 (2021).
doi: 10.1093/dnares/dsab001
pubmed: 33638981
pmcid: 8092371
Wang, B. et al. High-quality Arabidopsis thaliana Genome Assembly with Nanopore and HiFi Long Reads. Genomics, proteomics & bioinformatics https://doi.org/10.1016/j.gpb.2021.08.003 (2021).
Li, K. et al. Gapless indica rice genome reveals synergistic contributions of active transposable elements and segmental duplications to rice genome evolution. Molecular plant 14, 1745–1756, https://doi.org/10.1016/j.molp.2021.06.017 (2021).
doi: 10.1016/j.molp.2021.06.017
pubmed: 34171481
Song, J. M. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Molecular plant 14, 1757–1767, https://doi.org/10.1016/j.molp.2021.06.018 (2021).
doi: 10.1016/j.molp.2021.06.018
pubmed: 34171480
Navratilova, P. et al. Prospects of telomere-to-telomere assembly in barley: Analysis of sequence gaps in the MorexV3 reference genome. Plant biotechnology journal https://doi.org/10.1111/pbi.13816 (2022).
Belser, C. et al. Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing. Communications biology 4, 1047, https://doi.org/10.1038/s42003-021-02559-3 (2021).
doi: 10.1038/s42003-021-02559-3
pubmed: 34493830
pmcid: 8423783
Huang, H. et al. Telomere-to-telomere haplotype-resolved reference genome reveals subgenome divergence and disease resistance in triploid Cavendish banana. Horticulture research 10, https://doi.org/10.1093/hr/uhad153 (2023).
Liu, X. et al. The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars. Scientific Data 10, 631 https://doi.org/10.1038/s41597-023-02546-9 (2023).
Liu, J. et al. Gapless assembly of maize chromosomes using long-read technologies. Genome 854 biology 21, 121, https://doi.org/10.1186/s13059-020-02029-9 (2020).
doi: 10.1186/s13059-020-02029-9
Zhang, W. et al. Genome assembly of wild tea tree DASZ reveals pedigree and selection history of tea varieties. Nature communications 11, 3719, https://doi.org/10.1038/s41467-020-17498-6 (2020).
doi: 10.1038/s41467-020-17498-6
pubmed: 32709943
pmcid: 7381669
van Rengs, W. et al. A chromosome scale tomato genome built from complementary PacBio and Nanopore sequences alone reveals extensive linkage drag during breeding. The Plant journal: for cell and molecular biology 110, 572–588, https://doi.org/10.1111/tpj.15690 (2022).
doi: 10.1111/tpj.15690
pubmed: 35106855
Deng, Y. et al. A telomere-to-telomere gap-free reference genome of watermelon and its mutation library provide important resources for gene discovery and breeding. Molecular plant https://doi.org/10.1016/j.molp.2022.06.010 (2022).
Fu, A. et al. Telomere-to-telomere genome assembly of bitter melon (Momordica charantia L. var. abbreviata Ser.) reveals fruit development, composition and ripening genetic characteristics. Horticulture research 10, uhac228, https://doi.org/10.1093/hr/uhac228 (2023).
doi: 10.1093/hr/uhac228
pubmed: 36643758
Yue, J. et al. Telomere-to-telomere and gap-free reference genome assembly of the kiwifruit Actinidia chinensis. Horticulture research 10, uhac264, https://doi.org/10.1093/hr/uhac264 (2023).
doi: 10.1093/hr/uhac264
pubmed: 36778189
Zhang, L. et al. A near-complete genome assembly of Brassica rapa provides new insights into the evolution of centromeres. Plant biotechnology journal 21, 1022–1032, https://doi.org/10.1111/pbi.14015 (2023).
doi: 10.1111/pbi.14015
pubmed: 36688739
pmcid: 10106856
Bao, Y. et al. A gap-free and haplotype-resolved lemon genome provides insights into flavor synthesis and huanglongbing (HLB) tolerance. Horticulture research 10, uhad020, https://doi.org/10.1093/hr/uhad020 (2023).
doi: 10.1093/hr/uhad020
pubmed: 37035858
pmcid: 10076211
Zhou, Y. et al. The Telomere to Telomere genome of Fragaria vesca reveals the genomic evolution of Fragaria and the origin of cultivated octoploid strawberry. Horticulture research https://doi.org/10.1093/hr/uhad027 (2023).
Yang, M. et al. Insights into the evolution and spatial chromosome architecture of jujube from an updated gapless genome assembly. Plant Communications, https://doi.org/10.1016/j.xplc.2023.100662 (2023)
Li, W. et al. Near-gapless and haplotype-resolved apple genomes provide insights into the genetic basis of rootstock-induced dwarfing. Nat Genet 56, 505–516 (2024).
doi: 10.1038/s41588-024-01657-2
pubmed: 38347217
Sun, M. et al. Telomere-to telomere pear (Pyrus pyrifolia) reference genome reveals segmental and whole genome duplication driving genome evolution. Horticulture Research 10, uhad201, https://doi.org/10.1093/hr/uhad201 (2023).
doi: 10.1093/hr/uhad201
pubmed: 38023478
pmcid: 10681005
Chen, Y. et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience 7, 1–6 (2018).
doi: 10.1093/gigascience/gix120
pubmed: 29659813
pmcid: 5827348
Cheng, H. et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18, 170–175 (2021).
doi: 10.1038/s41592-020-01056-5
pubmed: 33526886
pmcid: 7961889
Roach, M. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinformatics 19, 460 (2018).
doi: 10.1186/s12859-018-2485-7
pubmed: 30497373
pmcid: 6267036
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
doi: 10.1093/bioinformatics/btp324
pubmed: 19451168
pmcid: 2705234
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
doi: 10.1126/science.aal3327
pubmed: 28336562
pmcid: 5635820
Durand, N. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell systems 3, 95–98 (2016).
doi: 10.1016/j.cels.2016.07.002
pubmed: 27467249
pmcid: 5846465
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
doi: 10.1093/bioinformatics/bty191
pubmed: 29750242
pmcid: 6137996
Xu, G. et al. LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly. Gigascience 8 (2019).
Benson, G. Tandem repeats finder:a program to analyze DNA sequences. Nucleic Acids Research 27(2), 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).
doi: 10.1093/nar/27.2.573
pubmed: 9862982
pmcid: 148217
Price, A. et al. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl 1), i351–358 (2005).
doi: 10.1093/bioinformatics/bti1018
pubmed: 15961478
Bao, W. et al. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 6, 11 (2015).
doi: 10.1186/s13100-015-0041-9
pubmed: 26045719
pmcid: 4455052
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and genome research 110, 462–467 (2005).
doi: 10.1159/000084979
pubmed: 16093699
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Current protocols in bioinformatics Chapter 4, 4.10.11–14.10.14 (2009).
Ou, S. & Jiang, N. LTR_FINDER_parallel: parallelization of LTR_FINDER enabling rapid identification of long terminal repeat retrotransposons. Mobile DNA 10, 48 (2019).
doi: 10.1186/s13100-019-0193-0
pubmed: 31857828
pmcid: 6909508
Majoros, W. et al. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
doi: 10.1093/bioinformatics/bth315
pubmed: 15145805
Verde, I. et al. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet 45, 487–494 (2013).
doi: 10.1038/ng.2586
pubmed: 23525075
Daccord, N. et al. High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development. Nat Genet 49, 1099–1106 (2017).
doi: 10.1038/ng.3886
pubmed: 28581499
Tabata, S. et al. Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana. Nature 408, 823–826 (2000).
doi: 10.1038/35048507
pubmed: 11130714
Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491 (2011).
doi: 10.1186/1471-2105-12-491
pubmed: 22192575
pmcid: 3280279
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
doi: 10.1186/1471-2105-5-59
pubmed: 15144565
pmcid: 421630
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896858 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896877 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896876 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896873 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896875 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896874 (2024).
NCBI GenBank https://identifiers.org/ncbi/insdc:JBFSJW010000000 (2024).
NCBI GenBank https://identifiers.org/ncbi/insdc:JBFSJV010000000 (2024).
Simão, F. et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
doi: 10.1093/bioinformatics/btv351
pubmed: 26059717