Telomere-to-telomere Genome Assembly of two representative Asian and European pear cultivars.


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
26 Oct 2024
Historique:
received: 20 02 2024
accepted: 18 10 2024
medline: 27 10 2024
pubmed: 27 10 2024
entrez: 27 10 2024
Statut: epublish

Résumé

As the third most important temperate fruit, Pear (Pyrus spp.) exhibits a remarkable genetic diversity and is classified into two mainly categories known as Asian pear and European pear. Although several pear genomes are available, most of the released versions are fragmented and not chromosome-level high-quality. In this study, we report two high-quality genomes for Pyrus bretschneideri Rhed. cv. 'Danshansuli' (DS) and Pyrus communis L. cv. 'Conference' (KFL), which represent the predominant Asian and European cultivars, respectively, with nearly telomere-to-telomere (T2T) gap-free level. The finally assembled genome sizes for DS and KFL were 510.98 Mb and 510.71 Mb, respectively, with Contig N50 of 29.47 Mb and 30.47 Mb, where each chromosome was represented by a single contig. The DS and KFL genomes yielded a total of 46,394 and 44,702 protein-coding genes, respectively. Among these genes, the functional annotation accounted for 96.47% and 96.46% in the DS and KFL genomes. The two novels nearly T2T genomic information offers an invaluable resource for comparative genomics, genetic diversity analysis, molecular breeding strategies, and functional exploration.

Identifiants

pubmed: 39461942
doi: 10.1038/s41597-024-04015-3
pii: 10.1038/s41597-024-04015-3
doi:

Types de publication

Dataset Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1170

Subventions

Organisme : Earmarked Fund for China Agriculture Research System
ID : CARS-28
Organisme : Earmarked Fund for China Agriculture Research System
ID : CARS-28

Informations de copyright

© 2024. The Author(s).

Références

Ou, S. et al. A de novo genome assembly of the dwarfing pear rootstock Zhongai 1. Scientific Data. 6, 281 (2019).
doi: 10.1038/s41597-019-0291-3 pubmed: 31767847 pmcid: 6877535
Li, J. et al. Pear genetics: recent advances, new prospects, and a road map for the future. HorticRes. 9 (2022).
Wu, J. et al. Diversification and independent domestication of Asian and European pears. Genome Biol. 19, 77 (2018).
doi: 10.1186/s13059-018-1452-y pubmed: 29890997 pmcid: 5996476
Wu, J. et al. The genome of the pear (Pyrus bretschneideri Rehd.). Genome Res. 23, 396–408 (2013).
doi: 10.1101/gr.144311.112 pubmed: 23149293 pmcid: 3561880
Linsmith, G. et al. Pseudo-chromosome–length genome assembly of a double haploid “Bartlett” pear (Pyrus communis L.). GigaScience 8 (2019).
Dong, X. et al. De novo assembly of a wild pear (Pyrus betuleafolia) genome. Plant Biotechnology Journal 18, 581–595 (2020).
doi: 10.1111/pbi.13226 pubmed: 31368610
Gao, Y. et al. High-quality genome assembly of ‘Cuiguan’ pear (Pyrus pyrifolia) as a reference genome for identifying regulatory genes and epigenetic modifications responsible for bud dormancy. HorticRes. 8, 197 (2021).
Shirasawa, K. et al. Chromosome-scale genome assembly of Japanese pear (Pyrus pyrifolia) variety ‘Nijisseiki’. DNA Res. 28, dsab001 (2021).
doi: 10.1093/dnares/dsab001 pubmed: 33638981 pmcid: 8092371
Wang, B. et al. High-quality Arabidopsis thaliana Genome Assembly with Nanopore and HiFi Long Reads. Genomics, proteomics & bioinformatics https://doi.org/10.1016/j.gpb.2021.08.003 (2021).
Li, K. et al. Gapless indica rice genome reveals synergistic contributions of active transposable elements and segmental duplications to rice genome evolution. Molecular plant 14, 1745–1756, https://doi.org/10.1016/j.molp.2021.06.017 (2021).
doi: 10.1016/j.molp.2021.06.017 pubmed: 34171481
Song, J. M. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Molecular plant 14, 1757–1767, https://doi.org/10.1016/j.molp.2021.06.018 (2021).
doi: 10.1016/j.molp.2021.06.018 pubmed: 34171480
Navratilova, P. et al. Prospects of telomere-to-telomere assembly in barley: Analysis of sequence gaps in the MorexV3 reference genome. Plant biotechnology journal https://doi.org/10.1111/pbi.13816 (2022).
Belser, C. et al. Telomere-to-telomere gapless chromosomes of banana using nanopore sequencing. Communications biology 4, 1047, https://doi.org/10.1038/s42003-021-02559-3 (2021).
doi: 10.1038/s42003-021-02559-3 pubmed: 34493830 pmcid: 8423783
Huang, H. et al. Telomere-to-telomere haplotype-resolved reference genome reveals subgenome divergence and disease resistance in triploid Cavendish banana. Horticulture research 10, https://doi.org/10.1093/hr/uhad153 (2023).
Liu, X. et al. The phased telomere-to-telomere reference genome of Musa acuminata, a main contributor to banana cultivars. Scientific Data 10, 631 https://doi.org/10.1038/s41597-023-02546-9 (2023).
Liu, J. et al. Gapless assembly of maize chromosomes using long-read technologies. Genome 854 biology 21, 121, https://doi.org/10.1186/s13059-020-02029-9 (2020).
doi: 10.1186/s13059-020-02029-9
Zhang, W. et al. Genome assembly of wild tea tree DASZ reveals pedigree and selection history of tea varieties. Nature communications 11, 3719, https://doi.org/10.1038/s41467-020-17498-6 (2020).
doi: 10.1038/s41467-020-17498-6 pubmed: 32709943 pmcid: 7381669
van Rengs, W. et al. A chromosome scale tomato genome built from complementary PacBio and Nanopore sequences alone reveals extensive linkage drag during breeding. The Plant journal: for cell and molecular biology 110, 572–588, https://doi.org/10.1111/tpj.15690 (2022).
doi: 10.1111/tpj.15690 pubmed: 35106855
Deng, Y. et al. A telomere-to-telomere gap-free reference genome of watermelon and its mutation library provide important resources for gene discovery and breeding. Molecular plant https://doi.org/10.1016/j.molp.2022.06.010 (2022).
Fu, A. et al. Telomere-to-telomere genome assembly of bitter melon (Momordica charantia L. var. abbreviata Ser.) reveals fruit development, composition and ripening genetic characteristics. Horticulture research 10, uhac228, https://doi.org/10.1093/hr/uhac228 (2023).
doi: 10.1093/hr/uhac228 pubmed: 36643758
Yue, J. et al. Telomere-to-telomere and gap-free reference genome assembly of the kiwifruit Actinidia chinensis. Horticulture research 10, uhac264, https://doi.org/10.1093/hr/uhac264 (2023).
doi: 10.1093/hr/uhac264 pubmed: 36778189
Zhang, L. et al. A near-complete genome assembly of Brassica rapa provides new insights into the evolution of centromeres. Plant biotechnology journal 21, 1022–1032, https://doi.org/10.1111/pbi.14015 (2023).
doi: 10.1111/pbi.14015 pubmed: 36688739 pmcid: 10106856
Bao, Y. et al. A gap-free and haplotype-resolved lemon genome provides insights into flavor synthesis and huanglongbing (HLB) tolerance. Horticulture research 10, uhad020, https://doi.org/10.1093/hr/uhad020 (2023).
doi: 10.1093/hr/uhad020 pubmed: 37035858 pmcid: 10076211
Zhou, Y. et al. The Telomere to Telomere genome of Fragaria vesca reveals the genomic evolution of Fragaria and the origin of cultivated octoploid strawberry. Horticulture research https://doi.org/10.1093/hr/uhad027 (2023).
Yang, M. et al. Insights into the evolution and spatial chromosome architecture of jujube from an updated gapless genome assembly. Plant Communications, https://doi.org/10.1016/j.xplc.2023.100662 (2023)
Li, W. et al. Near-gapless and haplotype-resolved apple genomes provide insights into the genetic basis of rootstock-induced dwarfing. Nat Genet 56, 505–516 (2024).
doi: 10.1038/s41588-024-01657-2 pubmed: 38347217
Sun, M. et al. Telomere-to telomere pear (Pyrus pyrifolia) reference genome reveals segmental and whole genome duplication driving genome evolution. Horticulture Research 10, uhad201, https://doi.org/10.1093/hr/uhad201 (2023).
doi: 10.1093/hr/uhad201 pubmed: 38023478 pmcid: 10681005
Chen, Y. et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience 7, 1–6 (2018).
doi: 10.1093/gigascience/gix120 pubmed: 29659813 pmcid: 5827348
Cheng, H. et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18, 170–175 (2021).
doi: 10.1038/s41592-020-01056-5 pubmed: 33526886 pmcid: 7961889
Roach, M. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinformatics 19, 460 (2018).
doi: 10.1186/s12859-018-2485-7 pubmed: 30497373 pmcid: 6267036
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
doi: 10.1093/bioinformatics/btp324 pubmed: 19451168 pmcid: 2705234
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
doi: 10.1126/science.aal3327 pubmed: 28336562 pmcid: 5635820
Durand, N. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell systems 3, 95–98 (2016).
doi: 10.1016/j.cels.2016.07.002 pubmed: 27467249 pmcid: 5846465
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
doi: 10.1093/bioinformatics/bty191 pubmed: 29750242 pmcid: 6137996
Xu, G. et al. LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly. Gigascience 8 (2019).
Benson, G. Tandem repeats finder:a program to analyze DNA sequences. Nucleic Acids Research 27(2), 573–580, https://doi.org/10.1093/nar/27.2.573 (1999).
doi: 10.1093/nar/27.2.573 pubmed: 9862982 pmcid: 148217
Price, A. et al. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl 1), i351–358 (2005).
doi: 10.1093/bioinformatics/bti1018 pubmed: 15961478
Bao, W. et al. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA 6, 11 (2015).
doi: 10.1186/s13100-015-0041-9 pubmed: 26045719 pmcid: 4455052
Jurka, J. et al. Repbase Update, a database of eukaryotic repetitive elements. Cytogenetic and genome research 110, 462–467 (2005).
doi: 10.1159/000084979 pubmed: 16093699
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Current protocols in bioinformatics Chapter 4, 4.10.11–14.10.14 (2009).
Ou, S. & Jiang, N. LTR_FINDER_parallel: parallelization of LTR_FINDER enabling rapid identification of long terminal repeat retrotransposons. Mobile DNA 10, 48 (2019).
doi: 10.1186/s13100-019-0193-0 pubmed: 31857828 pmcid: 6909508
Majoros, W. et al. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
doi: 10.1093/bioinformatics/bth315 pubmed: 15145805
Verde, I. et al. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet 45, 487–494 (2013).
doi: 10.1038/ng.2586 pubmed: 23525075
Daccord, N. et al. High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development. Nat Genet 49, 1099–1106 (2017).
doi: 10.1038/ng.3886 pubmed: 28581499
Tabata, S. et al. Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana. Nature 408, 823–826 (2000).
doi: 10.1038/35048507 pubmed: 11130714
Holt, C. & Yandell, M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12, 491 (2011).
doi: 10.1186/1471-2105-12-491 pubmed: 22192575 pmcid: 3280279
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
doi: 10.1186/1471-2105-5-59 pubmed: 15144565 pmcid: 421630
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896858 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896877 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896876 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896873 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896875 (2024).
NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRR27896874 (2024).
NCBI GenBank https://identifiers.org/ncbi/insdc:JBFSJW010000000 (2024).
NCBI GenBank https://identifiers.org/ncbi/insdc:JBFSJV010000000 (2024).
Simão, F. et al. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
doi: 10.1093/bioinformatics/btv351 pubmed: 26059717

Auteurs

Yongjie Qi (Y)

Key Laboratory of Horticultural Crop Germplasm Innovation and Utilization(Co-construction by Ministry and Province), Institute of Horticulture, Anhui Academy of Agricultural Sciences, Hefei, 230031, China. anhuiqyj@163.com.

Dai Shan (D)

BGI Genomics, Shenzhen, 518083, China.

Yufen Cao (Y)

Chinese Academy of Agricultural Sciences (CAAS), Xingcheng, 125100, China.

Na Ma (N)

Key Laboratory of Horticultural Crop Germplasm Innovation and Utilization(Co-construction by Ministry and Province), Institute of Horticulture, Anhui Academy of Agricultural Sciences, Hefei, 230031, China.

Liqing Lu (L)

Key Laboratory of Horticultural Crop Germplasm Innovation and Utilization(Co-construction by Ministry and Province), Institute of Horticulture, Anhui Academy of Agricultural Sciences, Hefei, 230031, China.

Luming Tian (L)

Chinese Academy of Agricultural Sciences (CAAS), Xingcheng, 125100, China.

Zhan Feng (Z)

BGI Genomics, Shenzhen, 518083, China.

Fanjun Ke (F)

Anhui University of Chinese Medicine, Hefei, 230012, China.

Jianbo Jian (J)

BGI Genomics, Shenzhen, 518083, China. jianjianbo@bgi.com.
Marine Biology Institute, Shantou University, Shantou, 515063, China. jianjianbo@bgi.com.

Zhenghui Gao (Z)

Key Laboratory of Horticultural Crop Germplasm Innovation and Utilization(Co-construction by Ministry and Province), Institute of Horticulture, Anhui Academy of Agricultural Sciences, Hefei, 230031, China. gzh96gao@163.com.

Yiliu Xu (Y)

Key Laboratory of Horticultural Crop Germplasm Innovation and Utilization(Co-construction by Ministry and Province), Institute of Horticulture, Anhui Academy of Agricultural Sciences, Hefei, 230031, China. yiliuxu@163.com.

Articles similaires

Genome Size Genome, Plant Magnoliopsida Evolution, Molecular Arabidopsis
Eimeria tenella Animals Antigens, Protozoan Chickens Genetic Variation
Humans Mendelian Randomization Analysis Graves Disease Aging Genome-Wide Association Study

Classifications MeSH