Genome sequence of Kobresia littledalei, the first chromosome-level genome in the family Cyperaceae.

Chromosomes, Plant Cyperaceae / classification Genome, Plant Phylogeny Tibet

Journal

Scientific data

ISSN: 2052-4463

Titre abrégé: Sci Data

Pays: England

ID NLM: 101640192

Informations de publication

Date de publication:
11 06 2020

Historique:

received: 22 11 2019

accepted: 07 05 2020

entrez: 13 6 2020

pubmed: 13 6 2020

medline: 5 11 2020

Statut: epublish

Résumé

Kobresia plants are important forage resources in the Qinghai-Tibet Plateau and are essential in maintaining the ecological balance of grasslands. Therefore, it is beneficial to obtain Kobresia genome resources and study the adaptive characteristics of Kobresia plants in the Qinghai-Tibetan Plateau. We assembled the genome of Kobresia littledalei C. B. Clarke, which was about 373.85 Mb in size. 96.82% of the bases were attached to 29 pseudo-chromosomes, combining PacBio, Illumina and Hi-C sequencing data. Additional investigation of the annotation identified 23,136 protein-coding genes. 98.95% of these were functionally annotated. According to phylogenetic analysis, K. littledalei in Cyperaceae separated from Poaceae about 97.6 million years ago after separating from Ananas comosus in Bromeliaceae about 114.3mya. For K. littledalei, we identified a high-quality genome at the chromosome level. This is the first time a reference genome has been established for a species of Cyperaceae. This genome will help additional studies focusing on the processes of plant adaptation to environments with high altitude and cold weather.

Identifiants

DOI: 10.1038/s41597-020-0518-3 PMID: 32528014 PMC: PMC7289886

pubmed: 32528014

doi: 10.1038/s41597-020-0518-3

pii: 10.1038/s41597-020-0518-3

pmc: PMC7289886

doi:

Types de publication

Dataset Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

175

Références

Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L. L. & Hernández-Hernández, T. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015).

pubmed: 25615647 pmcid: 25615647 doi: 10.1111/nph.13264

Xiao, Y., Xiao, Z., Ma, D., Liu, J. & Li, J. Genome sequence of the barred knifejaw Oplegnathus fasciatus (Temminck & Schlegel, 1844): the first chromosome-level draft genome in the family Oplegnathidae. GigaScience. 8, 21–22 (2019).

Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods. 13, 1050–1054 (2016).

pubmed: 27749838 pmcid: 5503144 doi: 10.1038/nmeth.4035

Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One. 9, e112963 (2014).

pubmed: 25409509 pmcid: 4237348 doi: 10.1371/journal.pone.0112963

Roach, M. J., Schmidt, S. & Borneman, A. R. Purge Haplotigs: synteny reduction for third-gen diploid genome assemblies. BMC Bioinformatics. 19, 460 (2018).

pubmed: 30497373 pmcid: 6267036 doi: 10.1186/s12859-018-2485-7

Zhang, D.-C. et al. Chromosome-level genome assembly of golden pompano (Trachinotus ovatus) in the family Carangidae. Scientific Data. 6, 216 (2019).

pubmed: 31641137 pmcid: 6805935 doi: 10.1038/s41597-019-0238-8

Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Research. 4, 35–36 (2015).

Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).

pubmed: 24185095 pmcid: 24185095 doi: 10.1038/nbt.2727

Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).

pubmed: 4665391 pmcid: 4665391 doi: 10.1186/s13059-015-0831-x

Akdemir, K. C. & Chin, L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 16, 198 (2015).

pubmed: 26392354 pmcid: 4576377 doi: 10.1186/s13059-015-0767-1

Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics. 25, 4.10.11–14.10.14 (2009).

doi: 10.1002/0471250953.bi0410s25

Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. 6, 11 (2015).

pubmed: 26045719 pmcid: 4455052 doi: 10.1186/s13100-015-0041-9

Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).

pubmed: 17485477 pmcid: 17485477 doi: 10.1093/nar/gkm286

Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics. 21, 351–358 (2005).

doi: 10.1093/bioinformatics/bti1018

Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

pubmed: 148217 pmcid: 148217 doi: 10.1093/nar/27.2.573

Stanke, M., Schöffmann, O., Morgenstern, B. & Waack, S. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics. 7, 62 (2006).

pubmed: 16469098 pmcid: 1409804 doi: 10.1186/1471-2105-7-62

Pertea, M., Salzberg, S. L. & Majoros, W. H. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 20, 2878–2879 (2004).

pubmed: 15145805 doi: 10.1093/bioinformatics/bth315

Korf, I. Gene finding in novel genomes. BMC Bioinformatics. 5, 59 (2004).

pubmed: 421630 pmcid: 421630 doi: 10.1186/1471-2105-5-59

Blanco, E., Parra, G. & Guigó, R. Using geneid to identify genes. Curr. Protoc. Bioinformatics. Chapter 4, Unit 4.3 (2007).

pubmed: 18428791

Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).

pubmed: 9149143 doi: 10.1006/jmbi.1997.0951

Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).

pubmed: 2395244 pmcid: 2395244 doi: 10.1186/gb-2008-9-1-r7

The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45, 158–169 (2016).

doi: 10.1093/nar/gkw1099

Morishima, K., Tanabe, M., Furumichi, M., Kanehisa, M. & Sato, Y. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, 353–361 (2016).

Bateman, A. et al. InterPro: the integrative protein signature database. Nucleic Acids Res. 37, 211–215 (2008).

Varshney, R. K. et al. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nat. Biotechnol. 35, 969–976 (2017).

pubmed: 28922347 pmcid: 6871012 doi: 10.1038/nbt.3943

Zou, C. et al. The genome of broomcorn millet. Nature Commun. 10, 436 (2019).

doi: 10.1038/s41467-019-08409-5

Zhang, J. et al. Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. Nat. Genet. 50, 1565–1573 (2018).

pubmed: 30297971 doi: 10.1038/s41588-018-0237-2

Ming, R. et al. The pineapple genome and the evolution of CAM photosynthesis. Nat. Genet. 47, 1435–1442 (2015).

pubmed: 26523774 pmcid: 4867222 doi: 10.1038/ng.3435

Matasci, N. et al. Data access for the 1,000 Plants (1KP) project. GigaScience. 3, 17 (2014).

pubmed: 25625010 pmcid: 4306014 doi: 10.1186/2047-217X-3-17

Bateman, A. et al. Pfam: the protein families database. Nucleic Acids Res. 42, 222–230 (2013).

Mitchell, A. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 30, 1236–1240 (2014).

pubmed: 24451626 pmcid: 3998142 doi: 10.1093/bioinformatics/btu031

Johnson, L. S., Eddy, S. R. & Portugaly, E. Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics. 11, 431 (2010).

pubmed: 20718988 pmcid: 2931519 doi: 10.1186/1471-2105-11-431

Consortium, T. G. O. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, 1049–1056 (2014).

doi: 10.1093/nar/gku1179

Conesa, A. & Götz, S. Blast2GO: A comprehensive suite for functional analysis in plant genomics. Int. J. Plant Genomics. 2008, 12 (2008).

doi: 10.1155/2008/619832

Lipnerova, I., Bures, P., Horova, L. & Smarda, P. Evolution of genome size in Carex (Cyperaceae) in relation to chromosome number and genomic base composition. Ann. Bot-London. 111, 79–94 (2012).

doi: 10.1093/aob/mcs239

VanBuren, R. et al. Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum. Nature. 527, 508–511 (2015).

pubmed: 26560029 doi: 10.1038/nature15714

Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 22, 2688–2690 (2006).

pubmed: 16928733 doi: 10.1093/bioinformatics/btl446

Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

pubmed: 17483113 doi: 10.1093/molbev/msm088

Tang, H. et al. Synteny and collinearity in plant genomes. Science. 320, 486–488 (2008).

pubmed: 18436778 doi: 10.1126/science.1153917

Paterson, A. H., Bowers, J. E. & Chapman, B. A. Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. USA 101, 9903 (2004).

pubmed: 15161969 doi: 10.1073/pnas.0307901101

NCBI Sequence Read Archive https://identifiers.org/insdc.sra:SRP198441 (2020).

Qu, G. Carex littledalei isolate C.B.Clarke, whole genome shotgun sequencing project. Genbank https://identifiers.org/ncbi/insdc:SWLB00000000 (2020).

Qu, G. Genome sequence of Kobresia littledalei, the first chromosome-level genome in the family Cyperaceae. figshare https://doi.org/10.6084/m9.figshare.12197544.v1 (2020).

Parra, G., Korf, I. & Bradnam, K. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 23, 1061–1067 (2007).

doi: 10.1093/bioinformatics/btm071

Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

pubmed: 3571712 pmcid: 3571712 doi: 10.1038/nbt.1883

Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126–e126 (2018).

pubmed: 30107434 pmcid: 6265445

Kriventseva, E. V., Zdobnov, E. M., Simão, F. A., Ioannidis, P. & Waterhouse, R. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 31, 3210–3212 (2015).

pubmed: 26059717 doi: 10.1093/bioinformatics/btv351

Luo, R. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience. 1, 18 (2012).

pubmed: 23587118 pmcid: 23587118 doi: 10.1186/2047-217X-1-18

Genome sequence of Kobresia littledalei, the first chromosome-level genome in the family Cyperaceae.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Références

Auteurs

Muyou Can (M)

Wei Wei (W)

Hailing Zi (H)

Magaweng Bai (M)

Yunfei Liu (Y)

Dan Gao (D)

Dengqunpei Tu (D)

Yuhong Bao (Y)

Li Wang (L)

Shaofeng Chen (S)

Xing Zhao (X)

Guangpeng Qu (G)

Articles similaires

Comprehensive comparative analysis and development of molecular markers for Lasianthus species based on complete chloroplast genome sequences.

Decoding the genomic terrain: functional insights into 14 chemosensory proteins in whitefly Bemisia tabaci Asia II-1.

Multiple NADPH-cytochrome P450 reductases from Lycoris radiata involved in Amaryllidaceae alkaloids biosynthesis.

Fasciola hepatica and Fasciola hybrid form co-existence in yak from Tibet of China: application of rDNA internal transcribed spacer.

Classifications MeSH