Genome sequence of Kobresia littledalei, the first chromosome-level genome in the family Cyperaceae.


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
11 06 2020
Historique:
received: 22 11 2019
accepted: 07 05 2020
entrez: 13 6 2020
pubmed: 13 6 2020
medline: 5 11 2020
Statut: epublish

Résumé

Kobresia plants are important forage resources in the Qinghai-Tibet Plateau and are essential in maintaining the ecological balance of grasslands. Therefore, it is beneficial to obtain Kobresia genome resources and study the adaptive characteristics of Kobresia plants in the Qinghai-Tibetan Plateau. We assembled the genome of Kobresia littledalei C. B. Clarke, which was about 373.85 Mb in size. 96.82% of the bases were attached to 29 pseudo-chromosomes, combining PacBio, Illumina and Hi-C sequencing data. Additional investigation of the annotation identified 23,136 protein-coding genes. 98.95% of these were functionally annotated. According to phylogenetic analysis, K. littledalei in Cyperaceae separated from Poaceae about 97.6 million years ago after separating from Ananas comosus in Bromeliaceae about 114.3mya. For K. littledalei, we identified a high-quality genome at the chromosome level. This is the first time a reference genome has been established for a species of Cyperaceae. This genome will help additional studies focusing on the processes of plant adaptation to environments with high altitude and cold weather.

Identifiants

pubmed: 32528014
doi: 10.1038/s41597-020-0518-3
pii: 10.1038/s41597-020-0518-3
pmc: PMC7289886
doi:

Types de publication

Dataset Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

175

Références

Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L. L. & Hernández-Hernández, T. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015).
pubmed: 25615647 pmcid: 25615647 doi: 10.1111/nph.13264
Xiao, Y., Xiao, Z., Ma, D., Liu, J. & Li, J. Genome sequence of the barred knifejaw Oplegnathus fasciatus (Temminck & Schlegel, 1844): the first chromosome-level draft genome in the family Oplegnathidae. GigaScience. 8, 21–22 (2019).
Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods. 13, 1050–1054 (2016).
pubmed: 27749838 pmcid: 5503144 doi: 10.1038/nmeth.4035
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS One. 9, e112963 (2014).
pubmed: 25409509 pmcid: 4237348 doi: 10.1371/journal.pone.0112963
Roach, M. J., Schmidt, S. & Borneman, A. R. Purge Haplotigs: synteny reduction for third-gen diploid genome assemblies. BMC Bioinformatics. 19, 460 (2018).
pubmed: 30497373 pmcid: 6267036 doi: 10.1186/s12859-018-2485-7
Zhang, D.-C. et al. Chromosome-level genome assembly of golden pompano (Trachinotus ovatus) in the family Carangidae. Scientific Data. 6, 216 (2019).
pubmed: 31641137 pmcid: 6805935 doi: 10.1038/s41597-019-0238-8
Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Research. 4, 35–36 (2015).
Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).
pubmed: 24185095 pmcid: 24185095 doi: 10.1038/nbt.2727
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
pubmed: 4665391 pmcid: 4665391 doi: 10.1186/s13059-015-0831-x
Akdemir, K. C. & Chin, L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 16, 198 (2015).
pubmed: 26392354 pmcid: 4576377 doi: 10.1186/s13059-015-0767-1
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics. 25, 4.10.11–14.10.14 (2009).
doi: 10.1002/0471250953.bi0410s25
Bao, W., Kojima, K. K. & Kohany, O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. 6, 11 (2015).
pubmed: 26045719 pmcid: 4455052 doi: 10.1186/s13100-015-0041-9
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
pubmed: 17485477 pmcid: 17485477 doi: 10.1093/nar/gkm286
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics. 21, 351–358 (2005).
doi: 10.1093/bioinformatics/bti1018
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
pubmed: 148217 pmcid: 148217 doi: 10.1093/nar/27.2.573
Stanke, M., Schöffmann, O., Morgenstern, B. & Waack, S. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics. 7, 62 (2006).
pubmed: 16469098 pmcid: 1409804 doi: 10.1186/1471-2105-7-62
Pertea, M., Salzberg, S. L. & Majoros, W. H. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 20, 2878–2879 (2004).
pubmed: 15145805 doi: 10.1093/bioinformatics/bth315
Korf, I. Gene finding in novel genomes. BMC Bioinformatics. 5, 59 (2004).
pubmed: 421630 pmcid: 421630 doi: 10.1186/1471-2105-5-59
Blanco, E., Parra, G. & Guigó, R. Using geneid to identify genes. Curr. Protoc. Bioinformatics. Chapter 4, Unit 4.3 (2007).
pubmed: 18428791
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).
pubmed: 9149143 doi: 10.1006/jmbi.1997.0951
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).
pubmed: 2395244 pmcid: 2395244 doi: 10.1186/gb-2008-9-1-r7
The UniProt Consortium. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 45, 158–169 (2016).
doi: 10.1093/nar/gkw1099
Morishima, K., Tanabe, M., Furumichi, M., Kanehisa, M. & Sato, Y. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, 353–361 (2016).
Bateman, A. et al. InterPro: the integrative protein signature database. Nucleic Acids Res. 37, 211–215 (2008).
Varshney, R. K. et al. Pearl millet genome sequence provides a resource to improve agronomic traits in arid environments. Nat. Biotechnol. 35, 969–976 (2017).
pubmed: 28922347 pmcid: 6871012 doi: 10.1038/nbt.3943
Zou, C. et al. The genome of broomcorn millet. Nature Commun. 10, 436 (2019).
doi: 10.1038/s41467-019-08409-5
Zhang, J. et al. Allele-defined genome of the autopolyploid sugarcane Saccharum spontaneum L. Nat. Genet. 50, 1565–1573 (2018).
pubmed: 30297971 doi: 10.1038/s41588-018-0237-2
Ming, R. et al. The pineapple genome and the evolution of CAM photosynthesis. Nat. Genet. 47, 1435–1442 (2015).
pubmed: 26523774 pmcid: 4867222 doi: 10.1038/ng.3435
Matasci, N. et al. Data access for the 1,000 Plants (1KP) project. GigaScience. 3, 17 (2014).
pubmed: 25625010 pmcid: 4306014 doi: 10.1186/2047-217X-3-17
Bateman, A. et al. Pfam: the protein families database. Nucleic Acids Res. 42, 222–230 (2013).
Mitchell, A. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 30, 1236–1240 (2014).
pubmed: 24451626 pmcid: 3998142 doi: 10.1093/bioinformatics/btu031
Johnson, L. S., Eddy, S. R. & Portugaly, E. Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics. 11, 431 (2010).
pubmed: 20718988 pmcid: 2931519 doi: 10.1186/1471-2105-11-431
Consortium, T. G. O. Gene Ontology Consortium: going forward. Nucleic Acids Res. 43, 1049–1056 (2014).
doi: 10.1093/nar/gku1179
Conesa, A. & Götz, S. Blast2GO: A comprehensive suite for functional analysis in plant genomics. Int. J. Plant Genomics. 2008, 12 (2008).
doi: 10.1155/2008/619832
Lipnerova, I., Bures, P., Horova, L. & Smarda, P. Evolution of genome size in Carex (Cyperaceae) in relation to chromosome number and genomic base composition. Ann. Bot-London. 111, 79–94 (2012).
doi: 10.1093/aob/mcs239
VanBuren, R. et al. Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum. Nature. 527, 508–511 (2015).
pubmed: 26560029 doi: 10.1038/nature15714
Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 22, 2688–2690 (2006).
pubmed: 16928733 doi: 10.1093/bioinformatics/btl446
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
pubmed: 17483113 doi: 10.1093/molbev/msm088
Tang, H. et al. Synteny and collinearity in plant genomes. Science. 320, 486–488 (2008).
pubmed: 18436778 doi: 10.1126/science.1153917
Paterson, A. H., Bowers, J. E. & Chapman, B. A. Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl. Acad. Sci. USA 101, 9903 (2004).
pubmed: 15161969 doi: 10.1073/pnas.0307901101
NCBI Sequence Read Archive  https://identifiers.org/insdc.sra:SRP198441 (2020).
Qu, G. Carex littledalei isolate C.B.Clarke, whole genome shotgun sequencing project. Genbank  https://identifiers.org/ncbi/insdc:SWLB00000000 (2020).
Qu, G. Genome sequence of Kobresia littledalei, the first chromosome-level genome in the family Cyperaceae. figshare  https://doi.org/10.6084/m9.figshare.12197544.v1 (2020).
Parra, G., Korf, I. & Bradnam, K. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 23, 1061–1067 (2007).
doi: 10.1093/bioinformatics/btm071
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
pubmed: 3571712 pmcid: 3571712 doi: 10.1038/nbt.1883
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126–e126 (2018).
pubmed: 30107434 pmcid: 6265445
Kriventseva, E. V., Zdobnov, E. M., Simão, F. A., Ioannidis, P. & Waterhouse, R. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 31, 3210–3212 (2015).
pubmed: 26059717 doi: 10.1093/bioinformatics/btv351
Luo, R. et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience. 1, 18 (2012).
pubmed: 23587118 pmcid: 23587118 doi: 10.1186/2047-217X-1-18

Auteurs

Muyou Can (M)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Wei Wei (W)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Hailing Zi (H)

Novogene Bioinformatics Institute, Beijing, 100083, China.

Magaweng Bai (M)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Yunfei Liu (Y)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Dan Gao (D)

Novogene Bioinformatics Institute, Beijing, 100083, China.

Dengqunpei Tu (D)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Yuhong Bao (Y)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Li Wang (L)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Shaofeng Chen (S)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China.

Xing Zhao (X)

Novogene Bioinformatics Institute, Beijing, 100083, China. zhaoxing@novogene.com.

Guangpeng Qu (G)

State Key Laboratory of Hulless Barley and Yak Germplasm Resources and Genetic Improvement, Lhasa, 850000, China. qgp0707@163.com.
Institute of Grassland Science, Tibet Academy of Agriculture and Animal Husbandry Science, Lhasa, 850000, China. qgp0707@163.com.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins

Classifications MeSH