A chromosome-level genome assembly of Plantago ovata.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
27 01 2023
27 01 2023
Historique:
received:
14
12
2021
accepted:
24
11
2022
pubmed:
28
1
2023
medline:
1
2
2023
entrez:
27
1
2023
Statut:
epublish
Résumé
Plantago ovata is cultivated for production of its seed husk (psyllium). When wet, the husk transforms into a mucilage with properties suitable for pharmaceutical industries, utilised in supplements for controlling blood cholesterol levels, and food industries for making gluten-free products. There has been limited success in improving husk quantity and quality through breeding approaches, partly due to the lack of a reference genome. Here we constructed the first chromosome-scale reference assembly of P. ovata using a combination of 5.98 million PacBio and 636.5 million Hi-C reads. We also used corrected PacBio reads to estimate genome size and transcripts to generate gene models. The final assembly covers ~ 500 Mb with 99.3% gene set completeness. A total of 97% of the sequences are anchored to four chromosomes with an N50 of ~ 128.87 Mb. The P. ovata genome contains 61.90% repeats, where 40.04% are long terminal repeats. We identified 41,820 protein-coding genes, 411 non-coding RNAs, 108 ribosomal RNAs, and 1295 transfer RNAs. This genome will provide a resource for plant breeding programs to, for example, reduce agronomic constraints such as seed shattering, increase psyllium yield and quality, and overcome crop disease susceptibility.
Identifiants
pubmed: 36707685
doi: 10.1038/s41598-022-25078-5
pii: 10.1038/s41598-022-25078-5
pmc: PMC9883528
doi:
Substances chimiques
Psyllium
8063-16-9
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1528Informations de copyright
© 2022. The Author(s).
Références
Gonçalves, S. & Romano, A. The medicinal potential of plants from the genus Plantago (Plantaginaceae). Ind. Crops Prod. 83, 213–226 (2016).
doi: 10.1016/j.indcrop.2015.12.038
Phan, J. L. et al. The novel features of Plantago ovata seed mucilage accumulation, storage and release. Sci. Rep. 10, 1–14 (2020).
doi: 10.1038/s41598-020-68685-w
Cowley, J. M. & Burton, R. A. The goo-d stuff: Plantago as a myxospermous model with modern utility. New Phytol. 229, 1917–1923 (2021).
doi: 10.1111/nph.17095
Cowley, J. M. et al. A small-scale fractionation pipeline for rapid analysis of seed mucilage characteristics. Plant Methods 16, 1–12 (2020).
doi: 10.1186/s13007-020-00569-6
Patel, D., Patel, H., Patel, P., Patel, H. & Amin, A. Evaluation of stable and non shattering isabgol cultivar-Gujarat isabgol. JOSAC https://doi.org/10.25081/josac.2018.v27.i1.1022 (2018).
doi: 10.25081/josac.2018.v27.i1.1022
McNeil D. A preliminary report on work conducted in 1985 to evaluate Plantago ovata as a potential crop in the Ord River irrigation area. https://researchlibrary.agric.wa.gov.au/pubns/24/ (1985).
Kumar, M. et al. Phenotypic and molecular characterization of selected species of Plantago with emphasis on Plantago ovata. Aust. J. Crop Sci. 8, 1639 (2014).
Shahriari, Z., Heidari, B., Dadkhodaie, A. & Richards, C. M. Analysis of karyotype, chromosome characteristics, variation in mucilage content and grain yield traits in Plantago ovata and P. psyllium species. Ind. Crops Prod. 123, 676–686 (2018).
doi: 10.1016/j.indcrop.2018.07.009
Dhar, M., Kaul, S., Sareen, S. & Koul, A. Plantago ovata: Genetic diversity, cultivation, utilization and chemistry. Plant Genet. Resour. 3, 252–263 (2005).
doi: 10.1079/PGR200582
Pramanik, S. & Raychaudhuri, S. S. DNA content, chromosome composition, and isozyme patterns in Plantago L. Bot. Rev. 63, 124–139 (1997).
doi: 10.1007/BF02935929
Dhar, M., Kaul, S., Friebe, B. & Gill, B. Chromosome identification in Plantago ovata Forsk. through C-banding and FISH. Curr. Sci. 83, 150–152 (2002).
Dhar, M., Fuchs, J. & Houben, A. Distribution of eu-and heterochromatin in Plantago ovata. Cytogenet. Genome Res. 125, 235–240 (2009).
doi: 10.1159/000230007
Saha, P., Das, D., Roy, S., Chakrabarti, A. & Sen Raychaudhuri, S. Effect of gamma irradiation on metallothionein protein expression in Plantago ovata Forsk. Int. J. Radiat. Biol. 89, 88–96 (2013).
doi: 10.3109/09553002.2013.734940
Lal, R. K. et al. Plantago ovata plant named ‘Mayuri’. Google Patents https://patents.google.com/patent/USPP17505P3/en (2017).
Tucker, M. et al. Dissecting the genetic basis for seed coat mucilage heteroxylan biosynthesis in Plantago ovata using gamma irradiation and infrared spectroscopy. Front. Plant Sci. 8, 326 (2017).
doi: 10.3389/fpls.2017.00326
Li, S., Sun, H. & Wang, K. The complete chloroplast genome sequence of Plantago ovata. Mitochondrial DNA Part B 4, 346–347 (2019).
doi: 10.1080/23802359.2018.1544049
Dhar, M. K., Friebe, B., Kaul, S. & Gill, B. S. Characterization and physical mapping of ribosomal RNA gene families in Plantago. Ann. Bot. 97, 541–548 (2006).
doi: 10.1093/aob/mcl017
Udall, J. A. & Dawe, R. K. Is it ordered correctly? validating genome assemblies by optical mapping. Plant Cell 30, 7–14 (2018).
doi: 10.1105/tpc.17.00514
Gel, B. & Serra, E. karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics 33, 3088–3090 (2017).
doi: 10.1093/bioinformatics/btx346
Sun, H., Ding, J., Piednoël, M. & Schneeberger, K. findGSE: estimating genome size variation within human and Arabidopsis using k-mer frequencies. Bioinformatics 34, 550–557 (2018).
doi: 10.1093/bioinformatics/btx637
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11, 1432 (2020).
doi: 10.1038/s41467-020-14998-3
Badr, A., Labani, R. & Elkington, T. Nuclear DNA variation in relation to cytological features of some species in the genus Plantago L. Cytologia 52, 733–737. https://doi.org/10.1508/cytologia.52.733 (1987).
doi: 10.1508/cytologia.52.733
Schmuths, H., Meister, A., Horres, R. & Bachmann, K. Genome size variation among accessions of Arabidopsis thaliana. Ann. Bot. 93, 317–321 (2004).
doi: 10.1093/aob/mch037
Mapleson, D., Garcia Accinelli, G., Kettleborough, G., Wright, J. & Clavijo, B. J. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics 33, 574–576 (2017).
Price, A. & Gibas, C. The quantitative impact of read mapping to non-native reference genomes in comparative RNA-Seq studies. PLoS ONE 12, e0180904. https://doi.org/10.1371/journal.pone.0180904 (2017).
doi: 10.1371/journal.pone.0180904
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126–e126. https://doi.org/10.1093/nar/gky730 (2018).
doi: 10.1093/nar/gky730
Michalovova, M., Vyskot, B. & Kejnovsky, E. Analysis of plastid and mitochondrial DNA insertions in the nucleus (NUPTs and NUMTs) of six plant species: size, relative age and chromosomal localization. Heredity (Edinb) 111, 314–320. https://doi.org/10.1038/hdy.2013.51 (2013).
doi: 10.1038/hdy.2013.51
Šmarda, P. et al. Ecological and evolutionary significance of genomic GC content diversity in monocots. Proc. Natl. Acad. Sci. USA 111, E4096 (2014).
doi: 10.1073/pnas.1321152111
Singh, R., Ming, R. & Yu, Q. Comparative analysis of GC content variations in plant genomes. Trop. Plant Biol. 9, 136–149 (2016).
doi: 10.1007/s12042-016-9165-4
Šmarda, P., Bureš, P., Šmerda, J. & Horová, L. Measurements of genomic GC content in plant genomes with flow cytometry: a test for reliability. New Phytol. 193, 513–521 (2012).
doi: 10.1111/j.1469-8137.2011.03942.x
Wang, J. et al. Genome-wide nucleotide patterns and potential mechanisms of genome divergence following domestication in maize and soybean. Genome Biol. 20, 74 (2019).
doi: 10.1186/s13059-019-1683-6
Cowley, J. M., O’Donovan, L. A. & Burton, R. A. The composition of Australian Plantago seeds highlights their potential as nutritionally-rich functional food ingredients. Sci. Rep. 11, 12692 (2021).
doi: 10.1038/s41598-021-92114-1
Kotwal, S. et al. De novo transcriptome analysis of medicinally important Plantago ovata using RNA-Seq. PLoS ONE 11, e0150273 (2016).
doi: 10.1371/journal.pone.0150273
Sundararajan, A. et al. Gene evolutionary trajectories and GC patterns driven by recombination in Zea mays. Front. Plant Sci. 7, 1433 (2016).
doi: 10.3389/fpls.2016.01433
Cantarel, B. L. et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18, 188–196 (2008).
doi: 10.1101/gr.6743907
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
doi: 10.1093/bioinformatics/btv351
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 1–14 (2019).
doi: 10.1186/s13059-019-1832-y
Li, M. et al. Genome structure and evolution of Antirrhinum majus L. Nat. Plants 5, 174–183 (2019).
doi: 10.1038/s41477-018-0349-9
Haibao, T., Vivek, K. & Jingping, L. jcvi: JCVI utility libraries (v0.5.7). Zenodo. https://doi.org/10.5281/zenodo.31631 (2015).
Fischer, M. H. et al. The gel-forming polysaccharide of psyllium husk (Plantago ovata Forsk). Carbohydr. Res. 339, 2009–2017 (2004).
doi: 10.1016/j.carres.2004.05.023
Guo, Q., Cui, S. W., Wang, Q. & Young, J. C. Fractionation and physicochemical characterization of psyllium gum. Carbohydr. Polym. 73, 35–43 (2008).
doi: 10.1016/j.carbpol.2007.11.001
Ebringerová, A. Structural diversity and application potential of hemicelluloses. Macromol. Symp. 232, 1–12 (2005).
doi: 10.1002/masy.200551401
Phan, J. L. et al. Differences in glycosyltransferase family 61 accompany variation in seed coat mucilage composition in Plantago spp. J. Exp. Bot. 67, 6481–6495 (2016).
doi: 10.1093/jxb/erw424
Anders, N. et al. Glycosyl transferases in family 61 mediate arabinofuranosyl transfer onto xylan in grasses. Proc. Natl. Acad. Sci. U.S.A. 109, 989–993 (2012).
doi: 10.1073/pnas.1115858109
Jensen, J. K., Johnson, N. & Wilkerson, C. G. Discovery of diversity in xylan biosynthetic genes by transcriptional profiling of a heteroxylan containing mucilaginous tissue. Front. Plant Sci. 4, 183–183 (2013).
doi: 10.3389/fpls.2013.00183
Voiniciuc, C., Günl, M., Schmidt, M.H.-W. & Usadel, B. Highly branched xylan made by IRREGULAR XYLEM14 and MUCILAGE-RELATED21 links mucilage to Arabidopsis seeds. Plant Physiol. 169, 2481–2495 (2015).
Peska, V. & Garcia, S. Origin, diversity, and evolution of telomere sequences in plants. Front. Plant Sci. 11, 117 (2020).
doi: 10.3389/fpls.2020.00117
Sikorskaite, S., Rajamäki, M.-L., Baniulis, D., Stanys, V. & Valkonen, J. P. Protocol: optimised methodology for isolation of nuclei from leaves of species in the Solanaceae and Rosaceae families. Plant Methods 9, 1–9 (2013).
doi: 10.1186/1746-4811-9-31
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
doi: 10.1101/gr.215087.116
Ghurye, J. et al. Integrating Hi-C links with assembly graphs for chromosome-scale assembly. PLoS Comput. Biol. 15, e1007273. https://doi.org/10.1371/journal.pcbi.1007273 (2019).
doi: 10.1371/journal.pcbi.1007273
Dudchenko, O. et al. The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000. Preprint at https://www.biorxiv.org/content/10.1101/254797v1 (2018).
Telatin, A., Fariselli, P. & Birolo, G. SeqFu: a suite of utilities for the robust and reproducible manipulation of sequence files. Bioengineering 8, 59 (2021).
doi: 10.3390/bioengineering8050059
Dainat, J., Binzer-Panchal, M., Olsen, R. A. et al. NBISweden/GAAS: GAAS-v1.2.0 (v1.2.0). Zenodo https://doi.org/10.5281/zenodo.3835504 (2020).
Campbell, M. S. et al. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations. Plant Physiol. 164, 513–524 (2014).
doi: 10.1104/pp.113.230144
Pertea, G. & Pertea, M. GFF utilities: GffRead and GffCompare [version 2; peer review: 3 approved]. F1000research 9, 304. https://doi.org/10.12688/f1000research.23297.2 (2020).
doi: 10.12688/f1000research.23297.2
Korf, I. Gene finding in novel genomes. BMC Bioinform. 5, 1–9 (2004).
doi: 10.1186/1471-2105-5-59
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
doi: 10.1093/nar/gkl200
Dainat, J., Hereñú, D., Pascal-git. NBISweden/AGAT: AGAT-v0.8.0 (v0.8.0). Zenodo https://doi.org/10.5281/zenodo.5336786 (2021).
The Rnacentral Consortium. RNAcentral: A hub of information for non-coding RNA sequences. Nucleic Acids Res. 47, D221–D229 (2019).
doi: 10.1093/nar/gky1034
Yi, X., Zhang, Z., Ling, Y., Xu, W. & Su, Z. PNRD: A plant non-coding RNA database. Nucleic Acids Res. 43, D982–D989 (2015).
doi: 10.1093/nar/gku1162
Szcześniak, M. W., Rosikiewicz, W. & Makałowska, I. CANTATAdb: a collection of plant long non-coding RNAs. Plant Cell Physiol. 57, e8–e8 (2016).
doi: 10.1093/pcp/pcv201
Wucher, V. et al. FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Res. 45, e57–e57 (2017).