A chromosome-level genome assembly of Plantago ovata.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
27 01 2023
Historique:
received: 14 12 2021
accepted: 24 11 2022
pubmed: 28 1 2023
medline: 1 2 2023
entrez: 27 1 2023
Statut: epublish

Résumé

Plantago ovata is cultivated for production of its seed husk (psyllium). When wet, the husk transforms into a mucilage with properties suitable for pharmaceutical industries, utilised in supplements for controlling blood cholesterol levels, and food industries for making gluten-free products. There has been limited success in improving husk quantity and quality through breeding approaches, partly due to the lack of a reference genome. Here we constructed the first chromosome-scale reference assembly of P. ovata using a combination of 5.98 million PacBio and 636.5 million Hi-C reads. We also used corrected PacBio reads to estimate genome size and transcripts to generate gene models. The final assembly covers ~ 500 Mb with 99.3% gene set completeness. A total of 97% of the sequences are anchored to four chromosomes with an N50 of ~ 128.87 Mb. The P. ovata genome contains 61.90% repeats, where 40.04% are long terminal repeats. We identified 41,820 protein-coding genes, 411 non-coding RNAs, 108 ribosomal RNAs, and 1295 transfer RNAs. This genome will provide a resource for plant breeding programs to, for example, reduce agronomic constraints such as seed shattering, increase psyllium yield and quality, and overcome crop disease susceptibility.

Identifiants

pubmed: 36707685
doi: 10.1038/s41598-022-25078-5
pii: 10.1038/s41598-022-25078-5
pmc: PMC9883528
doi:

Substances chimiques

Psyllium 8063-16-9

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1528

Informations de copyright

© 2022. The Author(s).

Références

Gonçalves, S. & Romano, A. The medicinal potential of plants from the genus Plantago (Plantaginaceae). Ind. Crops Prod. 83, 213–226 (2016).
doi: 10.1016/j.indcrop.2015.12.038
Phan, J. L. et al. The novel features of Plantago ovata seed mucilage accumulation, storage and release. Sci. Rep. 10, 1–14 (2020).
doi: 10.1038/s41598-020-68685-w
Cowley, J. M. & Burton, R. A. The goo-d stuff: Plantago as a myxospermous model with modern utility. New Phytol. 229, 1917–1923 (2021).
doi: 10.1111/nph.17095
Cowley, J. M. et al. A small-scale fractionation pipeline for rapid analysis of seed mucilage characteristics. Plant Methods 16, 1–12 (2020).
doi: 10.1186/s13007-020-00569-6
Patel, D., Patel, H., Patel, P., Patel, H. & Amin, A. Evaluation of stable and non shattering isabgol cultivar-Gujarat isabgol. JOSAC https://doi.org/10.25081/josac.2018.v27.i1.1022 (2018).
doi: 10.25081/josac.2018.v27.i1.1022
McNeil D. A preliminary report on work conducted in 1985 to evaluate Plantago ovata as a potential crop in the Ord River irrigation area. https://researchlibrary.agric.wa.gov.au/pubns/24/ (1985).
Kumar, M. et al. Phenotypic and molecular characterization of selected species of Plantago with emphasis on Plantago ovata. Aust. J. Crop Sci. 8, 1639 (2014).
Shahriari, Z., Heidari, B., Dadkhodaie, A. & Richards, C. M. Analysis of karyotype, chromosome characteristics, variation in mucilage content and grain yield traits in Plantago ovata and P. psyllium species. Ind. Crops Prod. 123, 676–686 (2018).
doi: 10.1016/j.indcrop.2018.07.009
Dhar, M., Kaul, S., Sareen, S. & Koul, A. Plantago ovata: Genetic diversity, cultivation, utilization and chemistry. Plant Genet. Resour. 3, 252–263 (2005).
doi: 10.1079/PGR200582
Pramanik, S. & Raychaudhuri, S. S. DNA content, chromosome composition, and isozyme patterns in Plantago L. Bot. Rev. 63, 124–139 (1997).
doi: 10.1007/BF02935929
Dhar, M., Kaul, S., Friebe, B. & Gill, B. Chromosome identification in Plantago ovata Forsk. through C-banding and FISH. Curr. Sci. 83, 150–152 (2002).
Dhar, M., Fuchs, J. & Houben, A. Distribution of eu-and heterochromatin in Plantago ovata. Cytogenet. Genome Res. 125, 235–240 (2009).
doi: 10.1159/000230007
Saha, P., Das, D., Roy, S., Chakrabarti, A. & Sen Raychaudhuri, S. Effect of gamma irradiation on metallothionein protein expression in Plantago ovata Forsk. Int. J. Radiat. Biol. 89, 88–96 (2013).
doi: 10.3109/09553002.2013.734940
Lal, R. K. et al. Plantago ovata plant named ‘Mayuri’. Google Patents https://patents.google.com/patent/USPP17505P3/en (2017).
Tucker, M. et al. Dissecting the genetic basis for seed coat mucilage heteroxylan biosynthesis in Plantago ovata using gamma irradiation and infrared spectroscopy. Front. Plant Sci. 8, 326 (2017).
doi: 10.3389/fpls.2017.00326
Li, S., Sun, H. & Wang, K. The complete chloroplast genome sequence of Plantago ovata. Mitochondrial DNA Part B 4, 346–347 (2019).
doi: 10.1080/23802359.2018.1544049
Dhar, M. K., Friebe, B., Kaul, S. & Gill, B. S. Characterization and physical mapping of ribosomal RNA gene families in Plantago. Ann. Bot. 97, 541–548 (2006).
doi: 10.1093/aob/mcl017
Udall, J. A. & Dawe, R. K. Is it ordered correctly? validating genome assemblies by optical mapping. Plant Cell 30, 7–14 (2018).
doi: 10.1105/tpc.17.00514
Gel, B. & Serra, E. karyoploteR: an R/Bioconductor package to plot customizable genomes displaying arbitrary data. Bioinformatics 33, 3088–3090 (2017).
doi: 10.1093/bioinformatics/btx346
Sun, H., Ding, J., Piednoël, M. & Schneeberger, K. findGSE: estimating genome size variation within human and Arabidopsis using k-mer frequencies. Bioinformatics 34, 550–557 (2018).
doi: 10.1093/bioinformatics/btx637
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat. Commun. 11, 1432 (2020).
doi: 10.1038/s41467-020-14998-3
Badr, A., Labani, R. & Elkington, T. Nuclear DNA variation in relation to cytological features of some species in the genus Plantago L. Cytologia 52, 733–737. https://doi.org/10.1508/cytologia.52.733 (1987).
doi: 10.1508/cytologia.52.733
Schmuths, H., Meister, A., Horres, R. & Bachmann, K. Genome size variation among accessions of Arabidopsis thaliana. Ann. Bot. 93, 317–321 (2004).
doi: 10.1093/aob/mch037
Mapleson, D., Garcia Accinelli, G., Kettleborough, G., Wright, J. & Clavijo, B. J. KAT: a K-mer analysis toolkit to quality control NGS datasets and genome assemblies. Bioinformatics 33, 574–576 (2017).
Price, A. & Gibas, C. The quantitative impact of read mapping to non-native reference genomes in comparative RNA-Seq studies. PLoS ONE 12, e0180904. https://doi.org/10.1371/journal.pone.0180904 (2017).
doi: 10.1371/journal.pone.0180904
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126–e126. https://doi.org/10.1093/nar/gky730 (2018).
doi: 10.1093/nar/gky730
Michalovova, M., Vyskot, B. & Kejnovsky, E. Analysis of plastid and mitochondrial DNA insertions in the nucleus (NUPTs and NUMTs) of six plant species: size, relative age and chromosomal localization. Heredity (Edinb) 111, 314–320. https://doi.org/10.1038/hdy.2013.51 (2013).
doi: 10.1038/hdy.2013.51
Šmarda, P. et al. Ecological and evolutionary significance of genomic GC content diversity in monocots. Proc. Natl. Acad. Sci. USA 111, E4096 (2014).
doi: 10.1073/pnas.1321152111
Singh, R., Ming, R. & Yu, Q. Comparative analysis of GC content variations in plant genomes. Trop. Plant Biol. 9, 136–149 (2016).
doi: 10.1007/s12042-016-9165-4
Šmarda, P., Bureš, P., Šmerda, J. & Horová, L. Measurements of genomic GC content in plant genomes with flow cytometry: a test for reliability. New Phytol. 193, 513–521 (2012).
doi: 10.1111/j.1469-8137.2011.03942.x
Wang, J. et al. Genome-wide nucleotide patterns and potential mechanisms of genome divergence following domestication in maize and soybean. Genome Biol. 20, 74 (2019).
doi: 10.1186/s13059-019-1683-6
Cowley, J. M., O’Donovan, L. A. & Burton, R. A. The composition of Australian Plantago seeds highlights their potential as nutritionally-rich functional food ingredients. Sci. Rep. 11, 12692 (2021).
doi: 10.1038/s41598-021-92114-1
Kotwal, S. et al. De novo transcriptome analysis of medicinally important Plantago ovata using RNA-Seq. PLoS ONE 11, e0150273 (2016).
doi: 10.1371/journal.pone.0150273
Sundararajan, A. et al. Gene evolutionary trajectories and GC patterns driven by recombination in Zea mays. Front. Plant Sci. 7, 1433 (2016).
doi: 10.3389/fpls.2016.01433
Cantarel, B. L. et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 18, 188–196 (2008).
doi: 10.1101/gr.6743907
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
doi: 10.1093/bioinformatics/btv351
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 1–14 (2019).
doi: 10.1186/s13059-019-1832-y
Li, M. et al. Genome structure and evolution of Antirrhinum majus L. Nat. Plants 5, 174–183 (2019).
doi: 10.1038/s41477-018-0349-9
Haibao, T., Vivek, K. & Jingping, L. jcvi: JCVI utility libraries (v0.5.7). Zenodo. https://doi.org/10.5281/zenodo.31631 (2015).
Fischer, M. H. et al. The gel-forming polysaccharide of psyllium husk (Plantago ovata Forsk). Carbohydr. Res. 339, 2009–2017 (2004).
doi: 10.1016/j.carres.2004.05.023
Guo, Q., Cui, S. W., Wang, Q. & Young, J. C. Fractionation and physicochemical characterization of psyllium gum. Carbohydr. Polym. 73, 35–43 (2008).
doi: 10.1016/j.carbpol.2007.11.001
Ebringerová, A. Structural diversity and application potential of hemicelluloses. Macromol. Symp. 232, 1–12 (2005).
doi: 10.1002/masy.200551401
Phan, J. L. et al. Differences in glycosyltransferase family 61 accompany variation in seed coat mucilage composition in Plantago spp. J. Exp. Bot. 67, 6481–6495 (2016).
doi: 10.1093/jxb/erw424
Anders, N. et al. Glycosyl transferases in family 61 mediate arabinofuranosyl transfer onto xylan in grasses. Proc. Natl. Acad. Sci. U.S.A. 109, 989–993 (2012).
doi: 10.1073/pnas.1115858109
Jensen, J. K., Johnson, N. & Wilkerson, C. G. Discovery of diversity in xylan biosynthetic genes by transcriptional profiling of a heteroxylan containing mucilaginous tissue. Front. Plant Sci. 4, 183–183 (2013).
doi: 10.3389/fpls.2013.00183
Voiniciuc, C., Günl, M., Schmidt, M.H.-W. & Usadel, B. Highly branched xylan made by IRREGULAR XYLEM14 and MUCILAGE-RELATED21 links mucilage to Arabidopsis seeds. Plant Physiol. 169, 2481–2495 (2015).
Peska, V. & Garcia, S. Origin, diversity, and evolution of telomere sequences in plants. Front. Plant Sci. 11, 117 (2020).
doi: 10.3389/fpls.2020.00117
Sikorskaite, S., Rajamäki, M.-L., Baniulis, D., Stanys, V. & Valkonen, J. P. Protocol: optimised methodology for isolation of nuclei from leaves of species in the Solanaceae and Rosaceae families. Plant Methods 9, 1–9 (2013).
doi: 10.1186/1746-4811-9-31
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
doi: 10.1101/gr.215087.116
Ghurye, J. et al. Integrating Hi-C links with assembly graphs for chromosome-scale assembly. PLoS Comput. Biol. 15, e1007273. https://doi.org/10.1371/journal.pcbi.1007273 (2019).
doi: 10.1371/journal.pcbi.1007273
Dudchenko, O. et al. The Juicebox Assembly Tools module facilitates de novo assembly of mammalian genomes with chromosome-length scaffolds for under $1000. Preprint at https://www.biorxiv.org/content/10.1101/254797v1 (2018).
Telatin, A., Fariselli, P. & Birolo, G. SeqFu: a suite of utilities for the robust and reproducible manipulation of sequence files. Bioengineering 8, 59 (2021).
doi: 10.3390/bioengineering8050059
Dainat, J., Binzer-Panchal, M., Olsen, R. A. et al. NBISweden/GAAS: GAAS-v1.2.0 (v1.2.0). Zenodo https://doi.org/10.5281/zenodo.3835504 (2020).
Campbell, M. S. et al. MAKER-P: a tool kit for the rapid creation, management, and quality control of plant genome annotations. Plant Physiol. 164, 513–524 (2014).
doi: 10.1104/pp.113.230144
Pertea, G. & Pertea, M. GFF utilities: GffRead and GffCompare [version 2; peer review: 3 approved]. F1000research 9, 304. https://doi.org/10.12688/f1000research.23297.2 (2020).
doi: 10.12688/f1000research.23297.2
Korf, I. Gene finding in novel genomes. BMC Bioinform. 5, 1–9 (2004).
doi: 10.1186/1471-2105-5-59
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
doi: 10.1093/nar/gkl200
Dainat, J., Hereñú, D., Pascal-git. NBISweden/AGAT: AGAT-v0.8.0 (v0.8.0). Zenodo https://doi.org/10.5281/zenodo.5336786 (2021).
The Rnacentral Consortium. RNAcentral: A hub of information for non-coding RNA sequences. Nucleic Acids Res. 47, D221–D229 (2019).
doi: 10.1093/nar/gky1034
Yi, X., Zhang, Z., Ling, Y., Xu, W. & Su, Z. PNRD: A plant non-coding RNA database. Nucleic Acids Res. 43, D982–D989 (2015).
doi: 10.1093/nar/gku1162
Szcześniak, M. W., Rosikiewicz, W. & Makałowska, I. CANTATAdb: a collection of plant long non-coding RNAs. Plant Cell Physiol. 57, e8–e8 (2016).
doi: 10.1093/pcp/pcv201
Wucher, V. et al. FEELnc: a tool for long non-coding RNA annotation and its application to the dog transcriptome. Nucleic Acids Res. 45, e57–e57 (2017).

Auteurs

Lina Herliana (L)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.
Research Center for Genetic Engineering, Research Organization for Life Sciences and Environment, National Research and Innovation Agency (BRIN), Bogor, 16911, Indonesia.

Julian G Schwerdt (JG)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.

Tycho R Neumann (TR)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.
IP Australia, PO Box 200, Woden, ACT, 2606, Australia.

Anita Severn-Ellis (A)

School of Biological Sciences, University of Western Australia, Crawley, WA, 6009, Australia.

Jana L Phan (JL)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.

James M Cowley (JM)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.

Neil J Shirley (NJ)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.

Matthew R Tucker (MR)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.

Tina Bianco-Miotto (T)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia.

Jacqueline Batley (J)

School of Biological Sciences, University of Western Australia, Crawley, WA, 6009, Australia.

Nathan S Watson-Haigh (NS)

South Australian Genomics Centre (SAGC), Adelaide, SA, Australia. nathan.watson-haigh@sahmri.com.
Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, VIC, 3000, Australia. nathan.watson-haigh@sahmri.com.

Rachel A Burton (RA)

School of Agriculture, Food and Wine, University of Adelaide, Waite Campus, Urrbrae, SA, Australia. rachel.burton@adelaide.edu.au.

Articles similaires

Gene Editing Climate Change Africa South of the Sahara Crops, Agricultural Agriculture
Animals Genome Fishes Chromosomes Molecular Sequence Annotation
Isopoda Animals Phylogeny Biological Evolution Transcriptome

Bank vole genomics links determinate and indeterminate growth of teeth.

Zachary T Calamari, Andrew Song, Emily Cohen et al.
1.00
Animals Arvicolinae Genomics Mice Tooth

Classifications MeSH