Chromosome-level genome assembly and functional annotation of Citrullus colocynthis: unlocking genetic resources for drought-resilient crop development.


Journal

Planta
ISSN: 1432-2048
Titre abrégé: Planta
Pays: Germany
ID NLM: 1250576

Informations de publication

Date de publication:
23 Oct 2024
Historique:
received: 01 04 2024
accepted: 11 10 2024
medline: 24 10 2024
pubmed: 24 10 2024
entrez: 23 10 2024
Statut: epublish

Résumé

The chromosome-level genome assembly of Citrullus colocynthis reveals its genetic potential for enhancing drought tolerance, paving the way for innovative crop improvement strategies. This study presents the first comprehensive genome assembly and annotation of Citrullus colocynthis, a drought-tolerant wild close relative of cultivated watermelon, highlighting its potential for enhancing agricultural resilience to climate change. The study achieved a chromosome-level assembly using advanced sequencing technologies, including PacBio HiFi and Hi-C, revealing a genome size of approximately 366 Mb with low heterozygosity and substantial repetitive content. Our analysis identified 23,327 gene models, that could encode stress response mechanisms for species' adaptation to arid environments. Comparative genomics with closely related species illuminated the evolutionary dynamics within the Cucurbitaceae family. In addition, resequencing of 27 accessions from the United Arab Emirates (UAE) identified genetic diversity, suggesting a foundation for future breeding programs. This genomic resource opens new avenues for the de novo domestication of C. colocynthis, offering a blueprint for developing crops with enhanced drought tolerance, disease resistance, and nutritional profiles, crucial for sustaining future food security in the face of escalating climate challenges.

Identifiants

pubmed: 39443340
doi: 10.1007/s00425-024-04551-7
pii: 10.1007/s00425-024-04551-7
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

124

Informations de copyright

© 2024. The Author(s).

Références

Altschul SF et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410
pubmed: 2231712 doi: 10.1016/S0022-2836(05)80360-2
Al-Snafi AE (2016) Chemical constituents and pharmacological effects of Citrullus colocynthis—A review. IOSR J Pharm 6(3):57–67
Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25:25–29. https://doi.org/10.1038/75556
doi: 10.1038/75556 pubmed: 10802651 pmcid: 3037419
Assis JG et al (2000) Implications of the introgression between Citrullus colocynthis and C. lanatus characters in the taxonomy, evolutionary dynamics and breeding of watermelon. Pl Genet Resources Newslett. 121:15–19
Badr A, Zaki H (2024) Genetic diversity of Citrullus colocynthis populations using phytochemical analysis and SCoT marker variations. Genet Resour Crop Evol 71:2341–2353
doi: 10.1007/s10722-023-01783-6
Badr A et al (2018) Genetic diversity of colocynth (Citrullus colocynthis Schrader) populations in the eastern desert of egypt as revealed by morphological variation and ISSR polymorphism. Feddes Repertorium 129:173–184
doi: 10.1002/fedr.201700011
Bao G, Church GM (2002) Automated de novo identification of repeat sequence families in sequenced genomes. Genome Res 12:1269–1276
pubmed: 12176934 pmcid: 186642 doi: 10.1101/gr.88502
Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580
pubmed: 9862982 pmcid: 148217 doi: 10.1093/nar/27.2.573
Berwal MK et al (2022) The bioactive compounds and fatty acid profile of bitter apple seed oil obtained in hot Arid Environments. Horticulturae. 8:259
doi: 10.3390/horticulturae8030259
Besemer J, Borodovsky M (2005) GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res 33:W451–W454
pubmed: 15980510 pmcid: 1160247 doi: 10.1093/nar/gki487
Bigdelo M et al (2017) Evaluation of bitter apple (Citrullus colocynthis (L.) Schrad) as potential rootstock for watermelon. Aust J Crop Sci 11:727–732
doi: 10.21475/ajcs.17.11.06.p492
Bikdeloo M et al (2021) Morphological and physio-biochemical responses of watermelon grafted onto rootstocks of wild watermelon [Citrullus colocynthis (L.) Schrad] and commercial interspecific cucurbita hybrid to drought stress. Horticulturae. 7(10):359
doi: 10.3390/horticulturae7100359
Bohra A et al (2022) Reap the crop wild relatives for breeding future crops. Trends Biotechnol 40:412–431
pubmed: 34629170 doi: 10.1016/j.tibtech.2021.08.009
Borgi Z, Hibar K, Boughalleb N, Jabari H (2009) Evaluation of four local colocynth accessions and four hybrids, used as watermelon rootstocks, for resistance to fusarium wilt and fusarium crown and root rot. Afr J Plant Sci Biotechnol 3:37–40
Buchfink B, Reuter K, Drost HG (2021) Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat Methods 18:366–368
pubmed: 33828273 pmcid: 8026399 doi: 10.1038/s41592-021-01101-x
Cantalapiedra CP et al (2021) eggNOG-mapper v2: functional annotation, orthology assignments, and domain prediction at the metagenomic scale. Mol Biol Evol 38:5825–5829. https://doi.org/10.1093/molbev/msab293
doi: 10.1093/molbev/msab293 pubmed: 34597405 pmcid: 8662613
Challis R et al (2020) BlobToolKit—interactive quality assessment of genome assemblies. G3 Genes Genomes Genetics. 10:1361–1374
pubmed: 32071071 pmcid: 7144090 doi: 10.1534/g3.119.400908
Chen S, Zhou Y, Chen Y, Gu J (2018) fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884–i890
pubmed: 30423086 pmcid: 6129281 doi: 10.1093/bioinformatics/bty560
Cheng H et al (2021) Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods 18:170–175. https://doi.org/10.1038/s41592-020-01056-5
doi: 10.1038/s41592-020-01056-5 pubmed: 33526886 pmcid: 7961889
Chomicki G, Renner SS (2015) Watermelon origin solved with molecular phylogenetics including Linnaean material: another example of museomics. New Phytol 205:526–532
pubmed: 25358433 doi: 10.1111/nph.13163
Cingolani P et al (2012) A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6:80–92. https://doi.org/10.4161/fly.19695
doi: 10.4161/fly.19695 pubmed: 22728672 pmcid: 3679285
Consortium T.U (2021) UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res 49:D480–D489
doi: 10.1093/nar/gkaa1100
Conway J et al (2017) UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33(18):2938–2940. https://doi.org/10.1093/bioinformatics/btx364
doi: 10.1093/bioinformatics/btx364 pubmed: 28645171 pmcid: 5870712
Coordinators NCBIR (2014) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 42:D7–D17
doi: 10.1093/nar/gkt1146
Council NR (2006) Lost Crops of Africa: Volume II: Vegetables. The National Academies Press, Washington.
Dane F, Liu J, Zhang C (2007) Phylogeography of the Bitter Apple, Citrullus Colocynthis. Genet Resour Crop Evol 54:327–336
doi: 10.1007/s10722-005-4897-2
DeMaere MZ, Darling AE (2021) qc3C: reference-free quality control for Hi-C sequencing data. PLoS Comput Biol 17:1–20
doi: 10.1371/journal.pcbi.1008839
Durand NC et al (2016) Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst 3:95–98. https://doi.org/10.1016/j.cels.2016.07.002
doi: 10.1016/j.cels.2016.07.002 pubmed: 27467249 pmcid: 5846465
El-Gebali S et al (2019) The Pfam protein families database in 2019. Nucleic Acids Res 47:D427–D432. https://doi.org/10.1093/nar/gky995
doi: 10.1093/nar/gky995 pubmed: 30357350
Emms DM, Kelly S (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol 16:157
pubmed: 26243257 pmcid: 4531804 doi: 10.1186/s13059-015-0721-2
Emms DM, Kelly S (2019) OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol 20:238
pubmed: 31727128 pmcid: 6857279 doi: 10.1186/s13059-019-1832-y
Emms DM, Kelly S (2017) STRIDE: species tree root inference from gene duplication events. Mol Biol Evol 34(12):3267–3278
pubmed: 29029342 pmcid: 5850722 doi: 10.1093/molbev/msx259
Emms DM, Kelly S (2018) STAG: Species Tree Inference from All Genes. bioRxiv. p. 267914.
Fernie AR, Yan J (2019) De novo domestication: an alternative route toward new crops for the future. Mol Plant 12:615–631
pubmed: 30999078 doi: 10.1016/j.molp.2019.03.016
Finn RD, Clements J, Eddy SR (2011) HMMER web server: interactive sequence similarity searching. Nucleic Acids Res 39:W29–W37. https://doi.org/10.1093/nar/gkr367
doi: 10.1093/nar/gkr367 pubmed: 21593126 pmcid: 3125773
Flynn JM et al (2020) RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci 117:9451–9457. https://doi.org/10.1073/pnas.1921046117
doi: 10.1073/pnas.1921046117 pubmed: 32300014 pmcid: 7196820
Fukasawa Y et al (2020) LongQC: a quality control tool for third generation sequencing long read data. G3 Genes Genomes Genetics. 10:1193–1196
pubmed: 32041730 pmcid: 7144081 doi: 10.1534/g3.119.400864
Gasparini K, Moreira JDR, Peres LEP, Zsögön A (2021) De novo domestication of wild species to create crops with increased resilience and nutritional value. Curr Opin Plant Biol 60:102006
pubmed: 33556879 doi: 10.1016/j.pbi.2021.102006
Gkanogiannis A (2023) fastreeR: phylogenetic, distance and other calculations on VCF and Fasta files. Bioconductor. https://doi.org/10.18129/B9.bioc.fastreeR
doi: 10.18129/B9.bioc.fastreeR
Gonzalez-Garay ML (2016) Introduction to isoform sequencing using pacific biosciences technology (Iso-Seq). In: Wu J (ed) Transcriptomics and gene regulation. Springer, Dordrecht, pp 141–160
doi: 10.1007/978-94-017-7450-5_6
Guo S et al (2012) The draft genome of watermelon (Citrullus lanatus) and resequencing of 20 diverse accessions. Nat Genet. https://doi.org/10.1038/ng.2470
doi: 10.1038/ng.2470 pubmed: 23242369 pmcid: 4169232
Gurevich A, Saveliev V, Vyahhi N, Tesler G (2013) QUAST: quality assessment tool for genome assemblies. Bioinformatics 29:1072–1075. https://doi.org/10.1093/bioinformatics/btt086
doi: 10.1093/bioinformatics/btt086 pubmed: 23422339 pmcid: 3624806
Hanssen F et al (2024) Scalable and efficient DNA sequencing analysis on different compute infrastructures aiding variant discovery. NAR Genom Bioinf 6(2):lqae031
doi: 10.1093/nargab/lqae031
Howe K et al (2021) Significantly improving the quality of genome assemblies through curation. Gigascience. 10:giaa153
pubmed: 33420778 pmcid: 7794651 doi: 10.1093/gigascience/giaa153
Huerta-Cepas J et al (2019) eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res 47:D309–D314
pubmed: 30418610 doi: 10.1093/nar/gky1085
Hussain AI et al (2014) Citrullus colocynthis (L.) Schrad (bitter apple fruit): a review of its phytochemistry, pharmacology, traditional uses and nutritional potential. J Ethnopharmacol 155:54–66
pubmed: 24936768 doi: 10.1016/j.jep.2014.06.011
Hyatt D et al (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform 11:119. https://doi.org/10.1186/1471-2105-11-119
doi: 10.1186/1471-2105-11-119
Jones P et al (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240
pubmed: 24451626 pmcid: 3998142 doi: 10.1093/bioinformatics/btu031
Kelly S, Maini PK (2013) DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments. PLoS ONE 8:e58537
pubmed: 23554899 pmcid: 3598851 doi: 10.1371/journal.pone.0058537
Kokot M, Długosz M, Deorowicz S (2017) KMC 3: counting and manipulating k-mer statistics. Bioinformatics 33:2759–2761
pubmed: 28472236 doi: 10.1093/bioinformatics/btx304
Lander ES et al (2001) Initial sequencing and analysis of the human genome. Nature 409:860–921
pubmed: 11237011 doi: 10.1038/35057062
Letunic I, Bork P (2007) Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23:127–128
pubmed: 17050570 doi: 10.1093/bioinformatics/btl529
Levi A et al (2017) Genetic diversity in the desert watermelon Citrullus colocynthis and its relationship with Citrullus species as determined by high-frequency oligonucleotides-targeting active gene markers. J. Am. Soc. Hort. Sci. 142(1):47–56
doi: 10.21273/JASHS03834-16
Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100
pubmed: 29750242 pmcid: 6137996 doi: 10.1093/bioinformatics/bty191
Li H (2021) New strategies to improve minimap2 alignment accuracy. Bioinformatics 37:4572–4574
pubmed: 34623391 pmcid: 8652018 doi: 10.1093/bioinformatics/btab705
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
pubmed: 19451168 pmcid: 2705234 doi: 10.1093/bioinformatics/btp324
Lieberman-Aiden E et al (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326(5950):289–293
pubmed: 19815776 pmcid: 2858594 doi: 10.1126/science.1181369
Li KP et al (2016) Cytogenetic relationships among Citrullus species in comparison with some genera of the tribe Benincaseae (Cucurbitaceae) as inferred from rDNA distribution patterns. BMC Evol Biol 16:85
pubmed: 27090090 pmcid: 4835933 doi: 10.1186/s12862-016-0656-6
Mariod AA, Jarret RL (2022) Chapter 12—Antioxidant, antimicrobial, and antidiabetic activities of Citrullus colocynthis seed oil. Multiple biological activities of unconventional seed oils. Academic Press, New York, pp 139–146. https://doi.org/10.1016/b978-0-12-824135-6.00005-2
doi: 10.1016/b978-0-12-824135-6.00005-2
Mazher M et al (2024) Evaluation of genetic diversity and population structure of Citrullus colocynthis based on physiochemical and inter simple sequence repeat (ISSR) markers. Genet Resour Crop Evol. https://doi.org/10.1007/s10722-024-01913-8
doi: 10.1007/s10722-024-01913-8
Meslier V et al (2022) Benchmarking second and third-generation sequencing platforms for microbial metagenomics. Scientific Data 9:694
pubmed: 36369227 pmcid: 9652401 doi: 10.1038/s41597-022-01762-z
Ogundele JO, Oshodi AA, Amoo IA (2012) Comparative Study of Amino Acid and Proximate Composition of Citrullus colocynthis and Citrullus vulgaris Seeds. Pak J Nutr 11:247–251
doi: 10.3923/pjn.2012.247.251
Palmer JM (2020) Funannotate v1.8.1: a fungal genome annotation and comparative genomics pipeline. Zenodo. https://doi.org/10.5281/zenodo.4054262 . Accessed Aug 2023
Patro R et al (2017) Salmon provides fast and bias-aware quantification of transcript expression. Nature Methods. https://doi.org/10.1038/nmeth.4197
doi: 10.1038/nmeth.4197 pubmed: 28263959 pmcid: 5600148
Pimentel D et al (1997) Economic and environmental benefits of biodiversity. Bioscience 47:747–757
doi: 10.2307/1313097
Porebski S, Bailey LG, Baum BR (1997) Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol Biol Report 15:8–15
doi: 10.1007/BF02772108
Ranallo-Benavidez TR, Jaron KS, Schatz MC (2020) GenomeScope 2.0 and Smudgeplot for reference-free profiling of polyploid genomes. Nat Commun. 11:1432
pubmed: 32188846 pmcid: 7080791 doi: 10.1038/s41467-020-14998-3
Renner SS et al (2021) A chromosome-level genome of a Kordofan melon illuminates the origin of domesticated watermelons. Proc Natl Acad Sci 118:e2101486118
pubmed: 34031154 pmcid: 8201767 doi: 10.1073/pnas.2101486118
Renzi JP et al (2022) How could the use of crop wild relatives in breeding increase the adaptation of crops to marginal environments? Front Plant Sci. https://doi.org/10.3389/fpls.2022.1101822
doi: 10.3389/fpls.2022.1101822 pubmed: 36531413 pmcid: 9755750
Rhie A, Walenz BP, Koren S, Phillippy AM (2020) Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol 21:245
pubmed: 32928274 pmcid: 7488777 doi: 10.1186/s13059-020-02134-9
Rhoads A, Au KF (2015) PacBio sequencing and its applications. Genom Proteom Bioinform 13:278–289
doi: 10.1016/j.gpb.2015.08.002
Robinson JT et al (2018) Juicebox.js provides a cloud-based visualization system for Hi-C data. Cell Syst 6:256-258.e1
pubmed: 29428417 pmcid: 6047755 doi: 10.1016/j.cels.2018.01.001
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425
pubmed: 3447015
Sawaya WN, Daghir NJ, Khalil JK (1986) Citrullus colocynthis seeds as a potential source of protein for food and feed. J Agric Food Chem 34:285–288
doi: 10.1021/jf00068a035
Sawaya WN, Daghir NJ, Khan P (1983) Chemical characterization and edibility of the oil extracted from Citrullus colocynthis seeds. J Food Sci 48:104–106
doi: 10.1111/j.1365-2621.1983.tb14799.x
Seppey M, Manni M, Zdobnov EM (2019) BUSCO: assessing genome assembly and annotation completeness. In: Kollmar M (ed) Gene prediction: methods and protocols. Springer, New York, pp 227–245
Si Y et al (2010) Cloning and expression analysis of the Ccrboh gene encoding respiratory burst oxidase in Citrullus colocynthis and grafting onto Citrullus lanatus (watermelon). J Exp Bot 61:1635–1642
pubmed: 20181664 pmcid: 2852657 doi: 10.1093/jxb/erq031
Smit AFA, Hubley R, Green P (2013) RepeatMasker Open-4.0. http://www.repeatmasker.org . Accessed Aug 2023
Stanke M et al (2006) AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res 34:W435–W439
pubmed: 16845043 pmcid: 1538822 doi: 10.1093/nar/gkl200
Steinegger M, Söding J (2017) MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat Biotechnol 35:1026–1028
pubmed: 29035372 doi: 10.1038/nbt.3988
Tyack N, Dempewolf H, Khoury CK (2020) The potential of payment for ecosystem services for crop wild relative conservation. Plants. 9(10):1305
pubmed: 33023207 pmcid: 7601374 doi: 10.3390/plants9101305
Van der Auwera GA, O'Connor BD (2020) Genomics in the Cloud: Using Docker, GATK, and WDL in Terra. O'Reilly Media.
Verma KS et al (2017) RAPD and ISSR marker assessment of genetic diversity in Citrullus colocynthis (L.) Schrad: a unique source of germplasm highly adapted to drought and high-temperature stress. 3 Biotech 7(5):288. https://doi.org/10.1007/s13205-017-0918-z
doi: 10.1007/s13205-017-0918-z pubmed: 28868215 pmcid: 5570720
Wang Z et al (2014) Analysis of the Citrullus colocynthis transcriptome during water deficit stress. PLoS ONE 9:e104657
pubmed: 25118696 pmcid: 4132101 doi: 10.1371/journal.pone.0104657
Wenger AM et al (2019) Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome. Nat Biotechnol 37:1155–1162
pubmed: 31406327 pmcid: 6776680 doi: 10.1038/s41587-019-0217-9
Xie M et al (2019) A reference-grade wild soybean genome. Nat Commun 10:1216
pubmed: 30872580 pmcid: 6418295 doi: 10.1038/s41467-019-09142-9
Yao W et al (2015) Exploring the rice dispensable genome using a metagenome-like assembly strategy. Genome Biol 16:187
pubmed: 26403182 pmcid: 4583175 doi: 10.1186/s13059-015-0757-3
Zhou C, McCarthy SA, Durbin R (2023) YaHS: yet another Hi-C scaffolding tool. Bioinformatics. 39:btac808
pubmed: 36525368 doi: 10.1093/bioinformatics/btac808

Auteurs

Anestis Gkanogiannis (A)

International Center for Biosaline Agriculture, ICBA, P.O. Box 14660, Dubai, United Arab Emirates. a.gkanogiannis@biosaline.org.ae.

Hifzur Rahman (H)

International Center for Biosaline Agriculture, ICBA, P.O. Box 14660, Dubai, United Arab Emirates.

Rakesh Kumar Singh (RK)

International Center for Biosaline Agriculture, ICBA, P.O. Box 14660, Dubai, United Arab Emirates.

Augusto Becerra Lopez-Lavalle (AB)

International Center for Biosaline Agriculture, ICBA, P.O. Box 14660, Dubai, United Arab Emirates.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Animals Hemiptera Insect Proteins Phylogeny Insecticides
Amaryllidaceae Alkaloids Lycoris NADPH-Ferrihemoprotein Reductase Gene Expression Regulation, Plant Plant Proteins
Drought Resistance Gene Expression Profiling Gene Expression Regulation, Plant Gossypium Multigene Family

Classifications MeSH