Improved chromosome-level genome assembly of Indian sandalwood (Santalum album).


Journal

Scientific data
ISSN: 2052-4463
Titre abrégé: Sci Data
Pays: England
ID NLM: 101640192

Informations de publication

Date de publication:
21 Dec 2023
Historique:
received: 01 08 2023
accepted: 12 12 2023
medline: 22 12 2023
pubmed: 22 12 2023
entrez: 21 12 2023
Statut: epublish

Résumé

Santalum album is a well-known aromatic and medicinal plant that is highly valued for the essential oil (EO) extracted from its heartwood. In this study, we present a high-quality chromosome-level genome assembly of S. album after integrating PacBio Sequel, Illumina HiSeq paired-end and high-throughput chromosome conformation capture sequencing technologies. The assembled genome size is 207.39 M with a contig N50 of 7.33 M and scaffold N50 size of 18.31 M. Compared with three previously published sandalwood genomes, the N50 length of the genome assembly was longer. In total, 94.26% of the assembly was assigned to 10 pseudo-chromosomes, and the anchor rate far exceeded that of a recently released value. BUSCO analysis yielded a completeness score of 94.91%. In addition, we predicted 23,283 protein-coding genes, 89.68% of which were functionally annotated. This high-quality genome will provide a foundation for sandalwood functional genomics studies, and also for elucidating the genetic basis of EO biosynthesis in S. album.

Identifiants

pubmed: 38129455
doi: 10.1038/s41597-023-02849-x
pii: 10.1038/s41597-023-02849-x
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

921

Subventions

Organisme : National Natural Science Foundation of China (National Science Foundation of China)
ID : 32171841, 32371925, 31870666

Informations de copyright

© 2023. The Author(s).

Références

Harbaugh, D. T. & Baldwin, B. G. Phylogeny and biogeography of the sandalwoods (Santalum, Santalaceae) repeated dispersals throughout the Pacific. Am. J. Bot. 94, 1028–1040 (2007).
pubmed: 21636472 doi: 10.3732/ajb.94.6.1028
Moniodis, J. et al. The transcriptome of sesquiterpenoid biosynthesis in heartwood xylem of Western Australian sandalwood (Santalum spicatum). Phytochemistry 113, 79–86 (2015).
pubmed: 25624157 doi: 10.1016/j.phytochem.2014.12.009
Zhang, X. H., Teixeira da Silva, J. A., Yan, J. & Ma, G. H. Essential oils composition from roots of Santalum album L. J. Essent. Oil Bear. Pl. 15, 1–6 (2012).
doi: 10.1080/0972060X.2012.10644011
Teixeira da Silva, J. A. et al. Sandalwood: basic biology, tissue culture, and genetic transformation. Planta 243, 847–887 (2016).
pubmed: 26745967 doi: 10.1007/s00425-015-2452-8
Mahesh, H. B. & Gowda, M. In The Sandalwood Genome: Compendium of Plant Genomes (Gowda, M. et al. (eds.), 1–5 (Springer Nature Switzerland press, 2022).
Burdock, G. A. & Carabin, I. G. Safety assessment of sandalwood oil (Santalum album L.). Food Chem. Toxicol. 46, 421–432 (2008).
pubmed: 17980948 doi: 10.1016/j.fct.2007.09.092
Kim, T. H. et al. Antifungal and ichthyotoxic sesquiterpenoids from Santalum album heartwood. Molecules 22, 1139 (2017).
pubmed: 28698478 pmcid: 6152050 doi: 10.3390/molecules22071139
Bommareddy, A. et al. Medicinal properties of alpha-santalol, a naturally occurring constituent of sandalwood oil: review. Nat. Prod. Res. 33, 527–543 (2019).
pubmed: 29130352 doi: 10.1080/14786419.2017.1399387
Kumar, A. N. A., Joshi, G. & Ram, H. Y. M. Sandalwood: history, uses, present status and the future. Curr. Sci. 103, 1408–1416 (2012).
Tropical Forestry Services (TPS). TFS Sandalwood Project 2015, Indian Sandalwood. Product Disclosure Statement. Tropical Forestry Services Ltd., 169 Broadway, Nedlands WA 6009, Australia (2015).
Baldovini, N., Delasalle, C. & Joulain, D. Phytochemistry of the heartwood from fragrant Santalum species: a review. Flavour Frag. J. 26, 7–26 (2011).
doi: 10.1002/ffj.2025
Jones, C. G. et al. Sandalwood fragrance biosynthesis involves sesquiterpene synthases of both the terpene synthase (TPS)-a and TPS-b subfamilies, including santalene synthases. J. Biol. Chem. 286, 17445–17454 (2011).
pubmed: 21454632 pmcid: 3093818 doi: 10.1074/jbc.M111.231787
Diaz-Chavez, M. L. et al. Biosynthesis of sandalwood oil: Santalum album CYP76F cytochromes P450 produce santalols and bergamotol. PLoS One 8, e75053 (2013).
pubmed: 24324844 pmcid: 3854609 doi: 10.1371/journal.pone.0075053
Celedon, J. M. et al. Heartwood-specific transcriptome and metabolite signatures of tropical sandalwood (Santalum album) reveal the final step of (Z)-santalol fragrance biosynthesis. Plant J. 86, 289–299 (2016).
pubmed: 26991058 doi: 10.1111/tpj.13162
Niu, M. Y. et al. Cloning and expression analysis of mevalonate kinase and phosphomevalonate kinase genes associated with the MVA pathway in Santalum album. Sci. Rep. 11, 16913 (2021).
pubmed: 34413433 pmcid: 8376994 doi: 10.1038/s41598-021-96511-4
Niu, M. Y. et al. Cloning, characterization, and functional analysis of acetyl-CoA C-acetyltransferase and 3-hydroxy-3-methylglutaryl-CoA synthase genes in Santalum album. Sci. Rep. 11, 1082 (2021).
pubmed: 33441887 pmcid: 7807033 doi: 10.1038/s41598-020-80268-3
Mahesh, H. B. et al. Multi-omics driven assembly and annotation of the sandalwood (Santalum album) genome. Plant Physiol. 176, 2772–2788 (2018).
pubmed: 29440596 pmcid: 5884603 doi: 10.1104/pp.17.01764
Dasgupta, M. G., Ulaganathan, K., Dev, S. A. & Balakrishnan, S. Draft genome of Santalum album L. provides genomic resources for accelerated trait improvement. Tree Genet. Genomes 15, 34 (2019).
doi: 10.1007/s11295-019-1334-9
Hong, Z. et al. Chromosome-level genome assemblies from two sandalwood species provide insights into the evolution of the Santalales. Commun Biol 6, 587 (2023).
pubmed: 37264116 pmcid: 10235099 doi: 10.1038/s42003-023-04980-2
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
pubmed: 26059717 doi: 10.1093/bioinformatics/btv351
Bennetzen, J. L. & Wang, H. The contributions of transposable elements to the structure, function, and evolution of plant genomes. Annu. Rev. Plant Biol. 65, 505–530 (2014).
pubmed: 24579996 doi: 10.1146/annurev-arplant-050213-035811
Porebski, S., Bailey, L. G. & Baum, B. R. Modification of a CTAB DNA extraction protocol for plants containing high polysaccharide and polyphenol components. Plant Mol. Biol. Rep. 15, 8–15 (1997).
doi: 10.1007/BF02772108
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
pubmed: 28298431 pmcid: 5411767 doi: 10.1101/gr.215087.116
Marcais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
pubmed: 21217122 pmcid: 3051319 doi: 10.1093/bioinformatics/btr011
Chen, Y. X. et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience 7, 1–6 (2018).
pubmed: 29659813 pmcid: 5827348 doi: 10.1093/gigascience/gix120
Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33, 2202–2204 (2017).
pubmed: 28369201 pmcid: 5870704 doi: 10.1093/bioinformatics/btx153
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
pubmed: 22388286 pmcid: 3322381 doi: 10.1038/nmeth.1923
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
pubmed: 26619908 pmcid: 4665391 doi: 10.1186/s13059-015-0831-x
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
pubmed: 27467249 pmcid: 5846465 doi: 10.1016/j.cels.2016.07.002
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
pubmed: 28336562 pmcid: 5635820 doi: 10.1126/science.aal3327
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
pubmed: 19451168 pmcid: 2705234 doi: 10.1093/bioinformatics/btp324
Zhang, X. H. et al. Identification and functional characterization of three new terpene synthase genes involved in chemical defense and abiotic stresses in Santalum album. BMC Plant Biol. 19, 115 (2019).
pubmed: 30922222 pmcid: 6437863 doi: 10.1186/s12870-019-1720-3
Kolosova, N. et al. Isolation of high-quality RNA from gymnosperm and angiosperm trees. Biotechniques 36, 821–824 (2004).
pubmed: 15152602 doi: 10.2144/04365ST06
Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).
pubmed: 17485477 pmcid: 1933203 doi: 10.1093/nar/gkm286
Edgar, R. C. & Myers, E. W. PILER: identification and classification of genomic repeats. Bioinformatics 21(Suppl 1), i152–i158 (2005).
pubmed: 15961452 doi: 10.1093/bioinformatics/bti1003
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21(Suppl 1), i351–i358 (2005).
pubmed: 15961478 doi: 10.1093/bioinformatics/bti1018
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-3.0. 1996–2010. (2010).
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).
pubmed: 15123596 pmcid: 479130 doi: 10.1101/gr.1865504
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
pubmed: 16845043 pmcid: 1538822 doi: 10.1093/nar/gkl200
Johnson, A. D. et al. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24, 2938–2939 (2008).
pubmed: 18974171 pmcid: 2720775 doi: 10.1093/bioinformatics/btn564
Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 25, 78–94 (1997).
doi: 10.1006/jmbi.1997.0951
Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
pubmed: 25751142 pmcid: 4655817 doi: 10.1038/nmeth.3317
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
pubmed: 25690850 pmcid: 4643835 doi: 10.1038/nbt.3122
Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol. 9, R7 (2008).
pubmed: 18190707 pmcid: 2395244 doi: 10.1186/gb-2008-9-1-r7
Lowe, T. M. & Eddy, S. R. tRNAscan-SE a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997).
pubmed: 9023104 pmcid: 146525 doi: 10.1093/nar/25.5.955
Griffiths-Jones, S. et al. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res. 33, D121–D124 (2005).
pubmed: 15608160 doi: 10.1093/nar/gki081
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
pubmed: 31727128 pmcid: 6857279 doi: 10.1186/s13059-019-1832-y
NGDC Genome Sequence Archive https://ngdc.cncb.ac.cn/gsa/browse/CRA009778/CRX582846 (2023).
NGDC Genome Sequence Archive https://ngdc.cncb.ac.cn/gsa/browse/CRA009778/CRX582847 (2023).
NGDC Genome Sequence Archive https://ngdc.cncb.ac.cn/gsa/browse/CRA009778/CRX582848 (2023).
NGDC Genome Sequence Archive https://ngdc.cncb.ac.cn/gsa/browse/CRA009778/CRX582849 (2023).
Zhang, X. H. et al. Santalum album TX1, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc.gca:GCA_034195605.1 (2023).
Zhang, X. H. et al. Improved chromosome-level genome assembly of Indian sandalwood (Santalum album). figshare https://doi.org/10.6084/m9.figshare.23694729.v1 (2023).

Auteurs

Xinhua Zhang (X)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China. xhzhang@scib.ac.cn.

MingZhi Li (M)

Bio&Data Biotechnologies Co. Ltd., Guangzhou, 510700, China.

Zhan Bian (Z)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Xiaohong Chen (X)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Yuan Li (Y)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Yuping Xiong (Y)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Lin Fang (L)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Kunlin Wu (K)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Songjun Zeng (S)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Shuguang Jian (S)

Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Rujiang Wang (R)

Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Hai Ren (H)

Key Laboratory of Vegetation Restoration and Management of Degraded Ecosystems, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China.

Jaime A Teixeira da Silva (JA)

Independent researcher, Ikenobe 3011-2, Kagawa-Ken, 761-0799, Japan.

Guohua Ma (G)

Key Laboratory of South China Agricultural Plant Molecular Analysis and Genetic Improvement & Guangdong Provincial Key Laboratory of Applied Botany, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China. magh@scib.ac.cn.

Classifications MeSH