Single-molecule real-time sequencing of the full-length transcriptome of purple garlic (Allium sativum L. cv. Leduzipi) and identification of serine O-acetyltransferase family proteins involved in cysteine biosynthesis.
cysteine
full-length transcriptome
garlic
organosulfur compounds
serine O-acetyltransferase
Journal
Journal of the science of food and agriculture
ISSN: 1097-0010
Titre abrégé: J Sci Food Agric
Pays: England
ID NLM: 0376334
Informations de publication
Date de publication:
May 2022
May 2022
Historique:
revised:
25
10
2021
received:
06
04
2021
accepted:
05
11
2021
pubmed:
7
11
2021
medline:
19
4
2022
entrez:
6
11
2021
Statut:
ppublish
Résumé
Garlic (Allium sativum L.), whose bioactive components are mainly organosulfur compounds (OSCs), is a herbaceous perennial widely consumed as a green vegetable and a condiment. Yet, the metabolic enzymes involved in the biosynthesis of OSCs are not identified in garlic. Here, a full-length transcriptome of purple garlic was generated via PacBio and Illumina sequencing, to characterize the garlic transcriptome and identify key proteins mediating the biosynthesis of OSCs. Overall, 22.56 Gb of clean data were generated, resulting in 454 698 circular consensus sequence (CCS) reads, of which 83.4% (379 206) were identified as being full-length non-chimeric reads - their further transcript clustering facilitated identification of 36 571 high-quality consensus reads. Once corrected, their genome-wide mapping revealed that 6140 reads were novel isoforms of known genes, and 2186 reads were novel isoforms from novel genes. We detected 1677 alternative splicing events, finding 2902 genes possessing either two or more poly(A) sites. Given the importance of serine O-acetyltransferase (SERAT) in cysteine biosynthesis, we investigated the five SERAT homologs in garlic. Phylogenetic analysis revealed a three-tier classification of SERAT proteins, each featuring a serine acetyltransferase domain (N-terminal) and one or two hexapeptide transferase motifs. Template-based modeling showed that garlic SERATs shared a common homo-trimeric structure with homologs from bacteria and other plants. The residues responsible for substrate recognition and catalysis were highly conserved, implying a similar reaction mechanism. In profiling the five SERAT genes' transcript levels, their expression pattern varied significantly among different tissues. This study's findings deepen our knowledge of SERAT proteins, and provide timely genetic resources that could advance future exploration into garlic's genetic improvement and breeding. © 2021 Society of Chemical Industry.
Sections du résumé
BACKGROUND
BACKGROUND
Garlic (Allium sativum L.), whose bioactive components are mainly organosulfur compounds (OSCs), is a herbaceous perennial widely consumed as a green vegetable and a condiment. Yet, the metabolic enzymes involved in the biosynthesis of OSCs are not identified in garlic.
RESULTS
RESULTS
Here, a full-length transcriptome of purple garlic was generated via PacBio and Illumina sequencing, to characterize the garlic transcriptome and identify key proteins mediating the biosynthesis of OSCs. Overall, 22.56 Gb of clean data were generated, resulting in 454 698 circular consensus sequence (CCS) reads, of which 83.4% (379 206) were identified as being full-length non-chimeric reads - their further transcript clustering facilitated identification of 36 571 high-quality consensus reads. Once corrected, their genome-wide mapping revealed that 6140 reads were novel isoforms of known genes, and 2186 reads were novel isoforms from novel genes. We detected 1677 alternative splicing events, finding 2902 genes possessing either two or more poly(A) sites. Given the importance of serine O-acetyltransferase (SERAT) in cysteine biosynthesis, we investigated the five SERAT homologs in garlic. Phylogenetic analysis revealed a three-tier classification of SERAT proteins, each featuring a serine acetyltransferase domain (N-terminal) and one or two hexapeptide transferase motifs. Template-based modeling showed that garlic SERATs shared a common homo-trimeric structure with homologs from bacteria and other plants. The residues responsible for substrate recognition and catalysis were highly conserved, implying a similar reaction mechanism. In profiling the five SERAT genes' transcript levels, their expression pattern varied significantly among different tissues.
CONCLUSION
CONCLUSIONS
This study's findings deepen our knowledge of SERAT proteins, and provide timely genetic resources that could advance future exploration into garlic's genetic improvement and breeding. © 2021 Society of Chemical Industry.
Substances chimiques
Protein Isoforms
0
Serine O-Acetyltransferase
EC 2.3.1.30
Cysteine
K848JZ4886
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
2864-2873Informations de copyright
© 2021 Society of Chemical Industry.
Références
Herden T, Hanelt P and Friesen N, Phylogeny of Allium L. subgenus Anguinum (G. Don. ex W.D.J. Koch) N. Friesen (Amaryllidaceae). Mol Phylogenet Evol 95:79-93 (2016).
Schäfer G and Kaschula CH, The immunomodulation and anti-inflammatory effects of garlic organosulfur compounds in cancer chemoprevention. Anticancer Agents Med Chem 14:233-240 (2014).
Ruhee RT, Roberts LA, Ma S and Suzuki K, Organosulfur compounds: a review of their anti-inflammatory effects in human health. Front Nutr 7:64 (2020).
Putnik P, Gabrić D, Roohinejad S, Barba FJ, Granato D, Mallikarjunan K et al., An overview of organosulfur compounds from Allium spp.: from processing and preservation to evaluation of their bioavailability, antimicrobial, and anti-inflammatory properties. Food Chem 276:680-691 (2019).
Zhao Dongsheng, Li Xinxia, Zhang Haibo, Rena-Kasim, Chen Jian HPLC Fingerprint Characteristics of Active Materials of Garlic and Other Allium Species. Analytical Letters. 2014;47: (1):155. -166. http://dx.doi.org/10.1080/00032719.2013.832273
El-Saber Batiha G, Magdy Beshbishy A, Wasef LG, Elewa YH, Al-Sagan AA, El-Hack A et al., Chemical constituents and pharmacological activities of garlic (Allium sativum L.): a review. Nutrients 12:872 (2020).
Melino S, Sabelli R and Paci M, Allyl sulfur compounds and cellular detoxification system: effects and perspectives in cancer therapy. Amino Acids 41:103-112 (2011).
Salehi Bahare, Zucca Paolo, Orhan Ilkay Erdogan, Azzini Elena, Adetunji Charles Oluwaseun, Mohammed Soheb Anwar, Banerjee Sanjay K., Sharopov Farukh, Rigano Daniela, Sharifi-Rad Javad, Armstrong Lorene, Martorell Miquel, Sureda Antoni, Martins Natália, Selamoğlu Zeliha, Ahmad Zaheer Allicin and health: A comprehensive review. Trends in Food Science & Technology. 2019;86:502. -516. http://dx.doi.org/10.1016/j.tifs.2019.03.003
Yamaguchi Y and Kumagai H, Characteristics, biosynthesis, decomposition, metabolism and functions of the garlic odour precursor, S-allyl-L-cysteine sulfoxide. Exp Ther Med 19:1528-1535 (2020).
Kosuge Y, Neuroprotective mechanisms of S-allyl-l-cysteine in neurological disease (review). Exp Ther Med 19:1565-1569 (2020).
Bayan L, Koulivand PH and Gorji A, Garlic: a review of potential therapeutic effects. Avicenna J Phytomed 4:1-14 (2014).
Abe K, Hori Y and Myoda T, Volatile compounds of fresh and processed garlic (review). Exp Ther Med 19:1585-1593 (2020).
Amagase H, Clarifying the real bioactive constituents of garlic. J Nutr 136:716S-725S (2006).
Chhabria S and Desai K, Purification and characterisation of alliinase produced by Cupriavidus necator and its application for generation of cytotoxic agent: allicin. Saudi J Biol Sci 25:1429-1438 (2018).
Jones MG, Hughes J, Tregova A, Milne J, Tomsett AB and Collin HA, Biosynthesis of the flavour precursors of onion and garlic. J Exp Bot 55:1903-1918 (2004).
Lawson LD and Hunsaker SM, Allicin bioavailability and bioequivalence from garlic supplements and garlic foods. Nutrients 10:812 (2018).
Kopriva S, Talukdar D, Takahashi H, Hell R, Sirko A, D'Souza SF, Talukdar T, Editorial: Frontiers of sulfur metabolism in plant growth, development, and stress response. Frontiers in Plant Science. 6 (2016). https://doi.org/10.3389/fpls.2015.01220
Bonner ER, Cahoon RE, Knapke SM and Jez JM, Molecular basis of cysteine biosynthesis in plants: structural and functional analysis of o-acetylserine sulfhydrylase from Arabidopsis thaliana. J Biol Chem 280:38803-38813 (2005).
Hell R and Wirtz M, Molecular biology, biochemistry and cellular physiology of cysteine metabolism in Arabidopsis thaliana. Arabidopsis Book 2011:e0154 (2011).
Watanabe M, Mochida K, Kato T, Tabata S, Yoshimoto N, Noji M et al., Comparative genomics and reverse genetics analysis reveal indispensable functions of the serine acetyltransferase gene family in Arabidopsis. Plant Cell 20:2484-2496 (2008).
Salmela L and Rivals E, LoRDEC: accurate and efficient long read error correction. Bioinformatics 30:3506-3514 (2014).
Wu TD and Watanabe CK, GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21:1859-1875 (2005).
Abdel-Ghany SE, Hamilton M, Jacobi JL, Ngam P, Devitt N, Schilkey F et al., A survey of the sorghum transcriptome using single-molecule long reads. Nat Commun 7:11706 (2016).
Conesa A and Götz S, Blast2GO: a comprehensive suite for functional analysis in plant genomics. Int J Plant Genomics 2008:619832 (2008).
Sun L, Luo H, Bu D, Zhao G, Yu K, Zhang C et al., Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic Acids Res 41:e166 (2013).
Li A, Zhang J and Zhou Z, PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme. BMC Bioinform 15:311 (2014).
Kong L, Zhang Y, Ye Z-Q, Liu X-Q, Zhao S-Q, Wei L et al., CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res 35:W345-W349 (2007).
Mistry J, Finn RD, Eddy SR, Bateman A and Punta M, Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res 41:e121 (2013).
Sun X, Zhu S, Li N, Cheng Y, Zhao J, Qiao X et al., A chromosome-level genome assembly of garlic (Allium sativum) provides insights into genome evolution and allicin biosynthesis. Mol Plant 13:1328-1339 (2020).
Sahu SS, Loaiza CD, Kaundal R, Plant-mSubP: a computational framework for the prediction of single- and multi-target protein subcellular localization using integrated machine-learning approaches. AoB PLANTS. 12 (2020). https://doi.org/10.1093/aobpla/plz068
Wu ZC, Xiao X and Chou KC, iLoc-plant: a multi-label classifier for predicting the subcellular localization of plant proteins with both single and multiple sites. Mol Biosyst 7:3287-3297 (2011).
Kumar S, Stecher G, Li M, Knyaz C and Tamura K, MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol 35:1547-1549 (2018).
Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R et al., SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res 46:W296-w303 (2018).
Yi H, Dey S, Kumaran S, Lee SG, Krishnan HB and Jez JM, Structure of soybean serine acetyltransferase and formation of the cysteine regulatory complex as a molecular chaperone. J Biol Chem 288:36463-36472 (2013).
Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME and Gouil Q, Opportunities and challenges in long-read sequencing data analysis. Genome Biol 21:30 (2020).
Byrne A, Cole C, Volden R and Vollmers C, Realizing the potential of full-length transcriptome sequencing. Philos Trans R Soc B 374:20190097 (2019).
Sahlin K and Medvedev P, Error correction enables use of Oxford nanopore technology for reference-free transcriptome analysis. Nat Commun 12:2 (2021).
Ahsan MU, Liu Q, Fang L, Wang K NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks. Genome Biology. 22 (2021). https://doi.org/10.1186/s13059-021-02472-2
Hu H, Yang W, Zheng Z, Niu Z, Yang Y, Wan D, Liu J, Ma T Analysis of alternative splicing and alternative polyadenylation in populus alba var. pyramidalis by single-molecular long-read sequencing. Frontiers in Genetics. 11 (2020). https://doi.org/10.3389/fgene.2020.00048
Chen X, Liu X, Zhu S, Tang S, Mei S, Chen J et al., Transcriptome-referenced association study of clove shape traits in garlic. DNA Res 25:587-596 (2018).
Nguyen TP, Mühlich C, Mohammadin S and van den Bergh E, Genome improvement and genetic map construction for Aethionema arabicum, the first divergent branch in the Brassicaceae family. G3: Genes Genomes Genet 9:3521-3530 (2019).
Xie L, Teng K, Tan P, Chao Y, Li Y, Guo W et al., PacBio single-molecule long-read sequencing shed new light on the transcripts and splice isoforms of the perennial ryegrass. Mol Genet Genomics 295:475-489 (2020).
Deng N, Hou C, He B, Ma F, Song Q, Shi S et al., A full-length transcriptome and gene expression analysis reveal genes and molecular elements expressed during seed development in Gnetum luofuense. BMC Plant Biol 20:531 (2020).
Gong W, Song Q, Ji K, Gong S, Wang L, Chen L et al., Full-length transcriptome from Camellia oleifera seed provides insight into the transcript variants involved in oil biosynthesis. J Agric Food Chem 68:14670-14683 (2020).
Wang L, Chen M, Zhu F, Fan T, Zhang J and Lo C, Alternative splicing is a Sorghum bicolor defense response to fungal infection. Planta 251:14 (2019).
Ghorbani A, Tahmasebi A, Izadpanah K, Afsharifar A and Dietzgen RG, Genome-wide analysis of alternative splicing in Zea mays during maize Iranian mosaic virus infection. Plant Mol Biol Rep 37:413-420 (2019).
Szakonyi D and Duque P, Alternative splicing as a regulator of early plant development. Front Plant Sci 9:1174 (2018).
Xia W, Liu R, Zhang J, Mason AS, Alternative splicing of flowering time gene FT is associated with halving of time to flowering in coconut. Sci Rep 10:11640 (2020).
Budak H, Kaya SB, Cagirici HB, Long Non-coding RNA in plants in the era of reference sequences. Frontiers in Plant Science. 11 (2020). https://doi.org/10.3389/fpls.2020.00276
Yu Y, Zhang Y, Chen X and Chen Y, Plant noncoding RNAs: hidden players in development and stress responses. Annu Rev Cell Dev Biol 35:407-431 (2019).
Sun Z, Huang K, Han Z, Wang P and Fang Y, Genome-wide identification of Arabidopsis long noncoding RNAs in response to the blue light. Sci Rep 10:6229 (2020).
Traubenik S, Reynoso MA, Hobecker K, Lancia M, Hummel M, Rosen B et al., Reprogramming of root cells during nitrogen-fixing symbiosis involves dynamic polysome association of coding and noncoding RNAs. Plant Cell 32:352-373 (2020).
Hartford CCR and Lal A, When long noncoding becomes protein coding. Mol Cell Biol 40:e00528-00519 (2020).
Lin X, Lin W, Ku Y-S, Wong F-L, Li M-W, Lam H-M et al., Analysis of soybean long non-coding RNAs reveals a subset of small peptide-coding transcripts. Plant Physiol 182:1359-1374 (2020).
Santhosha S.G., Jamuna Prakash, Prabhavathi S.N. Bioactive components of garlic and their physiological role in health maintenance: A review. Food Bioscience. 2013;3:59. -74. http://dx.doi.org/10.1016/j.fbio.2013.07.001
Tavares S, Wirtz M, Beier MP, Bogs J, Hell R and Amâncio S, Characterization of the serine acetyltransferase gene family of Vitis vinifera uncovers differences in regulation of OAS synthesis in woody plants. Front Plant Sci 6:74 (2015).
Haas FH, Heeg C, Queiroz R, Bauer A, Wirtz M and Hell R, Mitochondrial serine acetyltransferase functions as a pacemaker of cysteine synthesis in plant cells. Plant Physiol 148:1055-1067 (2008).
Hindson VJ, Moody PC, Rowe AJ and Shaw WV, Serine acetyltransferase from Escherichia coli is a dimer of trimers. J Biol Chem 275:461-466 (2000).
Thoden J, Cook P, Schäffer C, Messner P and Holden H, Structural and functional studies of QdtC: an N-acetyltransferase required for the biosynthesis of dTDP-3-acetamido-3,6-dideoxy-alpha-d-glucose. Biochemistry 48:2699-2709 (2009).
Thoden JB, Reinhardt LA, Cook PD, Menden P, Cleland WW and Holden HM, Catalytic mechanism of perosamine N-acetyltransferase revealed by high-resolution X-ray crystallographic studies and kinetic analyses. Biochemistry 51:3433-3444 (2012).