Comparative chloroplast genome analysis of Impatiens species (Balsaminaceae) in the karst area of China: insights into genome evolution and phylogenomic implications.
Balsaminaceae
Chloroplast genome
Comparative analysis
Impatiens
Phylogenetic relationship
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
24 Jul 2021
24 Jul 2021
Historique:
received:
21
12
2020
accepted:
14
06
2021
entrez:
25
7
2021
pubmed:
26
7
2021
medline:
28
7
2021
Statut:
epublish
Résumé
Impatiens L. is a genus of complex taxonomy that belongs to the family Balsaminaceae (Ericales) and contains approximately 1000 species. The genus is well known for its economic, medicinal, ornamental, and horticultural value. However, knowledge about its germplasm identification, molecular phylogeny, and chloroplast genomics is limited, and taxonomic uncertainties still exist due to overlapping morphological features and insufficient genomic resources. We sequenced the chloroplast genomes of six different species (Impatiens chlorosepala, Impatiens fanjingshanica, Impatiens guizhouensis, Impatiens linearisepala, Impatiens loulanensis, and Impatiens stenosepala) in the karst area of China and compared them with those of six previously published Balsaminaceae species. We contrasted genomic features and repeat sequences, assessed sequence divergence and constructed phylogenetic relationships. Except for those of I. alpicola, I. pritzelii and I. glandulifera, the complete chloroplast genomes ranging in size from 151,366 bp (I. alpicola) to 154,189 bp (Hydrocera triflora) encoded 115 distinct genes [81 protein-coding, 30 transfer RNA (tRNA), and 4 ribosomal RNA (rRNA) genes]. Moreover, the characteristics of the long repeat sequences and simple sequence repeats (SSRs) were determined. psbK-psbI, trnT-GGU-psbD, rpl36-rps8, rpoB-trnC-GCA, trnK-UUU-rps16, trnQ-UUG, trnP-UGG-psaJ, trnT-UGU-trnL-UAA, and ycf4-cemA were identified as divergence hotspot regions and thus might be suitable for species identification and phylogenetic studies. Additionally, the phylogenetic relationships based on Maximum likelihood (ML) and Bayesian inference (BI) of the whole chloroplast genomes showed that the chloroplast genome structure of I. guizhouensis represents the ancestral state of the Balsaminaceae family. Our study provided detailed information about nucleotide diversity hotspots and the types of repeats, which can be used to develop molecular markers applicable to Balsaminaceae species. We also reconstructed and analyzed the relationships of some Impatiens species and assessed their taxonomic statuses based on the complete chloroplast genomes. Together, the findings of the current study might provide valuable genomic resources for systematic evolution of the Balsaminaceae species.
Sections du résumé
BACKGROUND
BACKGROUND
Impatiens L. is a genus of complex taxonomy that belongs to the family Balsaminaceae (Ericales) and contains approximately 1000 species. The genus is well known for its economic, medicinal, ornamental, and horticultural value. However, knowledge about its germplasm identification, molecular phylogeny, and chloroplast genomics is limited, and taxonomic uncertainties still exist due to overlapping morphological features and insufficient genomic resources.
RESULTS
RESULTS
We sequenced the chloroplast genomes of six different species (Impatiens chlorosepala, Impatiens fanjingshanica, Impatiens guizhouensis, Impatiens linearisepala, Impatiens loulanensis, and Impatiens stenosepala) in the karst area of China and compared them with those of six previously published Balsaminaceae species. We contrasted genomic features and repeat sequences, assessed sequence divergence and constructed phylogenetic relationships. Except for those of I. alpicola, I. pritzelii and I. glandulifera, the complete chloroplast genomes ranging in size from 151,366 bp (I. alpicola) to 154,189 bp (Hydrocera triflora) encoded 115 distinct genes [81 protein-coding, 30 transfer RNA (tRNA), and 4 ribosomal RNA (rRNA) genes]. Moreover, the characteristics of the long repeat sequences and simple sequence repeats (SSRs) were determined. psbK-psbI, trnT-GGU-psbD, rpl36-rps8, rpoB-trnC-GCA, trnK-UUU-rps16, trnQ-UUG, trnP-UGG-psaJ, trnT-UGU-trnL-UAA, and ycf4-cemA were identified as divergence hotspot regions and thus might be suitable for species identification and phylogenetic studies. Additionally, the phylogenetic relationships based on Maximum likelihood (ML) and Bayesian inference (BI) of the whole chloroplast genomes showed that the chloroplast genome structure of I. guizhouensis represents the ancestral state of the Balsaminaceae family.
CONCLUSION
CONCLUSIONS
Our study provided detailed information about nucleotide diversity hotspots and the types of repeats, which can be used to develop molecular markers applicable to Balsaminaceae species. We also reconstructed and analyzed the relationships of some Impatiens species and assessed their taxonomic statuses based on the complete chloroplast genomes. Together, the findings of the current study might provide valuable genomic resources for systematic evolution of the Balsaminaceae species.
Identifiants
pubmed: 34303345
doi: 10.1186/s12864-021-07807-8
pii: 10.1186/s12864-021-07807-8
pmc: PMC8310579
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
571Informations de copyright
© 2021. The Author(s).
Références
Mol Biol Evol. 2017 Dec 1;34(12):3299-3302
pubmed: 29029172
Nat Methods. 2012 Jul 30;9(8):772
pubmed: 22847109
Bioinformatics. 2017 Aug 15;33(16):2583-2585
pubmed: 28398459
PLoS One. 2019 May 9;14(5):e0216645
pubmed: 31071159
3 Biotech. 2016 Dec;6(2):258
pubmed: 28330330
Bioinformatics. 2012 Jun 15;28(12):1647-9
pubmed: 22543367
Mitochondrial DNA B Resour. 2019 Oct 16;4(2):3624-3625
pubmed: 33366113
J Comput Biol. 2012 May;19(5):455-77
pubmed: 22506599
Genome Res. 2004 Jul;14(7):1394-403
pubmed: 15231754
BMC Genomics. 2019 Oct 29;20(1):791
pubmed: 31664913
Mol Phylogenet Evol. 2009 Sep;52(3):806-24
pubmed: 19398024
Mitochondrial DNA B Resour. 2019 Nov 6;4(2):3846-3847
pubmed: 33366215
Nucleic Acids Res. 2017 Jul 3;45(W1):W6-W11
pubmed: 28486635
Front Genet. 2020 Mar 17;11:227
pubmed: 32256523
Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W273-9
pubmed: 15215394
Front Plant Sci. 2018 Jul 09;9:927
pubmed: 30038632
PhytoKeys. 2018 Nov 2;(110):51-67
pubmed: 30425602
Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W686-9
pubmed: 15980563
Genes (Basel). 2019 Mar 14;10(3):
pubmed: 30875850
Int J Mol Sci. 2018 Apr 01;19(4):
pubmed: 29614787
Mol Biol Evol. 2013 Apr;30(4):772-80
pubmed: 23329690
BMC Evol Biol. 2002 Sep 26;2:17
pubmed: 12350234
Genome Biol. 2020 Sep 10;21(1):241
pubmed: 32912315
Nucleic Acids Res. 1987 Feb 11;15(3):1281-95
pubmed: 3547335
PeerJ. 2017 Oct 12;5:e3919
pubmed: 29038765
Bioinformatics. 2014 May 1;30(9):1312-3
pubmed: 24451623
BMC Evol Biol. 2020 Jul 31;20(1):96
pubmed: 32736519
Int J Mol Sci. 2018 Mar 01;19(3):
pubmed: 29494552
Int J Mol Sci. 2018 Mar 01;19(3):
pubmed: 29494509
J Nat Prod. 2017 Feb 24;80(2):471-478
pubmed: 28165740
Int J Mol Sci. 2019 Jun 13;20(12):
pubmed: 31200508
Plant Cell Rep. 2020 Jun;39(6):811-824
pubmed: 32221666
Front Genet. 2020 Jul 23;11:802
pubmed: 32849804
Mol Biol Evol. 2020 Jan 1;37(1):291-294
pubmed: 31432070
BMC Genomics. 2019 Nov 9;20(1):833
pubmed: 31706273
Molecules. 2019 Jan 29;24(3):
pubmed: 30699955
PeerJ. 2020 Mar 24;8:e8739
pubmed: 32231875
Bioinformatics. 2004 Nov 22;20(17):3252-5
pubmed: 15180927
Syst Biol. 2012 May;61(3):539-42
pubmed: 22357727
Int J Phytoremediation. 2016;18(3):228-34
pubmed: 26247535
Curr Genet. 2007 Nov;52(5-6):267-74
pubmed: 17957369
Brief Bioinform. 2019 Jul 19;20(4):1160-1166
pubmed: 28968734
Evid Based Complement Alternat Med. 2017;2017:4245830
pubmed: 28326124
Theor Appl Genet. 2003 Feb;106(3):411-22
pubmed: 12589540
Mol Ecol Resour. 2009 May;9(3):673-90
pubmed: 21564725
Int J Mol Sci. 2018 Feb 09;19(2):
pubmed: 29425128
Front Genet. 2020 Feb 20;11:73
pubmed: 32153639
Nucleic Acids Res. 2001 Nov 15;29(22):4633-42
pubmed: 11713313