Gene duplications in the E. coli genome: common themes among pathotypes.
Escherichia coli 042
Gene duplication
H-NS
Hha
Pathotypes
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
24 Apr 2019
24 Apr 2019
Historique:
received:
29
10
2018
accepted:
10
04
2019
entrez:
25
4
2019
pubmed:
25
4
2019
medline:
3
8
2019
Statut:
epublish
Résumé
Gene duplication underlies a significant proportion of gene functional diversity and genome complexity in both eukaryotes and prokaryotes. Although several reports in the literature described the duplication of specific genes in E. coli, a detailed analysis of the extent of gene duplications in this microorganism is needed. The genomes of the E. coli enteroaggregative strain 042 and other pathogenic strains contain duplications of the gene that codes for the global regulator Hha. To determine whether the presence of additional copies of the hha gene correlates with the presence of other genes, we performed a comparative genomic analysis between E. coli strains with and without hha duplications. The results showed that strains harboring additional copies of the hha gene also encode the yeeR irmA (aec69) gene cluster, which, in turn, is also duplicated in strain 042 and several other strains. The identification of these duplications prompted us to obtain a global map of gene duplications, first in strain 042 and later in other E. coli genomes. Duplications in the genomes of the enteroaggregative strain 042, the uropathogenic strain CFT073 and the enterohemorrhagic strain O145:H28 have been identified by a BLASTp protein similarity search. This algorithm was also used to evaluate the distribution of the identified duplicates among the genomes of a set of 28 representative E. coli strains. Despite the high genomic diversity of E. coli strains, we identified several duplicates in the genomes of almost all studied pathogenic strains. Most duplicated genes have no known function. Transcriptomic analysis also showed that most of these duplications are regulated by the H-NS/Hha proteins. Several duplicated genes are widely distributed among pathogenic E. coli strains. In addition, some duplicated genes are present only in specific pathotypes, and others are strain specific. This gene duplication analysis shows novel relationships between E. coli pathotypes and suggests that newly identified genes that are duplicated in a high percentage of pathogenic E. coli isolates may play a role in virulence. Our study also shows a relationship between the duplication of genes encoding regulators and genes encoding their targets.
Sections du résumé
BACKGROUND
BACKGROUND
Gene duplication underlies a significant proportion of gene functional diversity and genome complexity in both eukaryotes and prokaryotes. Although several reports in the literature described the duplication of specific genes in E. coli, a detailed analysis of the extent of gene duplications in this microorganism is needed.
RESULTS
RESULTS
The genomes of the E. coli enteroaggregative strain 042 and other pathogenic strains contain duplications of the gene that codes for the global regulator Hha. To determine whether the presence of additional copies of the hha gene correlates with the presence of other genes, we performed a comparative genomic analysis between E. coli strains with and without hha duplications. The results showed that strains harboring additional copies of the hha gene also encode the yeeR irmA (aec69) gene cluster, which, in turn, is also duplicated in strain 042 and several other strains. The identification of these duplications prompted us to obtain a global map of gene duplications, first in strain 042 and later in other E. coli genomes. Duplications in the genomes of the enteroaggregative strain 042, the uropathogenic strain CFT073 and the enterohemorrhagic strain O145:H28 have been identified by a BLASTp protein similarity search. This algorithm was also used to evaluate the distribution of the identified duplicates among the genomes of a set of 28 representative E. coli strains. Despite the high genomic diversity of E. coli strains, we identified several duplicates in the genomes of almost all studied pathogenic strains. Most duplicated genes have no known function. Transcriptomic analysis also showed that most of these duplications are regulated by the H-NS/Hha proteins.
CONCLUSIONS
CONCLUSIONS
Several duplicated genes are widely distributed among pathogenic E. coli strains. In addition, some duplicated genes are present only in specific pathotypes, and others are strain specific. This gene duplication analysis shows novel relationships between E. coli pathotypes and suggests that newly identified genes that are duplicated in a high percentage of pathogenic E. coli isolates may play a role in virulence. Our study also shows a relationship between the duplication of genes encoding regulators and genes encoding their targets.
Identifiants
pubmed: 31014240
doi: 10.1186/s12864-019-5683-4
pii: 10.1186/s12864-019-5683-4
pmc: PMC6480617
doi:
Types de publication
Journal Article
Langues
eng
Pagination
313Subventions
Organisme : Ministerio de Economía, Industria y Competitividad, Gobierno de España
ID : CGL2016-75255
Organisme : Ministerio de Economía, Industria y Competitividad, Gobierno de España
ID : BIO2016-76412-C2-1-R
Références
Microbiology. 2001 Jan;147(Pt 1):161-9
pubmed: 11160810
Genome Biol. 2003;4(8):R48
pubmed: 12914657
Nat Rev Microbiol. 2004 Feb;2(2):123-40
pubmed: 15040260
Curr Biol. 2005 Jun 7;15(11):1016-21
pubmed: 15936271
J Bacteriol. 2007 Jan;189(1):265-8
pubmed: 17041043
Mol Microbiol. 2007 Jan;63(1):7-14
pubmed: 17116239
Appl Environ Microbiol. 2007 Mar;73(5):1553-62
pubmed: 17220264
Annu Rev Microbiol. 2008;62:153-69
pubmed: 18785838
Nat Rev Genet. 2008 Dec;9(12):938-50
pubmed: 19015656
Biol Direct. 2009 Nov 23;4:46
pubmed: 19930658
Nat Rev Genet. 2010 Feb;11(2):97-108
pubmed: 20051986
PLoS One. 2010 Jan 20;5(1):e8801
pubmed: 20098708
FEMS Immunol Med Microbiol. 2010 Apr;58(3):344-55
pubmed: 20132305
J Basic Microbiol. 2010 Dec;50 Suppl 1:S107-15
pubmed: 20806245
PLoS One. 2010 Nov 23;5(11):e14093
pubmed: 21124856
Bioinformatics. 2011 Apr 1;27(7):1009-10
pubmed: 21278367
BMC Bioinformatics. 2011 Apr 22;12:116
pubmed: 21513511
N Engl J Med. 2011 Nov 10;365(19):1771-80
pubmed: 21696328
Lancet Infect Dis. 2011 Sep;11(9):671-6
pubmed: 21703928
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Proc Biol Sci. 2012 Dec 22;279(1749):5048-57
pubmed: 22977152
Toxins (Basel). 2012 Nov 08;4(11):1261-87
pubmed: 23202315
Future Microbiol. 2013 Jul;8(7):887-99
pubmed: 23841635
Plasmid. 2015 Jul;80:32-44
pubmed: 25952329
Clin Vaccine Immunol. 2015 Sep;22(9):983-91
pubmed: 26135975
MBio. 2016 Mar 15;7(2):e02046
pubmed: 26980835
FEBS Lett. 2016 May;590(10):1428-37
pubmed: 27129600
Sci Rep. 2016 May 12;6:25973
pubmed: 27169404
Front Microbiol. 2017 Feb 03;8:146
pubmed: 28217123
Infect Immun. 1985 May;48(2):378-83
pubmed: 2859247
Virus Res. 2017 Aug 15;240:161-165
pubmed: 28822699
mSystems. 2018 Mar 20;3(3):null
pubmed: 29577085
Pediatr Infect Dis J. 1987 Sep;6(9):829-31
pubmed: 3313248
Genetics. 1995 Jan;139(1):421-8
pubmed: 7705642
J Infect Dis. 1995 Feb;171(2):465-8
pubmed: 7844392
J Bacteriol. 1994 Aug;176(15):4691-9
pubmed: 7913930
Infect Immun. 1997 Oct;65(10):4135-45
pubmed: 9317019
Clin Microbiol Rev. 1998 Jan;11(1):142-201
pubmed: 9457432