Calculating and comparing codon usage values in rare disease genes highlights codon clustering with disease-and tissue- specific hierarchy.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2022
2022
Historique:
received:
05
10
2021
accepted:
02
03
2022
entrez:
31
3
2022
pubmed:
1
4
2022
medline:
15
4
2022
Statut:
epublish
Résumé
We designed a novel strategy to define codon usage bias (CUB) in 6 specific small cohorts of human genes. We calculated codon usage (CU) values in 29 non-disease-causing (NDC) and 31 disease-causing (DC) human genes which are highly expressed in 3 distinct tissues, kidney, muscle, and skin. We applied our strategy to the same selected genes annotated in 15 mammalian species. We obtained CUB hierarchical clusters for each gene cohort which showed tissue-specific and disease-specific CUB fingerprints. We showed that DC genes (especially those expressed in muscle) display a low CUB, well recognizable in codon hierarchical clustering. We defined the extremely biased codons as "zero codons" and found that their number is significantly higher in all DC genes, all tissues, and that this trend is conserved across mammals. Based on this calculation in different gene cohorts, we identified 5 codons which are more differentially used across genes and mammals, underlining that some genes have favorite synonymous codons in use. Since of the muscle genes clear clusters, and, among these, dystrophin gene surprisingly does not show any "zero codon" we adopted a novel approach to study CUB, we called "mapping-on-codons". We positioned 2828 dystrophin missense and nonsense pathogenic variations on their respective codon, highlighting that its frequency and occurrence is not dependent on the CU values. We conclude our strategy consents to identify a hierarchical clustering of CU values in a gene cohort-specific fingerprints, with recognizable trend across mammals. In DC muscle genes also a disease-related fingerprint can be observed, allowing discrimination between DC and NDC genes. We propose that using our strategy which studies CU in specific gene cohorts, as rare disease genes, and tissue specific genes, may provide novel information about the CUB role in human and medical genetics, with implications on synonymous variations interpretation and codon optimization algorithms.
Identifiants
pubmed: 35358230
doi: 10.1371/journal.pone.0265469
pii: PONE-D-21-32048
pmc: PMC8970475
doi:
Substances chimiques
Codon
0
Dystrophin
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0265469Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Cold Spring Harb Perspect Biol. 2012 May 01;4(5):
pubmed: 20739415
Am J Hum Genet. 2020 Jul 2;107(1):83-95
pubmed: 32516569
BMC Biol. 2021 Feb 19;19(1):36
pubmed: 33607980
Biol Rev Camb Philos Soc. 2013 Feb;88(1):49-61
pubmed: 22889422
Front Microbiol. 2017 Jul 27;8:1419
pubmed: 28798739
Front Genet. 2019 Oct 07;10:914
pubmed: 31649718
Biol Direct. 2014 Jul 10;9:17
pubmed: 25011537
Nucleic Acids Res. 1998 Oct 1;26(19):4540
pubmed: 9841348
Biomed Res Int. 2018 Oct 3;2018:9836256
pubmed: 30402498
Bioinformatics. 2010 Oct 1;26(19):2458-9
pubmed: 20685956
Hum Mol Genet. 2016 Oct 1;25(R2):R77-R85
pubmed: 27354349
Proc Natl Acad Sci U S A. 2011 Jun 21;108(25):10231-6
pubmed: 21646514
Hum Mutat. 2008 Aug;29(8):992-1006
pubmed: 18470943
Int J Biochem Cell Biol. 2015 Jul;64:58-74
pubmed: 25817479
Biol Direct. 2014 Dec 12;9:29
pubmed: 25496919
Front Genet. 2020 Mar 03;11:131
pubmed: 32194622
Diseases. 2016 Dec;4(4):
pubmed: 28367323
J Mol Evol. 2020 Mar;88(2):164-178
pubmed: 31820049
Front Genet. 2020 Jul 07;11:606
pubmed: 32733532
Gene. 2004 Jun 23;335:19-23
pubmed: 15194186
Nucleic Acids Res. 1987 Feb 11;15(3):1281-95
pubmed: 3547335
Mol Biol Evol. 2008 Mar;25(3):568-79
pubmed: 18178545
Cell. 2018 Feb 8;172(4):667-682.e15
pubmed: 29425489
Cell Mol Biol Lett. 2016 Oct 19;21:23
pubmed: 28536625
G3 (Bethesda). 2015 Aug 06;5(10):2027-36
pubmed: 26248983
Int J Mol Sci. 2020 Nov 19;21(22):
pubmed: 33228152
Bioinformation. 2017 Feb 28;13(2):46-53
pubmed: 28642636
Nature. 2013 Mar 7;495(7439):116-20
pubmed: 23417065
J Med Genet. 1994 Mar;31(3):183-6
pubmed: 8014964
Genet Med. 2015 May;17(5):405-24
pubmed: 25741868
BMC Genomics. 2016 May 17;17:366
pubmed: 27188984
Front Bioeng Biotechnol. 2014 Oct 06;2:41
pubmed: 25340050
Annu Rev Biochem. 2005;74:179-98
pubmed: 15952885
PLoS One. 2011;6(9):e25457
pubmed: 21966531
Semin Cell Dev Biol. 2021 Nov;119:89-100
pubmed: 34016524
Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:596-9
pubmed: 24109757
Mutat Res. 2019 Jan;813:31-38
pubmed: 30590232
J Transl Med. 2011 Jun 08;9:87
pubmed: 21651781
J Theor Biol. 2014 Apr 21;347:95-108
pubmed: 24434741
PLoS One. 2013;8(3):e59706
pubmed: 23527255
Nucleic Acids Res. 2014 Mar;42(5):2879-92
pubmed: 24371267
Genome Biol. 2020 Feb 27;21(1):44
pubmed: 32102681
Science. 2013 Oct 25;342(6157):475-9
pubmed: 24072823
Gene. 2020 Jul 15;747:144673
pubmed: 32304783
Genomics. 2020 Jan;112(1):304-311
pubmed: 30818063
Genet Med. 2020 Aug;22(8):1407-1412
pubmed: 32371920
J Anat. 2018 Jun 12;:
pubmed: 29893024
Nat Rev Mol Cell Biol. 2018 Jan;19(1):20-30
pubmed: 29018283
Genomics. 2019 Mar;111(2):167-176
pubmed: 29395657
Curr Protoc Hum Genet. 2016 Jul 01;90:7.13.1-7.13.19
pubmed: 27367167
Lancet Neurol. 2003 Dec;2(12):731-40
pubmed: 14636778