CpG-island-based annotation and analysis of human housekeeping genes.
CpG island density
genome analysis
genome annotation
housekeeping genes
statistical genetics
Journal
Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837
Informations de publication
Date de publication:
18 01 2021
18 01 2021
Historique:
received:
03
07
2019
revised:
27
08
2019
accepted:
03
10
2019
pubmed:
27
1
2020
medline:
16
11
2021
entrez:
27
1
2020
Statut:
ppublish
Résumé
By reviewing previous CpG-related studies, we consider that the transcription regulation of about half of the human genes, mostly housekeeping (HK) genes, involves CpG islands (CGIs), their methylation states, CpG spacing and other chromosomal parameters. However, the precise CGI definition and positioning of CGIs within gene structures, as well as specific CGI-associated regulatory mechanisms, all remain to be explained at individual gene and gene-family levels, together with consideration of species and lineage specificity. Although previous studies have already classified CGIs into high-CpG (HCGI), intermediate-CpG (ICGI) and low-CpG (LCGI) densities based on CpG density variation, the correlation between CGI density and gene expression regulation, such as co-regulation of CGIs and TATA box on HK genes, remains to be elucidated. First, this study introduces such a problem-solving protocol for human-genome annotation, which is based on a combination of GTEx, JBLA and Gene Ontology (GO) analysis. Next, we discuss why CGI-associated genes are most likely regulated by HCGI and tend to be HK genes; the HCGI/TATA± and LCGI/TATA± combinations show different GO enrichment, whereas the ICGI/TATA± combination is less characteristic based on GO enrichment analysis. Finally, we demonstrate that Hadoop MapReduce-based MR-JBLA algorithm is more efficient than the original JBLA in k-mer counting and CGI-associated gene analysis.
Identifiants
pubmed: 31982909
pii: 5715934
doi: 10.1093/bib/bbz134
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
515-525Informations de copyright
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.