Automated annotation of human centromeres with HORmon.


Journal

Genome research
ISSN: 1549-5469
Titre abrégé: Genome Res
Pays: United States
ID NLM: 9518021

Informations de publication

Date de publication:
06 2022
Historique:
received: 03 11 2021
accepted: 06 05 2022
pubmed: 12 5 2022
medline: 29 6 2022
entrez: 11 5 2022
Statut: ppublish

Résumé

Recent advances in long-read sequencing opened a possibility to address the long-standing questions about the architecture and evolution of human centromeres. They also emphasized the need for centromere annotation (partitioning human centromeres into monomers and higher-order repeats [HORs]). Although there was a half-century-long series of semi-manual studies of centromere architecture, a rigorous centromere annotation algorithm is still lacking. Moreover, an automated centromere annotation is a prerequisite for studies of genetic diseases associated with centromeres and evolutionary studies of centromeres across multiple species. Although the monomer decomposition (transforming a centromere into a monocentromere written in the monomer alphabet) and the HOR decomposition (representing a monocentromere in the alphabet of HORs) are currently viewed as two separate problems, we show that they should be integrated into a single framework in such a way that HOR (monomer) inference affects monomer (HOR) inference. We thus developed the HORmon algorithm that integrates the monomer/HOR inference and automatically generates the human monomers/HORs that are largely consistent with the previous semi-manual inference.

Identifiants

pubmed: 35545449
pii: gr.276362.121
doi: 10.1101/gr.276362.121
pmc: PMC9248890
doi:

Types de publication

Journal Article Research Support, U.S. Gov't, Non-P.H.S. Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1137-1151

Informations de copyright

© 2022 Kunyavskaya et al.; Published by Cold Spring Harbor Laboratory Press.

Références

Nat Biotechnol. 2020 Nov;38(11):1309-1316
pubmed: 32665660
Science. 1976 Feb 13;191(4227):528-35
pubmed: 1251186
Nucleic Acids Res. 1985 Apr 25;13(8):2731-43
pubmed: 2987865
Prog Mol Subcell Biol. 2021;60:203-234
pubmed: 34386877
Genome Biol. 2021 Jul 12;22(1):203
pubmed: 34253240
Science. 2001 Aug 10;293(5532):1098-102
pubmed: 11498581
Chromosome Res. 2018 Sep;26(3):115-138
pubmed: 29974361
Nat Biotechnol. 2011 Nov 08;29(11):987-91
pubmed: 22068540
BMC Genomics. 2009 Dec 23;10:630
pubmed: 20030836
Bioinformatics. 2016 Jul 1;32(13):1921-1924
pubmed: 27153570
Genes (Basel). 2018 Dec 07;9(12):
pubmed: 30544645
Genome Res. 2020 Sep;30(9):1291-1305
pubmed: 32801147
Nature. 2021 May;593(7857):101-107
pubmed: 33828295
PLoS Comput Biol. 2007 Sep;3(9):1807-18
pubmed: 17907796
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Annu Rev Genet. 2021 Nov 23;55:583-602
pubmed: 34813350
Mol Biol Evol. 1998 Aug;15(8):1062-8
pubmed: 9718733
Genome Res. 2014 Apr;24(4):697-707
pubmed: 24501022
Genom Data. 2015 Sep 1;5:139-146
pubmed: 26167452
Nucleic Acids Res. 1987 Sep 25;15(18):7549-69
pubmed: 3658703
Nature. 2020 Sep;585(7823):79-84
pubmed: 32663838
Bioinformatics. 2005 Apr 1;21(7):846-52
pubmed: 15509609
Cell. 2009 Sep 18;138(6):1067-82
pubmed: 19766562
Chromosoma. 2001 Aug;110(4):253-66
pubmed: 11534817
Science. 2022 Apr;376(6588):eabl4178
pubmed: 35357911
Bioinformatics. 2020 Jul 1;36(Suppl_1):i93-i101
pubmed: 32657390
Int J Mol Sci. 2021 Apr 21;22(9):
pubmed: 33919233
Somat Cell Mol Genet. 1989 Sep;15(5):445-60
pubmed: 2781415
Data Brief. 2019 Mar 08;24:103708
pubmed: 30989093
Bioinformatics. 2021 Jul 12;37(Suppl_1):i196-i204
pubmed: 34252949
Sci Adv. 2020 Dec 11;6(50):
pubmed: 33310858

Auteurs

Olga Kunyavskaya (O)

Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia, 199034.

Tatiana Dvorkina (T)

Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia, 199034.

Andrey V Bzikadze (AV)

Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, California 92093, USA.

Ivan A Alexandrov (IA)

Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University, Saint Petersburg, Russia, 199034.

Pavel A Pevzner (PA)

Department of Computer Science and Engineering, University of California, San Diego, California 92093, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH