The variation and evolution of complete human centromeres.
Journal
Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462
Informations de publication
Date de publication:
03 Apr 2024
03 Apr 2024
Historique:
received:
29
05
2023
accepted:
07
03
2024
pubmed:
4
4
2024
medline:
4
4
2024
entrez:
3
4
2024
Statut:
aheadofprint
Résumé
Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size
Identifiants
pubmed: 38570684
doi: 10.1038/s41586-024-07278-3
pii: 10.1038/s41586-024-07278-3
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : NIGMS NIH HHS
ID : K99 GM147352
Pays : United States
Organisme : NCI NIH HHS
ID : R01 CA266339
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG010169
Pays : United States
Commentaires et corrections
Type : UpdateOf
Informations de copyright
© 2024. The Author(s).
Références
Willard, H. F. Chromosome-specific organization of human alpha satellite DNA. Am. J. Hum. Genet. 37, 524–532 (1985).
pubmed: 2988334
pmcid: 1684601
Alexandrov, I., Kazakov, A., Tumeneva, I., Shepelev, V. & Yurov, Y. Alpha-satellite DNA of primates: old and new families. Chromosoma 110, 253–266 (2001).
pubmed: 11534817
doi: 10.1007/s004120100146
Henikoff, S., Ahmad, K. & Malik, H. S. The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293, 1098–1102 (2001).
pubmed: 11498581
doi: 10.1126/science.1062939
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
pubmed: 35357919
pmcid: 9186530
doi: 10.1126/science.abj6987
Altemose, N. et al. Complete genomic and epigenetic maps of human centromeres. Science 376, eabl4178 (2022).
pubmed: 35357911
pmcid: 9233505
doi: 10.1126/science.abl4178
Chaisson, M. J. P. et al. Resolving the complexity of the human genome using single-molecule sequencing. Nature 517, 608–611 (2015).
pubmed: 25383537
doi: 10.1038/nature13907
Nurk, S. et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. https://doi.org/10.1101/gr.263566.120 (2020).
Vollger, M. R. et al. Segmental duplications and their variation in a complete human genome. Science 376, eabj6965 (2022).
pubmed: 35357917
pmcid: 8979283
doi: 10.1126/science.abj6965
Steinberg, K. M. et al. Single haplotype assembly of the human genome from a hydatidiform mole. Genome Res .24, 2066–2076 (2014).
pubmed: 25373144
pmcid: 4248323
doi: 10.1101/gr.180893.114
Liao, W.-W. et al. A draft human pangenome reference. Nature 617, 312–324 (2023).
pubmed: 37165242
pmcid: 10172123
doi: 10.1038/s41586-023-05896-x
Porubsky, D. et al. Inversion polymorphism in a complete human genome assembly. Genome Biol. 24, 100 (2023).
pubmed: 37122002
pmcid: 10150506
doi: 10.1186/s13059-023-02919-8
Logsdon, G. A. & Eichler, E. E. The dynamic structure and rapid evolution of human centromeric satellite DNA. Genes 14, 92 (2023).
doi: 10.3390/genes14010092
Archidiacono, N. et al. Comparative mapping of human alphoid sequences in great apes using fluorescence in situ hybridization. Genomics 25, 477–484 (1995).
pubmed: 7789981
doi: 10.1016/0888-7543(95)80048-Q
Cechova, M. et al. High satellite repeat turnover in great apes studied with short- and long-read technologies. Mol. Biol. Evol. 36, 2415–2431 (2019).
pubmed: 31273383
pmcid: 6805231
doi: 10.1093/molbev/msz156
Miga, K. H. & Alexandrov, I. A. Variation and evolution of human centromeres: a field guide and perspective. Annu. Rev. Genet. 55, 583–602 (2021).
pubmed: 34813350
pmcid: 9549924
doi: 10.1146/annurev-genet-071719-020519
Willard, H. F., Wevrick, R. & Warburton, P. E. Human centromere structure: organization and potential role of alpha satellite DNA. Prog. Clin. Biol. Res. 318, 9–18 (1989).
pubmed: 2696978
Wu, J. C. & Manuelidis, L. Sequence definition and organization of a human repeated DNA. J. Mol. Biol. 142, 363–386 (1980).
pubmed: 6257909
doi: 10.1016/0022-2836(80)90277-6
Alkan, C. et al. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data. PLoS Comput. Biol. 3, 1807–1818 (2007).
pubmed: 17907796
doi: 10.1371/journal.pcbi.0030181
Alkan, C. et al. Genome-wide characterization of centromeric satellites from multiple mammalian genomes. Genome Res. 21, 137–145 (2011).
pubmed: 21081712
pmcid: 3012921
doi: 10.1101/gr.111278.110
Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. Nature 585, 79–84 (2020).
pubmed: 32663838
pmcid: 7484160
doi: 10.1038/s41586-020-2547-7
Logsdon, G. A. et al. The structure, function and evolution of a complete human chromosome 8. Nature 593, 101–107 (2021).
pubmed: 33828295
pmcid: 8099727
doi: 10.1038/s41586-021-03420-7
Vollger, M. R. et al. Long-read sequence and assembly of segmental duplications. Nat. Methods 16, 88–94 (2019).
pubmed: 30559433
doi: 10.1038/s41592-018-0236-3
Ebert, P. et al. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372, eabf7117 (2021).
pubmed: 33632895
pmcid: 8026704
doi: 10.1126/science.abf7117
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
pubmed: 33526886
pmcid: 7961889
doi: 10.1038/s41592-020-01056-5
Aldrup-MacDonald, M. E., Kuo, M. E., Sullivan, L. L., Chew, K. & Sullivan, B. A. Genomic variation within alpha satellite DNA influences centromere location on human chromosomes with metastable epialleles. Genome Res. 26, 1301–1311 (2016).
pubmed: 27510565
pmcid: 5052062
doi: 10.1101/gr.206706.116
Mahtani, M. M. & Willard, H. F. A primary genetic map of the pericentromeric region of the human X chromosome. Genomics 2, 294–301 (1988).
pubmed: 2906040
doi: 10.1016/0888-7543(88)90017-1
Bzikadze, A. V., Mikheenko, A. & Pevzner, P. A. Fast and accurate mapping of long reads to complete genome assemblies with VerityMap. Genome Res. https://doi.org/10.1101/gr.276871.122 (2022).
Dishuck, P. C., Rozanski, A. N., Logsdon, G. A., Porubsky, D. & Eichler, E. E. GAVISUNK: genome assembly validation via inter-SUNK distances in Oxford Nanopore reads. Bioinformatics https://doi.org/10.1093/bioinformatics/btac714 (2022).
Rautiainen, M. et al. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01662-6 (2023).
Bzikadze, A. V. & Pevzner, P. A. TandemAligner: a new parameter-free framework for fast sequence alignment. Preprint at bioRxiv https://doi.org/10.1101/2022.09.15.507041 (2022).
Gershman, A. et al. Epigenetic patterns in a complete human genome. Science 376, eabj5089 (2022).
pubmed: 35357915
pmcid: 9170183
doi: 10.1126/science.abj5089
Stimpson, K. M., Matheny, J. E. & Sullivan, B. A. Dicentric chromosomes: unique models to study centromere function and inactivation. Chromosome Res. 20, 595–605 (2012).
pubmed: 22801777
pmcid: 3557915
doi: 10.1007/s10577-012-9302-3
Sullivan, B. A. & Willard, H. F. Stable dicentric X chromosomes with two functional centromeres. Nat. Genet. 20, 227–228 (1998).
pubmed: 9806536
doi: 10.1038/3024
Shepelev, V. A., Alexandrov, A. A., Yurov, Y. B. & Alexandrov, I. A. The evolutionary origin of man can be traced in the layers of defunct ancestral alpha satellites flanking the active centromeres of human chromosomes. PLoS Genet. 5, e1000641 (2009).
pubmed: 19749981
pmcid: 2729386
doi: 10.1371/journal.pgen.1000641
Pike, L. M., Carlisle, A., Newell, C., Hong, S.-B. & Musich, P. R. Sequence and evolution of rhesus monkey alphoid DNA. J. Mol. Evol. 23, 127–137 (1986).
pubmed: 3018269
doi: 10.1007/BF02099907
Alexandrov, I. A., Mitkevich, S. P. & Yurov, Y. B. The phylogeny of human chromosome specific alpha satellites. Chromosoma 96, 443–453 (1988).
pubmed: 3219915
doi: 10.1007/BF00303039
Hughes, J. F., Skaletsky, H. & Page, D. C. ALRY-MAJOR:PT: Major Repeat Unit of Chimpanzee Alpha Repetitive DNA from the Y Chromosome Centromere—A Consensus (Repbase, accessed 28 May 2023); http://www.girinst.org/ .
Plohl, M., Luchetti, A., Meštrović, N. & Mantovani, B. Satellite DNAs between selfishness and functionality: structure, genomics and evolution of tandem repeats in centromeric (hetero)chromatin. Gene 409, 72–82 (2008).
pubmed: 18182173
doi: 10.1016/j.gene.2007.11.013
Amor, D. J. et al. Human centromere repositioning ‘in progress’. Proc. Natl Acad. Sci. USA 101, 6542–6547 (2004).
pubmed: 15084747
pmcid: 404081
doi: 10.1073/pnas.0308637101
Wlodzimierz, P. et al. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature 618, 557–565 (2023).
pubmed: 37198485
doi: 10.1038/s41586-023-06062-z
Iwata-Otsubo, A. et al. Expanded satellite repeats amplify a discrete CENP-A nucleosome assembly site on chromosomes that drive in female meiosis. Curr. Biol. 27, 2365–2373 (2017).
pubmed: 28756949
pmcid: 5567862
doi: 10.1016/j.cub.2017.06.069
Akera, T. et al. Spindle asymmetry drives non-Mendelian chromosome segregation. Science 358, 668–672 (2017).
pubmed: 29097549
pmcid: 5906099
doi: 10.1126/science.aan0092
Akera, T., Trimm, E. & Lampson, M. A. Molecular strategies of meiotic cheating by selfish centromeres. Cell 178, 1132–1144 (2019).
pubmed: 31402175
pmcid: 6731994
doi: 10.1016/j.cell.2019.07.001
Vollger, M. R., Kerpedjiev, P., Phillippy, A. M. & Eichler, E. E. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics 38, 2049–2051 (2022).
pubmed: 35020798
pmcid: 8963321
doi: 10.1093/bioinformatics/btac018
Richard, F. & Dutrillaux, B. Origin of human chromosome 21 and its consequences: a 50-million-year-old story. Chromosome Res. 6, 263–268 (1998).
pubmed: 9688515
doi: 10.1023/A:1009262622325
McConkey, E. H. Orthologous numbering of great ape and human chromosomes is essential for comparative genomics. Cytogenet. Genome Res. 105, 157–158 (2004).
pubmed: 15218271
doi: 10.1159/000078022
Huddleston, J. et al. Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res. 24, 688–696 (2014).
pubmed: 24418700
pmcid: 3975067
doi: 10.1101/gr.168450.113
Baid, G. et al. DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer. Nat. Biotechnol. 41, 232–238 (2023).
pubmed: 36050551
Logsdon, G. A. HMW gDNA purification and ONT ultra-long-read data generation. Protocols.io https://doi.org/10.17504/protocols.io.bchhit36 (2020).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
pubmed: 29750242
pmcid: 6137996
doi: 10.1093/bioinformatics/bty191
Jain, C. et al. Weighted minimizer sampling improves long read mapping. Bioinformatics 36, i111–i118 (2020).
pubmed: 32657365
pmcid: 7355284
doi: 10.1093/bioinformatics/btaa435
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
pubmed: 21221095
pmcid: 3346182
doi: 10.1038/nbt.1754
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
pubmed: 32928274
pmcid: 7488777
doi: 10.1186/s13059-020-02134-9
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).
pubmed: 22743772
doi: 10.1038/nmeth.2019
Potapova, T. A. et al. Karyotyping human and mouse cells using probes from single-sorted chromosomes and open source software. BioTechniques 59, 335–346 (2015).
pubmed: 26651513
doi: 10.2144/000114362
Falconer, E. et al. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods 9, 1107–1112 (2012).
pubmed: 23042453
pmcid: 3580294
doi: 10.1038/nmeth.2206
Sanders, A. D., Falconer, E., Hills, M., Spierings, D. C. J. & Lansdorp, P. M. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs. Nat. Protoc. 12, 1151–1176 (2017).
pubmed: 28492527
doi: 10.1038/nprot.2017.029
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
pubmed: 20080505
pmcid: 2828108
doi: 10.1093/bioinformatics/btp698
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
pubmed: 19505943
pmcid: 2723002
doi: 10.1093/bioinformatics/btp352
Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: fast processing of NGS alignment formats. Bioinformatics 31, 2032–2034 (2015).
pubmed: 25697820
pmcid: 4765878
doi: 10.1093/bioinformatics/btv098
Porubsky, D. et al. breakpointR: an R/Bioconductor package to localize strand state changes in strand-seq data. Bioinformatics 36, 1260–1261 (2020).
pubmed: 31504176
doi: 10.1093/bioinformatics/btz681
Porubsky, D. et al. Direct chromosome-length haplotyping by single-cell sequencing. Genome Res. 26, 1565–1574 (2016).
pubmed: 27646535
pmcid: 5088598
doi: 10.1101/gr.209841.116
Bakker, B. et al. Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies. Genome Biol. 17, 115 (2016).
pubmed: 27246460
pmcid: 4888588
doi: 10.1186/s13059-016-0971-7
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
pubmed: 20110278
pmcid: 2832824
doi: 10.1093/bioinformatics/btq033
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-4.0 (2013).
Wickham, H. Ggplot2: Elegant Graphics for Data Analysis (Springer, 2009).
McNulty, S. M. & Sullivan, B. A. Alpha satellite DNA biology: finding function in the recesses of the genome. Chromosome Res. 26, 115–138 (2018).
pubmed: 29974361
pmcid: 6121732
doi: 10.1007/s10577-018-9582-3
R Core Team. R: a Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2020).
Loman, N. J., Quick, J. & Simpson, J. T. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat. Methods 12, 733–735 (2015).
pubmed: 26076426
doi: 10.1038/nmeth.3444
Lee, I. et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Nat. Methods 17, 1191–1199 (2020).
pubmed: 33230324
pmcid: 7704922
doi: 10.1038/s41592-020-01000-7
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal 17, 10–12 (2011).
doi: 10.14806/ej.17.1.200
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at arxiv.org/abs/1303.3997 (2013).
Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).
pubmed: 24799436
pmcid: 4086134
doi: 10.1093/nar/gku365
Ventura, M. et al. The evolution of African great ape subtelomeric heterochromatin and the fusion of human chromosome 2. Genome Res. 22, 1036–1049 (2012).
pubmed: 22419167
pmcid: 3371704
doi: 10.1101/gr.136556.111
Earnshaw, W. C. & Tomkiel, J. E. Centromere and kinetochore structure. Curr. Opin. Cell Biol. 4, 86–93 (1992).
pubmed: 1558757
doi: 10.1016/0955-0674(92)90063-I
Lichter, P. et al. High-resolution mapping of human chromosome 11 by in situ hybridization with cosmid clones. Science 247, 64–69 (1990).
pubmed: 2294592
doi: 10.1126/science.2294592
Dvorkina, T., Bzikadze, A. V. & Pevzner, P. A. The string decomposition problem and its applications to centromere analysis and assembly. Bioinformatics 36, i93–i101 (2020).
pubmed: 32657390
pmcid: 7428072
doi: 10.1093/bioinformatics/btaa454
Glazko, G. V. & Nei, M. Estimation of divergence times for major lineages of primate species. Mol. Biol. Evol. 20, 424–434 (2003).
pubmed: 12644563
doi: 10.1093/molbev/msg050
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
pubmed: 23329690
pmcid: 3603318
doi: 10.1093/molbev/mst010
Nakamura, T., Yamada, K. D., Tomii, K. & Katoh, K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics 34, 2490–2492 (2018).
pubmed: 29506019
pmcid: 6041967
doi: 10.1093/bioinformatics/bty121
Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
pubmed: 25371430
doi: 10.1093/molbev/msu300
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007).
pubmed: 17050570
doi: 10.1093/bioinformatics/btl529
Tamura, K. & Nei, M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10, 512–526 (1993).
pubmed: 8336541
Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, 1983).
Porubsky, D. & Lansdorp, P. The variation and evolution of complete human centromeres. Zenodo https://doi.org/10.5281/zenodo.7959305 (2022).
Logsdon, G. A., Rozandki, A. N., Harvey, W. H. & Eichler, E. E. SUNK-based contig scaffolding pipeline. GitHub github.com/arozanski97/SUNK-based-contig-scaffolding (2023).
Logsdon, G. A., Rozanski, A. N., Harvey, W. H., Mastrorosa, F. K. & Eichler, E. E. CDR-Finder. GitHub github.com/arozanski97/CDR-Finder (2023).
Guarracino, A. et al. Recombination between heterologous human acrocentric chromosomes. Nature 617, 335–343 (2023).
pubmed: 37165241
pmcid: 10172130
doi: 10.1038/s41586-023-05976-y