The variation and evolution of complete human centromeres.


Journal

Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462

Informations de publication

Date de publication:
03 Apr 2024
Historique:
received: 29 05 2023
accepted: 07 03 2024
pubmed: 4 4 2024
medline: 4 4 2024
entrez: 3 4 2024
Statut: aheadofprint

Résumé

Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size

Identifiants

pubmed: 38570684
doi: 10.1038/s41586-024-07278-3
pii: 10.1038/s41586-024-07278-3
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : NIGMS NIH HHS
ID : K99 GM147352
Pays : United States
Organisme : NCI NIH HHS
ID : R01 CA266339
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG010169
Pays : United States

Commentaires et corrections

Type : UpdateOf

Informations de copyright

© 2024. The Author(s).

Références

Willard, H. F. Chromosome-specific organization of human alpha satellite DNA. Am. J. Hum. Genet. 37, 524–532 (1985).
pubmed: 2988334 pmcid: 1684601
Alexandrov, I., Kazakov, A., Tumeneva, I., Shepelev, V. & Yurov, Y. Alpha-satellite DNA of primates: old and new families. Chromosoma 110, 253–266 (2001).
pubmed: 11534817 doi: 10.1007/s004120100146
Henikoff, S., Ahmad, K. & Malik, H. S. The centromere paradox: stable inheritance with rapidly evolving DNA. Science 293, 1098–1102 (2001).
pubmed: 11498581 doi: 10.1126/science.1062939
Nurk, S. et al. The complete sequence of a human genome. Science 376, 44–53 (2022).
pubmed: 35357919 pmcid: 9186530 doi: 10.1126/science.abj6987
Altemose, N. et al. Complete genomic and epigenetic maps of human centromeres. Science 376, eabl4178 (2022).
pubmed: 35357911 pmcid: 9233505 doi: 10.1126/science.abl4178
Chaisson, M. J. P. et al. Resolving the complexity of the human genome using single-molecule sequencing. Nature 517, 608–611 (2015).
pubmed: 25383537 doi: 10.1038/nature13907
Nurk, S. et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. https://doi.org/10.1101/gr.263566.120 (2020).
Vollger, M. R. et al. Segmental duplications and their variation in a complete human genome. Science 376, eabj6965 (2022).
pubmed: 35357917 pmcid: 8979283 doi: 10.1126/science.abj6965
Steinberg, K. M. et al. Single haplotype assembly of the human genome from a hydatidiform mole. Genome Res .24, 2066–2076 (2014).
pubmed: 25373144 pmcid: 4248323 doi: 10.1101/gr.180893.114
Liao, W.-W. et al. A draft human pangenome reference. Nature 617, 312–324 (2023).
pubmed: 37165242 pmcid: 10172123 doi: 10.1038/s41586-023-05896-x
Porubsky, D. et al. Inversion polymorphism in a complete human genome assembly. Genome Biol. 24, 100 (2023).
pubmed: 37122002 pmcid: 10150506 doi: 10.1186/s13059-023-02919-8
Logsdon, G. A. & Eichler, E. E. The dynamic structure and rapid evolution of human centromeric satellite DNA. Genes 14, 92 (2023).
doi: 10.3390/genes14010092
Archidiacono, N. et al. Comparative mapping of human alphoid sequences in great apes using fluorescence in situ hybridization. Genomics 25, 477–484 (1995).
pubmed: 7789981 doi: 10.1016/0888-7543(95)80048-Q
Cechova, M. et al. High satellite repeat turnover in great apes studied with short- and long-read technologies. Mol. Biol. Evol. 36, 2415–2431 (2019).
pubmed: 31273383 pmcid: 6805231 doi: 10.1093/molbev/msz156
Miga, K. H. & Alexandrov, I. A. Variation and evolution of human centromeres: a field guide and perspective. Annu. Rev. Genet. 55, 583–602 (2021).
pubmed: 34813350 pmcid: 9549924 doi: 10.1146/annurev-genet-071719-020519
Willard, H. F., Wevrick, R. & Warburton, P. E. Human centromere structure: organization and potential role of alpha satellite DNA. Prog. Clin. Biol. Res. 318, 9–18 (1989).
pubmed: 2696978
Wu, J. C. & Manuelidis, L. Sequence definition and organization of a human repeated DNA. J. Mol. Biol. 142, 363–386 (1980).
pubmed: 6257909 doi: 10.1016/0022-2836(80)90277-6
Alkan, C. et al. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data. PLoS Comput. Biol. 3, 1807–1818 (2007).
pubmed: 17907796 doi: 10.1371/journal.pcbi.0030181
Alkan, C. et al. Genome-wide characterization of centromeric satellites from multiple mammalian genomes. Genome Res. 21, 137–145 (2011).
pubmed: 21081712 pmcid: 3012921 doi: 10.1101/gr.111278.110
Miga, K. H. et al. Telomere-to-telomere assembly of a complete human X chromosome. Nature 585, 79–84 (2020).
pubmed: 32663838 pmcid: 7484160 doi: 10.1038/s41586-020-2547-7
Logsdon, G. A. et al. The structure, function and evolution of a complete human chromosome 8. Nature 593, 101–107 (2021).
pubmed: 33828295 pmcid: 8099727 doi: 10.1038/s41586-021-03420-7
Vollger, M. R. et al. Long-read sequence and assembly of segmental duplications. Nat. Methods 16, 88–94 (2019).
pubmed: 30559433 doi: 10.1038/s41592-018-0236-3
Ebert, P. et al. Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science 372, eabf7117 (2021).
pubmed: 33632895 pmcid: 8026704 doi: 10.1126/science.abf7117
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
pubmed: 33526886 pmcid: 7961889 doi: 10.1038/s41592-020-01056-5
Aldrup-MacDonald, M. E., Kuo, M. E., Sullivan, L. L., Chew, K. & Sullivan, B. A. Genomic variation within alpha satellite DNA influences centromere location on human chromosomes with metastable epialleles. Genome Res. 26, 1301–1311 (2016).
pubmed: 27510565 pmcid: 5052062 doi: 10.1101/gr.206706.116
Mahtani, M. M. & Willard, H. F. A primary genetic map of the pericentromeric region of the human X chromosome. Genomics 2, 294–301 (1988).
pubmed: 2906040 doi: 10.1016/0888-7543(88)90017-1
Bzikadze, A. V., Mikheenko, A. & Pevzner, P. A. Fast and accurate mapping of long reads to complete genome assemblies with VerityMap. Genome Res. https://doi.org/10.1101/gr.276871.122 (2022).
Dishuck, P. C., Rozanski, A. N., Logsdon, G. A., Porubsky, D. & Eichler, E. E. GAVISUNK: genome assembly validation via inter-SUNK distances in Oxford Nanopore reads. Bioinformatics https://doi.org/10.1093/bioinformatics/btac714 (2022).
Rautiainen, M. et al. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01662-6 (2023).
Bzikadze, A. V. & Pevzner, P. A. TandemAligner: a new parameter-free framework for fast sequence alignment. Preprint at bioRxiv https://doi.org/10.1101/2022.09.15.507041 (2022).
Gershman, A. et al. Epigenetic patterns in a complete human genome. Science 376, eabj5089 (2022).
pubmed: 35357915 pmcid: 9170183 doi: 10.1126/science.abj5089
Stimpson, K. M., Matheny, J. E. & Sullivan, B. A. Dicentric chromosomes: unique models to study centromere function and inactivation. Chromosome Res. 20, 595–605 (2012).
pubmed: 22801777 pmcid: 3557915 doi: 10.1007/s10577-012-9302-3
Sullivan, B. A. & Willard, H. F. Stable dicentric X chromosomes with two functional centromeres. Nat. Genet. 20, 227–228 (1998).
pubmed: 9806536 doi: 10.1038/3024
Shepelev, V. A., Alexandrov, A. A., Yurov, Y. B. & Alexandrov, I. A. The evolutionary origin of man can be traced in the layers of defunct ancestral alpha satellites flanking the active centromeres of human chromosomes. PLoS Genet. 5, e1000641 (2009).
pubmed: 19749981 pmcid: 2729386 doi: 10.1371/journal.pgen.1000641
Pike, L. M., Carlisle, A., Newell, C., Hong, S.-B. & Musich, P. R. Sequence and evolution of rhesus monkey alphoid DNA. J. Mol. Evol. 23, 127–137 (1986).
pubmed: 3018269 doi: 10.1007/BF02099907
Alexandrov, I. A., Mitkevich, S. P. & Yurov, Y. B. The phylogeny of human chromosome specific alpha satellites. Chromosoma 96, 443–453 (1988).
pubmed: 3219915 doi: 10.1007/BF00303039
Hughes, J. F., Skaletsky, H. & Page, D. C. ALRY-MAJOR:PT: Major Repeat Unit of Chimpanzee Alpha Repetitive DNA from the Y Chromosome Centromere—A Consensus (Repbase, accessed 28 May 2023); http://www.girinst.org/ .
Plohl, M., Luchetti, A., Meštrović, N. & Mantovani, B. Satellite DNAs between selfishness and functionality: structure, genomics and evolution of tandem repeats in centromeric (hetero)chromatin. Gene 409, 72–82 (2008).
pubmed: 18182173 doi: 10.1016/j.gene.2007.11.013
Amor, D. J. et al. Human centromere repositioning ‘in progress’. Proc. Natl Acad. Sci. USA 101, 6542–6547 (2004).
pubmed: 15084747 pmcid: 404081 doi: 10.1073/pnas.0308637101
Wlodzimierz, P. et al. Cycles of satellite and transposon evolution in Arabidopsis centromeres. Nature 618, 557–565 (2023).
pubmed: 37198485 doi: 10.1038/s41586-023-06062-z
Iwata-Otsubo, A. et al. Expanded satellite repeats amplify a discrete CENP-A nucleosome assembly site on chromosomes that drive in female meiosis. Curr. Biol. 27, 2365–2373 (2017).
pubmed: 28756949 pmcid: 5567862 doi: 10.1016/j.cub.2017.06.069
Akera, T. et al. Spindle asymmetry drives non-Mendelian chromosome segregation. Science 358, 668–672 (2017).
pubmed: 29097549 pmcid: 5906099 doi: 10.1126/science.aan0092
Akera, T., Trimm, E. & Lampson, M. A. Molecular strategies of meiotic cheating by selfish centromeres. Cell 178, 1132–1144 (2019).
pubmed: 31402175 pmcid: 6731994 doi: 10.1016/j.cell.2019.07.001
Vollger, M. R., Kerpedjiev, P., Phillippy, A. M. & Eichler, E. E. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics 38, 2049–2051 (2022).
pubmed: 35020798 pmcid: 8963321 doi: 10.1093/bioinformatics/btac018
Richard, F. & Dutrillaux, B. Origin of human chromosome 21 and its consequences: a 50-million-year-old story. Chromosome Res. 6, 263–268 (1998).
pubmed: 9688515 doi: 10.1023/A:1009262622325
McConkey, E. H. Orthologous numbering of great ape and human chromosomes is essential for comparative genomics. Cytogenet. Genome Res. 105, 157–158 (2004).
pubmed: 15218271 doi: 10.1159/000078022
Huddleston, J. et al. Reconstructing complex regions of genomes using long-read sequencing technology. Genome Res. 24, 688–696 (2014).
pubmed: 24418700 pmcid: 3975067 doi: 10.1101/gr.168450.113
Baid, G. et al. DeepConsensus improves the accuracy of sequences with a gap-aware sequence transformer. Nat. Biotechnol. 41, 232–238 (2023).
pubmed: 36050551
Logsdon, G. A. HMW gDNA purification and ONT ultra-long-read data generation. Protocols.io https://doi.org/10.17504/protocols.io.bchhit36 (2020).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
pubmed: 29750242 pmcid: 6137996 doi: 10.1093/bioinformatics/bty191
Jain, C. et al. Weighted minimizer sampling improves long read mapping. Bioinformatics 36, i111–i118 (2020).
pubmed: 32657365 pmcid: 7355284 doi: 10.1093/bioinformatics/btaa435
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
pubmed: 21221095 pmcid: 3346182 doi: 10.1038/nbt.1754
Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 245 (2020).
pubmed: 32928274 pmcid: 7488777 doi: 10.1186/s13059-020-02134-9
Schindelin, J. et al. Fiji: an open-source platform for biological-image analysis. Nat. Methods 9, 676–682 (2012).
pubmed: 22743772 doi: 10.1038/nmeth.2019
Potapova, T. A. et al. Karyotyping human and mouse cells using probes from single-sorted chromosomes and open source software. BioTechniques 59, 335–346 (2015).
pubmed: 26651513 doi: 10.2144/000114362
Falconer, E. et al. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution. Nat. Methods 9, 1107–1112 (2012).
pubmed: 23042453 pmcid: 3580294 doi: 10.1038/nmeth.2206
Sanders, A. D., Falconer, E., Hills, M., Spierings, D. C. J. & Lansdorp, P. M. Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs. Nat. Protoc. 12, 1151–1176 (2017).
pubmed: 28492527 doi: 10.1038/nprot.2017.029
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics 26, 589–595 (2010).
pubmed: 20080505 pmcid: 2828108 doi: 10.1093/bioinformatics/btp698
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
pubmed: 19505943 pmcid: 2723002 doi: 10.1093/bioinformatics/btp352
Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: fast processing of NGS alignment formats. Bioinformatics 31, 2032–2034 (2015).
pubmed: 25697820 pmcid: 4765878 doi: 10.1093/bioinformatics/btv098
Porubsky, D. et al. breakpointR: an R/Bioconductor package to localize strand state changes in strand-seq data. Bioinformatics 36, 1260–1261 (2020).
pubmed: 31504176 doi: 10.1093/bioinformatics/btz681
Porubsky, D. et al. Direct chromosome-length haplotyping by single-cell sequencing. Genome Res. 26, 1565–1574 (2016).
pubmed: 27646535 pmcid: 5088598 doi: 10.1101/gr.209841.116
Bakker, B. et al. Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies. Genome Biol. 17, 115 (2016).
pubmed: 27246460 pmcid: 4888588 doi: 10.1186/s13059-016-0971-7
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
pubmed: 20110278 pmcid: 2832824 doi: 10.1093/bioinformatics/btq033
Smit, A. F. A., Hubley, R. & Green, P. RepeatMasker Open-4.0 (2013).
Wickham, H. Ggplot2: Elegant Graphics for Data Analysis (Springer, 2009).
McNulty, S. M. & Sullivan, B. A. Alpha satellite DNA biology: finding function in the recesses of the genome. Chromosome Res. 26, 115–138 (2018).
pubmed: 29974361 pmcid: 6121732 doi: 10.1007/s10577-018-9582-3
R Core Team. R: a Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2020).
Loman, N. J., Quick, J. & Simpson, J. T. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat. Methods 12, 733–735 (2015).
pubmed: 26076426 doi: 10.1038/nmeth.3444
Lee, I. et al. Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing. Nat. Methods 17, 1191–1199 (2020).
pubmed: 33230324 pmcid: 7704922 doi: 10.1038/s41592-020-01000-7
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal 17, 10–12 (2011).
doi: 10.14806/ej.17.1.200
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at arxiv.org/abs/1303.3997 (2013).
Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).
pubmed: 24799436 pmcid: 4086134 doi: 10.1093/nar/gku365
Ventura, M. et al. The evolution of African great ape subtelomeric heterochromatin and the fusion of human chromosome 2. Genome Res. 22, 1036–1049 (2012).
pubmed: 22419167 pmcid: 3371704 doi: 10.1101/gr.136556.111
Earnshaw, W. C. & Tomkiel, J. E. Centromere and kinetochore structure. Curr. Opin. Cell Biol. 4, 86–93 (1992).
pubmed: 1558757 doi: 10.1016/0955-0674(92)90063-I
Lichter, P. et al. High-resolution mapping of human chromosome 11 by in situ hybridization with cosmid clones. Science 247, 64–69 (1990).
pubmed: 2294592 doi: 10.1126/science.2294592
Dvorkina, T., Bzikadze, A. V. & Pevzner, P. A. The string decomposition problem and its applications to centromere analysis and assembly. Bioinformatics 36, i93–i101 (2020).
pubmed: 32657390 pmcid: 7428072 doi: 10.1093/bioinformatics/btaa454
Glazko, G. V. & Nei, M. Estimation of divergence times for major lineages of primate species. Mol. Biol. Evol. 20, 424–434 (2003).
pubmed: 12644563 doi: 10.1093/molbev/msg050
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
pubmed: 23329690 pmcid: 3603318 doi: 10.1093/molbev/mst010
Nakamura, T., Yamada, K. D., Tomii, K. & Katoh, K. Parallelization of MAFFT for large-scale multiple sequence alignments. Bioinformatics 34, 2490–2492 (2018).
pubmed: 29506019 pmcid: 6041967 doi: 10.1093/bioinformatics/bty121
Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).
pubmed: 25371430 doi: 10.1093/molbev/msu300
Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 23, 127–128 (2007).
pubmed: 17050570 doi: 10.1093/bioinformatics/btl529
Tamura, K. & Nei, M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol. Biol. Evol. 10, 512–526 (1993).
pubmed: 8336541
Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge Univ. Press, 1983).
Porubsky, D. & Lansdorp, P. The variation and evolution of complete human centromeres. Zenodo https://doi.org/10.5281/zenodo.7959305 (2022).
Logsdon, G. A., Rozandki, A. N., Harvey, W. H. & Eichler, E. E. SUNK-based contig scaffolding pipeline. GitHub github.com/arozanski97/SUNK-based-contig-scaffolding (2023).
Logsdon, G. A., Rozanski, A. N., Harvey, W. H., Mastrorosa, F. K. & Eichler, E. E. CDR-Finder. GitHub github.com/arozanski97/CDR-Finder (2023).
Guarracino, A. et al. Recombination between heterologous human acrocentric chromosomes. Nature 617, 335–343 (2023).
pubmed: 37165241 pmcid: 10172130 doi: 10.1038/s41586-023-05976-y

Auteurs

Glennis A Logsdon (GA)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
Department of Genetics, Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Allison N Rozanski (AN)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

Fedor Ryabov (F)

Masters Program in National Research University Higher School of Economics, Moscow, Russia.

Tamara Potapova (T)

Stowers Institute for Medical Research, Kansas City, MO, USA.

Valery A Shepelev (VA)

Institute of Molecular Genetics, Moscow, Russia.

Claudia R Catacchio (CR)

Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy.

David Porubsky (D)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

Yafei Mao (Y)

Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.

DongAhn Yoo (D)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

Mikko Rautiainen (M)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of Helsinki, Helsinki, Finland.

Sergey Koren (S)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

Sergey Nurk (S)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
Oxford Nanopore Technologies, Oxford, United Kingdom.

Julian K Lucas (JK)

Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, CA, USA.
UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA.

Kendra Hoekzema (K)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

Katherine M Munson (KM)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.

Jennifer L Gerton (JL)

Stowers Institute for Medical Research, Kansas City, MO, USA.

Adam M Phillippy (AM)

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

Mario Ventura (M)

Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, Bari, Italy.

Ivan A Alexandrov (IA)

Department of Human Molecular Genetics and Biochemistry, Tel Aviv University, Tel Aviv, Israel.
Department of Anatomy and Anthropology, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
Dan David Center for Human Evolution and Biohistory Research, Tel Aviv University, Tel Aviv, Israel.

Evan E Eichler (EE)

Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA. eee@gs.washington.edu.
Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA. eee@gs.washington.edu.

Classifications MeSH