Identification of constrained sequence elements across 239 primate genomes.
Journal
Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462
Informations de publication
Date de publication:
29 Nov 2023
29 Nov 2023
Historique:
received:
09
02
2023
accepted:
30
10
2023
pubmed:
30
11
2023
medline:
30
11
2023
entrez:
29
11
2023
Statut:
aheadofprint
Résumé
Noncoding DNA is central to our understanding of human gene regulation and complex diseases
Identifiants
pubmed: 38030727
doi: 10.1038/s41586-023-06798-8
pii: 10.1038/s41586-023-06798-8
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© 2023. The Author(s).
Références
Albert, F. W. & Kruglyak, L. The role of regulatory variation in complex traits and disease. Nat. Rev. Genet. 16, 197–212 (2015).
pubmed: 25707927
doi: 10.1038/nrg3891
Lappalainen, T. & MacArthur, D. G. From variant to function in human disease genetics. Science 373, 1464–1468 (2021).
pubmed: 34554789
doi: 10.1126/science.abi8207
Dermitzakis, E. T. & Clark, A. G. Evolution of transcription factor binding sites in mammalian gene regulatory regions: conservation and turnover. Mol. Biol. Evol. 19, 1114–1121 (2002).
pubmed: 12082130
doi: 10.1093/oxfordjournals.molbev.a004169
Thomas, J. W. et al. Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424, 788–793 (2003).
pubmed: 12917688
doi: 10.1038/nature01858
Boffelli, D., Nobrega, M. A. & Rubin, E. M. Comparative genomics at the vertebrate extremes. Nat. Rev. Genet. 5, 456–465 (2004).
pubmed: 15153998
doi: 10.1038/nrg1350
Margulies, E. H. et al. Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome. Genome Res. 17, 760–774 (2007).
pubmed: 17567995
pmcid: 1891336
doi: 10.1101/gr.6034307
Sullivan, P. F. et al. Leveraging base-pair mammalian constraint to understand genetic variation and human disease. Science 380, eabn2937 (2023).
pubmed: 37104612
pmcid: 10259825
doi: 10.1126/science.abn2937
Roadmap Epigenomics Consortium. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
pmcid: 4530010
doi: 10.1038/nature14248
ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
doi: 10.1038/nature11247
King, M. C. & Wilson, A. C. Evolution at two levels in humans and chimpanzees. Science 188, 107–116 (1975).
pubmed: 1090005
doi: 10.1126/science.1090005
Kuderna, L. F. K. et al. A global catalog of whole-genome diversity from 233 primate species. Science 380, 906–913 (2023).
pubmed: 37262161
doi: 10.1126/science.abn7829
Juan, D., Santpere, G., Kelley, J. L., Cornejo, O. E. & Marques-Bonet, T. Current advances in primate genomics: novel approaches for understanding evolution and disease. Nat. Rev. Genet. 24, 314–331 (2023).
pubmed: 36599936
doi: 10.1038/s41576-022-00554-w
Boffelli, D. et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 299, 1391–1394 (2003).
doi: 10.1126/science.1081331
Gilad, Y., Oshlack, A., Smyth, G. K., Speed, T. P. & White, K. P. Expression profiling in primates reveals a rapid evolution of human transcription factors. Nature 440, 242–245 (2006).
pubmed: 16525476
doi: 10.1038/nature04559
Orkin, J. D., Kuderna, L. F. K. & Marques-Bonet, T. The diversity of primates: from biomedicine to conservation genomics. Annu. Rev. Anim. Biosci. 9, 103–124 (2021).
pubmed: 33197208
doi: 10.1146/annurev-animal-061220-023138
Sousa, A. M. M., Meyer, K. A., Santpere, G., Gulden, F. O. & Sestan, N. Evolution of the human nervous system function, structure, and development. Cell 170, 226–247 (2017).
pubmed: 28708995
pmcid: 5647789
doi: 10.1016/j.cell.2017.06.036
Lindblad-Toh, K. et al. A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476–482 (2011).
pubmed: 21993624
pmcid: 3207357
doi: 10.1038/nature10530
Christmas, M. J. et al. Evolutionary constraint and innovation across hundreds of placental mammals. Science 380, eabn3943 (2023).
pubmed: 37104599
pmcid: 10250106
doi: 10.1126/science.abn3943
Wilson, D. E. & Reeder, D. M. Mammal Species of the World: A Taxonomic and Geographic Reference (JHU Press, 2005).
Zoonomia Consortium. A comparative genomics multitool for scientific discovery and conservation. Nature 587, 240–245 (2020).
doi: 10.1038/s41586-020-2876-6
Armstrong, J. et al. Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature 587, 246–251 (2020).
pubmed: 33177663
pmcid: 7673649
doi: 10.1038/s41586-020-2871-y
Sørensen, E. F. et al. Genome-wide coancestry reveals details of ancient and recent male-driven reticulation in baboons. Science 380, eabn8153 (2023).
pubmed: 37262153
doi: 10.1126/science.abn8153
Gao, H. et al. The landscape of tolerated genetic variation in humans and primates. Science 380, eabn8153 (2023).
pubmed: 37262156
doi: 10.1126/science.abn8197
Fiziev, P. P. et al. Rare penetrant mutations confer severe risk of common diseases. Science 380, eabo1131 (2023).
pubmed: 37262146
doi: 10.1126/science.abo1131
Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
pubmed: 25609793
doi: 10.1093/bioinformatics/btv033
Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).
pubmed: 19858363
pmcid: 2798823
doi: 10.1101/gr.097857.109
Frankish, A., Diekhans, M., Jungreis, I. & Lagarde, J. GENCODE 2021. Nucleic Acids Res. 49, D916–D923 (2021).
pubmed: 33270111
doi: 10.1093/nar/gkaa1087
Pan, Q. et al. Alternative splicing of conserved exons is frequently species-specific in human and mouse. Trends Genet. 21, 73–77 (2005).
pubmed: 15661351
doi: 10.1016/j.tig.2004.12.004
Merkin, J., Russell, C., Chen, P. & Burge, C. B. Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science 338, 1593–1599 (2012).
pubmed: 23258891
pmcid: 3568499
doi: 10.1126/science.1228186
Xiong, J. et al. Predominant patterns of splicing evolution on human, chimpanzee and macaque evolutionary lineages. Hum. Mol. Genet. 27, 1474–1485 (2018).
pubmed: 29452398
doi: 10.1093/hmg/ddy058
Suntsova, M. V. & Buzdin, A. A. Differences between human and chimpanzee genomes and their implications in gene expression, protein functions and biochemical properties of the two species. BMC Genomics 21, 535 (2020).
pubmed: 32912141
pmcid: 7488140
doi: 10.1186/s12864-020-06962-8
Kondrashov, F. A. & Koonin, E. V. Origin of alternative splicing by tandem exon duplication. Hum. Mol. Genet. 10, 2661–2669 (2001).
pubmed: 11726553
doi: 10.1093/hmg/10.23.2661
Mikkelsen, T. S. et al. Comparative epigenomic analysis of murine and human adipogenesis. Cell 143, 156–169 (2010).
pubmed: 20887899
pmcid: 2950833
doi: 10.1016/j.cell.2010.09.006
Odom, D. T. et al. Tissue-specific transcriptional regulation has diverged significantly between human and mouse. Nat. Genet. 39, 730–732 (2007).
pubmed: 17529977
pmcid: 3797512
doi: 10.1038/ng2047
Ward, L. D. & Kellis, M. Evidence of abundant purifying selection in humans for recently acquired regulatory functions. Science 337, 1675–1678 (2012).
pubmed: 22956687
pmcid: 4104271
doi: 10.1126/science.1225057
Necsulea, A. & Kaessmann, H. Evolutionary dynamics of coding and non-coding transcriptomes. Nat. Rev. Genet. 15, 734–748 (2014).
pubmed: 25297727
doi: 10.1038/nrg3802
Villar, D. et al. Enhancer evolution across 20 mammalian species. Cell 160, 554–566 (2015).
pubmed: 25635462
pmcid: 4313353
doi: 10.1016/j.cell.2015.01.006
Vierstra, J. et al. Global reference mapping of human transcription factor footprints. Nature 583, 729–736 (2020).
pubmed: 32728250
pmcid: 7410829
doi: 10.1038/s41586-020-2528-x
Fong, S. L. & Capra, J. A. Modeling the evolutionary architectures of transcribed human enhancer sequences reveals distinct origins, functions, and associations with human trait variation. Mol. Biol. Evol. 38, 3681–3696 (2021).
pubmed: 33973014
pmcid: 8382917
doi: 10.1093/molbev/msab138
Kircher, M. et al. Saturation mutagenesis of twenty disease-associated regulatory elements at single base-pair resolution. Nat. Commun. 10, 3583 (2019).
pubmed: 31395865
pmcid: 6687891
doi: 10.1038/s41467-019-11526-w
Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).
pubmed: 34608324
pmcid: 8490152
doi: 10.1038/s41592-021-01252-x
Edsall, L. E. et al. Evaluating chromatin accessibility differences across multiple primate species using a joint modeling approach. Genome Biol. Evol. 11, 3035–3053 (2019).
pubmed: 31599933
pmcid: 6821351
doi: 10.1093/gbe/evz218
Reilly, S. K. et al. Evolutionary genomics. Evolutionary changes in promoter and enhancer activity during human corticogenesis. Science 347, 1155–1159 (2015).
pubmed: 25745175
pmcid: 4426903
doi: 10.1126/science.1260943
Drake, J. A. et al. Conserved noncoding sequences are selectively constrained and not mutation cold spots. Nat. Genet. 38, 223–227 (2006).
pubmed: 16380714
doi: 10.1038/ng1710
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
pubmed: 32461654
pmcid: 7334197
doi: 10.1038/s41586-020-2308-7
Chen, S. et al. A genome-wide mutational constraint map quantified from variation in 76,156 human genomes. Preprint at bioRxiv https://doi.org/10.1101/2022.03.20.485034 (2022).
Meuleman, W. et al. Index and biological spectrum of human DNase I hypersensitive sites. Nature 584, 244–251 (2020).
pmcid: 7422677
doi: 10.1038/s41586-020-2559-3
Cardoso-Moreira, M. et al. Gene expression across mammalian organ development. Nature 571, 505–509 (2019).
pubmed: 31243369
pmcid: 6658352
doi: 10.1038/s41586-019-1338-5
Pontis, J. et al. Primate-specific transposable elements shape transcriptional networks during human development. Nat. Commun. 13, 7178 (2022).
pubmed: 36418324
pmcid: 9684439
doi: 10.1038/s41467-022-34800-w
Nowick, K. et al. Gain, loss and divergence in primate zinc-finger genes: a rich resource for evolution of gene regulatory differences between species. PLoS ONE 6, e21553 (2011).
pubmed: 21738707
pmcid: 3126818
doi: 10.1371/journal.pone.0021553
Vierstra, J. et al. Mouse regulatory DNA landscapes reveal global principles of cis-regulatory evolution. Science 346, 1007–1012 (2014).
pubmed: 25411453
pmcid: 4337786
doi: 10.1126/science.1246426
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
pubmed: 20686565
pmcid: 3039276
doi: 10.1038/nature09270
Cui, R. et al. Improving fine-mapping by modeling infinitesimal effects. Preprint at bioRxiv https://doi.org/10.1101/2022.10.21.513123 (2022).
Hardison, R. C. et al. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res. 13, 13–26 (2003).
pubmed: 12529302
pmcid: 430971
doi: 10.1101/gr.844103
Kryukov, G. V., Pennacchio, L. A. & Sunyaev, S. R. Most rare missense alleles are deleterious in humans: implications for complex disease and association studies. Am. J. Hum. Genet. 80, 727–739 (2007).
pubmed: 17357078
pmcid: 1852724
doi: 10.1086/513473
Kuderna, L. F., Esteller-Cucala, P. & Marques-Bonet, T. Branching out: what omics can tell us about primate evolution. Curr. Opin. Genet. Dev. 62, 65–71 (2020).
pubmed: 32634683
doi: 10.1016/j.gde.2020.06.006
Shao, Y. et al. Phylogenomic analyses provide insights into primate evolution. Science 380, 913–924 (2023).
pubmed: 37262173
doi: 10.1126/science.abn6919
Li, H. New strategies to improve minimap2 alignment accuracy. Bioinformatics 37, 4572–4574 (2021).
pubmed: 34623391
pmcid: 8652018
doi: 10.1093/bioinformatics/btab705
Hubisz, M. J., Pollard, K. S. & Siepel, A. PHAST and RPHAST: phylogenetic analysis with space/time models. Brief. Bioinform. 12, 41–51 (2011).
pubmed: 21278375
doi: 10.1093/bib/bbq072
Storey, J. D. A direct approach to false discovery rates. J. R. Stat. Soc. B 64, 479–498 (2002).
doi: 10.1111/1467-9868.00346
The GTEx Consortium. et al. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
pmcid: 7737656
doi: 10.1126/science.aaz1776
Thomas, P. D. et al. PANTHER: Making genome-scale phylogenetics accessible to all. Protein Sci. 31, 8–22 (2022).
pubmed: 34717010
doi: 10.1002/pro.4218
Bycroft, C. et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 562, 203–209 (2018).
pubmed: 30305743
doi: 10.1038/s41586-018-0579-z
Kanai, M. et al. Insights from complex trait fine-mapping across diverse populations. Preprint at bioRxiv https://doi.org/10.1101/2021.09.03.21262975 (2021).
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
pubmed: 26773131
pmcid: 4866522
doi: 10.1093/bioinformatics/btw018
Benner, C., Havulinna, A. S., Salomaa, V., Ripatti, S. & Pirinen, M. Refining fine-mapping: effect sizes and regional heritability. Preprint at bioRxiv https://doi.org/10.1101/318618 (2018).
Wang, G., Sarkar, A., Carbonetto, P. & Stephens, M. A simple new approach to variable selection in regression, with application to genetic fine mapping. J. R. Stat. Soc. B 82, 1273–1300 (2020).
doi: 10.1111/rssb.12388
Benner, C. et al. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet. 101, 539–551 (2017).
pubmed: 28942963
pmcid: 5630179
doi: 10.1016/j.ajhg.2017.08.012
Stegle, O., Parts, L., Durbin, R. & Winn, J. A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies. PLoS Comput. Biol. 6, e1000770 (2010).
pubmed: 20463871
pmcid: 2865505
doi: 10.1371/journal.pcbi.1000770
ENCODE Project Consortium. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
doi: 10.1038/s41586-020-2493-4
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
pubmed: 27268795
pmcid: 4893825
doi: 10.1186/s13059-016-0974-4
Finucane, H. K. et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat. Genet. 47, 1228–1235 (2015).
pubmed: 26414678
pmcid: 4626285
doi: 10.1038/ng.3404
Coetzee, S. G., Coetzee, G. A. & Hazelett, D. J. motifbreakR: an R/Bioconductor package for predicting variant effects at transcription factor binding sites. Bioinformatics 31, 3847–3849 (2015).
pubmed: 26272984
pmcid: 4653394
doi: 10.1093/bioinformatics/btv470
Kulakovskiy, I. V. et al. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-seq analysis. Nucleic Acids Res. 46, D252–D259 (2018).
pubmed: 29140464
doi: 10.1093/nar/gkx1106
García-Pérez, R. et al. Epigenomic profiling of primate lymphoblastoid cell lines reveals the evolutionary patterns of epigenetic activities in gene regulatory architectures. Nat. Commun. 12, 3116 (2021).
pmcid: 8149829
doi: 10.1038/s41467-021-23397-1