Identification of rare and common regulatory variants in pluripotent cells using population-scale transcriptomics.
Bardet-Biedl Syndrome
/ genetics
Calcium Channels
/ genetics
Cell Line
Cerebellar Ataxia
/ genetics
DNA Methylation
Gene Expression
Genetic Variation
Humans
Induced Pluripotent Stem Cells
/ cytology
Polymorphism, Single Nucleotide
Proteins
/ genetics
Quantitative Trait Loci
Rare Diseases
/ genetics
Regulatory Sequences, Nucleic Acid
Sequence Analysis, RNA
Whole Genome Sequencing
Journal
Nature genetics
ISSN: 1546-1718
Titre abrégé: Nat Genet
Pays: United States
ID NLM: 9216904
Informations de publication
Date de publication:
03 2021
03 2021
Historique:
received:
04
10
2019
accepted:
25
01
2021
pubmed:
6
3
2021
medline:
10
4
2021
entrez:
5
3
2021
Statut:
ppublish
Résumé
Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent state, the disease impact of genetic variants is less well known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of new colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.
Identifiants
pubmed: 33664507
doi: 10.1038/s41588-021-00800-7
pii: 10.1038/s41588-021-00800-7
pmc: PMC7944648
mid: NIHMS1666585
doi:
Substances chimiques
Bbs2 protein, human
0
CACNA1A protein, human
0
Calcium Channels
0
Proteins
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
313-321Subventions
Organisme : NIDDK NIH HHS
ID : U01 DK105541
Pays : United States
Organisme : NIDDK NIH HHS
ID : DP3 DK112155
Pays : United States
Organisme : NHGRI NIH HHS
ID : U01 HG007708
Pays : United States
Organisme : NIDDK NIH HHS
ID : R01 DK107437
Pays : United States
Organisme : NHGRI NIH HHS
ID : U01 HG010218
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL142015
Pays : United States
Organisme : NHGRI NIH HHS
ID : U01 HG009080
Pays : United States
Organisme : NIDDK NIH HHS
ID : R01 DK116750
Pays : United States
Organisme : NIA NIH HHS
ID : R01 AG066490
Pays : United States
Organisme : NIDDK NIH HHS
ID : P30 DK116074
Pays : United States
Organisme : NHLBI NIH HHS
ID : U01 HL107388
Pays : United States
Organisme : NLM NIH HHS
ID : T32 LM012409
Pays : United States
Organisme : Wellcome Trust
ID : WT098503
Pays : United Kingdom
Organisme : NHGRI NIH HHS
ID : U01 HG009431
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG008150
Pays : United States
Organisme : NIDDK NIH HHS
ID : R01 DK106236
Pays : United States
Organisme : NHLBI NIH HHS
ID : U01 HL107442
Pays : United States
Organisme : NIDDK NIH HHS
ID : R01 DK120565
Pays : United States
Organisme : NLM NIH HHS
ID : T15 LM007033
Pays : United States
Organisme : Medical Research Council
Pays : United Kingdom
Organisme : Wellcome Trust
ID : WT090851
Pays : United Kingdom
Organisme : NIH HHS
ID : S10 OD023452
Pays : United States
Organisme : Wellcome Trust
Pays : United Kingdom
Investigateurs
Marc Jan Bonder
(M)
Daniel Seaton
(D)
David A Jakubosky
(DA)
Christopher D Brown
(CD)
YoSon Park
(Y)
Commentaires et corrections
Type : CommentIn
Références
Westra, H.-J. et al. Systematic identification of trans-eQTLs as putative drivers of known disease associations. Nat. Genet. 45, 1238–1243 (2013).
pubmed: 24013639
pmcid: 3991562
doi: 10.1038/ng.2756
Bonder, M. J. et al. Disease variants alter transcription factor levels and methylation of their binding sites. Nat. Genet. 49, 131–138 (2017).
pubmed: 27918535
doi: 10.1038/ng.3721
Zhernakova, D. V. et al. Identification of context-dependent expression quantitative trait loci in whole blood. Nat. Genet. 49, 139–145 (2017).
pubmed: 27918533
doi: 10.1038/ng.3737
GTEx Consortium. Genetic effects on gene expression across human tissues. Nature 550, 204–213 (2017).
pmcid: 5776756
doi: 10.1038/nature24277
Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013).
pubmed: 24037378
pmcid: 3918453
doi: 10.1038/nature12531
Alasoo, K. et al. Shared genetic effects on chromatin and gene expression indicate a role for enhancer priming in immune response. Nat. Genet. 50, 424–431 (2018).
pubmed: 29379200
pmcid: 6548559
doi: 10.1038/s41588-018-0046-7
Schwartzentruber, J. et al. Molecular and functional variation in iPSC-derived sensory neurons. Nat. Genet. 50, 54–61 (2018).
pubmed: 29229984
doi: 10.1038/s41588-017-0005-8
Cuomo, A. S. E. et al. Single-cell RNA sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression. Nat. Commun. 11, 810 (2020).
pubmed: 32041960
pmcid: 7010688
doi: 10.1038/s41467-020-14457-z
Jerber, J. et al. Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation. Nat. Genet. https://doi.org/10.1038/s41588-021-00801-6 (2021).
Sun, N. et al. Patient-specific induced pluripotent stem cells as a model for familial dilated cardiomyopathy. Sci. Transl. Med. 4, 130ra47 (2012).
pubmed: 22517884
pmcid: 3657516
doi: 10.1126/scitranslmed.3003552
Lan, F. et al. Abnormal calcium handling properties underlie familial hypertrophic cardiomyopathy pathology in patient-specific induced pluripotent stem cells. Cell Stem Cell 12, 101–113 (2013).
pubmed: 23290139
pmcid: 3638033
doi: 10.1016/j.stem.2012.10.010
Lee, J. et al. Activation of PDGF pathway links LMNA mutation to dilated cardiomyopathy. Nature 572, 335–340 (2019).
pubmed: 31316208
pmcid: 6779479
doi: 10.1038/s41586-019-1406-x
Kodo, K. et al. iPSC-derived cardiomyocytes reveal abnormal TGF-β signalling in left ventricular non-compaction cardiomyopathy. Nat. Cell Biol. 18, 1031–1042 (2016).
pubmed: 27642787
pmcid: 5042877
doi: 10.1038/ncb3411
Wu, H. et al. Modelling diastolic dysfunction in induced pluripotent stem cell-derived cardiomyocytes from hypertrophic cardiomyopathy patients. Eur. Heart J. 40, 3685–3695 (2019).
pubmed: 31219556
doi: 10.1093/eurheartj/ehz326
pmcid: 7963137
Dubois, N. C. et al. SIRPA is a specific cell-surface marker for isolating cardiomyocytes derived from human pluripotent stem cells. Nat. Biotechnol. 29, 1011–1018 (2011).
pubmed: 22020386
pmcid: 4949030
doi: 10.1038/nbt.2005
Sterneckert, J. L., Reinhardt, P. & Schöler, H. R. Investigating human disease using stem cell models. Nat. Rev. Genet. 15, 625–639 (2014).
pubmed: 25069490
doi: 10.1038/nrg3764
Kilpinen, H. et al. Common genetic variation drives molecular heterogeneity in human iPSCs. Nature 546, 370–375 (2017).
pubmed: 28489815
pmcid: 5524171
doi: 10.1038/nature22403
Panopoulos, A. D. et al. iPSCORE: a resource of 222 iPSC lines enabling functional characterization of genetic variation across a variety of cell types. Stem Cell Rep. 8, 1086–1100 (2017).
doi: 10.1016/j.stemcr.2017.03.012
Pashos, E. E. et al. Large, diverse population cohorts of hiPSCs and derived hepatocyte-like cells reveal functional genetic variation at blood lipid-associated loci. Cell Stem Cell 20, 558–570 (2017).
pubmed: 28388432
pmcid: 5476422
doi: 10.1016/j.stem.2017.03.017
Banovich, N. E. et al. Impact of regulatory variation across human iPSCs and differentiated cells. Genome Res. 28, 122–131 (2018).
pubmed: 29208628
pmcid: 5749177
doi: 10.1101/gr.224436.117
Carcamo-Orive, I. et al. Analysis of transcriptional variability in a large human iPSC library reveals genetic and non-genetic determinants of heterogeneity. Cell Stem Cell 20, 518–532.e9 (2017).
pubmed: 28017796
doi: 10.1016/j.stem.2016.11.005
Frésard, L. et al. Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts. Nat. Med. 25, 911–919 (2019).
pubmed: 31160820
pmcid: 6634302
doi: 10.1038/s41591-019-0457-8
Li, X. et al. The impact of rare variation on gene expression across tissues. Nature 550, 239–243 (2017).
pubmed: 29022581
pmcid: 5877409
doi: 10.1038/nature24267
Choi, J. et al. A comparison of genetically matched cell lines reveals the equivalence of human iPSCs and ESCs. Nat. Biotechnol. 33, 1173–1181 (2015).
pubmed: 26501951
pmcid: 4847940
doi: 10.1038/nbt.3388
Thomas, S. M. et al. Reprogramming LCLs to iPSCs results in recovery of donor-specific gene expression signature. PLoS Genet. 11, e1005216 (2015).
pubmed: 25950834
pmcid: 4423863
doi: 10.1371/journal.pgen.1005216
Donovan, M. K. R., D’Antonio-Chronowska, A., D’Antonio, M. & Frazer, K. A. Cellular deconvolution of GTEx tissues powers eQTL studies to discover thousands of novel disease and cell-type associated regulatory variants. Nat. Commun. 11, 955 (2020).
pubmed: 32075962
pmcid: 7031340
doi: 10.1038/s41467-020-14561-0
Forbes, S. A. et al. COSMIC: mining complete cancer genomes in the catalogue of somatic mutations in cancer. Nucleic Acids Res. 39, D945–D950 (2010).
pubmed: 20952405
pmcid: 3013785
doi: 10.1093/nar/gkq929
Gerrard, D. T. et al. An integrative transcriptomic atlas of organogenesis in human embryos. eLife 5, e15657 (2016).
pubmed: 27557446
pmcid: 4996651
doi: 10.7554/eLife.15657
Urbut, S. M., Wang, G., Carbonetto, P. & Stephens, M. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. Nat. Genet. 51, 187–195 (2019).
pubmed: 30478440
doi: 10.1038/s41588-018-0268-8
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012 (2019).
pubmed: 30445434
doi: 10.1093/nar/gky1120
Jakubosky, D. et al. Discovery and quality analysis of a comprehensive set of structural variants and short tandem repeats. Nat. Commun. 11, 2928 (2020).
pubmed: 32522985
pmcid: 7287045
doi: 10.1038/s41467-020-16481-5
Zhao, J. et al. A burden of rare variants associated with extremes of gene expression in human peripheral blood. Am. J. Hum. Genet. 98, 299–309 (2016).
pubmed: 26849112
pmcid: 4746369
doi: 10.1016/j.ajhg.2015.12.023
Li, X. et al. Transcriptome sequencing of a large human family identifies the impact of rare noncoding variants. Am. J. Hum. Genet. 95, 245–256 (2014).
pubmed: 25192044
pmcid: 4157143
doi: 10.1016/j.ajhg.2014.08.004
Stegle, O., Parts, L., Piipari, M., Winn, J. & Durbin, R. Using probabilistic estimation of expression residuals to obtain increased power and interpretability of gene expression analyses. Nat. Protoc. 7, 500–507 (2012).
pubmed: 22343431
pmcid: 3398141
doi: 10.1038/nprot.2011.457
Ferraro, N. M. et al. Transcriptomic signatures across human tissues identify functional rare genetic variation. Science 369, eaaz5900 (2020).
Cummings, B. B. et al. Improving genetic diagnosis in Mendelian disease with transcriptome sequencing. Sci. Transl. Med. 9, eaal5209 (2017).
Kremer, L. S. et al. Genetic diagnosis of Mendelian disorders via RNA sequencing. Nat. Commun. 8, 15824 (2017).
pubmed: 28604674
pmcid: 5499207
doi: 10.1038/ncomms15824
Kernohan, K. D. et al. Whole-transcriptome sequencing in blood provides a diagnosis of spinal muscular atrophy with progressive myoclonic epilepsy. Hum. Mutat. 38, 611–614 (2017).
pubmed: 28251733
pmcid: 5889109
doi: 10.1002/humu.23211
McKusick, V. A. Mendelian Inheritance in Man: a Catalog of Human Genes and Genetic Disorders (JHU Press, 1998).
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
pubmed: 26773131
pmcid: 4866522
doi: 10.1093/bioinformatics/btw018
Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
pubmed: 27866706
pmcid: 5142122
doi: 10.1016/j.ajhg.2016.10.003
Kamat, M. A. et al. PhenoScanner V2: an expanded tool for searching human genotype–phenotype associations. Bioinformatics https://doi.org/10.1093/bioinformatics/btz469 (2019).
Liu, B., Gloudemans, M. J., Rao, A. S., Ingelsson, E. & Montgomery, S. B. Abundant associations with gene expression complicate GWAS follow-up. Nat. Genet. 51, 768–769 (2019).
pubmed: 31043754
pmcid: 6904208
doi: 10.1038/s41588-019-0404-0
van der Harst, P. & Verweij, N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ. Res. 122, 433–443 (2018).
pubmed: 29212778
pmcid: 5805277
doi: 10.1161/CIRCRESAHA.117.312086
Liu, J. Z. et al. Dense fine-mapping study identifies new susceptibility loci for primary biliary cirrhosis. Nat. Genet. 44, 1137–1141 (2012).
pubmed: 22961000
pmcid: 3459817
doi: 10.1038/ng.2395
Teslovich, T. M. et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 466, 707–713 (2010).
pubmed: 20686565
pmcid: 3039276
doi: 10.1038/nature09270
Pongor, L. et al. A genome-wide approach to link genotype to clinical outcome by utilizing next-generation sequencing and gene chip data of 6,697 breast cancer patients. Genome Med. 7, 104 (2015).
pubmed: 26474971
pmcid: 4609150
doi: 10.1186/s13073-015-0228-1
Yengo, L. et al. Meta-analysis of genome-wide association studies for height and body mass index in ∼700000 individuals of European ancestry. Hum. Mol. Genet. 27, 3641–3649 (2018).
pubmed: 30124842
pmcid: 6488973
doi: 10.1093/hmg/ddy271
Sanchez, E. et al. POLR1B and neural crest cell anomalies in Treacher Collins syndrome type 4. Genet. Med. 22, 547–556 (2020).
pubmed: 31649276
doi: 10.1038/s41436-019-0669-9
Howson, J. M. M. et al. Fifteen new risk loci for coronary artery disease highlight arterial-wall-specific mechanisms. Nat. Genet. 49, 1113–1119 (2017).
pubmed: 28530674
pmcid: 5555387
doi: 10.1038/ng.3874
Pankratz, N. et al. Meta-analysis of Parkinson’s disease: identification of a novel locus, RIT2. Ann. Neurol. 71, 370–384 (2012).
pubmed: 22451204
pmcid: 3354734
doi: 10.1002/ana.22687
Lambert, J. C. et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat. Genet. 45, 1452–1458 (2013).
pubmed: 24162737
pmcid: 3896259
doi: 10.1038/ng.2802
Scott, R. A. et al. An expanded genome-wide association study of type 2. Diabetes 66, 2888–2902 (2017).
pubmed: 28566273
pmcid: 5652602
doi: 10.2337/db16-1253
Webb, G. J., Siminovitch, K. A. & Hirschfield, G. M. The immunogenetics of primary biliary cirrhosis: a comprehensive review. J. Autoimmun. 64, 42–52 (2015).
pubmed: 26250073
pmcid: 5014907
doi: 10.1016/j.jaut.2015.07.004
1000 Genomes Project Consortiumet al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
doi: 10.1038/nature15393
Streeter, I. et al. The human-induced pluripotent stem cell initiative-data resources for cellular genetics. Nucleic Acids Res. 45, D691–D697 (2017).
pubmed: 27733501
doi: 10.1093/nar/gkw928
D’Antonio, M. et al. Insights into the mutational burden of human induced pluripotent stem cells from an integrative multi-omics approach. Cell Rep. 24, 883–894 (2018).
pubmed: 30044985
pmcid: 6467479
doi: 10.1016/j.celrep.2018.06.091
DeBoever, C. et al. Large-scale profiling reveals the influence of genetic variation on gene expression in human induced pluripotent stem cells. Cell Stem Cell 20, 533–546.e7 (2017).
pubmed: 28388430
pmcid: 5444918
doi: 10.1016/j.stem.2017.03.009
Knowles, J. W., Hao, K., Xie, W., Weedon, M. N. & Zhang, Z. Genetic and functional analyses identify NAT2 as a human insulin sensitivity gene. Circulation 128, A10906 (2013).
Casale, F. P., Rakitsch, B., Lippert, C. & Stegle, O. Efficient set tests for the genetic analysis of correlated traits. Nat. Methods 12, 755–758 (2015).
pubmed: 26076425
doi: 10.1038/nmeth.3439
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81, 559–575 (2007).
pubmed: 17701901
pmcid: 1950838
doi: 10.1086/519795
Ongen, H., Buil, A., Brown, A. A., Dermitzakis, E. T. & Delaneau, O. Fast and efficient QTL mapper for thousands of molecular phenotypes. Bioinformatics 32, 1479–1485 (2016).
pubmed: 26708335
doi: 10.1093/bioinformatics/btv722
Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003).
pubmed: 12883005
doi: 10.1073/pnas.1530509100
pmcid: 170937
Saha, A. & Battle, A. False positives in trans-eQTL and coexpression analyses arising from RNA-sequencing alignment errors. F1000Res. 7, 1860 (2018).
pubmed: 30613398
doi: 10.12688/f1000research.17145.1
Pedersen, B. S., Layer, R. M. & Quinlan, A. R. Vcfanno: fast, flexible annotation of genetic variants. Genome Biol. 17, 118 (2016).
pubmed: 27250555
pmcid: 4888505
doi: 10.1186/s13059-016-0973-5
Hall, C. L. et al. Frequency of genetic variants associated with arrhythmogenic right ventricular cardiomyopathy in the genome aggregation database. Eur. J. Hum. Genet. 26, 1312–1318 (2018).
pubmed: 29802319
pmcid: 6117313
doi: 10.1038/s41431-018-0169-4
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
doi: 10.1093/nar/gky1016
pubmed: 30371827
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 1000, 2078–2079 (2009).
doi: 10.1093/bioinformatics/btp352
Churchhouse, C & Neale, B. Rapid GWAS of thousands of phenotypes for 337,000 samples in the UK Biobank. https://www.nealelab.is/blog/2017/7/19/rapid-gwas-of-thousands-of-phenotypes-for-337000-samples-in-the-uk-biobank/ (2017).