A joint NCBI and EMBL-EBI transcript set for clinical genomics and research.


Journal

Nature
ISSN: 1476-4687
Titre abrégé: Nature
Pays: England
ID NLM: 0410462

Informations de publication

Date de publication:
04 2022
Historique:
received: 13 07 2021
accepted: 07 02 2022
pubmed: 8 4 2022
medline: 16 4 2022
entrez: 7 4 2022
Statut: ppublish

Résumé

Comprehensive genome annotation is essential to understand the impact of clinically relevant variants. However, the absence of a standard for clinical reporting and browser display complicates the process of consistent interpretation and reporting. To address these challenges, Ensembl/GENCODE

Identifiants

pubmed: 35388217
doi: 10.1038/s41586-022-04558-8
pii: 10.1038/s41586-022-04558-8
pmc: PMC9007741
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

310-315

Subventions

Organisme : Wellcome Trust
ID : WT108749/Z/15/Z
Pays : United Kingdom
Organisme : Medical Research Council
ID : MC_PC_19024
Pays : United Kingdom
Organisme : Wellcome Trust
ID : WT200990/A/16/Z
Pays : United Kingdom
Organisme : NHGRI NIH HHS
ID : U41 HG007234
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG007234
Pays : United States
Organisme : Wellcome Trust
ID : WT200990/Z/16/Z
Pays : United Kingdom
Organisme : Wellcome Trust
Pays : United Kingdom

Informations de copyright

© 2022. This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply.

Références

Frankish, A. et al. GENCODE 2021. Nucleic Acids Res. 49, D916–D923 (2021).
doi: 10.1093/nar/gkaa1087 pubmed: 33270111
O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).
doi: 10.1093/nar/gkv1189 pubmed: 26553804
Miller, D. T. et al. ACMG SF v3.0 list for reporting of secondary findings in clinical exome and genome sequencing: a policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet. Med. 23, 1381–1390 (2021).
doi: 10.1038/s41436-021-01172-3 pubmed: 34012068
Landrum, M. J. et al. ClinVar: improvements to accessing data. Nucleic Acids Res. 48, D835–D844 (2020).
doi: 10.1093/nar/gkz972 pubmed: 31777943
ENCODE Project Consortium. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).
doi: 10.1038/s41586-020-2493-4
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
doi: 10.1038/s41586-020-2308-7 pubmed: 32461654 pmcid: 7334197
Firth, H. V. et al. DECIPHER: Database of Chromosomal Imbalance and Phenotype in Humans Using Ensembl Resources. Am. J. Hum. Genet. 84, 524–533 (2009).
doi: 10.1016/j.ajhg.2009.03.010 pubmed: 19344873 pmcid: 2667985
GTEx Consortium. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).
doi: 10.1126/science.aaz1776
Morales, J. et al. The value of primary transcripts to the clinical and non-clinical genomics community: survey results and roadmap for improvements. Mol. Genet. Genomic Med. 9, e1786 (2021).
doi: 10.1002/mgg3.1786 pubmed: 34435752 pmcid: 8683622
Schneider, V. A. et al. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res. 27, 849–864 (2017).
doi: 10.1101/gr.213611.116 pubmed: 28396521 pmcid: 5411779
Rehm, H. L. et al. ClinGen—the clinical genome resource. N. Engl. J. Med. 372, 2235–2242 (2015).
doi: 10.1056/NEJMsr1406261 pubmed: 26014595 pmcid: 4474187
Martin, A. R. et al. PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nat. Genet. 51, 1560–1565 (2019).
doi: 10.1038/s41588-019-0528-2 pubmed: 31676867
Thormann, A. et al. Flexible and scalable diagnostic filtering of genomic variants using G2P with Ensembl VEP. Nat. Commun. 10, 2373 (2019).
doi: 10.1038/s41467-019-10016-3 pubmed: 31147538 pmcid: 6542828
Amberger, J. S. & Hamosh, A. Searching Online Mendelian Inheritance in Man (OMIM): a knowledgebase of human genes and genetic phenotypes. Curr. Protoc. Bioinformatics 58, 1.2.1–1.2.12 (2017).
Kalia, S. S. et al. Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (ACMG SF v2.0): a policy statement of the American College of Medical Genetics and Genomics. Genet. Med. 19, 249–255 (2017).
doi: 10.1038/gim.2016.190 pubmed: 27854360
Haberle, V. & Stark, A. Eukaryotic core promoters and the functional basis of transcription initiation. Nat. Rev. Mol. Cell Biol. 19, 621–637 (2018).
doi: 10.1038/s41580-018-0028-8 pubmed: 29946135 pmcid: 6205604
McLaren, W. et al. The Ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
doi: 10.1186/s13059-016-0974-4 pubmed: 27268795 pmcid: 4893825
Rangwala, S. H. et al. Accessing NCBI data using the NCBI Sequence Viewer and Genome Data Viewer (GDV). Genome Res. 31, 159–169 (2021).
doi: 10.1101/gr.266932.120 pubmed: 33239395 pmcid: 7849379
Lee, C. M. et al. UCSC Genome Browser enters 20th year. Nucleic Acids Res. 48, D756–D761 (2020).
pubmed: 31691824
Pujar, S. et al. Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation. Nucleic Acids Res. 46, D221–D228 (2018).
doi: 10.1093/nar/gkx1031 pubmed: 29126148
MacArthur, J. A. L. et al. Locus Reference Genomic: reference sequences for the reporting of clinically relevant sequence variants. Nucleic Acids Res. 42, D873–D878 (2014).
doi: 10.1093/nar/gkt1198 pubmed: 24285302
den Dunnen, J. T. Describing sequence variants using HGVS nomenclature. Methods Mol. Biol. 1492, 243–251 (2017).
doi: 10.1007/978-1-4939-6442-0_17
Miga, K. H. & Wang, T. The need for a human pangenome reference sequence. Annu. Rev. Genomics Hum. Genet. 22, 81–102 (2021).
doi: 10.1146/annurev-genom-120120-081921 pubmed: 33929893 pmcid: 8410644
Li, H. et al. Exome variant discrepancies due to reference genome differences. Am. J. Hum. Genet. 108, 1239–1250 (2021).
doi: 10.1016/j.ajhg.2021.05.011 pubmed: 34129815 pmcid: 8322936
Nellore, A. et al. Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive. Genome Biol. 17, 266 (2016).
doi: 10.1186/s13059-016-1118-6 pubmed: 28038678 pmcid: 5203714
Wilks, C. et al. Recount3: summaries and queries for large-scale RNA-seq expression and splicing. Genome Biol. 22, 323 (2021).
doi: 10.1186/s13059-021-02533-6 pubmed: 34844637 pmcid: 8628444
Lin, M. F., Jungreis, I. & Kellis, M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27, i275–i282 (2011).
doi: 10.1093/bioinformatics/btr209 pubmed: 21685081 pmcid: 3117341
Rodriguez, J. M. et al. APPRIS 2017: principal isoforms for multiple gene sets. Nucleic Acids Res. 46, D213–D217 (2018).
doi: 10.1093/nar/gkx997 pubmed: 29069475
UniProt Consortium. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).
doi: 10.1093/nar/gkaa1100
Noguchi, S. et al. FANTOM5 CAGE profiles of human and mouse samples. Sci. Data 4, 170112 (2017).
doi: 10.1038/sdata.2017.112 pubmed: 28850106 pmcid: 5574368
Wang, R., Zheng, D., Yehia, G. & Tian, B. A compendium of conserved cleavage and polyadenylation events in mammalian genes. Genome Res. 28, 1427–1441 (2018).
doi: 10.1101/gr.237826.118 pubmed: 30143597 pmcid: 6169888
Zheng, D. et al. Cellular stress alters 3′UTR landscape through alternative polyadenylation and isoform-specific degradation. Nat. Commun. 9, 2268 (2018).
doi: 10.1038/s41467-018-04730-7 pubmed: 29891946 pmcid: 5995920
Fontes, M. M. et al. Activity-dependent regulation of alternative cleavage and polyadenylation during hippocampal long-term potentiation. Sci. Rep. 7, 17377 (2017).
doi: 10.1038/s41598-017-17407-w pubmed: 29234016 pmcid: 5727029
Li, W. et al. Alternative cleavage and polyadenylation in spermatogenesis connects chromatin regulation with post-transcriptional control. BMC Biol. 14, 6 (2016).
doi: 10.1186/s12915-016-0229-6 pubmed: 26801249 pmcid: 4724118
Yang, Y. et al. PAF complex plays novel subunit-specific roles in alternative cleavage and polyadenylation. PLoS Genet. 12, e1005794 (2016).
doi: 10.1371/journal.pgen.1005794 pubmed: 26765774 pmcid: 4713055
Li, W. et al. Systematic profiling of poly(A)
doi: 10.1371/journal.pgen.1005166 pubmed: 25906188 pmcid: 4407891
Derti, A. et al. A quantitative atlas of polyadenylation in five mammals. Genome Res. 22, 1173–1183 (2012).
doi: 10.1101/gr.132563.111 pubmed: 22454233 pmcid: 3371698
Vo Ngoc, L., Cassidy, C. J., Huang, C. Y., Duttke, S. H. C. & Kadonaga, J. T. The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters. Genes Dev. 31, 6–11 (2017).
doi: 10.1101/gad.293837.116 pubmed: 28108474 pmcid: 5287114
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
doi: 10.1016/j.molcel.2010.05.004 pubmed: 20513432 pmcid: 2898526
Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–1018 (2011).
doi: 10.1093/bioinformatics/btr064 pubmed: 21330290 pmcid: 3065696
Fornes, O. et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 48, D87–D92 (2020).
doi: 10.1093/nar/gkaa516 pubmed: 31701148

Auteurs

Joannella Morales (J)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Shashikant Pujar (S)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Jane E Loveland (JE)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Alex Astashyn (A)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Ruth Bennett (R)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Andrew Berry (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Eric Cox (E)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Claire Davidson (C)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Olga Ermolaeva (O)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Catherine M Farrell (CM)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Reham Fatima (R)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Laurent Gil (L)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Tamara Goldfarb (T)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Jose M Gonzalez (JM)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Diana Haddad (D)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Matthew Hardy (M)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Toby Hunt (T)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

John Jackson (J)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Vinita S Joardar (VS)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Michael Kay (M)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Vamsi K Kodali (VK)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Kelly M McGarvey (KM)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Aoife McMahon (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Jonathan M Mudge (JM)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Daniel N Murphy (DN)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Michael R Murphy (MR)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Bhanu Rajput (B)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Sanjida H Rangwala (SH)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Lillian D Riddick (LD)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Françoise Thibaud-Nissen (F)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Glen Threadgold (G)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Anjana R Vatsan (AR)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Craig Wallin (C)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

David Webb (D)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Paul Flicek (P)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Ewan Birney (E)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Kim D Pruitt (KD)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Adam Frankish (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Fiona Cunningham (F)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Terence D Murphy (TD)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA. murphyte@ncbi.nlm.nih.gov.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH