GENCODE: reference annotation for the human and mouse genomes in 2023.


Journal

Nucleic acids research
ISSN: 1362-4962
Titre abrégé: Nucleic Acids Res
Pays: England
ID NLM: 0411011

Informations de publication

Date de publication:
06 01 2023
Historique:
accepted: 07 11 2022
revised: 15 10 2022
received: 22 09 2022
pubmed: 25 11 2022
medline: 12 1 2023
entrez: 24 11 2022
Statut: ppublish

Résumé

GENCODE produces high quality gene and transcript annotation for the human and mouse genomes. All GENCODE annotation is supported by experimental data and serves as a reference for genome biology and clinical genomics. The GENCODE consortium generates targeted experimental data, develops bioinformatic tools and carries out analyses that, along with externally produced data and methods, support the identification and annotation of transcript structures and the determination of their function. Here, we present an update on the annotation of human and mouse genes, including developments in the tools, data, analyses and major collaborations which underpin this progress. For example, we report the creation of a set of non-canonical ORFs identified in GENCODE transcripts, the LRGASP collaboration to assess the use of long transcriptomic data to build transcript models, the progress in collaborations with RefSeq and UniProt to increase convergence in the annotation of human and mouse protein-coding genes, the propagation of GENCODE across the human pan-genome and the development of new tools to support annotation of regulatory features by GENCODE. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.

Identifiants

pubmed: 36420896
pii: 6845433
doi: 10.1093/nar/gkac1071
pmc: PMC9825462
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

D942-D949

Subventions

Organisme : NHGRI NIH HHS
ID : R01 HG004037
Pays : United States
Organisme : NHGRI NIH HHS
ID : U41 HG007234
Pays : United States
Organisme : Wellcome Trust
Pays : United Kingdom

Informations de copyright

© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

Références

Nature. 2012 Nov 1;491(7422):56-65
pubmed: 23128226
Virology. 2021 Jun;558:145-151
pubmed: 33774510
Nucleic Acids Res. 2021 Jan 8;49(D1):D916-D923
pubmed: 33270111
Bioinformatics. 2015 Jan 1;31(1):143-5
pubmed: 25236461
Nature. 2022 Apr;604(7905):310-315
pubmed: 35388217
Nucleic Acids Res. 2013 Jan;41(Database issue):D110-7
pubmed: 23161672
Nucleic Acids Res. 2022 Jan 7;50(D1):D1115-D1122
pubmed: 34718705
Genome Res. 2012 Sep;22(9):1760-74
pubmed: 22955987
Nucleic Acids Res. 2021 Jan 8;49(D1):D939-D946
pubmed: 33152070
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489
pubmed: 33237286
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D655-8
pubmed: 16381952
PLoS Comput Biol. 2020 Oct 5;16(10):e1008287
pubmed: 33017396
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Genome Res. 2017 Jun;27(6):1050-1062
pubmed: 28396519
Nature. 2022 Apr;604(7906):437-446
pubmed: 35444317
Bioinformatics. 2020 Dec 15;:
pubmed: 33320174
Nat Commun. 2021 May 11;12(1):2642
pubmed: 33976134
Bioinformatics. 2011 Jul 1;27(13):i275-82
pubmed: 21685081
Nature. 2020 Nov;587(7833):240-245
pubmed: 33177664
Nucleic Acids Res. 2022 Jan 7;50(D1):D54-D59
pubmed: 34755885
Nat Genet. 2021 Mar;53(3):354-366
pubmed: 33603233
BMC Genomics. 2015;16 Suppl 8:S2
pubmed: 26110515
Nucleic Acids Res. 2003 Oct 1;31(19):5654-66
pubmed: 14500829
Nat Genet. 2017 Dec;49(12):1731-1740
pubmed: 29106417
Nature. 2021 Feb;590(7845):300-307
pubmed: 33536621
Nat Biotechnol. 2022 Feb;40(2):209-217
pubmed: 34663921
Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773
pubmed: 30357393
NAR Genom Bioinform. 2021 May 22;3(2):lqab044
pubmed: 34046593
Genome Res. 2018 Jul;28(7):1029-1038
pubmed: 29884752
Nat Genet. 2015 Mar;47(3):199-208
pubmed: 25599403
Nucleic Acids Res. 2022 Jan 7;50(D1):D988-D995
pubmed: 34791404
Nucleic Acids Res. 2022 Jan 7;50(D1):D543-D552
pubmed: 34723319
Nat Biotechnol. 2022 Jul;40(7):994-999
pubmed: 35831657
Nature. 2017 Mar 9;543(7644):199-204
pubmed: 28241135
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45
pubmed: 26553804
Nucleic Acids Res. 2021 Aug 20;49(14):8232-8246
pubmed: 34302486
Nucleic Acids Res. 2018 Jan 4;46(D1):D1062-D1067
pubmed: 29165669

Auteurs

Adam Frankish (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Sílvia Carbonell-Sala (S)

Department of Bioinformatics and Genomics, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science andTechnology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain.

Mark Diekhans (M)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.

Irwin Jungreis (I)

MIT Computer Science and Artificial Intelligence Laboratory, 32 Vassar St, Cambridge, MA 02139,USA.
Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA.

Jane E Loveland (JE)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Jonathan M Mudge (JM)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Cristina Sisu (C)

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
Department of Life Sciences, Brunel University London, Uxbridge UB8 3PH, UK.

James C Wright (JC)

Functional Proteomics, Division of Cancer Biology, Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK.

Carme Arnan (C)

Department of Bioinformatics and Genomics, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science andTechnology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain.

If Barnes (I)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Abhimanyu Banerjee (A)

Department of Genetics, Stanford University, Palo Alto, CA, USA.
Department of Computer Science, Stanford University, Palo Alto, CA, USA.

Ruth Bennett (R)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Andrew Berry (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Alexandra Bignell (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Carles Boix (C)

MIT Computer Science and Artificial Intelligence Laboratory, 32 Vassar St, Cambridge, MA 02139,USA.
Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA.

Ferriol Calvet (F)

Department of Bioinformatics and Genomics, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science andTechnology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain.

Daniel Cerdán-Vélez (D)

Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain.

Fiona Cunningham (F)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Claire Davidson (C)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Sarah Donaldson (S)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Cagatay Dursun (C)

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.

Reham Fatima (R)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Stefano Giorgetti (S)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Carlos Garcıa Giron (CG)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Jose Manuel Gonzalez (JM)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Matthew Hardy (M)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Peter W Harrison (PW)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Thibaut Hourlier (T)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Zoe Hollis (Z)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Toby Hunt (T)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Benjamin James (B)

MIT Computer Science and Artificial Intelligence Laboratory, 32 Vassar St, Cambridge, MA 02139,USA.
Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA.

Yunzhe Jiang (Y)

Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.

Rory Johnson (R)

Department of Medical Oncology, Bern University Hospital, Murtenstrasse 35, 3008 Bern, Switzerland.
School of Biology and Environmental Science, University College Dublin, Belfield, Dublin 4, D04 V1W8, Ireland.

Mike Kay (M)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Julien Lagarde (J)

Department of Bioinformatics and Genomics, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science andTechnology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain.

Fergal J Martin (FJ)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Laura Martínez Gómez (LM)

Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain.

Surag Nair (S)

Department of Genetics, Stanford University, Palo Alto, CA, USA.
Department of Computer Science, Stanford University, Palo Alto, CA, USA.

Pengyu Ni (P)

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.

Fernando Pozo (F)

Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain.

Vivek Ramalingam (V)

Department of Genetics, Stanford University, Palo Alto, CA, USA.
Department of Computer Science, Stanford University, Palo Alto, CA, USA.

Magali Ruffier (M)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Bianca M Schmitt (BM)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Jacob M Schreiber (JM)

Department of Genetics, Stanford University, Palo Alto, CA, USA.
Department of Computer Science, Stanford University, Palo Alto, CA, USA.

Emily Steed (E)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Marie-Marthe Suner (MM)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Dulika Sumathipala (D)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Irina Sycheva (I)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Barbara Uszczynska-Ratajczak (B)

Computational Biology of Noncoding RNA, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland.

Elizabeth Wass (E)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Yucheng T Yang (YT)

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China.

Andrew Yates (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Zahoor Zafrulla (Z)

Department of Genetics, Stanford University, Palo Alto, CA, USA.
Department of Computer Science, Stanford University, Palo Alto, CA, USA.

Jyoti S Choudhary (JS)

Functional Proteomics, Division of Cancer Biology, Institute of Cancer Research, 237 Fulham Road, London SW3 6JB, UK.

Mark Gerstein (M)

Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA.

Roderic Guigo (R)

Department of Bioinformatics and Genomics, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science andTechnology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain.
Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra (UPF), Barcelona, E-08003 Catalonia, Spain.

Tim J P Hubbard (TJP)

Department of Medical and Molecular Genetics, King's College London, Guys Hospital, Great Maze Pond, London SE1 9RT, UK.

Manolis Kellis (M)

MIT Computer Science and Artificial Intelligence Laboratory, 32 Vassar St, Cambridge, MA 02139,USA.
Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, MA 02142, USA.

Anshul Kundaje (A)

Department of Genetics, Stanford University, Palo Alto, CA, USA.
Department of Computer Science, Stanford University, Palo Alto, CA, USA.

Benedict Paten (B)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA 95064, USA.

Michael L Tress (ML)

Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), Calle Melchor Fernandez Almagro, 3, 28029 Madrid, Spain.

Paul Flicek (P)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH