The status of the human gene catalogue.


Journal

ArXiv
ISSN: 2331-8422
Titre abrégé: ArXiv
Pays: United States
ID NLM: 101759493

Informations de publication

Date de publication:
24 Mar 2023
Historique:
medline: 31 3 2023
entrez: 30 3 2023
pubmed: 31 3 2023
Statut: epublish

Résumé

Scientists have been trying to identify all of the genes in the human genome since the initial draft of the genome was published in 2001. Over the intervening years, much progress has been made in identifying protein-coding genes, and the estimated number has shrunk to fewer than 20,000, although the number of distinct protein-coding isoforms has expanded dramatically. The invention of high-throughput RNA sequencing and other technological breakthroughs have led to an explosion in the number of reported non-coding RNA genes, although most of them do not yet have any known function. A combination of recent advances offers a path forward to identifying these functions and towards eventually completing the human gene catalogue. However, much work remains to be done before we have a universal annotation standard that includes all medically significant genes, maintains their relationships with different reference genomes, and describes clinically relevant genetic variants.

Identifiants

pubmed: 36994150
pii: 2303.13996
pmc: PMC10055485
pii:

Types de publication

Preprint

Langues

eng

Commentaires et corrections

Type : UpdateIn

Références

Nat Protoc. 2014 May;9(5):989-1009
pubmed: 24705597
Cell. 2019 Jun 27;178(1):242-260.e29
pubmed: 31155234
Database (Oxford). 2016 Feb 20;2016:
pubmed: 26896845
Genes (Basel). 2018 Jan 16;9(1):
pubmed: 29337901
Nature. 2014 Mar 27;507(7493):462-70
pubmed: 24670764
Nat Struct Mol Biol. 2016 Dec;23(12):1117-1123
pubmed: 27820807
RNA. 2022 Feb;28(2):162-176
pubmed: 34728536
Nat Methods. 2019 Dec;16(12):1297-1305
pubmed: 31740818
G3 (Bethesda). 2023 Mar 9;13(3):
pubmed: 36630290
Nucleic Acids Res. 2021 Jan 8;49(D1):D165-D171
pubmed: 33196801
Science. 2005 Sep 2;309(5740):1564-6
pubmed: 16141073
Nat Biotechnol. 2023 Feb 23;:
pubmed: 36823353
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Nature. 2002 Dec 5;420(6915):563-73
pubmed: 12466851
Genome Biol. 2021 Aug 23;22(1):240
pubmed: 34425866
Genet Med. 2020 Jul;22(7):1269-1275
pubmed: 32366967
Bioinformatics. 2022 Feb 7;38(5):1440-1442
pubmed: 34734986
Genetics. 2022 Feb 4;220(2):
pubmed: 34897437
Sci Data. 2017 Aug 29;4:170113
pubmed: 28850107
Nat Rev Genet. 2018 Sep;19(9):535-548
pubmed: 29795125
Hum Mutat. 2022 Aug;43(8):986-997
pubmed: 34816521
Nat Rev Mol Cell Biol. 2023 Jan 3;:
pubmed: 36596869
Genome Res. 2005 Aug;15(8):1034-50
pubmed: 16024819
Genome Res. 2009 Jul;19(7):1316-23
pubmed: 19498102
Hum Mutat. 2016 Jun;37(6):564-9
pubmed: 26931183
Genome Biol. 2019 May 16;20(1):92
pubmed: 31097009
Genome Biol. 2020 Jun 2;21(1):129
pubmed: 32487205
Nat Genet. 1994 Jul;7(3):345-6
pubmed: 7920649
Genome Res. 2012 Sep;22(9):1775-89
pubmed: 22955988
Science. 2020 Sep 11;369(6509):1318-1330
pubmed: 32913098
Science. 2005 Sep 2;309(5740):1559-63
pubmed: 16141072
Nat Cell Biol. 2017 Dec;19(12):1400-1411
pubmed: 29180822
Nature. 2022 Apr;604(7905):310-315
pubmed: 35388217
Genome Biol. 2021 May 10;22(1):146
pubmed: 33971925
Nature. 2022 Aug;608(7922):353-359
pubmed: 35922509
Nucleic Acids Res. 2019 Jan 8;47(D1):D135-D139
pubmed: 30371849
Nucleic Acids Res. 2023 Jan 6;51(D1):D942-D949
pubmed: 36420896
Mol Biol Evol. 2016 Mar;33(3):755-60
pubmed: 26589994
Nature. 2012 Sep 6;489(7414):57-74
pubmed: 22955616
NPJ Genom Med. 2019 Dec 2;4:31
pubmed: 31814998
Front Genet. 2020 Nov 30;11:527484
pubmed: 33329688
Cell Syst. 2018 Feb 28;6(2):245-255.e5
pubmed: 29396323
Nucleic Acids Res. 2021 Jan 8;49(D1):D212-D220
pubmed: 33106848
RNA Biol. 2020 Dec;17(12):1741-1753
pubmed: 32597303
Nucleic Acids Res. 2019 Mar 18;47(5):2699
pubmed: 30715521
Genome Biol. 2010;11(5):206
pubmed: 20441615
Nat Neurosci. 2022 Oct;25(10):1353-1365
pubmed: 36171426
Nat Genet. 2017 Dec;49(12):1731-1740
pubmed: 29106417
Science. 2001 Feb 16;291(5507):1304-51
pubmed: 11181995
Proc Natl Acad Sci U S A. 2007 Dec 4;104(49):19428-33
pubmed: 18040051
Nat Methods. 2010 Apr;7(4):248-9
pubmed: 20354512
Genome Biol. 2017 Dec 28;18(1):241
pubmed: 29284497
Nat Commun. 2021 Jun 2;12(1):3297
pubmed: 34078885
Science. 2012 Sep 7;337(6099):1190-5
pubmed: 22955828
Nature. 2001 Feb 15;409(6822):860-921
pubmed: 11237011
Cell. 2014 Mar 27;157(1):77-94
pubmed: 24679528
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489
pubmed: 33237286
Nature. 2004 Oct 21;431(7011):931-45
pubmed: 15496913
Genome Biol. 2013 Jul 01;14(7):R70
pubmed: 23815980
Nat Genet. 2015 Mar;47(3):199-208
pubmed: 25599403
Nat Biotechnol. 2022 Jul;40(7):994-999
pubmed: 35831657
Nature. 2017 Mar 9;543(7644):199-204
pubmed: 28241135
Nucleic Acids Res. 2016 Jan 4;44(D1):D733-45
pubmed: 26553804
EMBO J. 2020 Mar 16;39(6):e103777
pubmed: 32090359
BMC Genomics. 2022 Jul 4;23(1):487
pubmed: 35787153
Nucleic Acids Res. 2019 Jan 8;47(D1):D1038-D1043
pubmed: 30445645
Genome Res. 2010 Jan;20(1):110-21
pubmed: 19858363
Nucleic Acids Res. 2007 Jan;35(Database issue):D110-5
pubmed: 17082203
Nat Commun. 2021 Jul 30;12(1):4645
pubmed: 34330918
PLoS Genet. 2018 Dec 26;14(12):e1007752
pubmed: 30586411

Auteurs

Paulo Amaral (P)

INSPER Institute of Education and Research, São Paulo, SP, Brasil.

Silvia Carbonell-Sala (S)

Centre for Genomic Regulation (CRG), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain.

Francisco M De La Vega (FM)

Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA; Tempus Labs, Inc., Chicago, IL.

Tiago Faial (T)

Nature Genetics, San Francisco, CA, USA.

Adam Frankish (A)

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Thomas Gingeras (T)

Department of Functional Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.

Roderic Guigo (R)

Centre for Genomic Regulation (CRG), Dr. Aiguader 88, 08003, Barcelona, Catalonia, Spain.
Universitat Pompeu Fabra (UPF), Barcelona, Catalonia, Spain.

Jennifer L Harrow (JL)

Centre for Genomics Research, Discovery Sciences, AstraZeneca, Da Vinci Building. Melbourn Science Park, Royston UK SG8 6HB.

Artemis G Hatzigeorgiou (AG)

Universithy of Thessaly, Department of Computer Science and Biomedical Informatics, Lamia, Greece; Hellenic Pasteur Institute, Athens, Greece.

Rory Johnson (R)

School of Biology and Environmental Science, University College Dublin, D04 V1W8 Dublin, Ireland; Conway Institute of Biomedical and Biomolecular Research, University College Dublin, D04 V1W8 Dublin, Ireland; Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, 3010 Bern, Switzerland; Department for BioMedical Research, University of Bern, 3008 Bern, Switzerland.

Terence D Murphy (TD)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

Mihaela Pertea (M)

Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.

Kim D Pruitt (KD)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

Shashikant Pujar (S)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.

Hazuki Takahashi (H)

Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama Kanagawa 230-0045 Japan.

Igor Ulitsky (I)

Department of Immunology and Regenerative Biology; Department of Molecular Neuroscience, Weizmann Institute of Science, Rehovot 76100, Israel.

Ales Varabyou (A)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

Christine A Wells (CA)

Stem Cell Systems, Department of Anatomy and Physiology, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville 3010 Vic Australia.

Mark Yandell (M)

Departent of Human Genetics, Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA.

Piero Carninci (P)

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
Human Technopole, via Rita Levi Montalcini 1, Milan 20157 Italy.

Steven L Salzberg (SL)

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA.
Department of Immunology and Regenerative Biology; Department of Molecular Neuroscience, Weizmann Institute of Science, Rehovot 76100, Israel.
Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA.

Classifications MeSH