A complete reference genome improves analysis of human genetic variation.


Journal

Science (New York, N.Y.)
ISSN: 1095-9203
Titre abrégé: Science
Pays: United States
ID NLM: 0404511

Informations de publication

Date de publication:
04 2022
Historique:
entrez: 31 3 2022
pubmed: 1 4 2022
medline: 6 4 2022
Statut: ppublish

Résumé

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves read mapping and variant calling for 3202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of variants per sample in previously unresolved regions, showcasing the promise of the T2T-CHM13 reference for evolutionary and biomedical discovery. Simultaneously, this reference eliminates tens of thousands of spurious variants per sample, including reduction of false positives in 269 medically relevant genes by up to a factor of 12. Because of these improvements in variant discovery coupled with population and functional genomic resources, T2T-CHM13 is positioned to replace GRCh38 as the prevailing reference for human genetics.

Identifiants

pubmed: 35357935
doi: 10.1126/science.abl3533
pmc: PMC9336181
mid: NIHMS1814268
doi:

Types de publication

Journal Article Research Support, N.I.H., Intramural Research Support, N.I.H., Extramural Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

eabl3533

Subventions

Organisme : NHGRI NIH HHS
ID : U01 HG010961
Pays : United States
Organisme : NIMH NIH HHS
ID : DP2 MH119424
Pays : United States
Organisme : NHGRI NIH HHS
ID : U41 HG010972
Pays : United States
Organisme : NIH HHS
ID : OT2 OD026682
Pays : United States
Organisme : NHGRI NIH HHS
ID : R21 HG010548
Pays : United States
Organisme : NHGRI NIH HHS
ID : U01 HG010971
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG011853
Pays : United States
Organisme : NHGRI NIH HHS
ID : UM1 HG008898
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG010263
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG006677
Pays : United States
Organisme : NIGMS NIH HHS
ID : R35 GM133747
Pays : United States
Organisme : NHGRI NIH HHS
ID : R00 HG009532
Pays : United States
Organisme : NCI NIH HHS
ID : U01 CA253481
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG011274
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG006620
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG010485
Pays : United States
Organisme : NIDDK NIH HHS
ID : R24 DK106766
Pays : United States

Commentaires et corrections

Type : CommentIn

Références

Nat Methods. 2022 Jun;19(6):687-695
pubmed: 35361931
Genome Biol. 2016 Jun 06;17(1):122
pubmed: 27268795
Nat Methods. 2021 Nov;18(11):1322-1332
pubmed: 34725481
Science. 2020 Sep 11;369(6509):1318-1330
pubmed: 32913098
Science. 2022 Apr;376(6588):eabl4178
pubmed: 35357911
Nature. 2016 Oct 13;538(7624):243-247
pubmed: 27706134
Nat Methods. 2021 Feb;18(2):170-175
pubmed: 33526886
Nat Genet. 2019 Jan;51(1):30-35
pubmed: 30455414
Genome Res. 2014 Apr;24(4):688-96
pubmed: 24418700
PLoS Biol. 2015 Jul 07;13(7):e1002195
pubmed: 26151137
Gene. 2016 Jan 15;576(1 Pt 3):385-94
pubmed: 26526134
Am J Hum Genet. 2003 Jul;73(1):212-4
pubmed: 12865991
Annu Rev Genomics Hum Genet. 2020 Aug 31;21:55-79
pubmed: 32421357
Genome Biol. 2018 Dec 17;19(1):220
pubmed: 30558649
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Nature. 2015 Oct 1;526(7571):75-81
pubmed: 26432246
Nat Methods. 2018 Jun;15(6):461-468
pubmed: 29713083
Nat Biotechnol. 2020 Nov;38(11):1347-1355
pubmed: 32541955
Cell. 2010 Jan 8;140(1):88-98
pubmed: 20074522
Cell. 2019 Jan 24;176(3):663-675.e19
pubmed: 30661756
J Mol Diagn. 2022 Mar;24(3):219-223
pubmed: 35041928
Comput Appl Biosci. 1995 Dec;11(6):615-9
pubmed: 8808577
Nature. 2020 May;581(7809):434-443
pubmed: 32461654
Annu Rev Genomics Hum Genet. 2021 Aug 31;22:81-102
pubmed: 33929893
Genome Res. 2014 Apr;24(4):697-707
pubmed: 24501022
Annu Rev Genomics Hum Genet. 2020 Aug 31;21:139-162
pubmed: 32453966
Elife. 2021 Sep 16;10:
pubmed: 34528508
Ann Hum Genet. 2020 Mar;84(2):125-140
pubmed: 31711268
Mol Biol Evol. 2022 Jan 7;39(1):
pubmed: 34626111
Nat Ecol Evol. 2017;1(3):69
pubmed: 28580430
Cell. 2019 Jan 24;176(3):535-548.e24
pubmed: 30661751
Nature. 2020 Jul;583(7818):699-710
pubmed: 32728249
Nature. 2016 Oct 13;538(7624):201-206
pubmed: 27654912
Genome Res. 2017 May;27(5):849-864
pubmed: 28396521
Nat Biotechnol. 2020 Sep;38(9):1044-1053
pubmed: 32686750
Genome Biol. 2015 Jan 24;16:13
pubmed: 25651527
Cell Genom. 2022 Jan 12;2(1):
pubmed: 35199087
Bioinformatics. 2020 Jul 1;36(Suppl_1):i111-i118
pubmed: 32657365
Nucleic Acids Res. 2019 Jan 8;47(D1):D1005-D1012
pubmed: 30445434
Science. 2022 Apr;376(6588):eabj6965
pubmed: 35357917
Nature. 2015 Jan 29;517(7536):608-11
pubmed: 25383537
Nat Biotechnol. 2019 May;37(5):555-560
pubmed: 30858580
Science. 2022 Apr;376(6588):eabk3112
pubmed: 35357925
Nat Genet. 2014 Mar;46(3):310-5
pubmed: 24487276
Nature. 2015 Oct 1;526(7571):68-74
pubmed: 26432245
Science. 2012 Feb 17;335(6070):823-8
pubmed: 22344438
BMC Genomics. 2014 May 20;15:387
pubmed: 24885025
Genome Biol. 2019 Aug 9;20(1):159
pubmed: 31399121
Cancer Genet. 2018 Feb;221:46-52
pubmed: 29405996
Science. 2010 May 7;328(5979):710-722
pubmed: 20448178
Genome Res. 2014 Dec;24(12):2066-76
pubmed: 25373144
Nature. 2014 Nov 13;515(7526):216-21
pubmed: 25363768
Nat Genet. 2021 Jun;53(6):779-786
pubmed: 33972781
Nature. 2001 Feb 15;409(6822):860-921
pubmed: 11237011
Nature. 2013 Aug 22;500(7463):415-21
pubmed: 23945592
Nat Commun. 2019 Apr 16;10(1):1784
pubmed: 30992455
Nat Commun. 2014 Apr 09;5:3618
pubmed: 24714587
Nature. 2004 Oct 21;431(7011):931-45
pubmed: 15496913
Nat Biotechnol. 2022 May;40(5):672-680
pubmed: 35132260
Genome Res. 1999 Aug;9(8):677-9
pubmed: 10447503
Curr Protoc Bioinformatics. 2013;43:11.10.1-11.10.33
pubmed: 25431634
Proc Natl Acad Sci U S A. 2012 Jul 24;109(30):11920-7
pubmed: 22797899
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Genome Res. 2011 Sep;21(9):1529-42
pubmed: 21700766
Nature. 2021 Feb;590(7845):290-299
pubmed: 33568819
Nucleic Acids Res. 2021 Jan 8;49(D1):D1046-D1057
pubmed: 33221922
Nucleic Acids Res. 2018 Jan 4;46(D1):D1062-D1067
pubmed: 29165669
Genome Med. 2021 Feb 22;13(1):31
pubmed: 33618777
Clin Exp Hepatol. 2020 Dec;6(4):347-353
pubmed: 33511283
Gene. 2021 Feb 15;769:145229
pubmed: 33059026
Genome Biol Evol. 2018 Nov 1;10(11):2899-2905
pubmed: 30364947
Nat Genet. 2015 Mar;47(3):276-83
pubmed: 25599402
Oncotarget. 2015 Sep 22;6(28):25943-61
pubmed: 26305677
Nat Methods. 2018 Aug;15(8):595-597
pubmed: 30013044

Auteurs

Sergey Aganezov (S)

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

Stephanie M Yan (SM)

Department of Biology, Johns Hopkins University, Baltimore, MD, USA.

Daniela C Soto (DC)

Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, CA, USA.

Melanie Kirsche (M)

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

Samantha Zarate (S)

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

Pavel Avdeyev (P)

Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA.

Dylan J Taylor (DJ)

Department of Biology, Johns Hopkins University, Baltimore, MD, USA.

Kishwar Shafin (K)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA.

Alaina Shumate (A)

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.

Chunlin Xiao (C)

National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, USA.

Justin Wagner (J)

National Institute of Standards and Technology, Gaithersburg, MD, USA.

Jennifer McDaniel (J)

National Institute of Standards and Technology, Gaithersburg, MD, USA.

Nathan D Olson (ND)

National Institute of Standards and Technology, Gaithersburg, MD, USA.

Michael E G Sauria (MEG)

Department of Biology, Johns Hopkins University, Baltimore, MD, USA.

Mitchell R Vollger (MR)

Department of Genome Sciences, University of Washington, Seattle, WA, USA.

Arang Rhie (A)

Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA.

Melissa Meredith (M)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA.

Skylar Martin (S)

Department of Computer Science and Biofrontiers Institute, University of Colorado, Boulder, CO, USA.

Joyce Lee (J)

Bionano Genomics, San Diego, CA, USA.

Sergey Koren (S)

Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA.

Jeffrey A Rosenfeld (JA)

Cancer Institute of New Jersey, New Brunswick, NJ, USA.

Benedict Paten (B)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA.

Ryan Layer (R)

Department of Computer Science and Biofrontiers Institute, University of Colorado, Boulder, CO, USA.

Chen-Shan Chin (CS)

DNAnexus, Mountain View, CA, USA.

Fritz J Sedlazeck (FJ)

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.

Nancy F Hansen (NF)

Comparative Genomics Analysis Unit, National Human Genome Research Institute, Rockville, MD, USA.

Danny E Miller (DE)

Department of Genome Sciences, University of Washington, Seattle, WA, USA.
Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children's Hospital, Seattle, WA, USA.

Adam M Phillippy (AM)

Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA.

Karen H Miga (KH)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, CA, USA.

Rajiv C McCoy (RC)

Department of Biology, Johns Hopkins University, Baltimore, MD, USA.

Megan Y Dennis (MY)

Department of Biochemistry and Molecular Medicine, Genome Center, MIND Institute, University of California, Davis, CA, USA.

Justin M Zook (JM)

National Institute of Standards and Technology, Gaithersburg, MD, USA.

Michael C Schatz (MC)

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
Department of Biology, Johns Hopkins University, Baltimore, MD, USA.
Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH