Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylation.
Journal
bioRxiv : the preprint server for biology
Titre abrégé: bioRxiv
Pays: United States
ID NLM: 101680187
Informations de publication
Date de publication:
05 Apr 2023
05 Apr 2023
Historique:
pubmed:
31
1
2023
medline:
31
1
2023
entrez:
30
1
2023
Statut:
epublish
Résumé
Long-read sequencing technologies substantially overcome the limitations of short-reads but to date have not been considered as feasible replacement at scale due to a combination of being too expensive, not scalable enough, or too error-prone. Here, we develop an efficient and scalable wet lab and computational protocol for Oxford Nanopore Technologies (ONT) long-read sequencing that seeks to provide a genuine alternative to short-reads for large-scale genomics projects. We applied our protocol to cell lines and brain tissue samples as part of a pilot project for the NIH Center for Alzheimer's and Related Dementias (CARD). Using a single PromethION flow cell, we can detect SNPs with F1-score better than Illumina short-read sequencing. Small indel calling remains to be difficult inside homopolymers and tandem repeats, but is comparable to Illumina calls elsewhere. Further, we can discover structural variants with F1-score comparable to state-of the-art methods involving Pacific Biosciences HiFi sequencing and trio information (but at a lower cost and greater throughput). Using ONT based phasing, we can then combine and phase small and structural variants at megabase scales. Our protocol also produces highly accurate, haplotype-specific methylation calls. Overall, this makes large-scale long-read sequencing projects feasible; the protocol is currently being used to sequence thousands of brain-based genomes as a part of the NIH CARD initiative. We provide the protocol and software as open-source integrated pipelines for generating phased variant calls and assemblies.
Identifiants
pubmed: 36711673
doi: 10.1101/2023.01.12.523790
pmc: PMC9882142
pii:
doi:
Types de publication
Preprint
Langues
eng
Subventions
Organisme : NHGRI NIH HHS
ID : U01 HG010961
Pays : United States
Organisme : NIA NIH HHS
ID : P01 AG000538
Pays : United States
Organisme : NIA NIH HHS
ID : P30 AG072980
Pays : United States
Organisme : NHLBI NIH HHS
ID : OT3 HL142481
Pays : United States
Organisme : NIH HHS
ID : OT2 OD033761
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG011274
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG010262
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG011853
Pays : United States
Organisme : Intramural NIH HHS
ID : ZIA NS003154
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG010485
Pays : United States
Organisme : NINDS NIH HHS
ID : U24 NS072026
Pays : United States
Organisme : NIA NIH HHS
ID : P30 AG019610
Pays : United States
Organisme : Intramural NIH HHS
ID : ZIA AG000538
Pays : United States
Commentaires et corrections
Type : UpdateIn
Références
Bioinformatics. 2018 Jul 1;34(13):i142-i150
pubmed: 29949969
N Engl J Med. 2021 Nov 11;385(20):1868-1880
pubmed: 34758253
Nature. 2020 May;581(7809):434-443
pubmed: 32461654
Nat Methods. 2021 Nov;18(11):1322-1332
pubmed: 34725481
Nat Biotechnol. 2020 Sep;38(9):1044-1053
pubmed: 32686750
Cell Genom. 2022 Jan 12;2(1):
pubmed: 35199087
Cell. 2018 Apr 5;173(2):355-370.e14
pubmed: 29625052
Genome Biol. 2022 Dec 27;23(1):271
pubmed: 36575487
Nat Biotechnol. 2022 Sep;40(9):1332-1335
pubmed: 35332338
Gigascience. 2020 Dec 21;9(12):
pubmed: 33347570
Nat Genet. 2016 Nov;48(11):1443-1448
pubmed: 27694958
Nat Rev Genet. 2020 Oct;21(10):597-614
pubmed: 32504078
Cell Genom. 2022 May 11;2(5):
pubmed: 35720974
Nature. 2012 Nov 1;491(7422):56-65
pubmed: 23128226
Cell Genom. 2022 May;2(5):
pubmed: 36452119
Nat Methods. 2023 Mar;20(3):408-417
pubmed: 36658279
Nature. 2022 Nov;611(7936):519-531
pubmed: 36261518
Nat Biotechnol. 2020 Nov;38(11):1347-1355
pubmed: 32541955
Bioinformatics. 2021 Apr 1;36(22-23):5519-5521
pubmed: 33346817
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Nat Methods. 2018 Jun;15(6):461-468
pubmed: 29713083
Nature. 2020 Feb;578(7793):82-93
pubmed: 32025007
Bioinformatics. 2022 Mar 28;38(7):1816-1822
pubmed: 35104333
Nat Methods. 2019 Jan;16(1):88-94
pubmed: 30559433
Nat Methods. 2022 Apr;19(4):445-448
pubmed: 35396485
Nature. 2021 Apr;592(7856):737-746
pubmed: 33911273
Nat Biotechnol. 2019 May;37(5):540-546
pubmed: 30936562
Bioinformatics. 2016 Apr 15;32(8):1220-2
pubmed: 26647377
Nat Genet. 2011 May;43(5):491-8
pubmed: 21478889
PLoS Genet. 2010 May 13;6(5):e1000952
pubmed: 20485568
Genome Res. 2017 May;27(5):722-736
pubmed: 28298431
Genome Biol. 2020 Aug 3;21(1):189
pubmed: 32746918
Genome Biol. 2019 Nov 20;20(1):246
pubmed: 31747936
Nat Methods. 2021 Feb;18(2):170-175
pubmed: 33526886
Nat Biotechnol. 2018 Apr;36(4):338-345
pubmed: 29431738
Nat Rev Genet. 2018 Jun;19(6):329-346
pubmed: 29599501
Nat Biotechnol. 2022 May;40(5):672-680
pubmed: 35132260
Nat Biotechnol. 2023 Feb 16;:
pubmed: 36797493
Nat Genet. 2022 Apr;54(4):518-525
pubmed: 35410384
Science. 2021 Dec 17;374(6574):abg8871
pubmed: 34914532
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Bioinformatics. 2012 Aug 15;28(16):2097-105
pubmed: 22668792
Genome Biol. 2021 Sep 14;22(1):268
pubmed: 34521442
Proc Natl Acad Sci U S A. 2016 Dec 27;113(52):E8396-E8405
pubmed: 27956617