Local read haplotagging enables accurate long-read small variant calling.

DeepVariant Long Reads Nanopore PacBio HiFi PacBio Revio Variant Calling

Journal

bioRxiv : the preprint server for biology
Titre abrégé: bioRxiv
Pays: United States
ID NLM: 101680187

Informations de publication

Date de publication:
12 Sep 2023
Historique:
pubmed: 25 9 2023
medline: 25 9 2023
entrez: 25 9 2023
Statut: epublish

Résumé

Long-read sequencing technology has enabled variant detection in difficult-to-map regions of the genome and enabled rapid genetic diagnosis in clinical settings. Rapidly evolving third-generation sequencing platforms like Pacific Biosciences (PacBio) and Oxford nanopore technologies (ONT) are introducing newer platforms and data types. It has been demonstrated that variant calling methods based on deep neural networks can use local haplotyping information with long-reads to improve the genotyping accuracy. However, using local haplotype information creates an overhead as variant calling needs to be performed multiple times which ultimately makes it difficult to extend to new data types and platforms as they get introduced. In this work, we have developed a local haplotype approximate method that enables state-of-the-art variant calling performance with multiple sequencing platforms including PacBio Revio system, ONT R10.4 simplex and duplex data. This addition of local haplotype approximation makes DeepVariant a universal variant calling solution for long-read sequencing platforms.

Identifiants

pubmed: 37745389
doi: 10.1101/2023.09.07.556731
pmc: PMC10515762
pii:
doi:

Types de publication

Preprint

Langues

eng

Subventions

Organisme : NHGRI NIH HHS
ID : U01 HG010961
Pays : United States
Organisme : NHLBI NIH HHS
ID : OT3 HL142481
Pays : United States
Organisme : NIH HHS
ID : OT2 OD033761
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG011274
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG010262
Pays : United States
Organisme : NHGRI NIH HHS
ID : U01 HG010971
Pays : United States
Organisme : NHGRI NIH HHS
ID : U24 HG011853
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG010485
Pays : United States

Déclaration de conflit d'intérêts

A.K., P.C., K.S., D.C., M.N., A.C. are employees of Google LLC and own Alphabet stock as part of the standard compensation package. E.A. is the founder of Personalis Inc and Deepcell Inc., advisor Pacific Biosciences, SequenceBio. E.A.A. has received support in kind Illumina, Oxford Nanopore, Pacific Biosciences. Stockholder Pacific Biosciences, Oxford Nanopore. K.H.M. is a science advisory board member of Centaura; K.H.M. has received travel funds to speak at events hosted by Oxford Nanopore Technologies. J.G. holds stock in ONT and PacBIo. J.G., K.S. and S.G. has accepted bursary to attend and speak at conferences on behalf of ONT.

Auteurs

Alexey Kolesnikov (A)

Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA.

Daniel Cook (D)

Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA.

Maria Nattestad (M)

Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA.

Brandy McNulty (B)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA.

John Gorzynski (J)

Stanford University, Stanford, CA, USA.

Sneha Goenka (S)

Stanford University, Stanford, CA, USA.

Euan A Ashley (EA)

Stanford University, Stanford, CA, USA.

Miten Jain (M)

Northeastern university, Boston, MA, USA.

Karen H Miga (KH)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA.

Benedict Paten (B)

UC Santa Cruz Genomics Institute, University of California, Santa Cruz, California, USA.

Pi-Chuan Chang (PC)

Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA.

Andrew Carroll (A)

Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA.

Kishwar Shafin (K)

Google Inc, 1600 Amphitheatre Pkwy, Mountain View, CA.

Classifications MeSH