DNA-m6A calling and integrated long-read epigenetic and genetic analysis with fibertools.


Journal

Genome research
ISSN: 1549-5469
Titre abrégé: Genome Res
Pays: United States
ID NLM: 9518021

Informations de publication

Date de publication:
07 Jun 2024
Historique:
received: 09 02 2024
accepted: 21 05 2024
medline: 8 6 2024
pubmed: 8 6 2024
entrez: 7 6 2024
Statut: aheadofprint

Résumé

Long-read DNA sequencing has recently emerged as a powerful tool for studying both genetic and epigenetic architectures at single-molecule and single-nucleotide resolution. Long-read epigenetic studies encompass both the direct identification of native cytosine methylation as well as the identification of exogenously placed DNA N6-methyladenine (DNA-m6A). However, detecting DNA-m6A modifications using single-molecule sequencing, as well as coprocessing single-molecule genetic and epigenetic architectures, is limited by computational demands and a lack of supporting tools. Here, we introduce fibertools, a state-of-the-art toolkit that features a semisupervised convolutional neural network for fast and accurate identification of m6A-marked bases using PacBio single-molecule long-read sequencing, as well as the coprocessing of long-read genetic and epigenetic data produced using either PacBio or Oxford Nanopore sequencing platforms. We demonstrate accurate DNA-m6A identification (>90% precision and recall) along >20 kilobase long DNA molecules with a ~1,000-fold improvement in speed. In addition, we demonstrate that fibertools can readily integrate genetic and epigenetic data at single-molecule resolution, including the seamless conversion between molecular and reference coordinate systems, allowing for accurate genetic and epigenetic analyses of long-read data within structurally and somatically variable genomic regions.

Identifiants

pubmed: 38849157
pii: gr.279095.124
doi: 10.1101/gr.279095.124
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

Published by Cold Spring Harbor Laboratory Press.

Auteurs

Stephanie C Bohaczuk (SC)

University of Washington School of Medicine.

Yizi Mao (Y)

University of Washington School of Medicine.

Jane Ranchalis (J)

University of Washington School of Medicine.

Benjamin J Mallory (BJ)

University of Washington.

Alan T Min (AT)

University of Washington.

Morgan O Hamm (MO)

University of Washington.

Elliott Swanson (E)

University of Washington.

Danilo Dubocanin (D)

Stanford University School of Medicine.

Connor Finkbeiner (C)

University of Washington.

Tony Li (T)

University of Washington.

Dale Whittington (D)

University of Washington.

William Stafford Noble (WS)

University of Washington.

Andrew Ben Stergachis (AB)

University of Washington, University of Washington School of Medicine, Brotman Baty Institute for Precision Medicine.

Mitchell R Vollger (MR)

University of Washington School of Medicine; mvollger@uw.edu.

Classifications MeSH