DNA 5-methylcytosine detection and methylation phasing using PacBio circular consensus sequencing.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
08 07 2023
Historique:
received: 20 11 2022
accepted: 22 06 2023
medline: 10 7 2023
pubmed: 9 7 2023
entrez: 8 7 2023
Statut: epublish

Résumé

Long single-molecular sequencing technologies, such as PacBio circular consensus sequencing (CCS) and nanopore sequencing, are advantageous in detecting DNA 5-methylcytosine in CpGs (5mCpGs), especially in repetitive genomic regions. However, existing methods for detecting 5mCpGs using PacBio CCS are less accurate and robust. Here, we present ccsmeth, a deep-learning method to detect DNA 5mCpGs using CCS reads. We sequence polymerase-chain-reaction treated and M.SssI-methyltransferase treated DNA of one human sample using PacBio CCS for training ccsmeth. Using long (≥10 Kb) CCS reads, ccsmeth achieves 0.90 accuracy and 0.97 Area Under the Curve on 5mCpG detection at single-molecule resolution. At the genome-wide site level, ccsmeth achieves >0.90 correlations with bisulfite sequencing and nanopore sequencing using only 10× reads. Furthermore, we develop a Nextflow pipeline, ccsmethphase, to detect haplotype-aware methylation using CCS reads, and then sequence a Chinese family trio to validate it. ccsmeth and ccsmethphase can be robust and accurate tools for detecting DNA 5-methylcytosines.

Identifiants

pubmed: 37422489
doi: 10.1038/s41467-023-39784-9
pii: 10.1038/s41467-023-39784-9
pmc: PMC10329642
doi:

Substances chimiques

5-Methylcytosine 6R795CQT4H
DNA 9007-49-2

Types de publication

Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

4054

Informations de copyright

© 2023. The Author(s).

Références

Elife. 2022 Jul 05;11:
pubmed: 35787786
Nat Commun. 2021 Jun 8;12(1):3438
pubmed: 34103501
Epigenetics Chromatin. 2019 Oct 8;12(1):60
pubmed: 31594537
Nat Biotechnol. 2023 Feb;41(2):232-238
pubmed: 36050551
Nat Commun. 2016 Nov 08;7:13316
pubmed: 27824329
Epigenetics Chromatin. 2015 Jul 21;8:24
pubmed: 26195987
Genome Biol. 2021 Sep 14;22(1):268
pubmed: 34521442
Nucleic Acids Res. 2022 Jan 7;50(D1):D27-D38
pubmed: 34718731
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Science. 2022 Feb 4;375(6580):515-522
pubmed: 35113693
Genome Res. 2021 Jan 19;:
pubmed: 33468551
Genome Biol. 2021 Feb 22;22(1):68
pubmed: 33618748
Genome Res. 2021 Jun 17;:
pubmed: 34140313
Nat Biotechnol. 2019 Apr;37(4):424-429
pubmed: 30804537
Genome Res. 2014 Apr;24(4):554-69
pubmed: 24402520
PLoS Genet. 2012 Jun;8(6):e1002750
pubmed: 22761581
Bioinformatics. 2019 Nov 1;35(22):4586-4595
pubmed: 30994904
Nat Rev Mol Cell Biol. 2019 Oct;20(10):590-607
pubmed: 31399642
Proc Natl Acad Sci U S A. 1992 Mar 1;89(5):1827-31
pubmed: 1542678
Genome Biol. 2020 Feb 7;21(1):30
pubmed: 32033565
Nat Methods. 2021 Nov;18(11):1322-1332
pubmed: 34725481
J Pathol. 2007 Feb;211(3):261-8
pubmed: 17177177
Nat Rev Genet. 2014 Oct;15(10):647-61
pubmed: 25159599
Genome Biol. 2021 Oct 27;22(1):299
pubmed: 34706745
Nat Commun. 2023 Jul 8;14(1):4054
pubmed: 37422489
Nucleic Acids Res. 2021 Aug 20;49(14):e81
pubmed: 34019650
Nat Commun. 2023 May 29;14(1):3090
pubmed: 37248219
Epigenomics. 2018 Jul;10(7):941-954
pubmed: 29962238
Bioinformatics. 2016 May 15;32(10):1446-53
pubmed: 26819470
Science. 2022 Apr;376(6588):eabl4178
pubmed: 35357911
J Appl Physiol (1985). 2010 Aug;109(2):586-97
pubmed: 20448029
Nat Biotechnol. 2019 Oct;37(10):1155-1162
pubmed: 31406327
PLoS Comput Biol. 2013;9(3):e1002935
pubmed: 23516341
Bioinformatics. 2016 Oct 1;32(19):2911-9
pubmed: 27318202
Proc Natl Acad Sci U S A. 2021 Feb 2;118(5):
pubmed: 33495335
Science. 2022 Apr;376(6588):eabj6965
pubmed: 35357917
Am J Hum Genet. 2016 Sep 1;99(3):555-566
pubmed: 27569549
Genome Res. 2002 Jun;12(6):996-1006
pubmed: 12045153
F1000Res. 2016 Jun 23;5:1479
pubmed: 27429743
Genome Biol. 2021 Oct 18;22(1):295
pubmed: 34663425
Science. 2022 Apr;376(6588):eabk3112
pubmed: 35357925
Nat Biotechnol. 2017 Apr 11;35(4):316-319
pubmed: 28398311
Genomics Proteomics Bioinformatics. 2021 Aug;19(4):578-583
pubmed: 34400360
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D590-8
pubmed: 16381938
Genome Biol. 2020 Mar 3;21(1):54
pubmed: 32127008
Cell Genom. 2022 Dec 21;3(1):100233
pubmed: 36777186
Essays Biochem. 2019 Dec 20;63(6):639-648
pubmed: 31755932
Sci Data. 2016 Jun 07;3:160025
pubmed: 27271295
Nat Rev Genet. 2011 Nov 29;13(1):36-46
pubmed: 22124482
Nat Methods. 2010 Jun;7(6):461-5
pubmed: 20453866
Bioinformatics. 2011 Jun 1;27(11):1571-2
pubmed: 21493656
Nat Methods. 2017 Apr;14(4):407-410
pubmed: 28218898
Nat Genet. 2018 Nov;50(11):1542-1552
pubmed: 30349119
Nat Methods. 2022 Dec;19(12):1590-1598
pubmed: 36357692
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Genome Biol. 2021 Dec 6;22(1):332
pubmed: 34872606

Auteurs

Peng Ni (P)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Xiangjiang Laboratory, Changsha, 410205, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

Fan Nie (F)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Xiangjiang Laboratory, Changsha, 410205, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

Zeyu Zhong (Z)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

Jinrui Xu (J)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

Neng Huang (N)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

Jun Zhang (J)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

Haochen Zhao (H)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

You Zou (Y)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China.

Yuanfeng Huang (Y)

Bioinformatics Center, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, 410000, China.

Jinchen Li (J)

Bioinformatics Center, National Clinical Research Centre for Geriatric Disorders, Department of Geriatrics, Xiangya Hospital, Central South University, Changsha, 410000, China.
Centre for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, 410000, China.

Chuan-Le Xiao (CL)

State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-sen University, #7 Jinsui Road, Tianhe District, Guangzhou, China. xiaochuanle@126.com.

Feng Luo (F)

School of Computing, Clemson University, Clemson, SC, 29634-0974, USA. luofeng@clemson.edu.

Jianxin Wang (J)

School of Computer Science and Engineering, Central South University, Changsha, 410083, China. jxwang@mail.csu.edu.cn.
Xiangjiang Laboratory, Changsha, 410205, China. jxwang@mail.csu.edu.cn.
Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, China. jxwang@mail.csu.edu.cn.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH