Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis.
Benchmarking
/ methods
Chromosome Mapping
/ methods
DNA Methylation
/ genetics
DNA, Plant
/ drug effects
Epigenesis, Genetic
Epigenomics
/ methods
Fragaria
/ genetics
Genome, Plant
Poaceae
/ genetics
Sequence Alignment
/ methods
Software
Sulfites
/ pharmacology
Thlaspi
/ genetics
Whole Genome Sequencing
/ methods
DNA methylation
WGBS mapping software
benchmark
epigenetics
non-model plants
whole genome bisulfite sequencing
Journal
Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837
Informations de publication
Date de publication:
02 09 2021
02 09 2021
Historique:
received:
10
11
2020
revised:
30
12
2020
pubmed:
25
2
2021
medline:
23
11
2021
entrez:
24
2
2021
Statut:
ppublish
Résumé
Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads.
Identifiants
pubmed: 33624017
pii: 6146770
doi: 10.1093/bib/bbab021
pmc: PMC8425420
pii:
doi:
Substances chimiques
DNA, Plant
0
Sulfites
0
hydrogen sulfite
OJ9787WBLU
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Commentaires et corrections
Type : ErratumIn
Informations de copyright
© The Author(s) 2021. Published by Oxford University Press.
Références
BMC Genomics. 2013 Nov 10;14:774
pubmed: 24206606
Nucleic Acids Res. 2016 Jul 8;44(W1):W160-5
pubmed: 27079975
Mol Cell. 2014 Sep 4;55(5):694-707
pubmed: 25132175
Proc Natl Acad Sci U S A. 1992 Mar 1;89(5):1827-31
pubmed: 1542678
Brief Bioinform. 2016 Nov;17(6):938-952
pubmed: 26628557
Genome Res. 2009 Jun;19(6):959-66
pubmed: 19273618
Nat Methods. 2018 Jul;15(7):475-476
pubmed: 29967506
Nucleic Acids Res. 2018 Nov 16;46(20):e120
pubmed: 30169659
Bioinformatics. 2012 Jul 1;28(13):1698-704
pubmed: 22581174
Bioinformatics. 2012 Oct 15;28(20):2592-9
pubmed: 22923295
Adv Bioinformatics. 2014;2014:472045
pubmed: 24839440
BMC Bioinformatics. 2009 Jul 27;10:232
pubmed: 19635165
DNA Res. 2015 Apr;22(2):121-31
pubmed: 25632110
Gigascience. 2018 Feb 1;7(2):1-7
pubmed: 29253147
Nature. 2009 Sep 17;461(7262):423-6
pubmed: 19734880
Nucleic Acids Res. 2014 Apr;42(6):e43
pubmed: 24391148
Nucleic Acids Res. 2012 Jul;40(13):e100
pubmed: 22457070
Genome Biol. 2018 Mar 15;19(1):33
pubmed: 29544553
Bioinformatics. 2010 Apr 1;26(7):873-81
pubmed: 20147302
Nature. 2009 Sep 17;461(7262):427-30
pubmed: 19734882
Nat Methods. 2012 Dec;9(12):1185-8
pubmed: 23103880
Cell. 2006 Sep 22;126(6):1189-201
pubmed: 16949657
Genome Biol. 2019 Dec 16;20(1):275
pubmed: 31843001
Nucleic Acids Res. 2012 May;40(10):e79
pubmed: 22344695
Bioinformatics. 2011 Jun 1;27(11):1571-2
pubmed: 21493656
Proc Natl Acad Sci U S A. 2017 May 30;114(22):E4511-E4519
pubmed: 28507144
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
J Cheminform. 2019 Apr 23;11(1):30
pubmed: 31016417
Proc Natl Acad Sci U S A. 2015 Mar 17;112(11):3553-7
pubmed: 25733903
Mol Cell. 2014 Sep 4;55(5):678-93
pubmed: 25132176