Peak calling by Sparse Enrichment Analysis for CUT&RUN chromatin profiling.
CUT&RUN
Epigenome profiling
Peak calling
Journal
Epigenetics & chromatin
ISSN: 1756-8935
Titre abrégé: Epigenetics Chromatin
Pays: England
ID NLM: 101471619
Informations de publication
Date de publication:
12 07 2019
12 07 2019
Historique:
received:
28
06
2019
accepted:
03
07
2019
entrez:
14
7
2019
pubmed:
14
7
2019
medline:
17
6
2020
Statut:
epublish
Résumé
CUT&RUN is an efficient epigenome profiling method that identifies sites of DNA binding protein enrichment genome-wide with high signal to noise and low sequencing requirements. Currently, the analysis of CUT&RUN data is complicated by its exceptionally low background, which renders programs designed for analysis of ChIP-seq data vulnerable to oversensitivity in identifying sites of protein binding. Here we introduce Sparse Enrichment Analysis for CUT&RUN (SEACR), an analysis strategy that uses the global distribution of background signal to calibrate a simple threshold for peak calling. SEACR discriminates between true and false-positive peaks with near-perfect specificity from "gold standard" CUT&RUN datasets and efficiently identifies enriched regions for several different protein targets. We also introduce a web server ( http://seacr.fredhutch.org ) for plug-and-play analysis with SEACR that facilitates maximum accessibility across users of all skill levels. SEACR is a highly selective peak caller that definitively validates the accuracy of CUT&RUN for datasets with known true negatives. Its ease of use and performance in comparison with existing peak calling strategies make it an ideal choice for analyzing CUT&RUN data.
Sections du résumé
BACKGROUND
CUT&RUN is an efficient epigenome profiling method that identifies sites of DNA binding protein enrichment genome-wide with high signal to noise and low sequencing requirements. Currently, the analysis of CUT&RUN data is complicated by its exceptionally low background, which renders programs designed for analysis of ChIP-seq data vulnerable to oversensitivity in identifying sites of protein binding.
RESULTS
Here we introduce Sparse Enrichment Analysis for CUT&RUN (SEACR), an analysis strategy that uses the global distribution of background signal to calibrate a simple threshold for peak calling. SEACR discriminates between true and false-positive peaks with near-perfect specificity from "gold standard" CUT&RUN datasets and efficiently identifies enriched regions for several different protein targets. We also introduce a web server ( http://seacr.fredhutch.org ) for plug-and-play analysis with SEACR that facilitates maximum accessibility across users of all skill levels.
CONCLUSIONS
SEACR is a highly selective peak caller that definitively validates the accuracy of CUT&RUN for datasets with known true negatives. Its ease of use and performance in comparison with existing peak calling strategies make it an ideal choice for analyzing CUT&RUN data.
Identifiants
pubmed: 31300027
doi: 10.1186/s13072-019-0287-4
pii: 10.1186/s13072-019-0287-4
pmc: PMC6624997
doi:
Substances chimiques
Chromatin
0
DNA-Binding Proteins
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
42Subventions
Organisme : Howard Hughes Medical Institute
ID : Henikoff
Pays : United States
Organisme : NHGRI NIH HHS
ID : 1R01HG010492
Pays : United States
Références
Cell. 2007 May 18;129(4):823-37
pubmed: 17512414
Science. 2007 Jun 8;316(5830):1497-502
pubmed: 17540862
Nat Methods. 2007 Aug;4(8):651-7
pubmed: 17558387
Genome Biol. 2008;9(9):R137
pubmed: 18798982
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Mol Cell. 2010 May 28;38(4):576-89
pubmed: 20513432
Nature. 2012 Sep 6;489(7414):57-74
pubmed: 22955616
Proc Natl Acad Sci U S A. 2013 Nov 12;110(46):18602-7
pubmed: 24173036
Nat Rev Genet. 2014 Feb;15(2):69-81
pubmed: 24342920
PLoS One. 2013 Dec 09;8(12):e83506
pubmed: 24349523
Methods Mol Biol. 2014;1150:81-95
pubmed: 24743991
Nature. 2015 Feb 19;518(7539):344-9
pubmed: 25693565
Nucleic Acids Res. 2015 Aug 18;43(14):6959-68
pubmed: 26117547
Brief Bioinform. 2017 Mar 1;18(2):279-290
pubmed: 26979602
Brief Bioinform. 2017 May 1;18(3):441-450
pubmed: 27169896
Elife. 2017 Jan 16;6:
pubmed: 28079019
Mol Cell. 2017 Dec 21;68(6):1038-1053.e4
pubmed: 29225036
Nat Protoc. 2018 May;13(5):1006-1019
pubmed: 29651053
Proc Natl Acad Sci U S A. 1985 Oct;82(19):6470-4
pubmed: 2995966
Epigenetics Chromatin. 2018 Dec 21;11(1):74
pubmed: 30577869
Nat Commun. 2019 Apr 29;10(1):1930
pubmed: 31036827