Peak calling by Sparse Enrichment Analysis for CUT&RUN chromatin profiling.


Journal

Epigenetics & chromatin
ISSN: 1756-8935
Titre abrégé: Epigenetics Chromatin
Pays: England
ID NLM: 101471619

Informations de publication

Date de publication:
12 07 2019
Historique:
received: 28 06 2019
accepted: 03 07 2019
entrez: 14 7 2019
pubmed: 14 7 2019
medline: 17 6 2020
Statut: epublish

Résumé

CUT&RUN is an efficient epigenome profiling method that identifies sites of DNA binding protein enrichment genome-wide with high signal to noise and low sequencing requirements. Currently, the analysis of CUT&RUN data is complicated by its exceptionally low background, which renders programs designed for analysis of ChIP-seq data vulnerable to oversensitivity in identifying sites of protein binding. Here we introduce Sparse Enrichment Analysis for CUT&RUN (SEACR), an analysis strategy that uses the global distribution of background signal to calibrate a simple threshold for peak calling. SEACR discriminates between true and false-positive peaks with near-perfect specificity from "gold standard" CUT&RUN datasets and efficiently identifies enriched regions for several different protein targets. We also introduce a web server ( http://seacr.fredhutch.org ) for plug-and-play analysis with SEACR that facilitates maximum accessibility across users of all skill levels. SEACR is a highly selective peak caller that definitively validates the accuracy of CUT&RUN for datasets with known true negatives. Its ease of use and performance in comparison with existing peak calling strategies make it an ideal choice for analyzing CUT&RUN data.

Sections du résumé

BACKGROUND
CUT&RUN is an efficient epigenome profiling method that identifies sites of DNA binding protein enrichment genome-wide with high signal to noise and low sequencing requirements. Currently, the analysis of CUT&RUN data is complicated by its exceptionally low background, which renders programs designed for analysis of ChIP-seq data vulnerable to oversensitivity in identifying sites of protein binding.
RESULTS
Here we introduce Sparse Enrichment Analysis for CUT&RUN (SEACR), an analysis strategy that uses the global distribution of background signal to calibrate a simple threshold for peak calling. SEACR discriminates between true and false-positive peaks with near-perfect specificity from "gold standard" CUT&RUN datasets and efficiently identifies enriched regions for several different protein targets. We also introduce a web server ( http://seacr.fredhutch.org ) for plug-and-play analysis with SEACR that facilitates maximum accessibility across users of all skill levels.
CONCLUSIONS
SEACR is a highly selective peak caller that definitively validates the accuracy of CUT&RUN for datasets with known true negatives. Its ease of use and performance in comparison with existing peak calling strategies make it an ideal choice for analyzing CUT&RUN data.

Identifiants

pubmed: 31300027
doi: 10.1186/s13072-019-0287-4
pii: 10.1186/s13072-019-0287-4
pmc: PMC6624997
doi:

Substances chimiques

Chromatin 0
DNA-Binding Proteins 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

42

Subventions

Organisme : Howard Hughes Medical Institute
ID : Henikoff
Pays : United States
Organisme : NHGRI NIH HHS
ID : 1R01HG010492
Pays : United States

Références

Cell. 2007 May 18;129(4):823-37
pubmed: 17512414
Science. 2007 Jun 8;316(5830):1497-502
pubmed: 17540862
Nat Methods. 2007 Aug;4(8):651-7
pubmed: 17558387
Genome Biol. 2008;9(9):R137
pubmed: 18798982
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Mol Cell. 2010 May 28;38(4):576-89
pubmed: 20513432
Nature. 2012 Sep 6;489(7414):57-74
pubmed: 22955616
Proc Natl Acad Sci U S A. 2013 Nov 12;110(46):18602-7
pubmed: 24173036
Nat Rev Genet. 2014 Feb;15(2):69-81
pubmed: 24342920
PLoS One. 2013 Dec 09;8(12):e83506
pubmed: 24349523
Methods Mol Biol. 2014;1150:81-95
pubmed: 24743991
Nature. 2015 Feb 19;518(7539):344-9
pubmed: 25693565
Nucleic Acids Res. 2015 Aug 18;43(14):6959-68
pubmed: 26117547
Brief Bioinform. 2017 Mar 1;18(2):279-290
pubmed: 26979602
Brief Bioinform. 2017 May 1;18(3):441-450
pubmed: 27169896
Elife. 2017 Jan 16;6:
pubmed: 28079019
Mol Cell. 2017 Dec 21;68(6):1038-1053.e4
pubmed: 29225036
Nat Protoc. 2018 May;13(5):1006-1019
pubmed: 29651053
Proc Natl Acad Sci U S A. 1985 Oct;82(19):6470-4
pubmed: 2995966
Epigenetics Chromatin. 2018 Dec 21;11(1):74
pubmed: 30577869
Nat Commun. 2019 Apr 29;10(1):1930
pubmed: 31036827

Auteurs

Michael P Meers (MP)

Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, 98109, USA.

Dan Tenenbaum (D)

Scientific Computing, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, 98109, USA.

Steven Henikoff (S)

Basic Sciences Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N, Seattle, WA, 98109, USA. steveh@fhcrc.org.
Howard Hughes Medical Institute Research Laboratory, Seattle, USA. steveh@fhcrc.org.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH