maxATAC: Genome-scale transcription-factor binding prediction from ATAC-seq with deep neural networks.


Journal

PLoS computational biology
ISSN: 1553-7358
Titre abrégé: PLoS Comput Biol
Pays: United States
ID NLM: 101238922

Informations de publication

Date de publication:
01 2023
Historique:
received: 06 07 2022
accepted: 10 01 2023
revised: 10 02 2023
pubmed: 1 2 2023
medline: 15 2 2023
entrez: 31 1 2023
Statut: epublish

Résumé

Transcription factors read the genome, fundamentally connecting DNA sequence to gene expression across diverse cell types. Determining how, where, and when TFs bind chromatin will advance our understanding of gene regulatory networks and cellular behavior. The 2017 ENCODE-DREAM in vivo Transcription-Factor Binding Site (TFBS) Prediction Challenge highlighted the value of chromatin accessibility data to TFBS prediction, establishing state-of-the-art methods for TFBS prediction from DNase-seq. However, the more recent Assay-for-Transposase-Accessible-Chromatin (ATAC)-seq has surpassed DNase-seq as the most widely-used chromatin accessibility profiling method. Furthermore, ATAC-seq is the only such technique available at single-cell resolution from standard commercial platforms. While ATAC-seq datasets grow exponentially, suboptimal motif scanning is unfortunately the most common method for TFBS prediction from ATAC-seq. To enable community access to state-of-the-art TFBS prediction from ATAC-seq, we (1) curated an extensive benchmark dataset (127 TFs) for ATAC-seq model training and (2) built "maxATAC", a suite of user-friendly, deep neural network models for genome-wide TFBS prediction from ATAC-seq in any cell type. With models available for 127 human TFs, maxATAC is the largest collection of high-performance TFBS prediction models for ATAC-seq. maxATAC performance extends to primary cells and single-cell ATAC-seq, enabling improved TFBS prediction in vivo. We demonstrate maxATAC's capabilities by identifying TFBS associated with allele-dependent chromatin accessibility at atopic dermatitis genetic risk loci.

Identifiants

pubmed: 36719906
doi: 10.1371/journal.pcbi.1010863
pii: PCOMPBIOL-D-22-01037
pmc: PMC9917285
doi:

Substances chimiques

Chromatin 0
Deoxyribonucleases EC 3.1.-

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e1010863

Subventions

Organisme : NIAID NIH HHS
ID : U01 AI130830
Pays : United States
Organisme : NIAMS NIH HHS
ID : R01 AR073228
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI153442
Pays : United States
Organisme : NIAID NIH HHS
ID : P01 AI150585
Pays : United States
Organisme : NHGRI NIH HHS
ID : R01 HG010730
Pays : United States
Organisme : NHGRI NIH HHS
ID : U01 HG011172
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI024717
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI148276
Pays : United States
Organisme : NINDS NIH HHS
ID : R01 NS099068
Pays : United States
Organisme : NIDDK NIH HHS
ID : R01 DK107502
Pays : United States
Organisme : NIAID NIH HHS
ID : U01 AI150748
Pays : United States
Organisme : NIAID NIH HHS
ID : R21 AI156185
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM055479
Pays : United States
Organisme : NIAMS NIH HHS
ID : P30 AR070549
Pays : United States
Organisme : NIAID NIH HHS
ID : U19 AI070235
Pays : United States

Informations de copyright

Copyright: © 2023 Cazares et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Déclaration de conflit d'intérêts

I have read the journal’s policy and the authors of this manuscript have the following competing interests: AB is a co-founder of Datirium, LLC.

Références

Cell Genom. 2022 Apr 13;2(4):None
pubmed: 35591976
Genome Biol. 2020 Mar 30;21(1):81
pubmed: 32228704
Nature. 2015 Jul 23;523(7561):486-90
pubmed: 26083756
Nat Biotechnol. 2019 Aug;37(8):925-936
pubmed: 31375813
Genome Biol. 2010;11(12):R119
pubmed: 21143862
PLoS Comput Biol. 2012 May;8(5):e1002529
pubmed: 22693437
Cell. 2018 Feb 8;172(4):650-665
pubmed: 29425488
Mol Cell. 2020 Mar 19;77(6):1307-1321.e10
pubmed: 31954095
Mol Cell. 2018 Sep 6;71(5):858-871.e8
pubmed: 30078726
Cell. 2018 Nov 29;175(6):1701-1715.e16
pubmed: 30449622
Nat Methods. 2021 Oct;18(10):1196-1203
pubmed: 34608324
Nat Genet. 2022 Jun;54(6):817-826
pubmed: 35618845
J Invest Dermatol. 2021 Jan;141(1):19-22
pubmed: 32526212
Nat Methods. 2021 Nov;18(11):1333-1341
pubmed: 34725479
Nat Genet. 2015 Dec;47(12):1449-1456
pubmed: 26482879
Genome Biol. 2019 Feb 26;20(1):45
pubmed: 30808370
Nat Genet. 2021 Mar;53(3):403-411
pubmed: 33633365
BMC Bioinformatics. 2021 Feb 1;22(1):38
pubmed: 33522898
Genome Res. 2009 Feb;19(2):167-77
pubmed: 19056696
Genome Biol. 2019 Jan 10;20(1):9
pubmed: 30630522
Nat Genet. 2018 May;50(5):699-707
pubmed: 29662164
Cell. 2018 Jul 26;174(3):744-757.e24
pubmed: 29887377
Science. 2017 May 5;356(6337):
pubmed: 28473536
Bioinformatics. 2009 Dec 1;25(23):3181-2
pubmed: 19773334
Nat Genet. 2011 Jun 12;43(7):690-4
pubmed: 21666691
Elife. 2020 Jan 27;9:
pubmed: 31985403
Science. 2002 Aug 9;297(5583):1003-7
pubmed: 12169732
Nat Biotechnol. 2015 Aug;33(8):831-8
pubmed: 26213851
Nature. 2010 Dec 16;468(7326):911-20
pubmed: 21164479
Sci Adv. 2020 Dec 18;6(51):
pubmed: 33355120
Cell Syst. 2019 May 22;8(5):446-455.e8
pubmed: 31078526
Proc Natl Acad Sci U S A. 2009 Jun 9;106(23):9362-7
pubmed: 19474294
Eur J Immunol. 2017 Jan;47(1):168-179
pubmed: 27861791
mSystems. 2019 Sep 3;4(5):
pubmed: 31481602
Methods. 2019 Aug 15;166:40-47
pubmed: 30922998
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D493-6
pubmed: 14681465
Proc Natl Acad Sci U S A. 2017 Jun 20;114(25):E4914-E4923
pubmed: 28576882
Nat Commun. 2021 Mar 8;12(1):1507
pubmed: 33686069
J Exp Med. 2020 Jan 6;217(1):
pubmed: 31653690
Nat Genet. 2021 Mar;53(3):354-366
pubmed: 33603233
BMC Genomics. 2018 Mar 1;19(1):169
pubmed: 29490630
Nat Genet. 2016 Oct;48(10):1193-203
pubmed: 27526324
Genome Res. 2019 Feb;29(2):281-292
pubmed: 30567711
Gigascience. 2021 Feb 16;10(2):
pubmed: 33590861
Proc Natl Acad Sci U S A. 2020 Oct 13;117(41):25655-25666
pubmed: 32978299
Genome Res. 2020 Dec;30(12):1815-1834
pubmed: 32732264
Cell. 2014 Sep 11;158(6):1431-1443
pubmed: 25215497
Front Immunol. 2020 Nov 11;11:585168
pubmed: 33262764
Nature. 2020 Jul;583(7818):699-710
pubmed: 32728249
Nature. 2012 Sep 6;489(7414):75-82
pubmed: 22955617
Nature. 2012 Sep 6;489(7414):57-74
pubmed: 22955616
PLoS Comput Biol. 2021 Dec 13;17(12):e1009670
pubmed: 34898596
Science. 2015 May 22;348(6237):910-4
pubmed: 25953818
Genome Res. 2018 May;28(5):739-750
pubmed: 29588361
Nat Methods. 2013 Dec;10(12):1213-8
pubmed: 24097267
PLoS Comput Biol. 2022 Sep 12;18(9):e1009921
pubmed: 36094959
Immunity. 2017 Jan 17;46(1):78-91
pubmed: 28099866
Nat Methods. 2017 Oct;14(10):975-978
pubmed: 28825706
Genome Biol. 2008;9(9):R137
pubmed: 18798982
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Mol Cell. 2010 May 28;38(4):576-89
pubmed: 20513432
PLoS Genet. 2022 May 16;18(5):e1009973
pubmed: 35576187
Genome Res. 2021 Apr;31(4):721-731
pubmed: 33741685
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
Genome Res. 2011 Mar;21(3):447-55
pubmed: 21106904
Genome Res. 2019 Mar;29(3):449-463
pubmed: 30696696
Nat Methods. 2015 Oct;12(10):931-4
pubmed: 26301843
Nucleic Acids Res. 2019 Jan 8;47(D1):D729-D735
pubmed: 30462313
J Allergy Clin Immunol. 2004 Jul;114(1):195-7
pubmed: 15282937
Nature. 2015 Feb 19;518(7539):337-43
pubmed: 25363779
PLoS Comput Biol. 2015 May 27;11(5):e1004271
pubmed: 26016777
Sci Rep. 2019 Jun 27;9(1):9354
pubmed: 31249361
Nat Commun. 2020 Jul 13;11(1):3488
pubmed: 32661261
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W187-91
pubmed: 24799436
Science. 2012 Sep 7;337(6099):1190-5
pubmed: 22955828
Mol Cells. 2020 Nov 30;43(11):921-934
pubmed: 33243936
F1000Res. 2021 Jan 18;10:33
pubmed: 34035898
Genome Biol. 2020 Mar 30;21(1):82
pubmed: 32228713
Nat Rev Genet. 2017 Sep;18(9):551-562
pubmed: 28607512
Nat Genet. 2010 Mar;42(3):255-9
pubmed: 20118932
Cell. 2021 Sep 16;184(19):5053-5069.e23
pubmed: 34390642
PLoS One. 2015 Mar 04;10(3):e0118432
pubmed: 25738806
Nat Methods. 2017 Oct;14(10):959-962
pubmed: 28846090
NPJ Genom Med. 2020 Jun 5;5:26
pubmed: 32550006
Genome Biol. 2020 Feb 3;21(1):22
pubmed: 32014034
IEEE Trans Pattern Anal Mach Intell. 2018 Apr;40(4):834-848
pubmed: 28463186
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Genome Res. 2016 Jul;26(7):990-9
pubmed: 27197224
Nat Commun. 2020 Aug 26;11(1):4267
pubmed: 32848148
Nucleic Acids Res. 2017 Jan 4;45(D1):D658-D662
pubmed: 27789702

Auteurs

Tareian A Cazares (TA)

Immunology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.

Faiz W Rizvi (FW)

Systems Biology and Physiology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.

Balaji Iyer (B)

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America.

Xiaoting Chen (X)

The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Michael Kotliar (M)

Division of Allergy and Immunology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Anthony T Bejjani (AT)

Molecular and Developmental Biology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.

Joseph A Wayman (JA)

Division of Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Omer Donmez (O)

The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Benjamin Wronowski (B)

Division of Allergy and Immunology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Sreeja Parameswaran (S)

The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Leah C Kottyan (LC)

The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.
Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Artem Barski (A)

Division of Allergy and Immunology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.
Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

Matthew T Weirauch (MT)

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.
Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.

V B Surya Prasath (VBS)

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America.
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.

Emily R Miraldi (ER)

Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America.
Division of Immunobiology, Cincinnati Children's Hospital Medical Center, Cincinnati, Ohio, United States of America.
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH