Systematic benchmarking of single-cell ATAC-sequencing protocols.
Journal
Nature biotechnology
ISSN: 1546-1696
Titre abrégé: Nat Biotechnol
Pays: United States
ID NLM: 9604648
Informations de publication
Date de publication:
03 Aug 2023
03 Aug 2023
Historique:
received:
18
01
2022
accepted:
22
06
2023
pubmed:
4
8
2023
medline:
4
8
2023
entrez:
3
8
2023
Statut:
aheadofprint
Résumé
Single-cell assay for transposase-accessible chromatin by sequencing (scATAC-seq) has emerged as a powerful tool for dissecting regulatory landscapes and cellular heterogeneity. However, an exploration of systemic biases among scATAC-seq technologies has remained absent. In this study, we benchmark the performance of eight scATAC-seq methods across 47 experiments using human peripheral blood mononuclear cells (PBMCs) as a reference sample and develop PUMATAC, a universal preprocessing pipeline, to handle the various sequencing data formats. Our analyses reveal significant differences in sequencing library complexity and tagmentation specificity, which impact cell-type annotation, genotype demultiplexing, peak calling, differential region accessibility and transcription factor motif enrichment. Our findings underscore the importance of sample extraction, method selection, data processing and total cost of experiments, offering valuable guidance for future research. Finally, our data and analysis pipeline encompasses 169,000 PBMC scATAC-seq profiles and a best practices code repository for scATAC-seq data analysis, which are freely available to extend this benchmarking effort to future protocols.
Identifiants
pubmed: 37537502
doi: 10.1038/s41587-023-01881-x
pii: 10.1038/s41587-023-01881-x
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Fonds Wetenschappelijk Onderzoek (Research Foundation Flanders)
ID : 1S80920N
Organisme : Fonds Wetenschappelijk Onderzoek (Research Foundation Flanders)
ID : G0B5619N
Organisme : Fonds Wetenschappelijk Onderzoek (Research Foundation Flanders)
ID : G094121N
Informations de copyright
© 2023. The Author(s).
Références
Massoni-Badosa, R. et al. Sampling time-dependent artifacts in single-cell genomics studies. Genome Biol. 21, 112 (2020).
doi: 10.1186/s13059-020-02032-0
pubmed: 32393363
pmcid: 7212672
Mereu, E. et al. Benchmarking single-cell RNA-sequencing protocols for cell atlas projects. Nat. Biotechnol. 38, 747–755 (2020).
doi: 10.1038/s41587-020-0469-4
pubmed: 32518403
Minnoye, L. et al. Chromatin accessibility profiling methods. Nat. Rev. Methods Primer 1, 11 (2021).
doi: 10.1038/s43586-020-00008-9
Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. Nature 523, 486–490 (2015).
doi: 10.1038/nature14590
pubmed: 26083756
pmcid: 4685948
Cusanovich, D. A. et al. A single-cell atlas of in vivo mammalian chromatin accessibility. Cell 174, 1309–1324 (2018).
doi: 10.1016/j.cell.2018.06.052
pubmed: 30078704
pmcid: 6158300
Domcke, S., et al. A human cell atlas of fetal chromatin accessibility. Science 370, eaba7612 (2020).
Zhang, K. et al. A single-cell atlas of chromatin accessibility in the human genome. Cell 184, 5985–6001 (2021).
doi: 10.1016/j.cell.2021.10.024
pubmed: 34774128
pmcid: 8664161
Hulselmans, G., De Rop, F. & Flerin, C. Pipeline for universal mapping of ATAC-seq. Zenodo https://doi.org/10.5281/zenodo.7764884 (2023).
Satpathy, A. T. et al. Massively parallel single-cell chromatin landscapes of human immune cell development and intratumoral T cell exhaustion. Nat. Biotechnol. 37, 925–936 (2019).
doi: 10.1038/s41587-019-0206-z
pubmed: 31375813
pmcid: 7299161
Lareau, C. A. et al. Massively parallel single-cell mitochondrial DNA genotyping and chromatin profiling. Nat. Biotechnol. 39, 451–461 (2021).
doi: 10.1038/s41587-020-0645-6
pubmed: 32788668
Lareau, C. A. et al. Droplet-based combinatorial indexing for massive-scale single-cell chromatin accessibility. Nat. Biotechnol. 37, 916–924 (2019).
doi: 10.1038/s41587-019-0147-6
pubmed: 31235917
pmcid: 10299900
De Rop, F. V. et al. Hydrop enables droplet-based single-cell ATAC-seq and single-cell RNA-seq using dissolvable hydrogel beads. eLife 11, e73971 (2022).
Mulqueen, R. M. et al. High-content single-cell combinatorial indexing. Nat. Biotechnol. 39, 1574–1580 (2021).
doi: 10.1038/s41587-021-00962-z
pubmed: 34226710
pmcid: 8678206
Flerin, C. C., Davie, K., Hulselmans, G. & Waegeneer, M. D. vib-singlecell-nf/vsn-pipelines: v0.27.0. Zenodo https://zenodo.org/record/5751297 (2021).
Lareau, C. A., Ma, S., Duarte, F. M. & Buenrostro, J. D. Inference and effects of barcode multiplets in droplet-based single-cell assays. Nat. Commun. 11, 866 (2020).
doi: 10.1038/s41467-020-14667-5
pubmed: 32054859
pmcid: 7018801
Bravo González-Blas, C. et al. cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data. Nat. Methods 16, 397–400 (2019).
doi: 10.1038/s41592-019-0367-1
pubmed: 30962623
Ou, J. et al. ATACseqQC: a Bioconductor package for post-alignment quality assessment of ATAC-seq data. BMC Genomics 19, 169 (2018).
doi: 10.1186/s12864-018-4559-3
pubmed: 29490630
pmcid: 5831847
The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
doi: 10.1038/nature11247
pmcid: 3439153
Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst. 8, 281–291 (2019).
doi: 10.1016/j.cels.2018.11.005
pubmed: 30954476
pmcid: 6625319
Zhang, F., Kang, H. M. & Yun, Y. popscle. GitHub https://github.com/statgen/popscle (2019).
Ding, J. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat. Biotechnol. 38, 737–746 (2020).
doi: 10.1038/s41587-020-0465-8
pubmed: 32341560
pmcid: 7289686
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
doi: 10.1016/j.cell.2019.05.031
pubmed: 31178118
pmcid: 6687398
Zhang, Y. et al. Model-based analysis of ChIP–seq (MACS). Genome Biol. 9, R137 (2008).
doi: 10.1186/gb-2008-9-9-r137
pubmed: 18798982
pmcid: 2592715
Herrmann, C., Van de Sande, B., Potier, D. & Aerts, S. i-cisTarget: an integrative genomics method for the prediction of regulatory features and cis-regulatory modules. Nucleic Acids Res. 40, e114 (2012).
doi: 10.1093/nar/gks543
pubmed: 22718975
pmcid: 3424583
Imrichová, H., Hulselmans, G., Kalender Atak, Z., Potier, D. & Aerts, S. i-cisTarget 2015 update: generalized cis-regulatory enrichment analysis in human, mouse and fly. Nucleic Acids Res. 43, W57–W64 (2015).
doi: 10.1093/nar/gkv395
pubmed: 25925574
pmcid: 4489282
González-Blas, C. B. et al. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat. Methods https://doi.org/10.1038/s41592-023-01938-4 (2023).
Klein, S. L. & Flanagan, K. L. Sex differences in immune responses. Nat. Rev. Immunol. 16, 626–638 (2016).
doi: 10.1038/nri.2016.90
pubmed: 27546235
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
doi: 10.1038/s41592-019-0619-0
pubmed: 31740819
pmcid: 6884693
Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017).
doi: 10.1038/nbt.3820
pubmed: 28398311
Krueger, F., James, F., Ewels, P., Afyounian, E. & Schuster-Boeckler, B. FelixKrueger/TrimGalore: v0.6.7. Zenodo https://zenodo.org/record/5127899 (2021).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal. 17, 10–12 (2011).
doi: 10.14806/ej.17.1.200
Md, V., Misra, S., Li, H. & Aluru, S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. Preprint at http://arxiv.org/abs/1907.12931 (2019).
Li, H. et al. The sequence alignment/map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
doi: 10.1093/bioinformatics/btp352
pubmed: 19505943
pmcid: 2723002
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
doi: 10.1093/gigascience/giab008
pubmed: 33590861
pmcid: 7931819
Tange, O. GNU Parallel 2018. Zenodo https://zenodo.org/record/1146014 (2018).
Durinck, S. et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21, 3439–3440 (2005).
doi: 10.1093/bioinformatics/bti525
pubmed: 16082012
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
doi: 10.1038/nature15393
pubmed: 26432245
Amemiya, H. M., Kundaje, A. & Boyle, A. P. The ENCODE Blacklist: identification of problematic regions of the genome. Sci. Rep. 9, 9354 (2019).
doi: 10.1038/s41598-019-45839-z
pubmed: 31249361
pmcid: 6597582
Heinz, S. et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell 38, 576–589 (2010).
doi: 10.1016/j.molcel.2010.05.004
pubmed: 20513432
pmcid: 2898526
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
doi: 10.1186/s13059-017-1382-0
pubmed: 29409532
pmcid: 5802054
De Rop, F. et al. Datasets supplementary to systematic benchmarking of single-cell ATAC sequencing protocols. Gene Expression Omnibus https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194028 (2023).
De Rop, F. aertslab/scATAC-seq_benchmark. Zenodo https://doi.org/10.5281/zenodo.8034473 (2023).