Accelerating 3D genomics data analysis with Microcket.


Journal

Communications biology
ISSN: 2399-3642
Titre abrégé: Commun Biol
Pays: England
ID NLM: 101719179

Informations de publication

Date de publication:
01 Jun 2024
Historique:
received: 27 03 2024
accepted: 24 05 2024
medline: 2 6 2024
pubmed: 2 6 2024
entrez: 1 6 2024
Statut: epublish

Résumé

The three-dimensional (3D) organization of genome is fundamental to cell biology. To explore 3D genome, emerging high-throughput approaches have produced billions of sequencing reads, which is challenging and time-consuming to analyze. Here we present Microcket, a package for mapping and extracting interacting pairs from 3D genomics data, including Hi-C, Micro-C, and derivant protocols. Microcket utilizes a unique read-stitch strategy that takes advantage of the long read cycles in modern DNA sequencers; benchmark evaluations reveal that Microcket runs much faster than the current tools along with improved mapping efficiency, and thus shows high potential in accelerating and enhancing the biological investigations into 3D genome. Microcket is freely available at https://github.com/hellosunking/Microcket .

Identifiants

pubmed: 38824179
doi: 10.1038/s42003-024-06382-4
pii: 10.1038/s42003-024-06382-4
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

675

Subventions

Organisme : National Natural Science Foundation of China (National Science Foundation of China)
ID : 82101763
Organisme : National Natural Science Foundation of China (National Science Foundation of China)
ID : 32270587, 32100673

Informations de copyright

© 2024. The Author(s).

Références

Klein, K. N. et al. Replication timing maintains the global epigenetic state in human cells. Science 372, 371–378 (2021).
doi: 10.1126/science.aba5545 pubmed: 33888635 pmcid: 8173839
Lu, L. et al. Robust Hi-C Maps of Enhancer-Promoter Interactions Reveal the Function of Non-coding Genome in Neural Development and Diseases. Mol. Cell 79, 521–534.e515 (2020).
doi: 10.1016/j.molcel.2020.06.007 pubmed: 32592681 pmcid: 7415676
Zheng, H. & Xie, W. The role of 3D genome organization in development and cell differentiation. Nat. Rev. Mol. Cell Biol. 20, 535–550 (2019).
doi: 10.1038/s41580-019-0132-4 pubmed: 31197269
Spielmann, M., Lupianez, D. G. & Mundlos, S. Structural variation in the 3D genome. Nat. Rev. Genet 19, 453–467 (2018).
doi: 10.1038/s41576-018-0007-0 pubmed: 29692413
Dekker, J. et al. The 4D nucleome project. Nature 549, 219–226 (2017).
doi: 10.1038/nature23884 pubmed: 28905911 pmcid: 5617335
Krietenstein, N. et al. Ultrastructural Details of Mammalian Chromosome Architecture. Mol. Cell 78, 554–565 e557 (2020).
doi: 10.1016/j.molcel.2020.03.003 pubmed: 32213324 pmcid: 7222625
Akgol Oksuz, B. et al. Systematic evaluation of chromosome conformation capture assays. Nat. Methods 18, 1046–1055 (2021).
doi: 10.1038/s41592-021-01248-7 pubmed: 34480151 pmcid: 8446342
Hsieh, T. S. et al. Resolving the 3D Landscape of Transcription-Linked Mammalian Chromatin Folding. Mol. Cell 78, 539–553.e538 (2020).
doi: 10.1016/j.molcel.2020.03.002 pubmed: 32213323 pmcid: 7703524
Durand, N. C. et al. Juicer Provides a One-Click System for Analyzing Loop-Resolution Hi-C Experiments. Cell Syst. 3, 95–98 (2016).
doi: 10.1016/j.cels.2016.07.002 pubmed: 27467249 pmcid: 5846465
Kruse, K., Hug, C. B. & Vaquerizas, J. M. FAN-C: a feature-rich framework for the analysis and visualisation of chromosome conformation capture data. Genome Biol. 21, 303 (2020).
doi: 10.1186/s13059-020-02215-9 pubmed: 33334380 pmcid: 7745377
Wingett, S. et al. HiCUP: pipeline for mapping and processing Hi-C data. F1000Res 4, 1310 (2015).
doi: 10.12688/f1000research.7334.1 pubmed: 26835000 pmcid: 4706059
Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).
doi: 10.1186/s13059-015-0831-x pubmed: 26619908 pmcid: 4665391
Lafontaine, D. L., Yang, L., Dekker, J. & Gibcus, J. H. Hi-C 3.0: Improved Protocol for Genome-Wide Chromosome Conformation Capture. Curr. Protoc. 1, e198 (2021).
doi: 10.1002/cpz1.198 pubmed: 34286910 pmcid: 8362010
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
doi: 10.1093/bioinformatics/btp324 pubmed: 19451168 pmcid: 2705234
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
doi: 10.1093/bioinformatics/bts635 pubmed: 23104886
Holgersen, E. M. et al. Identifying high-confidence capture Hi-C interactions using CHiCANE. Nat. Protoc. 16, 2257–2285 (2021).
doi: 10.1038/s41596-021-00498-1 pubmed: 33837305
Wolff, J. et al. Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res. 48, W177–W184 (2020).
doi: 10.1093/nar/gkaa220 pubmed: 32301980 pmcid: 7319437
Khakmardan, S., Rezvani, M., Pouyan, A. A., Fateh, M. & Alinejad-Rokny, H. MHiC, an integrated user-friendly tool for the identification and visualization of significant interactions in Hi-C data. BMC Genomics 21, 225 (2020).
doi: 10.1186/s12864-020-6636-7 pubmed: 32164554 pmcid: 7068949
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
doi: 10.1016/j.cell.2014.11.021 pubmed: 25497547 pmcid: 5635824
Johnstone, S. E. et al. Large-Scale Topological Changes Restrain Malignant Progression in Colorectal Cancer. Cell 182, 1474–1489.e1423 (2020).
doi: 10.1016/j.cell.2020.07.030 pubmed: 32841603 pmcid: 7575124
Jung, I. et al. A compendium of promoter-centered long-range chromatin interactions in the human genome. Nat. Genet 51, 1442–1449 (2019).
doi: 10.1038/s41588-019-0494-8 pubmed: 31501517 pmcid: 6778519
Song, M. et al. Mapping cis-regulatory chromatin contacts in neural cells links neuropsychiatric disorder risk variants to target genes. Nat. Genet 51, 1252–1262 (2019).
doi: 10.1038/s41588-019-0472-1 pubmed: 31367015 pmcid: 6677164
Lee, B. H., Wu, Z. & Rhie, S. K. Characterizing chromatin interactions of regulatory elements and nucleosome positions, using Hi-C, Micro-C, and promoter capture Micro-C. Epigenetics Chromatin 15, 41 (2022).
doi: 10.1186/s13072-022-00473-4 pubmed: 36544209 pmcid: 9768916
Turkalo, T. K. et al. A non-genetic switch triggers alternative telomere lengthening and cellular immortalization in ATRX deficient cells. Nat. Commun. 14, 939 (2023).
doi: 10.1038/s41467-023-36294-6 pubmed: 36805596 pmcid: 9941109
Hnisz, D. et al. Activation of proto-oncogenes by disruption of chromosome neighborhoods. Science 351, 1454–1458 (2016).
doi: 10.1126/science.aad9024 pubmed: 26940867 pmcid: 4884612
Liu, Y. et al. A predominant enhancer co-amplified with the SOX2 oncogene is necessary and sufficient for its expression in squamous cancer. Nat. Commun. 12, 7139 (2021).
doi: 10.1038/s41467-021-27055-4 pubmed: 34880227 pmcid: 8654995
Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).
doi: 10.1038/nbt.2727 pubmed: 24185095 pmcid: 4117202
Sun, K. Ktrim: an extra-fast and accurate adapter- and quality-trimmer for sequencing data. Bioinformatics 36, 3561–3562 (2020).
doi: 10.1093/bioinformatics/btaa171 pubmed: 32159761
Magoc, T. & Salzberg, S. L. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011).
doi: 10.1093/bioinformatics/btr507 pubmed: 21903629 pmcid: 3198573
DeMaere, M. Z. & Darling, A. E. Sim3C: simulation of Hi-C and Meta3C proximity ligation sequencing technologies. Gigascience 7, 1–12 (2018).
doi: 10.1093/gigascience/gix103 pubmed: 29149264
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Meth 9, 357–359 (2012).
doi: 10.1038/nmeth.1923
Abdennur, N. et al. Pairtools: from sequencing data to chromosome contacts. bioRxiv, https://doi.org/10.1101/2023.02.13.528389 (2023).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
doi: 10.1093/bioinformatics/btp352 pubmed: 19505943 pmcid: 2723002
Sun, K. Github/Zenodo, https://github.com/hellosunking/Microcket , https://doi.org/10.5281/zenodo.11174864 (2024).

Auteurs

Yu Zhao (Y)

Molecular Cancer Research Center, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, 518107, China.

Mengqi Yang (M)

Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China.
Department of Chemical and Biological Engineering, Division of Life Science, Hong Kong University of Science and Technology, Hong Kong SAR, 999077, China.

Fanglei Gong (F)

Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China.

Yuqi Pan (Y)

Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China.
Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, China.

Minghui Hu (M)

Molecular Cancer Research Center, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, 518107, China.

Qin Peng (Q)

Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518132, China.

Leina Lu (L)

Department of Genetics and Genome Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, 44106, USA.

Xiaowen Lyu (X)

State Key Laboratory of Cellular Stress Biology, Fujian Provincial Key Laboratory of Reproductive Health Research, Fujian Provincial Key Laboratory of Organ and Tissue Regeneration, School of Medicine, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, 361102, China.

Kun Sun (K)

Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China. sunkun@szbl.ac.cn.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH