GCparagon: evaluating and correcting GC biases in cell-free DNA at the fragment level.


Journal

NAR genomics and bioinformatics
ISSN: 2631-9268
Titre abrégé: NAR Genom Bioinform
Pays: England
ID NLM: 101756213

Informations de publication

Date de publication:
Dec 2023
Historique:
received: 09 05 2023
revised: 18 09 2023
accepted: 07 11 2023
medline: 29 11 2023
pubmed: 29 11 2023
entrez: 29 11 2023
Statut: epublish

Résumé

Analyses of cell-free DNA (cfDNA) are increasingly being employed for various diagnostic and research applications. Many technologies aim to increase resolution, e.g. for detecting early-stage cancer or minimal residual disease. However, these efforts may be confounded by inherent base composition biases of cfDNA, specifically the over - and underrepresentation of guanine (G) and cytosine (C) sequences. Currently, there is no universally applicable tool to correct these effects on sequencing read-level data. Here, we present GCparagon, a two-stage algorithm for computing and correcting GC biases in cfDNA samples. In the initial step, length and GC base count parameters are determined. Here, our algorithm minimizes the inclusion of known problematic genomic regions, such as low-mappability regions, in its calculations. In the second step, GCparagon computes weights counterbalancing the distortion of cfDNA attributes (correction matrix). These fragment weights are added to a binary alignment map (BAM) file as alignment tags for individual reads. The GC correction matrix or the tagged BAM file can be used for downstream analyses. Parallel computing allows for a GC bias estimation below 1 min. We demonstrate that GCparagon vastly improves the analysis of regulatory regions, which frequently show specific GC composition patterns and will contribute to standardized cfDNA applications.

Identifiants

pubmed: 38025047
doi: 10.1093/nargab/lqad102
pii: lqad102
pmc: PMC10657415
doi:

Types de publication

Journal Article

Langues

eng

Pagination

lqad102

Informations de copyright

© The Author(s) 2023. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics.

Références

Nucleic Acids Res. 2016 Jul 8;44(W1):W160-5
pubmed: 27079975
Nat Commun. 2017 Nov 6;8(1):1324
pubmed: 29109393
Nucleic Acids Res. 2012 May;40(10):e72
pubmed: 22323520
Nat Commun. 2019 Oct 11;10(1):4666
pubmed: 31604930
Nat Rev Genet. 2019 Feb;20(2):71-88
pubmed: 30410101
Nature. 2020 Sep;585(7825):357-362
pubmed: 32939066
Nat Methods. 2020 Mar;17(3):261-272
pubmed: 32015543
Nature. 2019 Jun;570(7761):385-389
pubmed: 31142840
Nucleic Acids Res. 2019 Jan 8;47(D1):D100-D105
pubmed: 30445619
Nat Commun. 2022 Dec 3;13(1):7475
pubmed: 36463275
Nat Rev Clin Oncol. 2021 May;18(5):297-312
pubmed: 33473219
Med. 2021 Dec 10;2(12):1292-1313
pubmed: 35590147
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Nat Genet. 2009 Oct;41(10):1061-7
pubmed: 19718026
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Nat Commun. 2021 May 28;12(1):3230
pubmed: 34050156
Science. 2015 Jan 23;347(6220):1260419
pubmed: 25613900
Nucleic Acids Res. 2023 Jan 6;51(D1):D1188-D1195
pubmed: 36420891
Nucleic Acids Res. 2021 Jan 8;49(D1):D947-D955
pubmed: 32663312
Gigascience. 2021 Feb 16;10(2):
pubmed: 33590861
Nat Immunol. 2012 Jul 08;13(8):761-9
pubmed: 22772404
Nature. 2022 Aug;608(7921):199-208
pubmed: 35859180
Science. 2021 Apr 9;372(6538):
pubmed: 33833097
Cell. 2016 Jan 14;164(1-2):57-68
pubmed: 26771485
Mol Cancer. 2022 Mar 21;21(1):81
pubmed: 35307037
Nat Biotechnol. 2019 May;37(5):555-560
pubmed: 30858580
Nat Genet. 2018 Jul;50(7):1011-1020
pubmed: 29867222
Genome Med. 2013 Apr 05;5(4):30
pubmed: 23561577
Sci Rep. 2019 Jun 27;9(1):9354
pubmed: 31249361
Nat Genet. 2016 Oct;48(10):1273-8
pubmed: 27571261
Nature. 2011 May 22;474(7352):516-20
pubmed: 21602827
Genome Biol. 2011;12(2):R18
pubmed: 21338519

Auteurs

Benjamin Spiegl (B)

Institute of Human Genetics, Diagnostic and Research Center for Molecular BioMedicine, Medical University of Graz, 8010 Graz, Austria.

Faruk Kapidzic (F)

Institute of Human Genetics, Diagnostic and Research Center for Molecular BioMedicine, Medical University of Graz, 8010 Graz, Austria.

Sebastian Röner (S)

Exploratory Diagnostic Sciences, Berlin Institute of Health (BIH) at Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany.

Martin Kircher (M)

Exploratory Diagnostic Sciences, Berlin Institute of Health (BIH) at Charité-Universitätsmedizin Berlin, 10117 Berlin, Germany.
Institute of Human Genetics, University Medical Center Schleswig-Holstein (UKSH), University of Lübeck, 23562 Lübeck, Germany.

Michael R Speicher (MR)

Institute of Human Genetics, Diagnostic and Research Center for Molecular BioMedicine, Medical University of Graz, 8010 Graz, Austria.
BioTechMed-Graz, 8010 Graz, Austria.

Classifications MeSH