Comparative analysis of somatic variant calling on matched FF and FFPE WGS samples.
Biomarkers, Tumor
/ genetics
DNA, Neoplasm
/ analysis
Feasibility Studies
Formaldehyde
/ chemistry
Gene Expression Profiling
Gene Expression Regulation, Neoplastic
Humans
Lung Neoplasms
/ genetics
Male
Mutation
Neoplasm Recurrence, Local
/ genetics
Paraffin Embedding
/ methods
Prognosis
Prostatic Neoplasms
/ genetics
Specimen Handling
Tissue Fixation
/ methods
Whole Genome Sequencing
Cohort studies
FFPE
Precision oncology
Somatic variants
WGS
Journal
BMC medical genomics
ISSN: 1755-8794
Titre abrégé: BMC Med Genomics
Pays: England
ID NLM: 101319628
Informations de publication
Date de publication:
06 07 2020
06 07 2020
Historique:
received:
29
08
2019
accepted:
22
06
2020
entrez:
8
7
2020
pubmed:
8
7
2020
medline:
11
5
2021
Statut:
epublish
Résumé
Research grade Fresh Frozen (FF) DNA material is not yet routinely collected in clinical practice. Many hospitals, however, collect and store Formalin Fixed Paraffin Embedded (FFPE) tumor samples. Consequently, the sample size of whole genome cancer cohort studies could be increased tremendously by including FFPE samples, although the presence of artefacts might obfuscate the variant calling. To assess whether FFPE material can be used for cohort studies, we performed an in-depth comparison of somatic SNVs called on matching FF and FFPE Whole Genome Sequence (WGS) samples extracted from the same tumor. Four variant callers (i.e. Strelka2, Mutect2, VarScan2 and Shimmer) were used to call somatic variants on matching FF and FFPE WGS samples from a metastatic prostate tumor. Using the variants identified by these callers, we developed a heuristic to maximize the overlap between the FF and its FFPE counterpart in terms of sensitivity and precision. The proposed variant calling approach was then validated on nine matched primary samples. Finally, we assessed what fraction of the discrepancy could be attributed to intra-tumor heterogeneity (ITH), by comparing the overlap in clonal and subclonal somatic variants. We first compared variants between an FF and an FFPE sample from a metastatic prostate tumor, showing that on average 50% of the calls in the FF are recovered in the FFPE sample, with notable differences between callers. Combining the variants of the different callers using a simple heuristic, increases both the precision and the sensitivity of the variant calling. Validating the heuristic on nine additional matched FF-FFPE samples, resulted in an average F1-score of 0.58 and an outperformance of any of the individual callers. In addition, we could show that part of the discrepancy between the FF and the FFPE samples can be attributed to ITH. This study illustrates that when using the correct variant calling strategy, the majority of clonal SNVs can be recovered in an FFPE sample with high precision and sensitivity. These results suggest that somatic variants derived from WGS of FFPE material can be used in cohort studies.
Sections du résumé
BACKGROUND
Research grade Fresh Frozen (FF) DNA material is not yet routinely collected in clinical practice. Many hospitals, however, collect and store Formalin Fixed Paraffin Embedded (FFPE) tumor samples. Consequently, the sample size of whole genome cancer cohort studies could be increased tremendously by including FFPE samples, although the presence of artefacts might obfuscate the variant calling. To assess whether FFPE material can be used for cohort studies, we performed an in-depth comparison of somatic SNVs called on matching FF and FFPE Whole Genome Sequence (WGS) samples extracted from the same tumor.
METHODS
Four variant callers (i.e. Strelka2, Mutect2, VarScan2 and Shimmer) were used to call somatic variants on matching FF and FFPE WGS samples from a metastatic prostate tumor. Using the variants identified by these callers, we developed a heuristic to maximize the overlap between the FF and its FFPE counterpart in terms of sensitivity and precision. The proposed variant calling approach was then validated on nine matched primary samples. Finally, we assessed what fraction of the discrepancy could be attributed to intra-tumor heterogeneity (ITH), by comparing the overlap in clonal and subclonal somatic variants.
RESULTS
We first compared variants between an FF and an FFPE sample from a metastatic prostate tumor, showing that on average 50% of the calls in the FF are recovered in the FFPE sample, with notable differences between callers. Combining the variants of the different callers using a simple heuristic, increases both the precision and the sensitivity of the variant calling. Validating the heuristic on nine additional matched FF-FFPE samples, resulted in an average F1-score of 0.58 and an outperformance of any of the individual callers. In addition, we could show that part of the discrepancy between the FF and the FFPE samples can be attributed to ITH.
CONCLUSION
This study illustrates that when using the correct variant calling strategy, the majority of clonal SNVs can be recovered in an FFPE sample with high precision and sensitivity. These results suggest that somatic variants derived from WGS of FFPE material can be used in cohort studies.
Identifiants
pubmed: 32631411
doi: 10.1186/s12920-020-00746-5
pii: 10.1186/s12920-020-00746-5
pmc: PMC7336445
doi:
Substances chimiques
Biomarkers, Tumor
0
DNA, Neoplasm
0
Formaldehyde
1HG84L3525
Types de publication
Comparative Study
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
94Références
Elife. 2017 Nov 13;6:
pubmed: 29130882
Nucleic Acids Res. 2016 Sep 19;44(16):e131
pubmed: 27270079
BMC Med Genomics. 2011 Sep 29;4:68
pubmed: 21958464
Nat Med. 2014 Jun;20(6):682-8
pubmed: 24836576
PLoS One. 2014 May 30;9(5):e98187
pubmed: 24878701
Nucleic Acids Res. 2010 Aug;38(14):e151
pubmed: 20525786
PLoS One. 2015 Jul 29;10(7):e0127353
pubmed: 26222067
Genome Med. 2018 Apr 25;10(1):33
pubmed: 29695279
Nat Methods. 2018 Aug;15(8):591-594
pubmed: 30013048
Eur Urol. 2013 May;63(5):920-6
pubmed: 22981675
J Mol Diagn. 2013 Sep;15(5):623-33
pubmed: 23810758
Bioinformatics. 2013 Jun 15;29(12):1498-503
pubmed: 23620360
Proc Natl Acad Sci U S A. 2010 Sep 28;107(39):16910-5
pubmed: 20837533
Nat Biotechnol. 2013 Mar;31(3):213-9
pubmed: 23396013
Clin Chem. 2015 Jan;61(1):64-71
pubmed: 25421801
Nat Commun. 2019 Jul 5;10(1):2969
pubmed: 31278357
BMC Med Genomics. 2014 May 13;7:23
pubmed: 24885028
Pathology. 2016 Apr;48(3):261-6
pubmed: 27020503
Bioinformatics. 2019 Nov 1;35(21):4433-4435
pubmed: 31099386
PLoS One. 2015 Dec 07;10(12):e0144162
pubmed: 26641479
Genet Med. 2018 Oct;20(10):1196-1205
pubmed: 29388947
Genome Res. 2012 Mar;22(3):568-76
pubmed: 22300766
Nucleic Acids Res. 2019 Jan 25;47(2):e12
pubmed: 30418619
PLoS One. 2009;4(5):e5548
pubmed: 19440246
Cell. 2012 May 25;149(5):994-1007
pubmed: 22608083