DeltaMSI: artificial intelligence-based modeling of microsatellite instability scoring on next-generation sequencing data.

DNA mismatch repair deficiency Immunotherapy Indel length distribution analysis Logistic regression Machine learning Microsatellite instability Support vector machine Targeted resequencing

Journal

BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194

Informations de publication

Date de publication:
01 Mar 2023
Historique:
received: 11 05 2022
accepted: 14 02 2023
entrez: 1 3 2023
pubmed: 2 3 2023
medline: 4 3 2023
Statut: epublish

Résumé

DNA mismatch repair deficiency (dMMR) testing is crucial for detection of microsatellite unstable (MSI) tumors. MSI is detected by aberrant indel length distributions of microsatellite markers, either by visual inspection of PCR-fragment length profiles or by automated bioinformatic scoring on next-generation sequencing (NGS) data. The former is time-consuming and low-throughput while the latter typically relies on simplified binary scoring of a single parameter of the indel distribution. The purpose of this study was to use machine learning to process the full complexity of indel distributions and integrate it into a robust script for screening of dMMR on small gene panel-based NGS data of clinical tumor samples without paired normal tissue. Scikit-learn was used to train 7 models on normalized read depth data of 36 microsatellite loci in a cohort of 133 MMR proficient (pMMR) and 46 dMMR tumor samples, taking loss of MLH1/MSH2/PMS2/MSH6 protein expression as reference method. After selection of the optimal model and microsatellite panel the two top-performing models per locus (logistic regression and support vector machine) were integrated into a novel script (DeltaMSI) for combined prediction of MSI status on 28 marker loci at sample level. Diagnostic performance of DeltaMSI was compared to that of mSINGS, a widely used script for MSI detection on unpaired tumor samples. The robustness of DeltaMSI was evaluated on 1072 unselected, consecutive solid tumor samples in a real-world setting sequenced using capture chemistry, and 116 solid tumor samples sequenced by amplicon chemistry. Likelihood ratios were used to select result intervals with clinical validity. DeltaMSI achieved higher robustness at equal diagnostic power (AUC = 0.950; 95% CI 0.910-0.975) as compared to mSINGS (AUC = 0.876; 95% CI 0.823-0.918). Its sensitivity of 90% at 100% specificity indicated its clinical potential for high-throughput MSI screening in all tumor types. Clinical Trial Number/IRB B1172020000040, Ethical Committee, AZ Delta General Hospital.

Sections du résumé

BACKGROUND BACKGROUND
DNA mismatch repair deficiency (dMMR) testing is crucial for detection of microsatellite unstable (MSI) tumors. MSI is detected by aberrant indel length distributions of microsatellite markers, either by visual inspection of PCR-fragment length profiles or by automated bioinformatic scoring on next-generation sequencing (NGS) data. The former is time-consuming and low-throughput while the latter typically relies on simplified binary scoring of a single parameter of the indel distribution. The purpose of this study was to use machine learning to process the full complexity of indel distributions and integrate it into a robust script for screening of dMMR on small gene panel-based NGS data of clinical tumor samples without paired normal tissue.
METHODS METHODS
Scikit-learn was used to train 7 models on normalized read depth data of 36 microsatellite loci in a cohort of 133 MMR proficient (pMMR) and 46 dMMR tumor samples, taking loss of MLH1/MSH2/PMS2/MSH6 protein expression as reference method. After selection of the optimal model and microsatellite panel the two top-performing models per locus (logistic regression and support vector machine) were integrated into a novel script (DeltaMSI) for combined prediction of MSI status on 28 marker loci at sample level. Diagnostic performance of DeltaMSI was compared to that of mSINGS, a widely used script for MSI detection on unpaired tumor samples. The robustness of DeltaMSI was evaluated on 1072 unselected, consecutive solid tumor samples in a real-world setting sequenced using capture chemistry, and 116 solid tumor samples sequenced by amplicon chemistry. Likelihood ratios were used to select result intervals with clinical validity.
RESULTS RESULTS
DeltaMSI achieved higher robustness at equal diagnostic power (AUC = 0.950; 95% CI 0.910-0.975) as compared to mSINGS (AUC = 0.876; 95% CI 0.823-0.918). Its sensitivity of 90% at 100% specificity indicated its clinical potential for high-throughput MSI screening in all tumor types. Clinical Trial Number/IRB B1172020000040, Ethical Committee, AZ Delta General Hospital.

Identifiants

pubmed: 36859168
doi: 10.1186/s12859-023-05186-3
pii: 10.1186/s12859-023-05186-3
pmc: PMC9976396
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

73

Informations de copyright

© 2023. The Author(s).

Références

Cancer Res. 1998 Aug 1;58(15):3455-60
pubmed: 9699680
Proc Natl Acad Sci U S A. 1998 Jul 21;95(15):8698-702
pubmed: 9671741
N Engl J Med. 2015 Jun 25;372(26):2509-20
pubmed: 26028255
Nat Rev Cancer. 2015 Mar;15(3):181-94
pubmed: 25673086
J Transl Med. 2020 May 28;18(1):215
pubmed: 32466784
J Natl Cancer Inst. 2004 Feb 18;96(4):261-8
pubmed: 14970275
N Engl J Med. 2005 May 5;352(18):1851-60
pubmed: 15872200
JCO Precis Oncol. 2017;2017:
pubmed: 30211344
BMC Bioinformatics. 2021 Apr 12;22(1):185
pubmed: 33845765
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Genome Biol Evol. 2010 Jul 12;2:325-35
pubmed: 20624737
Cancer Res. 1998 Nov 15;58(22):5248-57
pubmed: 9823339
Ann Oncol. 2019 Aug 1;30(8):1232-1243
pubmed: 31056702
J Clin Pathol. 2019 Dec;72(12):830-835
pubmed: 31235541
J Mol Diagn. 2008 Jul;10(4):301-7
pubmed: 18556776
Sci Rep. 2015 Aug 26;5:13321
pubmed: 26306458
Oncotarget. 2017 Jan 31;8(5):7452-7463
pubmed: 27980218
J Mol Diagn. 2018 Mar;20(2):225-231
pubmed: 29277635
Am J Surg Pathol. 2009 Nov;33(11):1639-45
pubmed: 19701074
J Clin Oncol. 2002 Feb 15;20(4):1043-8
pubmed: 11844828
Clin Chem. 2014 Sep;60(9):1192-9
pubmed: 24987110
Sci Rep. 2021 Jun 18;11(1):12880
pubmed: 34145315
Clin Genet. 2009 Jul;76(1):1-18
pubmed: 19659756
PLoS One. 2021 Feb 4;16(2):e0244471
pubmed: 33539352
Bioinformatics. 2014 Apr 1;30(7):1015-6
pubmed: 24371154
J Mol Diagn. 2008 Jul;10(4):293-300
pubmed: 18556767
Nat Med. 2016 Nov;22(11):1342-1350
pubmed: 27694933

Auteurs

Koen Swaerts (K)

Department of Laboratory Medicine, AZ Delta General Hospital, Deltalaan 1, 8800, Roeselare, Belgium.
RADar Innovation Center, AZ Delta General Hospital, Roeselare, Belgium.

Franceska Dedeurwaerdere (F)

Department of Pathology, AZ Delta General Hospital, Roeselare, Belgium.

Dieter De Smet (D)

Department of Laboratory Medicine, AZ Delta General Hospital, Deltalaan 1, 8800, Roeselare, Belgium.
RADar Innovation Center, AZ Delta General Hospital, Roeselare, Belgium.

Peter De Jaeger (P)

RADar Innovation Center, AZ Delta General Hospital, Roeselare, Belgium.

Geert A Martens (GA)

Department of Laboratory Medicine, AZ Delta General Hospital, Deltalaan 1, 8800, Roeselare, Belgium. geert.martens@azdelta.be.
RADar Innovation Center, AZ Delta General Hospital, Roeselare, Belgium. geert.martens@azdelta.be.
Department of Biomolecular Medicine, Ghent University, Ghent, Belgium. geert.martens@azdelta.be.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH