Predicting breast cancer 5-year survival using machine learning: A systematic review.


Journal

PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081

Informations de publication

Date de publication:
2021
Historique:
received: 21 01 2021
accepted: 06 04 2021
entrez: 16 4 2021
pubmed: 17 4 2021
medline: 30 9 2021
Statut: epublish

Résumé

Accurately predicting the survival rate of breast cancer patients is a major issue for cancer researchers. Machine learning (ML) has attracted much attention with the hope that it could provide accurate results, but its modeling methods and prediction performance remain controversial. The aim of this systematic review is to identify and critically appraise current studies regarding the application of ML in predicting the 5-year survival rate of breast cancer. In accordance with the PRISMA guidelines, two researchers independently searched the PubMed (including MEDLINE), Embase, and Web of Science Core databases from inception to November 30, 2020. The search terms included breast neoplasms, survival, machine learning, and specific algorithm names. The included studies related to the use of ML to build a breast cancer survival prediction model and model performance that can be measured with the value of said verification results. The excluded studies in which the modeling process were not explained clearly and had incomplete information. The extracted information included literature information, database information, data preparation and modeling process information, model construction and performance evaluation information, and candidate predictor information. Thirty-one studies that met the inclusion criteria were included, most of which were published after 2013. The most frequently used ML methods were decision trees (19 studies, 61.3%), artificial neural networks (18 studies, 58.1%), support vector machines (16 studies, 51.6%), and ensemble learning (10 studies, 32.3%). The median sample size was 37256 (range 200 to 659820) patients, and the median predictor was 16 (range 3 to 625). The accuracy of 29 studies ranged from 0.510 to 0.971. The sensitivity of 25 studies ranged from 0.037 to 1. The specificity of 24 studies ranged from 0.008 to 0.993. The AUC of 20 studies ranged from 0.500 to 0.972. The precision of 6 studies ranged from 0.549 to 1. All of the models were internally validated, and only one was externally validated. Overall, compared with traditional statistical methods, the performance of ML models does not necessarily show any improvement, and this area of research still faces limitations related to a lack of data preprocessing steps, the excessive differences of sample feature selection, and issues related to validation. Further optimization of the performance of the proposed model is also needed in the future, which requires more standardization and subsequent validation.

Sections du résumé

BACKGROUND
Accurately predicting the survival rate of breast cancer patients is a major issue for cancer researchers. Machine learning (ML) has attracted much attention with the hope that it could provide accurate results, but its modeling methods and prediction performance remain controversial. The aim of this systematic review is to identify and critically appraise current studies regarding the application of ML in predicting the 5-year survival rate of breast cancer.
METHODS
In accordance with the PRISMA guidelines, two researchers independently searched the PubMed (including MEDLINE), Embase, and Web of Science Core databases from inception to November 30, 2020. The search terms included breast neoplasms, survival, machine learning, and specific algorithm names. The included studies related to the use of ML to build a breast cancer survival prediction model and model performance that can be measured with the value of said verification results. The excluded studies in which the modeling process were not explained clearly and had incomplete information. The extracted information included literature information, database information, data preparation and modeling process information, model construction and performance evaluation information, and candidate predictor information.
RESULTS
Thirty-one studies that met the inclusion criteria were included, most of which were published after 2013. The most frequently used ML methods were decision trees (19 studies, 61.3%), artificial neural networks (18 studies, 58.1%), support vector machines (16 studies, 51.6%), and ensemble learning (10 studies, 32.3%). The median sample size was 37256 (range 200 to 659820) patients, and the median predictor was 16 (range 3 to 625). The accuracy of 29 studies ranged from 0.510 to 0.971. The sensitivity of 25 studies ranged from 0.037 to 1. The specificity of 24 studies ranged from 0.008 to 0.993. The AUC of 20 studies ranged from 0.500 to 0.972. The precision of 6 studies ranged from 0.549 to 1. All of the models were internally validated, and only one was externally validated.
CONCLUSIONS
Overall, compared with traditional statistical methods, the performance of ML models does not necessarily show any improvement, and this area of research still faces limitations related to a lack of data preprocessing steps, the excessive differences of sample feature selection, and issues related to validation. Further optimization of the performance of the proposed model is also needed in the future, which requires more standardization and subsequent validation.

Identifiants

pubmed: 33861809
doi: 10.1371/journal.pone.0250370
pii: PONE-D-21-02286
pmc: PMC8051758
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't Systematic Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

e0250370

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

JAMA. 1997 Feb 12;277(6):488-94
pubmed: 9020274
Cancer Inform. 2018 Nov 09;17:1176935118810215
pubmed: 30455569
BMJ. 1994 Nov 19;309(6965):1351-5
pubmed: 7866085
Eur Urol. 2015 Jun;67(6):1142-1151
pubmed: 25572824
Int J Med Inform. 2020 Sep;141:104170
pubmed: 32544823
J Med Syst. 2015 Nov;39(11):152
pubmed: 26385549
PLoS Med. 2014 Oct 14;11(10):e1001744
pubmed: 25314315
BMC Med Inform Decis Mak. 2013 Nov 09;13:124
pubmed: 24207108
IEEE/ACM Trans Comput Biol Bioinform. 2018 Feb 15;:
pubmed: 29994639
BMC Med. 2020 Nov 17;18(1):327
pubmed: 33198768
BMC Med Res Methodol. 2014 Dec 22;14:137
pubmed: 25532820
J Am Med Inform Assoc. 2013 Jul-Aug;20(4):613-8
pubmed: 23467471
IEEE Trans Biomed Eng. 2018 Nov 22;:
pubmed: 30475709
Int J Med Inform. 2020 Nov;143:104268
pubmed: 32950874
Artif Intell Med. 2005 Jun;34(2):113-27
pubmed: 15894176
Int J Med Inform. 2019 Oct;130:103957
pubmed: 31472443
J Med Syst. 2014 Oct;38(10):106
pubmed: 25119239
Breast Cancer Res Treat. 1994;30(2):117-26
pubmed: 7949209
J Am Med Inform Assoc. 2020 Jul 1;27(7):1092-1101
pubmed: 32548642
BMC Med Inform Decis Mak. 2019 Mar 22;19(1):48
pubmed: 30902088
Endocr Connect. 2019 Jul;8(7):952-960
pubmed: 31234143
J Natl Cancer Inst. 2020 Oct 1;112(10):979-988
pubmed: 32259259
PLoS One. 2016 May 19;11(5):e0155119
pubmed: 27195952
Artif Intell Med. 2018 Aug;90:1-14
pubmed: 30017512
J Clin Epidemiol. 2019 Jun;110:12-22
pubmed: 30763612
Eur J Cardiothorac Surg. 2013 Jun;43(6):1146-52
pubmed: 23152436
Radiology. 2020 Feb;294(2):265-272
pubmed: 31845842
Ann Intern Med. 2019 Jan 1;170(1):W1-W33
pubmed: 30596876
Glob J Health Sci. 2015 Jan 26;7(4):392-8
pubmed: 25946945
N Engl J Med. 2016 Sep 29;375(13):1216-9
pubmed: 27682033
Breast Cancer Res Treat. 2018 Dec;172(3):611-618
pubmed: 30194511
Breast Cancer Res. 2020 May 28;22(1):57
pubmed: 32466777
J Biomed Inform. 2014 Dec;52:418-26
pubmed: 25182868
Ann Oncol. 2007 Jun;18(6):971-6
pubmed: 17043092
J Res Health Sci. 2016 Winter;16(1):31-5
pubmed: 27061994
Cancer Invest. 2009 Mar;27(3):235-43
pubmed: 19291527
Comput Struct Biotechnol J. 2014 Nov 15;13:8-17
pubmed: 25750696
J Intern Med. 2020 Jul;288(1):62-81
pubmed: 32128929
BMC Med Res Methodol. 2014 Mar 19;14:40
pubmed: 24645774
Comput Methods Programs Biomed. 2018 Mar;156:25-45
pubmed: 29428074
Mol Psychiatry. 2020 Dec;25(12):3186-3197
pubmed: 32820237
Comput Methods Programs Biomed. 2018 Jul;161:45-53
pubmed: 29852967
J Clin Invest. 2011 Oct;121(10):3786-8
pubmed: 21965334
Stat Med. 2000 Feb 29;19(4):453-73
pubmed: 10694730
Oncologist. 2004;9(6):606-16
pubmed: 15561805
BMC Med Genomics. 2014;7 Suppl 1:S4
pubmed: 25080202
Annu Int Conf IEEE Eng Med Biol Soc. 2008;2008:5148-51
pubmed: 19163876
Artif Intell Med. 2020 Nov;110:101977
pubmed: 33250149
BMC Cancer. 2019 Mar 14;19(1):230
pubmed: 30871490
CA Cancer J Clin. 2018 Nov;68(6):394-424
pubmed: 30207593
Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:1290-3
pubmed: 24109931
Comput Biol Med. 2015 Apr;59:125-133
pubmed: 25725446
Comput Methods Programs Biomed. 2020 Aug;192:105458
pubmed: 32302875
Epidemiology. 2010 Jan;21(1):128-38
pubmed: 20010215
J Med Internet Res. 2019 Jul 26;21(7):e14464
pubmed: 31350843
Folia Biol (Praha). 2019;65(5-6):212-220
pubmed: 32362304
Am J Epidemiol. 2010 Oct 15;172(8):971-80
pubmed: 20807737
Int J Epidemiol. 1999 Feb;28(1):1-9
pubmed: 10195657

Auteurs

Jiaxin Li (J)

School of Nursing, Jilin University, Jilin, China.

Zijun Zhou (Z)

Breast Surgery, Jilin Province Tumor Hospital, Jilin, China.

Jianyu Dong (J)

School of Nursing, Jilin University, Jilin, China.

Ying Fu (Y)

School of Nursing, Jilin University, Jilin, China.

Yuan Li (Y)

School of Nursing, Jilin University, Jilin, China.

Ze Luan (Z)

School of Nursing, Jilin University, Jilin, China.

Xin Peng (X)

School of Nursing, Jilin University, Jilin, China.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH