Comparative study on the performance of different classification algorithms, combined with pre- and post-processing techniques to handle imbalanced data, in the diagnosis of adult patients with familial hypercholesterolemia.


Journal

PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081

Informations de publication

Date de publication:
2022
Historique:
received: 30 10 2021
accepted: 26 05 2022
entrez: 24 6 2022
pubmed: 25 6 2022
medline: 29 6 2022
Statut: epublish

Résumé

Familial Hypercholesterolemia (FH) is an inherited disorder of cholesterol metabolism. Current criteria for FH diagnosis, like Simon Broome (SB) criteria, lead to high false positive rates. The aim of this work was to explore alternative classification procedures for FH diagnosis, based on different biological and biochemical indicators. For this purpose, logistic regression (LR), naive Bayes classifier (NB), random forest (RF) and extreme gradient boosting (XGB) algorithms were combined with Synthetic Minority Oversampling Technique (SMOTE), or threshold adjustment by maximizing Youden index (YI), and compared. Data was tested through a 10 × 10 repeated k-fold cross validation design. The LR model presented an overall better performance, as assessed by the areas under the receiver operating characteristics (AUROC) and precision-recall (AUPRC) curves, and several operating characteristics (OC), regardless of the strategy to cope with class imbalance. When adopting either data processing technique, significantly higher accuracy (Acc), G-mean and F1 score values were found for all classification algorithms, compared to SB criteria (p < 0.01), revealing a more balanced predictive ability for both classes, and higher effectiveness in classifying FH patients. Adjustment of the cut-off values through pre or post-processing methods revealed a considerable gain in sensitivity (Sens) values (p < 0.01). Although the performance of pre and post-processing strategies was similar, SMOTE does not cause model's parameters to loose interpretability. These results suggest a LR model combined with SMOTE can be an optimal approach to be used as a widespread screening tool.

Identifiants

pubmed: 35749402
doi: 10.1371/journal.pone.0269713
pii: PONE-D-21-34690
pmc: PMC9231719
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e0269713

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

Eur Heart J. 2013 Dec;34(45):3478-90a
pubmed: 23956253
Atherosclerosis. 2005 May;180(1):155-60
pubmed: 15823288
Eur Heart J. 2017 Feb 21;38(8):565-573
pubmed: 27044878
Atherosclerosis. 2013 Jul;229(1):161-8
pubmed: 23669246
J Am Coll Cardiol. 2020 May 26;75(20):2553-2566
pubmed: 32439005
Cardiol Ther. 2015 Jun;4(1):25-38
pubmed: 25769531
J Biomed Sci. 2016 Apr 16;23:39
pubmed: 27084339
NPJ Digit Med. 2019 Apr 11;2:23
pubmed: 31304370
J Clin Endocrinol Metab. 2018 Apr 1;103(4):1704-1714
pubmed: 29408959
Cardiol Clin. 2015 May;33(2):169-79
pubmed: 25939291
BMJ. 1991 Oct 12;303(6807):893-6
pubmed: 1933004
NPJ Digit Med. 2020 Oct 30;3:142
pubmed: 33145438
Am J Epidemiol. 2004 Sep 1;160(5):407-20
pubmed: 15321837
Lancet Public Health. 2019 May;4(5):e256-e264
pubmed: 31054643
Curr Cardiol Rep. 2017 May;19(5):44
pubmed: 28405938
Genet Med. 2015 Dec;17(12):980-8
pubmed: 25741862
J Clin Endocrinol Metab. 2012 Nov;97(11):3956-64
pubmed: 22893714
Atherosclerosis. 2010 Oct;212(2):553-8
pubmed: 20828696
PLoS One. 2014 Jan 09;9(1):e81998
pubmed: 24416135
Biochem Med (Zagreb). 2016 Oct 15;26(3):297-307
pubmed: 27812299
JAMA Netw Open. 2020 Apr 1;3(4):e203959
pubmed: 32347951
Springerplus. 2013 May 14;2(1):222
pubmed: 23853744
Atherosclerosis. 2018 Oct;277:289-297
pubmed: 30270061
Eur Heart J. 2013 Apr;34(13):962-71
pubmed: 23416791
Eur J Prev Cardiol. 2020 Oct;27(15):1639-1646
pubmed: 32019371
Anesth Analg. 2018 May;126(5):1763-1768
pubmed: 29481436
PLoS One. 2015 Mar 04;10(3):e0118432
pubmed: 25738806
Comput Math Methods Med. 2017;2017:3762651
pubmed: 28642804
Atherosclerosis. 2015 Feb;238(2):336-43
pubmed: 25555265
Nat Rev Dis Primers. 2017 Dec 07;3:17093
pubmed: 29219151
J Clin Epidemiol. 1996 Dec;49(12):1373-9
pubmed: 8970487

Auteurs

João Albuquerque (J)

Departamento de Biomedicina, Unidade de Bioquímica, Faculdade de Medicina, Universidade do Porto, Porto, Portugal.
Centro de Estatística e Aplicações, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.
Grupo de Investigação Cardiovascular, Departamento de Promoção da Saúde e Prevenção de Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Lisboa, Portugal.

Ana Margarida Medeiros (AM)

Grupo de Investigação Cardiovascular, Departamento de Promoção da Saúde e Prevenção de Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Lisboa, Portugal.
Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.

Ana Catarina Alves (AC)

Grupo de Investigação Cardiovascular, Departamento de Promoção da Saúde e Prevenção de Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Lisboa, Portugal.
Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.

Mafalda Bourbon (M)

Grupo de Investigação Cardiovascular, Departamento de Promoção da Saúde e Prevenção de Doenças Não Transmissíveis, Instituto Nacional de Saúde Doutor Ricardo Jorge, Lisboa, Portugal.
Instituto de Biossistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.

Marília Antunes (M)

Centro de Estatística e Aplicações, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.
Departamento de Estatística e Investigação Operacional, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH