Multi-classifier prediction of knee osteoarthritis progression from incomplete imbalanced longitudinal data.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
21 05 2020
Historique:
received: 25 10 2019
accepted: 20 04 2020
entrez: 23 5 2020
pubmed: 23 5 2020
medline: 15 12 2020
Statut: epublish

Résumé

Conventional inclusion criteria used in osteoarthritis clinical trials are not very effective in selecting patients who would benefit from a therapy being tested. Typically majority of selected patients show no or limited disease progression during a trial period. As a consequence, the effect of the tested treatment cannot be observed, and the efforts and resources invested in running the trial are not rewarded. This could be avoided, if selection criteria were more predictive of the future disease progression. In this article, we formulated the patient selection problem as a multi-class classification task, with classes based on clinically relevant measures of progression (over a time scale typical for clinical trials). Using data from two long-term knee osteoarthritis studies OAI and CHECK, we tested multiple algorithms and learning process configurations (including multi-classifier approaches, cost-sensitive learning, and feature selection), to identify the best performing machine learning models. We examined the behaviour of the best models, with respect to prediction errors and the impact of used features, to confirm their clinical relevance. We found that the model-based selection outperforms the conventional inclusion criteria, reducing by 20-25% the number of patients who show no progression. This result might lead to more efficient clinical trials.

Identifiants

pubmed: 32439879
doi: 10.1038/s41598-020-64643-8
pii: 10.1038/s41598-020-64643-8
pmc: PMC7242357
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

8427

Références

Felson, D. T. Developments in the clinical understanding of osteoarthritis. Arthritis Research and Therapy 11, 203, https://doi.org/10.1186/ar2531 (2009).
doi: 10.1186/ar2531 pubmed: 19232065
Cross, M. et al. The global burden of hip and knee osteoarthritis: estimates from the Global Burden of Disease 2010 study. Annals of the Rheumatic Diseases 73, 1323–1330, https://doi.org/10.1136/annrheumdis-2013-204763 (2014).
doi: 10.1136/annrheumdis-2013-204763 pubmed: 24553908
Felson, D. et al. Progression of osteoarthritis as a state of inertia. Annals of the Rheumatic Diseases 72, 924–929, https://doi.org/10.1136/annrheumdis-2012-201575 (2012).
doi: 10.1136/annrheumdis-2012-201575 pubmed: 22753401 pmcid: 5310527
Wesseling, J. et al. Cohort Profile: Cohort Hip and Cohort Knee (CHECK) study. International Journal of Epidemiology 45, 36–44, https://doi.org/10.1093/ije/dyu177 (2016).
doi: 10.1093/ije/dyu177 pubmed: 25172137
Eckstein, F., Kwoh, C. K. & Link, T. M. Imaging research results from the Osteoarthritis Initiative (OAI): a review and lessons learned 10 years after start of enrolment. Annals of the Rheumatic Diseases 73, 1289–1300, https://doi.org/10.1136/annrheumdis-2014-205310 (2014).
doi: 10.1136/annrheumdis-2014-205310 pubmed: 24728332
Marijnissen, A. et al. Knee Images Digital Analysis (KIDA): a novel method to quantify individual radiographic features of knee osteoarthritis in detail. Osteoarthritis and Cartilage 16, 234–243, https://doi.org/10.1016/j.joca.2007.06.009 (2008).
doi: 10.1016/j.joca.2007.06.009 pubmed: 17693099
Eckstein, F. et al. Brief Report: Cartilage thickness change as an imaging biomarker of knee osteoarthritis progression: data from the Foundation for the National Institutes of Health Osteoarthritis Biomarkers Consortium. Arthritis & Rheumatology 67, 3184–3189, https://doi.org/10.1002/art.39324 (2015).
doi: 10.1002/art.39324
Bellamy, N. WOMAC: a 20-year experiential review of a patient-centered self-reported health status questionnaire. The Journal of Rheumatology 29, 2473–2476, http://www.jrheum.org/content/29/12/2473 (2002).
pubmed: 12465137
Pedregosa, F. et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830, http://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html (2011).
McKinney, W. pandas: a foundational Python library for data analysis and statistics. In Workshop on Python for High-Performance and Scientific Computing (PyHPC 2011) (Seattle, USA, 2011), https://www.dlr.de/sc/Portaldata/15/Resources/dokumente/pyhpc2011/submissions/pyhpc2011_submission_9.pdf .
Oliphant, T. E. Python for Scientific Computing. Computing in Science and Engineering 9, 10–20, https://doi.org/10.1109/MCSE.2007.58 (2007).
doi: 10.1109/MCSE.2007.58
Jones, E. T. P. et al. SciPy: Open source scientific tools for Python (2001–), https://www.scipy.org/scipylib/ .
Waskom, M. seaborn: statistical data visualization (2013–), http://seaborn.pydata.org/ .
Hunter, J. D. Matplotlib: a 2D graphics environment. Computing in Science and Engineering 9, 90–95, https://doi.org/10.1109/MCSE.2007.55 (2007).
doi: 10.1109/MCSE.2007.55
Sasaki, Y. The truth of the F-measure. Tech. Rep., School of Computer Science, University of Manchester (2007), https://www.toyota-ti.ac.jp/Lab/Denshi/COIN/people/yutaka.sasaki/F-measure-YS-26Oct07.pdf .
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R. & Lin, C.-J. LIBLINEAR: A Library for Large Linear Classification. Journal of Machine Learning Research 9, 1871–1874, http://www.csie.ntu.edu.tw/cjlin/papers/liblinear.pdf (2008).
Wu, X. et al. Top 10 algorithms in data mining. Knowledge and Information Systems 14, 1–37, https://doi.org/10.1007/s10115-007-0114-2 (2008).
doi: 10.1007/s10115-007-0114-2
Chang, C.-C. & Lin, C.-J. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27, https://doi.org/10.1145/1961189.1961199 (2011).
doi: 10.1145/1961189.1961199
Breiman, L. Random Forests. Machine Learning 45, 5–32, https://doi.org/10.1023/a:1010933404324 (2001).
doi: 10.1023/a:1010933404324
Fernández-Delgado, M., Cernadas, E., Barro, S. & Amorim, D. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research 15, 3133–3181, http://jmlr.org/papers/v15/delgado14a.html (2014).
Zhang, C., Liu, C., Zhang, X. & Almpanidis, G. An up-to-date comparison of state-of-the-art classification algorithms. Expert Systems with Applications 82, 128–150, https://doi.org/10.1016/j.eswa.2017.04.003 (2017).
doi: 10.1016/j.eswa.2017.04.003
Chen, C., Liaw, A. & Breiman, L. Using random forest to learn imbalanced data. Tech. Rep., University of California, Berkeley (2004).
Tsoumakas, G. & Katakis, I. Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3, 1–13, https://doi.org/10.4018/jdwm.2007070101 (2007).
doi: 10.4018/jdwm.2007070101
Tsamardinos, I., Greasidou, E. & Borboudakis, G. Bootstrapping the out-of-sample predictions for efficient and accurate cross-validation. Machine Learning 107, 1895–1922, https://doi.org/10.1007/s10994-018-5714-4 (2018).
doi: 10.1007/s10994-018-5714-4 pubmed: 30393425 pmcid: 6191021
Lundberg, S. M., Erion, G. G. & Lee, S.-I. Consistent individualized feature attribution for tree ensembles. Computing Research Repository arXiv:1802.03888v2 https://arxiv.org/abs/1802.03888 (2018).
Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In I., Guyon et al. (eds.) Advances in Neural Information Processing Systems (NIPS 2017), 4765–4774 (Long Beach, CA, USA, 2017), http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf .
Å trumbelj, E. & Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowledge and Information Systems 41, 647–665, https://doi.org/10.1007/s10115-013-0679-x (2014).
doi: 10.1007/s10115-013-0679-x
Ribeiro, M. T., Singh, S. & Guestrin, C. “Why should I trust you?”: explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144 (San Francisco, USA, 2016), https://doi.org/10.1145/2939672.2939778 .
Altman, R. et al. Development of criteria for the classification and reporting of osteoarthritis: Classification of osteoarthritis of the knee. Arthritis & Rheumatism 29, 1039–1049, https://doi.org/10.1002/art.1780290816 (1986).
doi: 10.1002/art.1780290816
Kohn, M. D., Sassoon, A. A. & Fernando, N. D. Classifications in brief: Kellgren-Lawrence classification of osteoarthritis. Clinical Orthopaedics and Related Research 474, 1886–1893, https://doi.org/10.1007/s11999-016-4732-4 (2016).
doi: 10.1007/s11999-016-4732-4 pubmed: 26872913 pmcid: 4925407
Kellgren, J. & Lawrence, J. Radiological assessment of osteo-arthrosis. Annals of the Rheumatic Diseases 16, 494–502, https://doi.org/10.1136/ard.16.4.494 (1957).
doi: 10.1136/ard.16.4.494 pubmed: 13498604 pmcid: 1006995
Hand, D. J. & Till, R. J. A simple generalisation of the area under the ROC curve for multiple class classification problems. Machine Learning 45, 171–186, https://doi.org/10.1023/A:1010920819831 (2001).
doi: 10.1023/A:1010920819831
Hand, D. J. Measuring classifier performance: a coherent alternative to the area under the ROC curve. Machine Learning 77, 103–123, https://doi.org/10.1007/s10994-009-5119-5 (2009).
doi: 10.1007/s10994-009-5119-5
Matthews, B. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA) - Protein Structure 405, 442–451, https://doi.org/10.1016/0005-2795(75)90109-9 (1975).
doi: 10.1016/0005-2795(75)90109-9
Gorodkin, J. Comparing two K-category assignments by a K-category correlation coefficient. Computational Biology and Chemistry 28, 367–374, https://doi.org/10.1016/j.compbiolchem.2004.09.006 (2004).
doi: 10.1016/j.compbiolchem.2004.09.006 pubmed: 15556477
Zhang, W. et al. Nottingham knee osteoarthritis risk prediction models. Annals of the Rheumatic Diseases 70, 1599–1604, https://doi.org/10.1136/ard.2011.149807 (2011).
doi: 10.1136/ard.2011.149807 pubmed: 21613308
Kinds, M. et al. Evaluation of separate quantitative radiographic features adds to the prediction of incident radiographic osteoarthritis in individuals with recent onset of knee pain: 5-year follow-up in the CHECK cohort. Osteoarthritis and Cartilage 20, 548–556, https://doi.org/10.1016/j.joca.2012.02.009 (2012).
doi: 10.1016/j.joca.2012.02.009 pubmed: 22366685
Kerkhof, H. et al. Prediction model for knee osteoarthritis incidence, including clinical, genetic and biochemical risk factors. Annals of the Rheumatic Diseases 73, 2116–2121, https://doi.org/10.1136/annrheumdis-2013-203620 (2014).
doi: 10.1136/annrheumdis-2013-203620 pubmed: 23962456
Jamshidi, A., Pelletier, J.-P. & Martel-Pelletier, J. Machine-learning-based patient-specific prediction models for knee osteoarthritis. Nature Reviews Rheumatology 15, 49–60, https://doi.org/10.1038/s41584-018-0130-5 (2019).
doi: 10.1038/s41584-018-0130-5 pubmed: 30523334
Yoo, T. K., Kim, D. W., Choi, S. B., Oh, E. & Park, J. S. Simple scoring system and artificial neural network for knee osteoarthritis risk prediction: A cross-sectional study. PLOS ONE 11, 1–17, https://doi.org/10.1371/journal.pone.0148724 (2016).
doi: 10.1371/journal.pone.0148724
Minciullo, L., Bromiley, P. A., Felson, D. T. & Cootes, T. F. Indecisive trees for classification and prediction of knee osteoarthritis. In Q., Wang, Y., Shi, H.-I., Suk & K., Suzuki (eds.) International Workshop on Machine Learning in Medical Imaging (MLMI 2017), 283–290 (Quebec City, Canada, 2017), https://doi.org/10.1007/978-3-319-67389-9_33 .
Lazzarini, N. et al. A machine learning approach for the identification of new biomarkers for knee osteoarthritis development in overweight and obese women. Osteoarthritis and Cartilage 25, 2014–2021, https://doi.org/10.1016/j.joca.2017.09.001 (2017).
doi: 10.1016/j.joca.2017.09.001 pubmed: 28899843
Kraus, V. B. et al. Predictive validity of biochemical biomarkers in knee osteoarthritis: data from the FNIH OA Biomarkers Consortium. Annals of the Rheumatic Diseases 76, 186–195, https://doi.org/10.1136/annrheumdis-2016-209252 (2017).
doi: 10.1136/annrheumdis-2016-209252 pubmed: 27296323
Hafezi-Nejad, N. et al. Prediction of medial tibiofemoral compartment joint space loss progression using volumetric cartilage measurements: Data from the FNIH OA biomarkers consortium. European Radiology 27, 464–473, https://doi.org/10.1007/s00330-016-4393-4 (2017).
doi: 10.1007/s00330-016-4393-4 pubmed: 27221563
Brand, A., Allen, L., Altman, M., Hlava, M. & Scott, J. Beyond authorship: attribution, contribution, collaboration, and credit. Learned Publishing 28, 151–155, https://doi.org/10.1087/20150211 (2015).
doi: 10.1087/20150211

Auteurs

Paweł Widera (P)

School of Computing Science, Newcastle University, 1 Science Square, Newcastle, NE4 5TG, UK.

Paco M J Welsing (PMJ)

Department of Rheumatology & Clinical Immunology, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Netherlands.

Christoph Ladel (C)

Merck, Frankfurter Str. 250, 64293, Darmstadt, Germany.

John Loughlin (J)

Biosciences Institute, Newcastle University, International Centre for Life, Newcastle, NE1 3BZ, UK.

Floris P F J Lafeber (FPFJ)

Department of Rheumatology & Clinical Immunology, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Netherlands.

Florence Petit Dop (F)

Immuno-inflammation Center of Therapeutic Innovation, Institut de Recherches Internationales Servier, Suresnes, France.

Jonathan Larkin (J)

Novel Human Genetics Research Unit, GlaxoSmithKline, Collegeville, PA, 19426, USA.

Harrie Weinans (H)

Department of Orthopedics, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Netherlands.
Department of Biomechanical Engineering, Delft University of Technology, Mekelweg 2, 2628 CD, Delft, Netherlands.

Ali Mobasheri (A)

Department of Regenerative Medicine, State Research Institute Centre for Innovative Medicine, Santariskiu 5, 08661, Vilnius, Lithuania.
Research Unit of Medical Imaging, Physics and Technology, University of Oulu, Aapistie 5A, FIN-90230, Oulu, Finland.
Centre for Sport, Exercise and Osteoarthritis Research Versus Arthritis, Queen's Medical Centre, Nottingham, NG7 2UH, UK.

Jaume Bacardit (J)

School of Computing Science, Newcastle University, 1 Science Square, Newcastle, NE4 5TG, UK. jaume.bacardit@newcastle.ac.uk.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH