Machine learning predicts upper secondary education dropout as early as the end of primary school.

Academic outcomes Comprehensive education Education dropout Kindergarten Longitudinal data Machine learning Upper secondary education

Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
05 06 2024
Historique:
received: 27 02 2024
accepted: 30 05 2024
medline: 6 6 2024
pubmed: 6 6 2024
entrez: 5 6 2024
Statut: epublish

Résumé

Education plays a pivotal role in alleviating poverty, driving economic growth, and empowering individuals, thereby significantly influencing societal and personal development. However, the persistent issue of school dropout poses a significant challenge, with its effects extending beyond the individual. While previous research has employed machine learning for dropout classification, these studies often suffer from a short-term focus, relying on data collected only a few years into the study period. This study expanded the modeling horizon by utilizing a 13-year longitudinal dataset, encompassing data from kindergarten to Grade 9. Our methodology incorporated a comprehensive range of parameters, including students' academic and cognitive skills, motivation, behavior, well-being, and officially recorded dropout data. The machine learning models developed in this study demonstrated notable classification ability, achieving a mean area under the curve (AUC) of 0.61 with data up to Grade 6 and an improved AUC of 0.65 with data up to Grade 9. Further data collection and independent correlational and causal analyses are crucial. In future iterations, such models may have the potential to proactively support educators' processes and existing protocols for identifying at-risk students, thereby potentially aiding in the reinvention of student retention and success strategies and ultimately contributing to improved educational outcomes.

Identifiants

pubmed: 38839872
doi: 10.1038/s41598-024-63629-0
pii: 10.1038/s41598-024-63629-0
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

12956

Subventions

Organisme : Research Council of Finland
ID : 339418
Organisme : Research Council of Finland
ID : 276239
Organisme : Research Council of Finland
ID : 323773
Organisme : Strategic Research Council
ID : 352648

Informations de copyright

© 2024. The Author(s).

Références

Huisman, J. & Smits, J. Keeping children in school: Effects of household and context characteristics on school dropout in 363 districts of 30 developing countries. SAGE Open 5, 2158244015609666. https://doi.org/10.1177/2158244015609666 (2015).
doi: 10.1177/2158244015609666
Breton, T. R. Can institutions or education explain world poverty? An augmented Solow model provides some insights. J. Socio-Econ. 33, 45–69. https://doi.org/10.1016/j.socec.2003.12.004 (2004).
doi: 10.1016/j.socec.2003.12.004
The World Bank. The Human Capital Index 2020 Update: Human Capital in the Time of COVID-19 (The World Bank, 2021).
Bäckman, O. High school dropout, resource attainment, and criminal convictions. J. Res. Crime Delinq. 54, 715–749. https://doi.org/10.1177/0022427817697441 (2017).
doi: 10.1177/0022427817697441
Bjerk, D. Re-examining the impact of dropping out on criminal and labor outcomes in early adulthood. Econ. Educ. Rev. 31, 110–122. https://doi.org/10.1016/j.econedurev.2011.09.003 (2012).
doi: 10.1016/j.econedurev.2011.09.003
Campolieti, M., Fang, T. & Gunderson, M. Labour market outcomes and skill acquisition of high-school dropouts. J. Labor Res. 31, 39–52. https://doi.org/10.1007/s12122-009-9074-5 (2010).
doi: 10.1007/s12122-009-9074-5
Dragone, D., Migali, G. & Zucchelli, E. High school dropout and the intergenerational transmission of crime. IZA Discuss. Paper https://doi.org/10.2139/ssrn.3794075 (2021).
doi: 10.2139/ssrn.3794075
Catterall, J. S. The societal benefits and costs of school dropout recovery. Educ. Res. Int. 2011, 957303. https://doi.org/10.1155/2011/957303 (2011).
doi: 10.1155/2011/957303
Freudenberg, N. & Ruglis, J. Reframing school dropout as a public health issue. Prev. Chronic Dis. 4, A107 (2007).
pubmed: 17875251 pmcid: 2099272
Kallio, J. M., Kauppinen, T. M. & Erola, J. Cumulative socio-economic disadvantage and secondary education in Finland. Eur. Sociol. Rev. 32, 649–661. https://doi.org/10.1093/esr/jcw021 (2016).
doi: 10.1093/esr/jcw021
Gubbels, J., van der Put, C. E. & Assink, M. Risk factors for school absenteeism and dropout: A meta-analytic review. J. Youth Adolesc. 48, 1637–1667. https://doi.org/10.1007/s10964-019-01072-5 (2019).
doi: 10.1007/s10964-019-01072-5 pubmed: 31312979 pmcid: 6732159
EUROSTAT. Early leavers from education and training (2021).
Official Statistics of Finland (OSF). Discontinuation of education (2022).
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).
doi: 10.1038/nature14539 pubmed: 26017442
Esteva, A. et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature 542, 115–118. https://doi.org/10.1038/nature21056 (2017).
doi: 10.1038/nature21056 pubmed: 28117445 pmcid: 8382232
Liu, X. et al. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: A systematic review and meta-analysis. Lancet Digit. Health 1, e271–e297. https://doi.org/10.1016/S2589-7500(19)30123-2 (2019).
doi: 10.1016/S2589-7500(19)30123-2 pubmed: 33323251
Prezja, F., Annala, L., Kiiskinen, S., Lahtinen, S. & Ojala, T. Synthesizing bidirectional temporal states of knee osteoarthritis radiographs with cycle-consistent generative adversarial neural networks. Preprint at http://arxiv.org/abs/2311.05798 (2023).
Prezja, F., Paloneva, J., Pölönen, I., Niinimäki, E. & Äyrämö, S. DeepFake knee osteoarthritis X-rays from generative adversarial neural networks deceive medical experts and offer augmentation potential to automatic classification. Sci. Rep. 12, 18573. https://doi.org/10.1038/s41598-022-23081-4 (2022).
doi: 10.1038/s41598-022-23081-4 pubmed: 36329253 pmcid: 9633706
Prezja, F. et al. Improving performance in colorectal cancer histology decomposition using deep and ensemble machine learning. Preprint at http://arxiv.org/abs/2310.16954 (2023).
Topol, E. J. High-performance medicine: The convergence of human and artificial intelligence. Nat. Med. 25, 44–56. https://doi.org/10.1038/s41591-018-0300-7 (2019).
doi: 10.1038/s41591-018-0300-7 pubmed: 30617339
Wornow, M. et al. The shaky foundations of clinical foundation models: A survey of large language models and foundation models for emrs. Preprint at http://arxiv.org/abs/2303.12961 (2023).
Peng, Z. et al. Kosmos-2: Grounding multimodal large language models to the world. Preprint at http://arxiv.org/abs/2306.14824 (2023).
Livne, M. et al. nach0: Multimodal natural and chemical languages foundation model. Preprint at http://arxiv.org/abs/2311.12410 (2023).
Luo, Y. et al. Biomedgpt: Open multimodal generative pre-trained transformer for biomedicine. Preprint at http://arxiv.org/abs/2308.09442 (2023).
Bernardo, A. B. I. et al. Profiling low-proficiency science students in the Philippines using machine learning. Humanit. Soc. Sci. Commun. 10, 192. https://doi.org/10.1057/s41599-023-01705-y (2023).
doi: 10.1057/s41599-023-01705-y pubmed: 37192949 pmcid: 10154750
Bilal, M., Omar, M., Anwar, W., Bokhari, R. H. & Choi, G. S. The role of demographic and academic features in a student performance prediction. Sci. Rep. 12, 12508. https://doi.org/10.1038/s41598-022-15880-6 (2022).
doi: 10.1038/s41598-022-15880-6 pubmed: 35869103 pmcid: 9307570
Krüger, J. G. C., Alceu de Souza, B. J. & Barddal, J. P. An explainable machine learning approach for student dropout prediction. Expert Syst. Appl. 233, 120933. https://doi.org/10.1016/j.eswa.2023.120933 (2023).
doi: 10.1016/j.eswa.2023.120933
Sara, N.-B., Halland, R., Igel, C. & Alstrup, S. High-school dropout prediction using machine learning: A danish large-scale study. In ESANN, vol. 2015, 23rd (2015).
Chung, J. Y. & Lee, S. Dropout early warning systems for high school students using machine learning. Child. Youth Serv. Rev. 96, 346–353. https://doi.org/10.1016/j.childyouth.2018.11.030 (2019).
doi: 10.1016/j.childyouth.2018.11.030
Lee, S. & Chung, J. Y. The machine learning-based dropout early warning system for improving the performance of dropout prediction. Appl. Sci. https://doi.org/10.3390/app9153093 (2019).
doi: 10.3390/app9153093
Sansone, D. Beyond early warning indicators: High school dropout and machine learning. Oxf. Bull. Econ. Stat. 81, 456–485. https://doi.org/10.1111/obes.12277 (2019).
doi: 10.1111/obes.12277
Aguiar, E. et al. Who, when, and why: A machine learning approach to prioritizing students at risk of not graduating high school on time. In Proc. of the Fifth International Conference on Learning Analytics And Knowledge, LAK ’15, 93–102, https://doi.org/10.1145/2723576.2723619 (Association for Computing Machinery, New York, NY, USA, 2015).
Colak, O. Z. et al. School dropout prediction and feature importance exploration in Malawi using household panel data: Machine learning approach. J. Comput. Soc. Sci. 6, 245–287. https://doi.org/10.1007/s42001-022-00195-3 (2023).
doi: 10.1007/s42001-022-00195-3
Sorensen, L. C. “Big Data’’ in educational administration: An application for predicting school dropout risk. Educ. Adm. Q. 55, 404–446. https://doi.org/10.1177/0013161X18799439 (2019).
doi: 10.1177/0013161X18799439
Schoeneberger, J. A. Longitudinal attendance patterns: Developing high school dropouts. Clear. House J. Educ. Strat. Issues Ideas 85, 7–14. https://doi.org/10.1080/00098655.2011.603766 (2012).
doi: 10.1080/00098655.2011.603766
Balfanz, R., Herzog, L., Douglas, I. & Mac, J. Preventing student disengagement and keeping students on the graduation path in urban middle-grades schools: Early identification and effective interventions. Educ. Psychol. 42, 223–235. https://doi.org/10.1080/00461520701621079 (2007).
doi: 10.1080/00461520701621079
Rumberger, R. W. Why Students Drop Out of High School and What Can Be Done About It (Harvard University Press, 2012).
De Witte, K., Cabus, S., Thyssen, G., Groot, W. & van Den Brink, H. M. A critical review of the literature on school dropout. Educ. Res. Rev. 10, 13–28 (2013).
doi: 10.1016/j.edurev.2013.05.002
Esch, P. et al. The downward spiral of mental disorders and educational attainment: A systematic review on early school leaving. BMC Psychiatry 14, 1–13 (2014).
doi: 10.1186/s12888-014-0237-4
Lerkkanen, M.-K. et al. The first steps study [alkuportaat] (2006-2016).
Vasalampi, K. & Aunola, K. The school path: From first steps to secondary and higher education study [koulupolku: Alkuportailta jatko-opintoihin] (2016).
Official Statistics of Finland (OSF). Statistical databases (2007).
Lemaître, G., Nogueira, F. & Aridas, C. K. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. J. Mach. Learn. Res. 18, 1–5 (2017).
Breiman, L. Random forests. Mach. Learn. 45, 5–32 (2001).
doi: 10.1023/A:1010933404324
Liu, X.-Y., Wu, J. & Zhou, Z.-H. Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. B (Cybernetics) 39, 539–550 (2008).
pubmed: 19095540
Freund, Y. & Schapire, R. E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997).
doi: 10.1006/jcss.1997.1504
Breiman, L. Bagging predictors. Mach. Learn. 24, 123–140 (1996).
doi: 10.1007/BF00058655
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Quinlan, J. R. Induction of decision trees. Mach. Learn. 1, 81–106 (1986).
doi: 10.1007/BF00116251
Brodersen, K. H., Ong, C. S., Stephan, K. E. & Buhmann, J. M. The balanced accuracy and its posterior distribution. In 2010 20th international conference on pattern recognition, 3121–3124 (IEEE, 2010).
Kohavi, R. et al. A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai 14, 1137–1145 (1995).
Prezja, F. Deep fast vision: A python library for accelerated deep transfer learning vision prototyping. Preprint at http://arxiv.org/abs/2311.06169 (2023).
Knowles, J. E. Of needles and haystacks: Building an accurate statewide dropout early warning system in Wisconsin. J. Educ. Data Min. 7, 18–67. https://doi.org/10.5281/zenodo.3554725 (2015).
doi: 10.5281/zenodo.3554725
Aunola, K., Leskinen, E., Lerkkanen, M.-K. & Nurmi, J.-E. Developmental dynamics of math performance from preschool to Grade 2. J. Educ. Psychol. 96, 699–713. https://doi.org/10.1037/0022-0663.96.4.699 (2004).
doi: 10.1037/0022-0663.96.4.699
Ricketts, J., Lervåg, A., Dawson, N., Taylor, L. A. & Hulme, C. Reading and oral vocabulary development in early adolescence. Sci. Stud. Read. 24, 380–396. https://doi.org/10.1080/10888438.2019.1689244 (2020).
doi: 10.1080/10888438.2019.1689244
Verhoeven, L. & van Leeuwe, J. Prediction of the development of reading comprehension: A longitudinal study. Appl. Cogn. Psychol. 22, 407–423. https://doi.org/10.1002/acp.1414 (2008).
doi: 10.1002/acp.1414
Khanolainen, D. et al. Longitudinal effects of the home learning environment and parental difficulties on reading and math development across Grades 1–9. Front. Psychol. https://doi.org/10.3389/fpsyg.2020.577981 (2020).
doi: 10.3389/fpsyg.2020.577981 pubmed: 33132988 pmcid: 7578386
Psyridou, M. et al. Developmental profiles of arithmetic fluency skills from grades 1 to 9 and their early identification. Dev. Psychol. 59, 2379–2396. https://doi.org/10.1037/dev0001622 (2023).
doi: 10.1037/dev0001622 pubmed: 37747509
Psyridou, M. et al. Developmental profiles of reading fluency and reading comprehension from grades 1 to 9 and their early identification. Dev. Psychol. 57, 1840–1854. https://doi.org/10.1037/dev0000976 (2021).
doi: 10.1037/dev0000976 pubmed: 34914449

Auteurs

Maria Psyridou (M)

Department of Psychology, University of Jyväskylä, 40014, Jyväskylä, Finland. maria.m.psyridou@jyu.fi.

Fabi Prezja (F)

Faculty of Information Technology, University of Jyväskylä, 40014, Jyväskylä, Finland.

Minna Torppa (M)

Department of Teacher Education, University of Jyväskylä, 40014, Jyväskylä, Finland.

Marja-Kristiina Lerkkanen (MK)

Department of Teacher Education, University of Jyväskylä, 40014, Jyväskylä, Finland.

Anna-Maija Poikkeus (AM)

Department of Teacher Education, University of Jyväskylä, 40014, Jyväskylä, Finland.

Kati Vasalampi (K)

Department of Education, University of Jyväskylä, 40014, Jyväskylä, Finland.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH