Natural frequency tree- versus conditional probability formula-based training for medical students' estimation of screening test predictive values: a randomized controlled trial.
Conditional probability
Medical students
Natural frequency
Predictive value of screening tests
Probabilistic reasoning
Journal
BMC medical education
ISSN: 1472-6920
Titre abrégé: BMC Med Educ
Pays: England
ID NLM: 101088679
Informations de publication
Date de publication:
24 Oct 2024
24 Oct 2024
Historique:
received:
01
02
2024
accepted:
16
10
2024
medline:
25
10
2024
pubmed:
25
10
2024
entrez:
25
10
2024
Statut:
epublish
Résumé
Medical students and professionals often struggle to understand medical test results, which can lead to poor medical decisions. Natural frequency tree-based training (NF-TT) has been suggested to help people correctly estimate the predictive value of medical tests. We aimed to compare the effectiveness of NF-TT with conventional conditional probability formula-based training (CP-FT) and investigate student variables that may influence NF-TT's effectiveness. We conducted a parallel group randomized controlled trial of NF-TT vs. CP-FT in two medical schools in South Korea (a 1:1 allocation ratio). Participants were randomly assigned to watch either NF-TT or CP-FT video at individual computer stations. NF-TT video showed how to translate relevant probabilistic information into natural frequencies using a tree structure to estimate the predictive values of screening tests. CP-FT video showed how to plug the same information into a mathematical formula to calculate predictive values. Both videos were 15 min long. The primary outcome was the accuracy in estimating the predictive value of screening tests assessed using multiple-choice questions at baseline, post-intervention (i.e., immediately after training), and one-month follow-up. The secondary outcome was the accuracy of conditional probabilistic reasoning in non-medical contexts, also assessed using multiple-choice questions, but only at follow-up as a measure of transfer of learning. 231 medical students completed their participation. Overall, NF-TT was not more effective than CP-FT in improving the predictive value estimation accuracy at post-intervention (NF-TT: 87.13%, CP-FT: 86.03%, p = .86) and follow-up (NF-TT: 72.39%, CP-FT: 68.10%, p = .40) and facilitating transfer of training (NF-TT: 75.54%, CP-FT: 71.43%, p = .41). However, for participants without relevant prior training, NF-TT was more effective than CP-FT in improving estimation accuracy at follow-up (NF-TT: 74.86%, CP-FT: 58.71%, p = .02) and facilitating transfer of learning (NF-TT: 82.86%, CP-FT: 66.13%, p = .04). Introducing NF-TT early in the medical school curriculum, before students are exposed to a pervasive conditional probability formula-based approach, would offer the greatest benefit. Korea Disease Control and Prevention Agency Clinical Research Information Service KCT0004246 (the date of first trial registration: 27/08/2019). The full trial protocol can be accessed at https://cris.nih.go.kr/cris/search/detailSearch.do?seq=15616&search_page=L .
Sections du résumé
BACKGROUND
BACKGROUND
Medical students and professionals often struggle to understand medical test results, which can lead to poor medical decisions. Natural frequency tree-based training (NF-TT) has been suggested to help people correctly estimate the predictive value of medical tests. We aimed to compare the effectiveness of NF-TT with conventional conditional probability formula-based training (CP-FT) and investigate student variables that may influence NF-TT's effectiveness.
METHODS
METHODS
We conducted a parallel group randomized controlled trial of NF-TT vs. CP-FT in two medical schools in South Korea (a 1:1 allocation ratio). Participants were randomly assigned to watch either NF-TT or CP-FT video at individual computer stations. NF-TT video showed how to translate relevant probabilistic information into natural frequencies using a tree structure to estimate the predictive values of screening tests. CP-FT video showed how to plug the same information into a mathematical formula to calculate predictive values. Both videos were 15 min long. The primary outcome was the accuracy in estimating the predictive value of screening tests assessed using multiple-choice questions at baseline, post-intervention (i.e., immediately after training), and one-month follow-up. The secondary outcome was the accuracy of conditional probabilistic reasoning in non-medical contexts, also assessed using multiple-choice questions, but only at follow-up as a measure of transfer of learning. 231 medical students completed their participation.
RESULTS
RESULTS
Overall, NF-TT was not more effective than CP-FT in improving the predictive value estimation accuracy at post-intervention (NF-TT: 87.13%, CP-FT: 86.03%, p = .86) and follow-up (NF-TT: 72.39%, CP-FT: 68.10%, p = .40) and facilitating transfer of training (NF-TT: 75.54%, CP-FT: 71.43%, p = .41). However, for participants without relevant prior training, NF-TT was more effective than CP-FT in improving estimation accuracy at follow-up (NF-TT: 74.86%, CP-FT: 58.71%, p = .02) and facilitating transfer of learning (NF-TT: 82.86%, CP-FT: 66.13%, p = .04).
CONCLUSIONS
CONCLUSIONS
Introducing NF-TT early in the medical school curriculum, before students are exposed to a pervasive conditional probability formula-based approach, would offer the greatest benefit.
TRIAL REGISTRATION
BACKGROUND
Korea Disease Control and Prevention Agency Clinical Research Information Service KCT0004246 (the date of first trial registration: 27/08/2019). The full trial protocol can be accessed at https://cris.nih.go.kr/cris/search/detailSearch.do?seq=15616&search_page=L .
Identifiants
pubmed: 39449124
doi: 10.1186/s12909-024-06209-0
pii: 10.1186/s12909-024-06209-0
doi:
Types de publication
Journal Article
Randomized Controlled Trial
Comparative Study
Langues
eng
Sous-ensembles de citation
IM
Pagination
1207Subventions
Organisme : National Research Foundation of Korea
ID : 0411-20180026
Informations de copyright
© 2024. The Author(s).
Références
Whiting PF, Davenport C, Jameson C, Burke M, Sterne JA, Hyde C, et al. How well do health professionals interpret diagnostic information? A systematic review. BMJ Open. 2015;5(7):e008155.
doi: 10.1136/bmjopen-2015-008155
Gigerenzer G, Gaissmaier W, Kurz-Milcke E, Schwartz LM, Woloshin S. Helping doctors and patients make sense of health statistics. Psychol Sci Public Interest. 2007;8(2):53–96.
doi: 10.1111/j.1539-6053.2008.00033.x
Poses RM, Anthony M. Availability, wishful thinking, and physicians’ diagnostic judgments for patients with suspected bacteremia. Med Decis Making. 1991;11(3):159–68.
doi: 10.1177/0272989X9101100303
Casscells W, Schoenberger A, Graboys TB. Interpretation by physicians of clinical laboratory results. New Engl J Med. 1978;299(18):999–1001.
doi: 10.1056/NEJM197811022991808
Eddy DM. Probabilistic reasoning in clinical medicine: problems and opportunities. In: Kahneman D, Slovic P, Tversky A, editors. Judgment under uncertainty: heuristics and biases Cambridge. Cambridge University Press; 1982. pp. 249–67.
Gigerenzer G, Hoffrage U, Ebert A. AIDS counselling for low-risk clients. AIDS Care. 1998;10(2):197–211.
doi: 10.1080/09540129850124451
Gigerenzer G, Hoffrage U. How to improve Bayesian reasoning without instruction: frequency formats. Psychol Rev. 1995;102(4):684–704.
doi: 10.1037/0033-295X.102.4.684
Mamede S, de Carvalho-Filho MA, de Faria RMD, Franci D, Nunes M, Ribeiro LMC, et al. Immunising’ physicians against availability bias in diagnostic reasoning: a randomised controlled experiment. BMJ Qual Saf. 2020;29:550–9.
doi: 10.1136/bmjqs-2019-010079
Schmidt HG, Mamede S, van den Berge K, van Gog T, van Saase JLCM, Rikers RMJP. Exposure to media information about a disease can cause doctors to misdiagnose similar-looking clinical cases. Acad Med. 2014;89(2):285–91.
doi: 10.1097/ACM.0000000000000107
Brannon LA, Carson KL. The representativeness heuristic: influence on nurses’ decision making. Appl Nurs Res. 2003;16(3):201–4.
doi: 10.1016/S0897-1897(03)00043-0
Koehler JJ. The base rate fallacy reconsidered: descriptive, normative, and methodological challenges. Behav Brain Sci. 1996;19(1):1–17.
doi: 10.1017/S0140525X00041157
Bramwell R, West H, Salmon P. Health professionals’ and service users’ interpretation of screening test results: experimental study. BMJ. 2006;333(7562):284.
doi: 10.1136/bmj.38884.663102.AE
Manrai AK, Bhatia G, Strymish J, Kohane IS, Jain SH. Medicine’s uncomfortable relationship with math: calculating positive predictive value. JAMA Intern Med. 2014;174(6):991–3.
doi: 10.1001/jamainternmed.2014.1059
Hoffrage U, Gigerenzer G. Using natural frequencies to improve diagnostic inferences. Acad Med. 1998;73(5):538–40.
doi: 10.1097/00001888-199805000-00024
Sedgwick P. Screening tests and indices of performance: effects of prevalence. BMJ. 2011;343:d6483.
doi: 10.1136/bmj.d6483
McDowell M, Jacobs P. Meta-analysis of the effect of natural frequencies on Bayesian reasoning. Psychol Bull. 2017;143(12):1273–312.
doi: 10.1037/bul0000126
McDowell M, Galesic M, Gigerenzer G. Natural frequencies do foster public understanding of medical tests: comment on Pighin, Gonzalez, Savadori, and Girotto (2016). Med Decis Mak. 2018;38(3):390–9.
doi: 10.1177/0272989X18754508
Woike JK, Hertwig R, Gigerenzer G. Heterogeneity of rules in Bayesian reasoning: a toolbox analysis. Cogn Psychol. 2023;143:101564.
doi: 10.1016/j.cogpsych.2023.101564
Lyman GH, Balducci L. Overestimation of test effects in clinical judgment. J Cancer Educ. 1993;8(4):297–307.
doi: 10.1080/08858199309528246
Prinz R, Feufel MA, Gigerenzer G, Wegwarth O. What counselors tell low-risk clients about HIV test performance. Curr HIV Res. 2015;13(5):369–80.
doi: 10.2174/1570162X13666150511125200
Ellis KM, Brase GL. Communicating HIV results to low-risk individuals: still hazy after all these years. Curr HIV Res. 2015;13(5):381–90.
doi: 10.2174/1570162X13666150511125629
Fitzpatrick MC, Pandey A, Wells CR, Sah P, Galvani AP. Buyer beware: inflated claims of sensitivity for rapid COVID-19 tests. Lancet. 2021;397(10268):24–5.
doi: 10.1016/S0140-6736(20)32635-0
Kmietowicz Z. Covid-19: controversial rapid test policy divides doctors and scientists. BMJ. 2021;372:n81.
doi: 10.1136/bmj.n81
Crozier A, Rajan S, Buchan I, McKee M. Put to the test: use of rapid testing technologies for covid-19. BMJ. 2021;372:n208.
doi: 10.1136/bmj.n208
Iacobucci G. Covid-19: government rolls out twice weekly rapid testing to all in England. BMJ. 2021;373:n902.
doi: 10.1136/bmj.n902
Mahase E. Covid-19: lateral flow tests in care homes failed to stop outbreaks, finds study. BMJ. 2021;373:n1025.
doi: 10.1136/bmj.n1025
Woloshin S, Dewitt B, Krishnamurti T, Fischhoff B. Assessing how consumers interpret and act on results from at-home COVID-19 self-test kits: a randomized clinical trial. JAMA Intern Med. 2022;182(3):332–41.
doi: 10.1001/jamainternmed.2021.8075
Campbell MJ, Machin D, Walters SJ. Medical statistics: a textbook for the health sciences. 4th ed. New York: Wiley; 2007.
Kurzenhäuser S, Hoffrage U. Teaching Bayesian reasoning: an evaluation of a classroom tutorial for medical students. Med Teach. 2002;24(5):516–21.
doi: 10.1080/0142159021000012540
Sedlmeier P, Gigerenzer G. Teaching Bayesian reasoning in less than two hours. J Exp Psychol Gen. 2001;130(3):380–400.
doi: 10.1037/0096-3445.130.3.380
Hoffrage U, Krauss S, Martignon L, Gigerenzer G. Natural frequencies improve Bayesian reasoning in simple and complex inference tasks. Front Psychol. 2015;6:1473.
doi: 10.3389/fpsyg.2015.01473
Hoffrage U, Hafenbrädl S, Bouquet C. Natural frequencies facilitate diagnostic inferences of managers. Front Psychol. 2015;6:642.
doi: 10.3389/fpsyg.2015.00642
Gigerenzer G. The psychology of good judgment: frequency formats and simple algorithms. Med Decis Mak. 1996;16(3):273–80.
doi: 10.1177/0272989X9601600312
Galesic M, Gigerenzer G, Straubinger N. Natural frequencies help older adults and people with low numeracy to evaluate medical screening tests. Med Decis Making. 2009;29(3):368–71.
doi: 10.1177/0272989X08329463
Hoffrage U, Gigerenzer G. How to improve the diagnostic inferences of medical experts. Experts in science and society. Boston, MA: Springer; 2004. pp. 249–68.
Zhu L, Gigerenzer G. Children can solve Bayesian problems: the role of representation in mental computation. Cognition. 2006;98(3):287–308.
doi: 10.1016/j.cognition.2004.12.003
Galesic M, Gigerenzer G, Straubinger N. Natural frequencies help older adults and people with low numeracy to evaluate medical screening tests. Med Decis Mak. 2009;29(3):368–71.
doi: 10.1177/0272989X08329463
Friederichs H, Ligges S, Weissenstein A. Using tree diagrams without numerical values in addition to relative numbers improves students’ numeracy skills: a randomized study in medical education. Med Decis Mak. 2014;34(2):253–7.
doi: 10.1177/0272989X13504499
Feufel MA, Keller N, Kendel F, Spies CD. Boosting for insight and/or boosting for agency? How to maximize accurate test interpretation with natural frequencies. BMC Med Educ. 2023;23(1):1–10.
doi: 10.1186/s12909-023-04025-6
Talboy AN, Schneider SL. Improving accuracy on Bayesian inference problems using a brief tutorial. J Behav Decis Mak. 2017;30(2):373–88.
doi: 10.1002/bdm.1949
Sirota M, Kostovičová L, Vallée-Tourangeau F. How to train your Bayesian: a problem-representation transfer rather than a format-representation shift explains training effects. Q J Exp Psychol (Hove). 2015;68(1):1–9.
doi: 10.1080/17470218.2014.972420
Binder K, Krauss S, Bruckmaier G, Marienhagen J. Visualizing the Bayesian 2-test case: the effect of tree diagrams on medical decision making. PLoS ONE. 2018;13(3):e0195029.
doi: 10.1371/journal.pone.0195029
Ruscio J. Comparing Bayes’s theorem to frequency-based approaches to teaching Bayesian reasoning. Teaching of Psychology; 2003.
Kurzenhäuser S. Natural frequencies in medical risk communication: applications of a simple mental tool to improve statistical thinking in physicians and patients [doctoral dissertation]: Freie Universität Berlin; 2003.
Pighin S, Gonzalez M, Savadori L, Girotto V. Natural frequencies do not foster public understanding of medical test results. Med Decis Mak. 2016;36(6):686–91.
doi: 10.1177/0272989X16640785
Perkins DN, Salomon G. Transfer of learning. Int Encyclopedia Educ. 1992;2:6452–7.
Faul F, Erdfelder E, Lang A-G, Buchner A. G* power 3: a flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav Res Methods. 2007;39(2):175–91.
doi: 10.3758/BF03193146
Craze for medical schools. Korea Times. 2024 Jan 14.
Steckelberg A, Balgenorth A, Berger J, Mühlhauser I. Explaining computation of predictive values: 2 × 2 table versus frequency tree. A randomized controlled trial [ISRCTN74278823]. BMC Med Educ. 2004;4(1):13.
doi: 10.1186/1472-6920-4-13
Weber P, Binder K, Krauss S. Why can only 24% solve Bayesian reasoning problems in natural frequencies: frequency phobia in spite of probability blindness. Front Psychol. 2018;9:1833.
doi: 10.3389/fpsyg.2018.01833
Luchins AS. Mechanization in problem solving: the effect of Einstellung. Psychol Monogr. 1942;54(6):1–95.
doi: 10.1037/h0093502
Brase GL, Fiddick L, Harries C. Participant recruitment methods and statistical reasoning performance. Q J Exp Psychol. 2006;59(5):965–76.
doi: 10.1080/02724980543000132
Eichler A, Böcherer-Linder K, Vogel M. Different visualizations cause different strategies when dealing with Bayesian situations. Front Psychol. 2020;11.
Rothwell PM. Treating individuals 2. Subgroup analysis in randomised controlled trials: importance, indications, and interpretation. Lancet. 2005;365(9454):176–86.
doi: 10.1016/S0140-6736(05)17709-5