Personalized pain management: The relationship between clinical relevance and reliability of measurements.

Humans Clinical Relevance Pain Reproducibility of Results

Journal

European journal of pain (London, England)

ISSN: 1532-2149

Titre abrégé: Eur J Pain

Pays: England

ID NLM: 9801774

Informations de publication

Date de publication:
10 2023

Historique:

revised: 05 03 2023

received: 11 01 2023

accepted: 08 03 2023

medline: 5 9 2023

pubmed: 24 3 2023

entrez: 23 3 2023

Statut: ppublish

Résumé

Reliability is a topic in health science in which a critical appraisal of the magnitudes of the measurements is often left aside to favour a formulaic analysis. Furthermore, the relationship between clinical relevance and reliability of measurements is often overlooked. In this context, the aim of the present article is to provide an overview of the design and analysis of reliability studies, the interpretation of the reliability of measurements and its relationship to clinical significance in the context of pain research and management. The article is divided in two sections: the first section contains a step-by-step guide with simple and straightforward recommendations for the design and analysis of a reliability study, with a relevant example involving a commonly used assessment measure in pain research. The second section provides deeper insight about the interpretation of the results of a reliability study and the association between the reliability of measurements and their experimental and clinical relevance. SIGNIFICANCE: Reliability studies quantify the measurement error in experimental or clinical setups and should be interpreted as a continuous outcome. The assessment of measurement error is useful to design and interpret future experimental studies and clinical interventions. Reliability and clinical relevance are inextricably linked, as measurement error should be considered in the interpretation of minimal detectable change and minimal clinically important differences.

Identifiants

DOI: 10.1002/ejp.2110 PMID: 36951044

pubmed: 36951044

doi: 10.1002/ejp.2110

doi:

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

Pagination

1056-1064

Informations de copyright

Références

Altman, D. G., & Bland, J. M. (1983). Measurement in medicine: The analysis of method comparison studies. Journal of the Royal Statistical Society: Series D (The Statistician), 32(3), 307-317. https://doi.org/10.2307/2987937

Angst, F., Aeschlimann, A., & Angst, J. (2017). The minimal clinically important difference raised the significance of outcome effects above the statistical level, with methodological implications for future studies. Journal of Clinical Epidemiology, 82, 128-136. https://doi.org/10.1016/j.jclinepi.2016.11.016

Arnold, B. F., Hogan, D. R., Colford, J. M., & Hubbard, A. E. (2011). Simulation methods to estimate design power: An overview for applied research. BMC Medical Research Methodology, 11(1), 94. https://doi.org/10.1186/1471-2288-11-94

Atkinson, G., & Nevill, A. M. (2000). Typical error versus limits of agreement. Sports Medicine, 30(5), 375-381. https://doi.org/10.2165/00007256-200030050-00005

Atkinson, G., & Nevill, A. M. A. (1998). Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine. Sports Medicine, 26(4), 217-238. https://doi.org/10.2165/00007256-199826040-00002

Barbosa, M. A., Tahara, A. K., Ferreira, I. C., Intelangelo, L., & Barbosa, A. C. (2019). Effects of 8 weeks of masticatory muscles focused endurance exercises on women with oro-facial pain and temporomandibular disorders: A placebo randomised controlled trial. Journal of Oral Rehabilitation, 46(10), 885-894. https://doi.org/10.1111/joor.12823

Bernstein, J. (2016). Not the last word: Inigo Montoya and statistical significance. Clinical Orthopaedics and Related Research, 474(6), 1370-1374. https://doi.org/10.1007/s11999-016-4814-3

Biurrun Manresa, J. A., Fritsche, R., Vuilleumier, P. H., Oehler, C., Mørch, C. D., Arendt-Nielsen, L., Andersen, O. K., & Curatolo, M. (2014). Is the conditioned pain modulation paradigm reliable? A test-retest assessment using the nociceptive withdrawal reflex. PLoS One, 9(6), e100241. https://doi.org/10.1371/journal.pone.0100241

Bland, J. M. (2010). How can I decide the sample size for a repeatability study? https://www-users.york.ac.uk/~mb55/meas/sizerep.htm

Bland, J. M., & Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, 1(8476), 307-310. https://doi.org/10.1016/S0140-6736(86)90837-8

Bland, J. M., & Altman, D. G. (1996a). Measurement error proportional to the mean. British Medical Journal, 313(7049), 106.

Bland, J. M., & Altman, D. G. (1996b). Statistics notes: Measurement error and correlation coefficients. BMJ, 313(7048), 41-42. https://doi.org/10.1136/bmj.313.7048.41

Bland, J. M., & Altman, D. G. (1996c). Statistics notes: Measurement error. BMJ, 313(7059), 744. https://doi.org/10.1136/bmj.313.7059.744

Bland, J. M., & Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research, 8(2), 135-160. https://doi.org/10.1191/096228099673819272

Bravo, G., Sene, M., & Arcand, M. (2017). Reliability of health-related quality-of-life assessments made by older adults and significant others for health states of increasing cognitive impairment. Health and Quality of Life Outcomes, 15(1), 4. https://doi.org/10.1186/s12955-016-0579-3

Brownstein, N. C., Louis, T. A., O'Hagan, A., & Pendergast, J. (2019). The role of expert judgment in statistical inference and evidence-based decision-making. American Statistician, 73(1), 56-68. https://doi.org/10.1080/00031305.2018.1529623

Bruton, A., Conway, J. H., & Holgate, S. T. (2000). Reliability: What is it, and how is it measured? Physiotherapy, 86(2), 94-99. https://doi.org/10.1016/S0031-9406(05)61211-4

Caldwell, A. R. (2022). SimplyAgree: An R package and jamovi module for simplifying agreement and reliability analyses. Journal of Open Source Software, 7(71), 4148. https://doi.org/10.21105/joss.04148

Chance, B. L. (2002). Components of statistical thinking and implications for instruction and assessment. Journal of Statistics Education, 10(3). https://doi.org/10.1080/10691898.2002.11910677

Crosby, R. D., Kolotkin, R. L., & Williams, G. R. (2003). Defining clinically meaningful change in health-related quality of life. Journal of Clinical Epidemiology, 56(5), 395-407. https://doi.org/10.1016/S0895-4356(03)00044-1

de Vet, H. C. W., Beckerman, H., Terwee, C. B., Terluin, B., & Bouter, L. M. (2006). Definition of clinical differences. The Journal of Rheumatology, 33(2), 434.

de Vet, H. C. W., & Terwee, C. B. (2010). The minimal detectable change should not replace the minimal important difference. Journal of Clinical Epidemiology, 63(7), 804-805. https://doi.org/10.1016/j.jclinepi.2009.12.015

de Vet, H. C. W., Terwee, C. B., Ostelo, R. W., Beckerman, H., Knol, D. L., & Bouter, L. M. (2006). Minimal changes in health status questionnaires: Distinction between minimally detectable change and minimally important change. Health and Quality of Life Outcomes, 4(1), 54. https://doi.org/10.1186/1477-7525-4-54

Euasobhon, P., Atisook, R., Bumrungchatudom, K., Zinboonyahgoon, N., Saisavoey, N., & Jensen, M. P. (2022). Reliability and responsivity of pain intensity scales in individuals with chronic pain. Pain, 163(12), e1184-e1191. https://doi.org/10.1097/j.pain.0000000000002692

Fleiss, J. L. (1999). The design and analysis of clinical experiments. In The Design and Analysis of Clinical Experiments. John Wiley & Sons, Inc.. https://doi.org/10.1002/9781118032923

Fleiss, J. L., Levin, B., & Paik, M. C. (2003). Statistical methods for rates and proportions. In Statistical Methods for Rates and Proportions. John Wiley & Sons, Inc.. https://doi.org/10.1002/0471445428

Gerke, O., Pedersen, A. K., Debrabant, B., Halekoh, U., & Möller, S. (2022). Sample size determination in method comparison and observer variability studies. Journal of Clinical Monitoring and Computing, 36(5), 1241-1243. https://doi.org/10.1007/s10877-022-00853-x

Gustorff, B., Sycha, T., Lieba-Samal, D., Rolke, R., Treede, R.-D., & Magerl, W. (2013). The pattern and time course of somatosensory changes in the human UVB sunburn model reveal the presence of peripheral and central sensitization. Pain, 154(4), 586-597. https://doi.org/10.1016/j.pain.2012.12.020

Han, O., Tan, H. W., Julious, S., Sutton, L., Jacques, R., Lee, E., Lewis, J., & Walters, S. (2022). A descriptive study of samples sizes used in agreement studies published in the PubMed repository. BMC Medical Research Methodology, 22(1), 242. https://doi.org/10.1186/s12874-022-01723-5

Hopkins, W. G. (2000). Measures of reliability in sports medicine and science. Sports Medicine, 30(1), 1-15. https://doi.org/10.2165/00007256-200030010-00001

Houweling, T. A. W. (2010). Reporting improvement from patient-reported outcome measures: A review. Clinical Chiropractic, 13(1), 15-22. https://doi.org/10.1016/j.clch.2009.12.003

Jaeschke, R., Singer, J., & Guyatt, G. H. (1989). Measurement of health status: Ascertaining the minimal clinically important difference. Controlled Clinical Trials, 10(4), 407-415. https://doi.org/10.1016/0197-2456(89)90005-6

Jan, S.-L., & Shieh, G. (2018). The Bland-Altman range of agreement: Exact interval procedure and sample size determination. Computers in Biology and Medicine, 100, 247-252. https://doi.org/10.1016/j.compbiomed.2018.06.020

Jensen, M. B., Biurrun Manresa, J. A., & Andersen, O. K. (2015). Reliable estimation of nociceptive withdrawal reflex thresholds. Journal of Neuroscience Methods, 253, 110-115. https://doi.org/10.1016/j.jneumeth.2015.06.014

Julious, S. A. (2004). Sample sizes for clinical trials with Normal data. Statistics in Medicine, 23(12), 1921-1986. https://doi.org/10.1002/sim.1783

Kazdin, A. E. (1999). The meanings and measurement of clinical significance. Journal of Consulting and Clinical Psychology, 67(3), 332-339. https://doi.org/10.1037/0022-006X.67.3.332

King, M. T. (2011). A point of minimal important difference (MID): A critique of terminology and methods. Expert Review of Pharmacoeconomics & Outcomes Research, 11(2), 171-184. https://doi.org/10.1586/erp.11.9

Koh, R. G., Paul, T. M., Nesovic, K., West, D., Kumbhare, D., & Wilson, R. D. (2022). Reliability and minimal detectable difference of pressure pain thresholds in a pain-free population. British Journal of Pain, 204946372211471. https://doi.org/10.1177/20494637221147185

Kottner, J., Audigé, L., Brorson, S., Donner, A., Gajewski, B. J., Hróbjartsson, A., Roberts, C., Shoukri, M., & Streiner, D. L. (2011). Guidelines for reporting reliability and agreement studies (GRRAS) were proposed. Journal of Clinical Epidemiology, 64(1), 96-106. https://doi.org/10.1016/j.jclinepi.2010.03.002

Kropmans, T. J. B., Dijkstra, P. U., Stegenga, B., Stewart, R., & de Bont, L. G. M. (1999). Smallest detectable difference in outcome variables related to painful restriction of the temporomandibular joint. Journal of Dental Research, 78(3), 784-789. https://doi.org/10.1177/00220345990780031101

Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 1-12. https://doi.org/10.3389/fpsyg.2013.00863

Lakens, D. (2022a). Sample size justification. Collabra: Psychology, 8(1), 33267. https://doi.org/10.1525/collabra.33267

Lakens, D. (2022b). Why P values are not measures of evidence. Trends in Ecology & Evolution, 37(4), 289-290. https://doi.org/10.1016/j.tree.2021.12.006

Lu, M.-J., Zhong, W.-H., Liu, Y.-X., Miao, H.-Z., Li, Y.-C., & Ji, M.-H. (2016). Sample size for assessing agreement between two methods of measurement by Bland−Altman method. The International Journal of Biostatistics, 12(2). https://doi.org/10.1515/ijb-2015-0039

Ludbrook, J. (2010). Confidence in Altman-Bland plots: A critical review of the method of differences. Clinical and Experimental Pharmacology and Physiology, 37(2), 143-149. https://doi.org/10.1111/j.1440-1681.2009.05288.x

Mejuto-Vázquez, M. J., Salom-Moreno, J., Ortega-Santiago, R., Truyols-Domínguez, S., & Fernández-de-las-Peñas, C. (2014). Short-term changes in neck pain, widespread pressure pain sensitivity, and cervical range of motion after the application of trigger point dry needling in patients with acute mechanical neck pain: A randomized clinical trial. Journal of Orthopaedic & Sports Physical Therapy, 44(4), 252-260. https://doi.org/10.2519/jospt.2014.5108

Mokkink, L. B., de Vet, H., Diemeer, S., & Eekhout, I. (2022). Sample size recommendations for studies on reliability and measurement error: An online application based on simulation studies. Health Services and Outcomes Research Methodology. https://doi.org/10.1007/s10742-022-00293-9

Mokkink, L. B., Terwee, C. B., Patrick, D. L., Alonso, J., Stratford, P. W., Knol, D. L., Bouter, L. M., & de Vet, H. C. W. (2010). The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. Journal of Clinical Epidemiology, 63(7), 737-745. https://doi.org/10.1016/j.jclinepi.2010.02.006

Mørch, C. D., Gazerani, P., Nielsen, T. A., & Arendt-Nielsen, L. (2013). The UVB cutaneous inflammatory pain model: A reproducibility study in healthy volunteers. International Journal of Physiology, Pathophysiology and Pharmacology, 5(4), 203-215.

Morrow, J. R., Jr., & Jackson, A. W. (1993). How ‘significant’ is your reliability? Research Quarterly for Exercise and Sport, 64(3), 352-355.

National Academies of Sciences, Engineering, and Medicine; Policy and Global Affairs; Committee on Science, Engineering, Medicine, and Public Policy; Board on Research Data and Information; Division on Engineering and Physical Sciences; Committee on Applied and Theoretical Statistics; Board on Mathematical Sciences and Analytics; Division on Earth and Life Studies; Nuclear and Radiation Studies Board; Division of Behavioral and Social Sciences and Education; Committee on National Statistics; Board on Behavioral, Cognitive, and Sensory Sciences; Committee on Reproducibility and Replicability in Science. (2019). Reproducibility and replicability in science. National Academies Press. https://doi.org/10.17226/25303

Norman, G. R., Sloan, J. A., & Wyrwich, K. W. (2003). Interpretation of changes in health-related quality of life: The remarkable universality of half a standard deviation. Medical Care, 41(5), 582-592. https://doi.org/10.1097/01.MLR.0000062554.74615.4C

Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences of the United States of America, 115(11), 2600-2606. https://doi.org/10.1073/pnas.1708274114

O'Hagan, A. (2019). Expert knowledge elicitation: Subjective but scientific. American Statistician, 73(1), 69-81. https://doi.org/10.1080/00031305.2018.1518265

Olofsen, E., Dahan, A., Borsboom, G., & Drummond, G. (2014). Improvements in the application and reporting of advanced Bland-Altman methods of comparison. Journal of Clinical Monitoring and Computing, 29(1), 127-139. https://doi.org/10.1007/s10877-014-9577-3

O'Neill, S., & O'Neill, L. (2015). Improving QST reliability-More raters, tests, or occasions? A multivariate generalizability study. Journal of Pain, 16(5), 454-462. https://doi.org/10.1016/j.jpain.2015.01.476

Ottenbacher, K. J., Johnson, M. B., & Hojem, M. (1988). The significance of clinical change and clinical change of significance: Issues and methods. The American Journal of Occupational Therapy, 42(3), 156-163. https://doi.org/10.5014/ajot.42.3.156

Revicki, D., Hays, R. D., Cella, D., & Sloan, J. (2008). Recommended methods for determining responsiveness and minimally important differences for patient-reported outcomes. Journal of Clinical Epidemiology, 61(2), 102-109. https://doi.org/10.1016/j.jclinepi.2007.03.012

Schmitt, J. S., & Di Fabio, R. P. (2004). Reliable change and minimum important difference (MID) proportions facilitated group responsiveness comparisons using individual threshold criteria. Journal of Clinical Epidemiology, 57(10), 1008-1018. https://doi.org/10.1016/j.jclinepi.2004.02.007

Schuck, P., & Zwingmann, C. (2003). The smallest real difference as a measure of sensitivity to change: A critical analysis. International Journal of Rehabilitation Research, 26(2), 85-91. https://doi.org/10.1097/00004356-200306000-00002

Schuller, W., Terwee, C. B., Terluin, B., Rohrich, D. C., Ostelo, R. W. J. G., & de Vet, H. C. W. (2022). Responsiveness and minimal important change of the PROMIS pain interference item Bank in Patients Presented in musculoskeletal practice. The Journal of Pain, S1526590022004394, 530-539. https://doi.org/10.1016/j.jpain.2022.10.013

Shechtman, O. (2013). The coefficient of variation as an index of measurement Reliability. In S. A. R. Doi & G. M. Williams (Eds.), Methods of clinical epidemiology (pp. 39-49). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-37131-8_4

Shieh, G. (2014a). Optimal sample sizes for the design of reliability studies: Power consideration. Behavior Research Methods, 46(3), 772-785. https://doi.org/10.3758/s13428-013-0396-0

Shieh, G. (2014b). Sample size requirements for the design of reliability studies: Precision consideration. Behavior Research Methods, 46(3), 808-822. https://doi.org/10.3758/s13428-013-0415-1

Shieh, G. (2018). The appropriateness of Bland-Altman's approximate confidence intervals for limits of agreement. BMC Medical Research Methodology, 18(1), 45. https://doi.org/10.1186/s12874-018-0505-y

Sinatra, R. (2002). Role of COX-2 inhibitors in the evolution of acute pain management. Journal of Pain and Symptom Management, 24(1), S18-S27. https://doi.org/10.1016/S0885-3924(02)00410-4

Sterne, J. A. C., & Smith, G. D. (2001). Sifting the evidence-What's wrong with significance tests? Physical Therapy, 81(8), 1464-1469. https://doi.org/10.1093/ptj/81.8.1464

Terwee, C. B., Roorda, L. D., Knol, D. L., De Boer, M. R., & de Vet, H. C. W. (2009). Linking measurement error to minimal important change of patient-reported outcomes. Journal of Clinical Epidemiology, 62(10), 1062-1067. https://doi.org/10.1016/j.jclinepi.2008.10.011

Tong, C. (2019). Statistical inference enables bad science; statistical thinking enables good science. American Statistician, 73(1), 246-261. https://doi.org/10.1080/00031305.2018.1518264

Turner, D., Schünemann, H. J., Griffith, L. E., Beaton, D. E., Griffiths, A. M., Critch, J. N., & Guyatt, G. H. (2010). The minimal detectable change cannot reliably replace the minimal important difference. Journal of Clinical Epidemiology, 63(1), 28-36. https://doi.org/10.1016/j.jclinepi.2009.01.024

Walter, S. D., Eliasziw, M., & Donner, A. (1998). Sample size and optimal designs for reliability studies. Statistics in Medicine, 17(1), 101-110. https://doi.org/10.1002/(SICI)1097-0258(19980115)17:1<101::AID-SIM727>3.0.CO;2-E

Weir, J. P. J. (2005). Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. Journal of Strength and Conditioning Research, 19(1), 231-240. https://doi.org/10.1519/15184.1

Wyrwich, K. W. (2004). Minimal important difference thresholds and the standard error of measurement: Is there a connection? Journal of Biopharmaceutical Statistics, 14(1), 97-110. https://doi.org/10.1081/BIP-120028508

Wyrwich, K. W., Nienaber, N. A., Tierney, W. M., & Wolinsky, F. D. (1999). Linking clinical relevance and statistical significance in evaluating intra-individual changes in health-related quality of life. Medical Care, 37(5), 469-478. https://doi.org/10.1097/00005650-199905000-00006

Wyrwich, K. W., & Norman, G. R. (2022). The challenges inherent with anchor-based approaches to the interpretation of important change in clinical outcome assessments. Quality of Life Research. https://doi.org/10.1007/s11136-022-03297-7

Personalized pain management: The relationship between clinical relevance and reliability of measurements.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Christian Ariel Mista (CA)

Leonardo Intelangelo (L)

José Biurrun Manresa (J)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH