Development of Reliable and Valid Negative Mood Screening Tools for Orthopaedic Patients with Musculoskeletal Pain.
Journal
Clinical orthopaedics and related research
ISSN: 1528-1132
Titre abrégé: Clin Orthop Relat Res
Pays: United States
ID NLM: 0075674
Informations de publication
Date de publication:
01 02 2022
01 02 2022
Historique:
received:
31
05
2021
accepted:
10
11
2021
pubmed:
9
12
2021
medline:
15
2
2022
entrez:
8
12
2021
Statut:
ppublish
Résumé
Negative mood is an important risk factor for poor clinical outcomes among individuals with musculoskeletal pain. Screening for negative mood can aid in identifying those who may need additional psychological interventions. Limitations of current negative mood screening tools include (1) high response burden, (2) a focus on single dimensions of negative mood, (3) poor precision for identifying individuals with low or high negative mood levels, and/or (4) design not specific for use in populations with orthopaedic conditions and musculoskeletal pain. (1) Can item response theory methods be used to construct screening tools for negative mood (such as depression, anxiety, and anger) in patients undergoing physical therapy for orthopaedic conditions? (2) Do these tools demonstrate reliability and construct validity when used in a clinical setting? This was a cross-sectional study involving outpatients having physical therapy in tertiary-care settings. A total of 431 outpatients with neck (n = 93), shoulder (n = 108), low back (n = 119), or knee (n = 111) conditions were enrolled between December 2014 and December 2015, with 24% (103 of 431) seeking care after orthopaedic surgery. Participants completed three validated psychological questionnaires measuring negative mood, resulting in 39 candidate items for item response theory analysis. Factor analysis was used to identify the dimensions (factors) assessed by the candidate items and select items that loaded on the main factor of interest (negative mood), establishing a unidimensional item set. Unidimensionality of an item set suggests they are assessing one main factor or trait, allowing unbiased score estimates. The identified items were assessed for their fit to the graded item response theory model. This model allows for items to vary by the level of difficulty they represent and by their ability to discriminate between patients at different levels of the trait being assessed, in this case, negative mood. Finally, a hierarchical bifactor model where multiple subfactors are allowed to load on an overall factor was used to confirm that the items identified as representing a unidimensional item set explained the large majority of variance of the overall factor, providing additional support for essential unidimensionality. Using the final item bank, we constructed a computer adaptive test administration mode, and reduced item sets were selected to create short forms including items with the highest information (reliability) at targeted score levels of the trait being measured, while also considering clinical content. We identified a 12-item bank for assessment of negative mood; eight-item and four-item short-form versions were developed to reduce administrative burden. Computer adaptive test administration used a mean ± SD of 8 ± 1 items. The item bank's reliability (0 = no reliability; 1 = perfect reliability) was 0.89 for the computer adaptive test administration, 0.86 for the eight-item short form, and 0.71 for the four-item short form. Reliability values equal to or greater than 0.7 are considered acceptable for group level measures. Construct validity sufficient for clinical practice was supported by more severe negative mood scores among individuals with a previous episode of pain in the involved anatomical region, pain and activity limitations during the past 3 months, a work-related injury, education less than a college degree, and income less than or equal to USD 50,000. These newly derived tools include short-form and computer adaptive test options for reliable and valid negative mood assessment in outpatient orthopaedic populations. Future research should determine the responsiveness of these measures to change and establish score thresholds for clinical decision-making. Orthopaedic providers can use these tools to inform prognosis, establish clinical benchmarks, and identify patients who may benefit from psychological and/or behavioral treatments.
Sections du résumé
BACKGROUND
Negative mood is an important risk factor for poor clinical outcomes among individuals with musculoskeletal pain. Screening for negative mood can aid in identifying those who may need additional psychological interventions. Limitations of current negative mood screening tools include (1) high response burden, (2) a focus on single dimensions of negative mood, (3) poor precision for identifying individuals with low or high negative mood levels, and/or (4) design not specific for use in populations with orthopaedic conditions and musculoskeletal pain.
QUESTIONS/PURPOSES
(1) Can item response theory methods be used to construct screening tools for negative mood (such as depression, anxiety, and anger) in patients undergoing physical therapy for orthopaedic conditions? (2) Do these tools demonstrate reliability and construct validity when used in a clinical setting?
METHODS
This was a cross-sectional study involving outpatients having physical therapy in tertiary-care settings. A total of 431 outpatients with neck (n = 93), shoulder (n = 108), low back (n = 119), or knee (n = 111) conditions were enrolled between December 2014 and December 2015, with 24% (103 of 431) seeking care after orthopaedic surgery. Participants completed three validated psychological questionnaires measuring negative mood, resulting in 39 candidate items for item response theory analysis. Factor analysis was used to identify the dimensions (factors) assessed by the candidate items and select items that loaded on the main factor of interest (negative mood), establishing a unidimensional item set. Unidimensionality of an item set suggests they are assessing one main factor or trait, allowing unbiased score estimates. The identified items were assessed for their fit to the graded item response theory model. This model allows for items to vary by the level of difficulty they represent and by their ability to discriminate between patients at different levels of the trait being assessed, in this case, negative mood. Finally, a hierarchical bifactor model where multiple subfactors are allowed to load on an overall factor was used to confirm that the items identified as representing a unidimensional item set explained the large majority of variance of the overall factor, providing additional support for essential unidimensionality. Using the final item bank, we constructed a computer adaptive test administration mode, and reduced item sets were selected to create short forms including items with the highest information (reliability) at targeted score levels of the trait being measured, while also considering clinical content.
RESULTS
We identified a 12-item bank for assessment of negative mood; eight-item and four-item short-form versions were developed to reduce administrative burden. Computer adaptive test administration used a mean ± SD of 8 ± 1 items. The item bank's reliability (0 = no reliability; 1 = perfect reliability) was 0.89 for the computer adaptive test administration, 0.86 for the eight-item short form, and 0.71 for the four-item short form. Reliability values equal to or greater than 0.7 are considered acceptable for group level measures. Construct validity sufficient for clinical practice was supported by more severe negative mood scores among individuals with a previous episode of pain in the involved anatomical region, pain and activity limitations during the past 3 months, a work-related injury, education less than a college degree, and income less than or equal to USD 50,000.
CONCLUSION
These newly derived tools include short-form and computer adaptive test options for reliable and valid negative mood assessment in outpatient orthopaedic populations. Future research should determine the responsiveness of these measures to change and establish score thresholds for clinical decision-making.
CLINICAL RELEVANCE
Orthopaedic providers can use these tools to inform prognosis, establish clinical benchmarks, and identify patients who may benefit from psychological and/or behavioral treatments.
Identifiants
pubmed: 34878414
doi: 10.1097/CORR.0000000000002082
pii: 00003086-202202000-00018
pmc: PMC8747611
doi:
Types de publication
Journal Article
Validation Study
Langues
eng
Sous-ensembles de citation
IM
Pagination
313-324Informations de copyright
Copyright © 2021 by the Association of Bone and Joint Surgeons.
Déclaration de conflit d'intérêts
All ICMJE Conflict of Interest Forms for authors and Clinical Orthopaedics and Related Research® editors and board members are on file with the publication and can be viewed on request.
Références
Akhtar-Danesh N, Landeen J. Relation between depression and sociodemographic factors. Int J Ment Health Syst. 2007;1:4.
Andresen EM. Criteria for assessing the tools of disability outcomes research. Arch Phys Med Rehabil. 2000;81:S15-20.
Bartlett SJ, Orbai A-M, Duncan T, et al. Reliability and validity of selected PROMIS measures in people with rheumatoid arthritis. PLos One. 2015;10:e0138543.
Bentler PM. Comparative fit indexes in structural models. Psychol Bull. 1990;107:238-246.
Bland JM, Altman DG. Cronbach’s alpha. BMJ. 1997;314:572.
Butera KA, George SZ, Lentz TA. Psychometric evaluation of the optimal screening for prediction of referral and outcome yellow flag (OSPRO-YF) tool: factor structure, reliability, and validity. J Pain. 2020;21:557-569.
Cai L, Thissen D, du Toit SHC. IRTPRO for Windows [Computer Software]. Scientific Software International; 2015.
Chakravarty EF, Bjorner JB, Fries JF. Improving patient reported outcomes using item response theory and computerized adaptive testing. J Rheumatol. 2007;34:1426-1431.
Choi SW. Firestar: computerized adaptive testing simulation program for polytomous item response theory models. Appl Psychol Meas . 2009;33:644-645.
Clauser BE, Hambleton RK. Review of differential item functioning. J Educ Meas. 1994;31:88-92.
Cook KF, Kallen MA, Amtmann D. Having a fit: impact of number of items and distribution of data on traditional criteria for assessing IRT’s unidimensionality assumption. Qual Life Res. 2009;18:447-460.
Cramer D., Howitt DL. The Sage Disctionary of Statistics. Sage; 2004.
Degen RM, MacDermid JC, Grewal R, Drosdowech DS, Faber KJ, Athwal GS. Prevalence of symptoms of depression, anxiety, and posttraumatic stress disorder in workers with upper extremity complaints. J Orthop Sports Phys Ther. 2016;46:590-595.
Deutscher D, Kallen MA, Hayes D, et al. The lower extremity physical function (LEPF) patient-reported outcome measure was reliable, valid, and efficient for patients with musculoskeletal impairments. Arch Phys Med Rehabil. 2021;102:1576-1587.
DeVellis RF. Classical test theory. Med Care. 2006;44:S50-59.
Edelen MO, Stucky BD, Chandra A. Quantifying “problematic” DIF within an IRT framework: application to a cancer stigma index. Qual Life Res. 2015;24:95-103.
Gatchel RJ, Peng YB, Peters ML, Fuchs PN, Turk DC. The biopsychosocial approach to chronic pain: scientific advances and future directions. Psychol Bull. 2007;133:581-624.
George SZ, Beneciuk JM, Lentz TA, Wu SS. The optimal screening for prediction of referral and outcome (OSPRO) in patients with musculoskeletal pain conditions: a longitudinal validation cohort from the USA. BMJ Open. 2017;7:e015188.
Giusti EM, Lacerenza M, Manzoni GM, Castelnuovo G. Psychological and psychosocial predictors of chronic postsurgical pain: a systematic review and meta-analysis. Pain. 2021;162:10-30.
Hair J, Hult G, Ringle C, Sarstedt M. A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM). SAGE Publications Inc; 2021.
Harris PA, Taylor R, Minor BL, et al. The REDCap consortium: building an international community of software platform partners. J Biomed Inform. 2019;95:103208.
Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2009;42:377-381.
Hart DL, Deutscher D, Werneke MW, Holder J, Wang Y-C. Implementing computerized adaptive tests in routine clinical practice: experience implementing CATs. J Appl Meas. 2010;11:288-303.
Health Measures. PROMIS® instrument development and psychometric evaluation scientific standards. Available at: http://www.nihpromis.org/Documents/PROMIS_Standards_050212.pdf . Accessed April 29, 2019.
Horn ME, George SZ, Fritz JM. Influence of initial provider on health care utilization in patients seeking care for neck pain. Mayo Clin Proc Innov Qual Outcomes. 2017;1:226-233.
Hu L, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: conventional criteria versus new alternatives. Struct Equ Modeling . 1999;6:1-55.
Kazis LE, Ameli O, Rothendler J, et al. Observational retrospective study of the association of initial healthcare provider for new-onset low back pain with early and long-term opioid use. BMJ Open. 2019;9:e028633.
Kline RB. Principles and Practice of Structural Equation Modeling, 2nd ed. Guilford Publications; 2005.
Kreiner DS, Matz P, Bono CM, et al. Guideline summary review: an evidence-based clinical guideline for the diagnosis and treatment of low back pain. Spine J. 2020;20:998-1024.
Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001;16:606-613.
Kroenke K, Yu Z, Wu J, Kean J, Monahan PO. Operating characteristics of PROMIS four-item depression and anxiety scales in primary care patients with chronic pain. Pain Med. 2014;15:1892-1901.
Leeuw M, Goossens MEJB, Linton SJ, Crombez G, Boersma K, Vlaeyen JWS. The fear-avoidance model of musculoskeletal main: current state of scientific evidence. J Behav Med. 2007;30:77-94.
Lentz TA, Beneciuk JM, Bialosky JE, et al. Development of a yellow flag assessment tool for orthopaedic physical therapists: results from the optimal screening for prediction of referral and outcome (OSPRO) cohort. J Orthop Sports Phys Ther. 2016;46:327-343.
Lim CR, Harris K, Dawson J, Beard DJ, Fitzpatrick R, Price AJ. Floor and ceiling effects in the OHS: an analysis of the NHS PROMs data set. BMJ Open. 2015;5:e007765.
Lin I, Wiles L, Waller R, et al. What does best practice care for musculoskeletal pain look like? Eleven consistent recommendations from high-quality clinical practice guidelines: systematic review. Br J Sports Med. 2020;54:79-86.
Linden WJ van der, Hambleton RK. Handbook of Modern Item Response Theory. Springer Science & Business Media; 2013.
Lundberg M, Grimby-Ekman A, Verbunt J, Simmonds MJ. Pain-related fear: a critical review of the related measures. Pain Res Treat. 2011;2011:494196.
McDonald RP. Test Theory: A Unified Treatment. Psychology Press; 2013.
Muraki E. Fitting a polytomous item response model to Likert-type data. Appl Psychol Meas . 1990;14:59-71.
Muthén LK, Muthén BO. Mplus User’s Guide, 7th ed. Muthén & Muthén; 1998.
Norris M, Lecavalier L. Evaluating the use of exploratory factor analysis in developmental disability psychological research. J Autism Dev Disord. 2010;40:8-20.
Orlando M. Critical Issues to Address When Applying Item Response Theory (IRT) Models. RAND Corp; 2004.
O’Rourke N, Hatcher L. A Step-By-Step Approach to Using SAS for Factor Analysis and Structural Equation Modeling, 2nd ed. SAS Institute Inc: 2013.
Panattoni N, Longo UG, De Salvatore S, et al. The influence of psychosocial factors on patient-reported outcome measures in rotator cuff tears pre- and post-surgery: a systematic review. Qual Life Res. Published online July 3, 2021. DOI: 10.1007/s11136-021-02921-2 .
Pilkonis PA, Choi SW, Reise SP, et al. Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS®): depression, anxiety, and anger. Assessment. 2011;18:263-283.
Pilkonis PA, Yu L, Dodds NE, Johnston KL, Maihoefer CC, Lawrence SM. Validation of the depression item bank from the Patient-Reported Outcomes Measurement Information System (PROMIS) in a three-month observational study. J Psychiatr Res. 2014;56:112-119.
Reise SP, Morizot J, Hays RD. The role of the bifactor model in resolving dimensionality issues in health outcomes measures. Qual Life Res. 2007;16(suppl 1):19-31.
Ruo B, Choi SW, Baker DW, Grady KL, Cella D. Development and validation of a computer adaptive test for measuring dyspnea in heart failure. J Card Fail. 2010;16:659-668.
Samejima F. Estimation of latent ability using a response pattern of graded scores. ETS Research Bulletin Series. 1968;1968:i-169.
Sheikhzadeh A, Wertli MM, Weiner SS, Rasmussen-Barr E, Weiser S. Do psychological factors affect outcomes in musculoskeletal shoulder disorders? A systematic review. BMC Musculoskelet Disord. 2021;22:560.
Spielberger CD. State-trait Anger Expression Inventory: Research Edition Professional Manual . Psychological Assessment Resources; 1988.
Spielberger CD, Gorsuch RL, Lushene RE, Vagg PR, Jacobs GA. Manual for the State and Trait Anxiety Inventory (Form Y). Consulting Psychologists Press; 1983.
Stark S, Chernyshenko OS, Drasgow F, Williams BA. Examining assumptions about item responding in personality assessment: should ideal point methods be considered for scale development and scoring? J Appl Psychol . 2006;91:25-39.
Sun E, Moshfegh J, Rishel CA, Cook CE, Goode AP, George SZ. Association of early physical therapy with long-term opioid use among opioid-naive patients with musculoskeletal pain. JAMA Netw Open. 2018;1:e185909.
Vajapey SP, McKeon JF, Krueger CA, Spitzer AI. Outcomes of total joint arthroplasty in patients with depression: a systematic review. J Clin Orthop Trauma. 2021;18:187-198.
Wang M, Woods CM. Anchor selection using the Wald test anchor-all-test-all procedure. Appl Psychol Meas. 2017;41:17-29.
Wong JJ, Tricco AC, Côté P, et al. Association between depressive symptoms or depression and health outcomes for low back pain: a systematic review and meta-analysis. J Gen Intern Med. Published online August 12, 2021. DOI: 10.1007/s11606-021-07079-8 .
Woods CM, Cai L, Wang M. The Langer-Improved Wald test for DIF testing with multiple groups: evaluation and comparison to two-group IRT. Educ Psychol Meas. 2013;73:532-547.