Ensuring cross-cultural data comparability by means of anchoring vignettes in heterogeneous refugee samples.

Anchoring vignettes Comparative statistics Health system responsiveness Measurement invariance Refugees Response category related differential item functioning Sample heterogeneity

Journal

BMC medical research methodology
ISSN: 1471-2288
Titre abrégé: BMC Med Res Methodol
Pays: England
ID NLM: 100968545

Informations de publication

Date de publication:
28 09 2023
Historique:
received: 11 05 2022
accepted: 08 08 2023
medline: 29 9 2023
pubmed: 28 9 2023
entrez: 27 9 2023
Statut: epublish

Résumé

Configural, metric, and scalar measurement invariance have been indicators of bias-free statistical cross-group comparisons, although they are difficult to verify in the data. Low comparability of translated questionnaires or the different understanding of response formats by respondents might lead to rejection of measurement invariance and point to comparability bias in multi-language surveys. Anchoring vignettes have been proposed as a method to control for the different understanding of response categories by respondents (the latter is referred to as differential item functioning related to response categories or rating scales: RC-DIF). We evaluate the question whether the cross-cultural comparability of data can be assured by means of anchoring vignettes or by considering socio-demographic heterogeneity as an alternative approach. We used the Health System Responsiveness (HSR) questionnaire and collected survey data in English (n = 183) and Arabic (n = 121) in a random sample of refugees in the third largest German federal state. We conducted multiple-group Confirmatory Factor Analyses (MGCFA) to analyse measurement invariance and compared the results when 1) using rescaled data on the basis of anchoring vignettes (non-parametric approach), 2) including information on RC-DIF from the analyses with anchoring vignettes as covariates (parametric approach) and 3) including socio-demographic covariates. For the HSR, every level of measurement invariance between the Arabic and English languages was rejected. Implementing rescaling or modelling on the basis of anchoring vignettes provided superior results over the initial MGCFA analysis, since configural, metric and - for ordered categorical analyses-scalar invariance could not be rejected. A consideration of socio-demographic variables did not show such an improvement. Surveys may consider anchoring vignettes as a method to assess cross-cultural comparability of data, whereas socio-demographic variables cannot be used to improve data comparability as a standalone method. More research on the efficient implementation of anchoring vignettes and further development of methods to incorporate them when modelling measurement invariance is needed.

Sections du résumé

BACKGROUND
Configural, metric, and scalar measurement invariance have been indicators of bias-free statistical cross-group comparisons, although they are difficult to verify in the data. Low comparability of translated questionnaires or the different understanding of response formats by respondents might lead to rejection of measurement invariance and point to comparability bias in multi-language surveys. Anchoring vignettes have been proposed as a method to control for the different understanding of response categories by respondents (the latter is referred to as differential item functioning related to response categories or rating scales: RC-DIF). We evaluate the question whether the cross-cultural comparability of data can be assured by means of anchoring vignettes or by considering socio-demographic heterogeneity as an alternative approach.
METHODS
We used the Health System Responsiveness (HSR) questionnaire and collected survey data in English (n = 183) and Arabic (n = 121) in a random sample of refugees in the third largest German federal state. We conducted multiple-group Confirmatory Factor Analyses (MGCFA) to analyse measurement invariance and compared the results when 1) using rescaled data on the basis of anchoring vignettes (non-parametric approach), 2) including information on RC-DIF from the analyses with anchoring vignettes as covariates (parametric approach) and 3) including socio-demographic covariates.
RESULTS
For the HSR, every level of measurement invariance between the Arabic and English languages was rejected. Implementing rescaling or modelling on the basis of anchoring vignettes provided superior results over the initial MGCFA analysis, since configural, metric and - for ordered categorical analyses-scalar invariance could not be rejected. A consideration of socio-demographic variables did not show such an improvement.
CONCLUSIONS
Surveys may consider anchoring vignettes as a method to assess cross-cultural comparability of data, whereas socio-demographic variables cannot be used to improve data comparability as a standalone method. More research on the efficient implementation of anchoring vignettes and further development of methods to incorporate them when modelling measurement invariance is needed.

Identifiants

pubmed: 37759183
doi: 10.1186/s12874-023-02015-2
pii: 10.1186/s12874-023-02015-2
pmc: PMC10536699
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

213

Informations de copyright

© 2023. BioMed Central Ltd., part of Springer Nature.

Références

Harkness JA, Villar A, Edwards B, et al. Translation, Adaptation, and Design. In: Harkness JA, Braun M, Edwards B, Johnson TP, Lyberg L, Mohler PP, et al., editors. Survey Methods in Multinational, Multiregional, and Multicultural Contexts. Hoboken: John Wiley & Sons, Inc; 2010. p. 115–40. https://doi.org/10.1002/9780470609927.ch7 .
doi: 10.1002/9780470609927.ch7
van de Vijver FJR, Matsumoto D. Introduction to the Methodological Issues Associated With Cross-Cultural Research. In: Cross-Cultural Research Methods in Psychology. 1st ed.: Cambridge University Press; 2010. p. 1–14. https://doi.org/10.1017/cbo9780511779381.002 .
Stathopoulou T, Krajčeva E, Menold N, Dept S. Questionnaire Design and Translation for Refugee Populations: Lessons Learned from the REHEAL Study. J Refug Stud. 2019;32:i105–21. https://doi.org/10.1093/jrs/fez045 .
doi: 10.1093/jrs/fez045
Meredith W. Measurement invariance, factor analysis and factorial invariance. Psychometrika. 1993;58:525–43. https://doi.org/10.1007/BF02294825 .
doi: 10.1007/BF02294825
Kim ES, Cao C, Wang Y, Nguyen DT. Measurement invariance testing with many groups: a comparison of five approaches. Struct Equ Modeling. 2017;24(4):524–44. https://doi.org/10.1080/10705511.2017.1304822 .
doi: 10.1080/10705511.2017.1304822
Meitinger K, Davidov E, Schmidt P, Braun M. Measurement Invariance: Testing for It and Explaining Why It is Absent. Surv Res Methods. 2020;14:345–9. https://doi.org/10.18148/SRM/2020.V14I4.7655 .
doi: 10.18148/SRM/2020.V14I4.7655
Leitgöb H, Seddig D, Asparouhov T, Behr D, Davidov E, Roover K de, et al. Measurement invariance in the social sciences: Historical development, methodological challenges, state of the art, and future perspectives. Soc Sci Res. 2022:102805. https://doi.org/10.1016/j.ssresearch.2022.102805 .
van de Schoot R, Schmidt P, De Beuckelaer A, editors. Measurement Invariance: Retrieved from http://journal.frontiersin.org/article/ https://doi.org/10.3389/fpsyg.2015.01064 . Lausanne: Frontiers Media; 2015.
Davidov E, Cieciuch J, Schmidt P. The cross-country measurement comparability in the immigration module of the European Social Survey 2014–15 2018. https://doi.org/10.18148/srm/2018.v12i1.7212 .
Lee S, Vasquez E, Ryan L, Smith J. Measurement Equivalence of Subjective Well-Being Scales under the Presence of Acquiescent Response Style for the Racially and Ethnically Diverse Older Population in the United States. Surv Res Methods. 2020;4:417–37. https://doi.org/10.18148/SRM/2020.V14I4.7413 .
doi: 10.18148/SRM/2020.V14I4.7413
Zercher F, Schmidt P, Cieciuch J, Davidov E. The comparability of the universalism value over time and across countries in the European Social Survey: exact vs. approximate measurement invariance. Front Psychol. 2015;6:733. https://doi.org/10.3389/fpsyg.2015.00733 .
doi: 10.3389/fpsyg.2015.00733 pubmed: 26089811 pmcid: 4455243
Benítez I, van de Vijver F, Padilla JL. A Mixed Methods Approach to the Analysis of Bias in Cross-cultural Studies. Sociol Methods Res. 2022;51:237–70. https://doi.org/10.1177/0049124119852390 .
doi: 10.1177/0049124119852390
Wu AD, Li Z, Zumbo BD. Decoding the Meaning of Factorial Invariance and Updating the Practice of Multi-group Confirmatory Factor Analysis: A Demonstration With TIMSS Data. Practical Assessment, Research & Evaluation. 2007;12.
van de Vijver FJR. Capturing Bias in Structural Equation Modeling. In: Cross-Cultural Analysis: Routledge; 2018. 3–43. https://doi.org/10.4324/9781315537078-1 .
Boer D, Hanke K, He J. On Detecting Systematic Measurement Error in Cross-Cultural Research: A Review and Critical Reflection on Equivalence and Invariance Tests. J Cross-Cult Psychol. 2018;49:713–34. https://doi.org/10.1177/0022022117749042 .
doi: 10.1177/0022022117749042
Roberts C, Sarrasin O, Ernst Stähli M. Investigating the Relative Impact of Different Sources of Measurement Non-Equivalence in Comparative Surveys 2020. doi: https://doi.org/10.18148/srm/2020.v14i4.7416
Menold N, Tausch A. Measurement of Latent Variables With Different Rating Scales. Sociol Methods Res. 2016;45:678–99. https://doi.org/10.1177/0049124115583913 .
doi: 10.1177/0049124115583913
King G, Murray CJL, Salomon JA, Tandon A. Enhancing the Validity and Cross-Cultural Comparability of Measurement in Survey Research. Am Polit Sci Rev. 2004;98:191–207. https://doi.org/10.1017/S000305540400108X .
doi: 10.1017/S000305540400108X
Rice N, Robone S, Smith PC. Vignettes and health systems responsiveness in cross-country comparative analyses. J R Stat Soc. 2012;175:337–69. https://doi.org/10.1111/j.1467-985X.2011.01021.x .
doi: 10.1111/j.1467-985X.2011.01021.x
Holland PW, Wainer H. Differential Item Functioning: Routledge; 2012.
Valentine N, Prasad A, Rice N, Robone S, Chatterji S. Health systems responsiveness: a measure of the acceptability of health-care processes and systems from the user's perspective. In: Smith PC, Mossialos E, Papanicolas I, Leatherman S, editors. Performance Measurement for Health System Improvement: Cambridge University Press; 2010. 138–186. https://doi.org/10.1017/CBO9780511711800.007 .
d’Uva TB, Lindeboom M, O’Donnell O, van Doorslaer E. Education-related inequity in healthcare with heterogeneous reporting of health. J R Stat Soc. 2011;174:639–64. https://doi.org/10.1111/j.1467-985X.2011.00706.x .
doi: 10.1111/j.1467-985X.2011.00706.x
Marksteiner T, Kuger S, Klieme E. The potential of anchoring vignettes to increase intercultural comparability of non-cognitive factors. Assess Educ. 2019;26:516–36. https://doi.org/10.1080/0969594X.2018.1514367 .
doi: 10.1080/0969594X.2018.1514367
Grol-Prokopczyk H, Freese J, Hauser RM. Using anchoring vignettes to assess group differences in general self-rated health. J Health Soc Behav. 2011;52:246–61. https://doi.org/10.1177/0022146510396713 .
doi: 10.1177/0022146510396713 pubmed: 21673148 pmcid: 3117438
He J, Buchholz J, Klieme E. Effects of Anchoring Vignettes on Comparability and Predictive Validity of Student Self-Reports in 64 Cultures. J Cross-Cult Psychol. 2017;48:319–34. https://doi.org/10.1177/0022022116687395 .
doi: 10.1177/0022022116687395
Hox JJ, de Leeuw ED, Zijlmans EAO. Measurement equivalence in mixed mode surveys. Front Psychol. 2015;6:87. https://doi.org/10.3389/fpsyg.2015.00087 .
doi: 10.3389/fpsyg.2015.00087 pubmed: 25699002 pmcid: 4318282
Millsap RE. Statistical approaches to measurement invariance. New York, London: Routledge; 2011.
Meitinger K. Necessary but Insufficient: Why Measurement Invariance Tests Need Online Probing as a Complementary Tool. Public Opin Q. 2017;81:447–72. https://doi.org/10.1093/poq/nfx009 .
doi: 10.1093/poq/nfx009 pubmed: 28579643 pmcid: 5452432
Dong Y, Dumas D. Are personality measures valid for different populations? A systematic review of measurement invariance across cultures, gender, and age. Personality Individ Differ. 2020;160:109956. https://doi.org/10.1016/j.paid.2020.109956 .
doi: 10.1016/j.paid.2020.109956
Muthén BO, Asparouhov T. BSEM measurement invariance analysis. 2013. http://www.statmodel.com/examples/webnote.shtml
Asparouhov T, Muthén B. Multiple-Group Factor Analysis Alignment. Struct Equ Modeling. 2014;21:495–508. https://doi.org/10.1080/10705511.2014.919210 .
doi: 10.1080/10705511.2014.919210
Tourangeau R, Rips LJ, Rasinski K. The Psychology of Survey Response. Cambridge, New York: Cambridge University Press; 2000.
doi: 10.1017/CBO9780511819322
Paulhus DL. Measurement and Control of Response Bias. In: Robinson JP, Shaver PR, Wrightsman LS, editors. Measures of Personality and Social Psychological Attitudes: Academic Press; 1991. 17–59. https://doi.org/10.1016/B978-0-12-590241-0.50006-X .
van Vaerenbergh Y, Thomas TD. Response Styles in Survey Research: A Literature Review of Antecedents, Consequences, and Remedies. Int J Pub Opin Res. 2013;25:195–217. https://doi.org/10.1093/ijpor/eds021 .
doi: 10.1093/ijpor/eds021
Yang Y, Harkness JA, Chin T-Y, Villar A, et al. Response Styles and Culture. In: Harkness JA, Braun M, Edwards B, Johnson TP, Lyberg L, Mohler PP, et al., editors. Survey Methods in Multinational, Multiregional, and Multicultural Contexts. Hoboken: John Wiley & Sons, Inc; 2010. p. 203–23. https://doi.org/10.1002/9780470609927.ch12 .
doi: 10.1002/9780470609927.ch12
Kline RB. Principles and practice of structural equation modeling. New York: Guilford Press; 2016.
Gregorich SE. Do self-report instruments allow meaningful comparisons across diverse population groups? Testing measurement invariance using the confirmatory factor analysis framework. Med Care. 2006;44:S78–94. https://doi.org/10.1097/01.mlr.0000245454.12228.8f .
doi: 10.1097/01.mlr.0000245454.12228.8f pubmed: 17060839 pmcid: 1808350
Cheung GW, Rensvold RB. Evaluating Goodness-of-Fit Indexes for Testing Measurement Invariance. Struct Equ Modeling. 2002;9:233–55. https://doi.org/10.1207/S15328007SEM0902_5 .
doi: 10.1207/S15328007SEM0902_5
Menold N, Kemper CJ. The Impact of Frequency Rating Scale Formats on the Measurement of Latent Variables in Web Surveys - An Experimental Investigation Using a Measure of Affectivity as an Example. Psihologija. 2015;48:431–49. https://doi.org/10.2298/PSI1504431M .
doi: 10.2298/PSI1504431M
King G, Wand J. Comparing Incomparable Survey Responses: Evaluating and Selecting Anchoring Vignettes. Polit Anal. 2007;15:46–66. https://doi.org/10.1093/pan/mpl011 .
doi: 10.1093/pan/mpl011
van Soest A, Delaney L, Harmon C, Kapteyn A, Smith JP. Validating the Use of Anchoring Vignettes for the Correction of Response Scale Differences in Subjctive Questions. J R Stat Soc. 2011;174:575–95. https://doi.org/10.1111/j.1467-985X.2011.00694.x .
doi: 10.1111/j.1467-985X.2011.00694.x
Hopkins DJ, King G. Improving Anchoring Vignettes: Designing Surveys to Correct Interpersonal Incomparability. Public Opin Quart. 2010;74:201–22. https://doi.org/10.1093/poq/nfq011 .
doi: 10.1093/poq/nfq011
Salomon JA, Tandon A, Murray CJL. Comparability of self rated health: cross sectional multi-country survey using anchoring vignettes. BMJ. 2004;328:258. https://doi.org/10.1136/bmj.37963.691632.44 .
doi: 10.1136/bmj.37963.691632.44 pubmed: 14742348 pmcid: 324453
Greene WH, Harris MN, Knott RJ, Rice N. Specification and testing of hierarchical ordered response models with anchoring vignettes. J R Stat Soc. 2021;184:31–64. https://doi.org/10.1111/rssa.12612 .
doi: 10.1111/rssa.12612
Mottus R, Allik J, Realo A, Rossier J, Zecca G, Ah-Kion J, et al. The Effect of Response Style on Self-Reported Conscientiousness Across 20 Countries. Pers Soc Psychol Bull. 2012;38:1423–36. https://doi.org/10.1177/0146167212451275 .
doi: 10.1177/0146167212451275 pubmed: 22745332
Biddle L, Menold N, Bentner M, Nöst S, Jahn R, Ziegler S, Bozorgmehr K. Health monitoring among asylum seekers and refugees: a state-wide, cross-sectional, population-based study in Germany. Emerg Themes Epidemiol. 2019;16:3. https://doi.org/10.1186/s12982-019-0085-2 .
doi: 10.1186/s12982-019-0085-2 pubmed: 31316579 pmcid: 6613239
Biddle L, Hintermeier M, Mohsenpour A, Sand M, Bozorgmehr K. Monitoring der Gesundheit von Geflüchteten: Integrative Ansätze mit Surveys und Routinedaten 2021. doi: https://doi.org/10.25646/7862
Behr D. Translation studies and internationally comparative survey research: quality assurance as object of a process analysis. 2009. https://www.ssoar.info/ssoar/handle/document/26125
Hadler P, Neuert C, Lenzner T, Stiegler A, Sarafoglou A, Bous P, et al. RESPOND - Improving regional health system responses to the challenges of migration through tailored interventions for asylum-seekers and refugees: GESIS – Pretest Lab; 2017.
Harrison S, Henderson J, Alderdice F, Quigley MA. Methods to increase response rates to a population-based maternity survey: a comparison of two pilot studies. BMC Med Res Methodol. 2019;19:65. https://doi.org/10.1186/s12874-019-0702-3 .
doi: 10.1186/s12874-019-0702-3 pubmed: 30894130 pmcid: 6425628
Meyer BD, Mok WKC, Sullivan JX. Household Surveys in Crisis. J Econ Perspect. 2015;29:199–226. https://doi.org/10.1257/jep.29.4.199 .
doi: 10.1257/jep.29.4.199
Mirzoev T, Kane S. What is health systems responsiveness? Review of existing knowledge and proposed conceptual framework. BMJ Glob Health. 2017;2:e000486. https://doi.org/10.1136/bmjgh-2017-000486 .
doi: 10.1136/bmjgh-2017-000486 pubmed: 29225953 pmcid: 5717934
Raykov T, Marcoulides GA. Introduction to Psychometric Theory. New York: Taylor & Francis; 2011.
doi: 10.4324/9780203841624
Byrne B. Structural Equation Modeling with Mplus: Basic Concepts, Applications, and Programming (Multivariate Applications). London: Taylor & Francis; 2011.
Muthén LK, Muthén BO, editors. Mplus User’s Guide. Los Angeles, CA: Muthén & Muthén; 2014.
Li C-H. Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares. Behav Res Methods. 2016;48:936–49. https://doi.org/10.3758/s13428-015-0619-7 .
doi: 10.3758/s13428-015-0619-7 pubmed: 26174714
Beauducel A, Wittmann WW. Simulation Study on Fit Indexes in CFA Based on Data With Slightly Distorted Simple Structure. Struct Equ Modeling. 2005;12:41–75. https://doi.org/10.1207/s15328007sem1201_3 .
doi: 10.1207/s15328007sem1201_3
Hu L, Bentler PM. Cutoff Criteria for Fit Indexes in Covariance Structure Analysis: Conventional Criteria Versus New Alternatives. Struct Equ Modeling. 1999;6:1–55. https://doi.org/10.1080/10705519909540118 .
doi: 10.1080/10705519909540118
Chen FF. Sensitivity of Goodness of Fit Indexes to Lack of Measurement Invariance. Struct Equ Modeling. 2007;14:464–504. https://doi.org/10.1080/10705510701301834 .
doi: 10.1080/10705510701301834
Raftery AE. Bayesian Model Selection in Social Research. Sociol Methodol. 1995;25:111. https://doi.org/10.2307/271063 .
doi: 10.2307/271063
Wand J, King G. Anchoring vignettes in R: A (different kind of) vignette. 2007. http://cran.nexr.com/web/packages/anchors/vignettes/anchors.pdf .
Rabe-Hesketh S, Skrondal A. Estimating CHOPIT models in GLLAMM: Political efficacy example from King et al. (2002). http://www.gllamm.org/chopit.pdf .
Lubke G, Muthén BO. Performance of Factor Mixture Models as a Function of Model Size, Covariate Effects, and Class-Specific Parameters. Struct Equ Modeling. 2007;14:26–47. https://doi.org/10.1080/10705510709336735 .
doi: 10.1080/10705510709336735
Muthén BO. Beyond SEM: General Latent Variable Modeling. Behaviormetrika. 2002;29:81–117. https://doi.org/10.2333/bhmk.29.81 .
doi: 10.2333/bhmk.29.81
Kelloway EK. Using Mplus for structural equation modeling: a researcher’s guide. Thousand Oaks: Sage Publications, Inc.; 2015.
Wolf EJ, Harrington KM, Clark SL, Miller MW. Sample Size Requirements for Structural Equation Models: An Evaluation of Power, Bias, and Solution Propriety. Educ Psychol Measure. 2013;73:913–34. https://doi.org/10.1177/0013164413495237 .
doi: 10.1177/0013164413495237
Muthén B, Asparouhov T. IRT studies of many groups: the alignment method. Front Psychol. 2014;5:978. https://doi.org/10.3389/fpsyg.2014.00978 .
doi: 10.3389/fpsyg.2014.00978 pubmed: 25309470 pmcid: 4162377
van de Schoot R, Schmidt P, De Beuckelaer A, Lek K, Zondervan-Zwijnenburg M. Editorial: Measurement Invariance. In: van de Schoot R, Schmidt P, De Beuckelaer A, editors. Measurement Invariance: Retrieved from http://journal.frontiersin.org/article/ https://doi.org/10.3389/fpsyg.2015.01064 . Lausanne: Frontiers Media; 2015. p. 1064.

Auteurs

Natalja Menold (N)

Dept. of Methods in Empirical Social Research, Technische Universität Dresden, Dresden, Germany. Natalja.Menold@tu-dresden.de.

Louise Biddle (L)

Charité - Universitätsmedizin Berlin, Institute of International Health, Berlin, Germany.

Hagen von Hermanni (H)

Dept. of Methods in Empirical Social Research, Technische Universität Dresden, Dresden, Germany.

Jasmin Kadel (J)

Dept. of Methods in Empirical Social Research, Technische Universität Dresden, Dresden, Germany.

Kayvan Bozorgmehr (K)

University Hospital Heidelberg, Section for Health Equity Studies & Migration, Heidelberg, Germany.
Dept. of Population Medicine and Health Services Research, Bielefeld University, Bielefeld, Germany.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH