Parsimonious item response theory modeling with the negative log-log link: The role of inflection point shift.

Asymmetric model Generalized linear models Inflection point shift Item response theory Measurement Model complexity Psychometrics Upper asymptote parameter

Journal

Behavior research methods

ISSN: 1554-3528

Titre abrégé: Behav Res Methods

Pays: United States

ID NLM: 101244316

Informations de publication

Date de publication:
03 Aug 2023

Historique:

accepted: 30 06 2023

medline: 4 8 2023

pubmed: 4 8 2023

entrez: 3 8 2023

Statut: aheadofprint

Résumé

In item response theory (IRT) modeling, the magnitude of the lower and upper asymptote parameters determines the degree to which the inflection point shifts above or below P = 0.50. The current study examines the one-parameter negative log-log model (NLLM), which is characterized by a downward shift in the inflection point, among other distinctive psychometric properties. After detailing the statistical foundations of the NLLM, we present a series of simulation studies to establish item and person parameter estimation accuracy and to demonstrate that this parsimonious model addresses the "slipping" effect (i.e., unexpectedly incorrect answers) via an inflection point < 0.50 rather than through computationally difficult estimation of the upper asymptote. We then provide further support for these simulation results through empirical data analysis. Finally, we discuss how the NLLM contributes to recent methodological literature on the utility of asymmetric IRT models.

Identifiants

DOI: 10.3758/s13428-023-02189-z PMID: 37537489

pubmed: 37537489

doi: 10.3758/s13428-023-02189-z

pii: 10.3758/s13428-023-02189-z

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Informations de copyright

Références

Agresti, A. (2012). Categorical data analysis (3rd ed). Wiley and Sons.

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, AC-19, 716–723.

doi: 10.1109/TAC.1974.1100705

Baker, F. B., & Kim, S.-H. (2017). The basics of item response theory using R. Springer.

doi: 10.1007/978-3-319-54205-8

Baker, F. B., & Kim, S. H. (2004). Item response theory: Parameter estimation techniques (2nd ed.). Boca Raton, FL: CRC Press.

doi: 10.1201/9781482276725

Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item-response model. ETS Research Report Series, 1981(1), i–8.

doi: 10.1002/j.2333-8504.1981.tb01255.x

Bazán, J. L., Branco, M. D., & Bolfarine, H. (2006). A model of skew item response theory. Bayesian Analysis, 1(4), 861–892.

doi: 10.1214/06-BA128

Bernard-Brek, L., Lan, W. Y., & Yang, Z. (2018). Differences in mathematics achievement according to opportunity to learn: A 4PL item response theory examination. Studies in Educational Evaluation, 56, 1–7.

doi: 10.1016/j.stueduc.2017.11.002

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–479). Addison-Wesley.

Bolfarine, H., & Bazán, J. L. (2010). Bayesian estimation of the logistic positive exponent IRT model. Journal of Educational and Behavioral Statistics, 35, 693–713.

doi: 10.3102/1076998610375834

Bonifay, W., & Cai, L. (2017). On the complexity of item response theory models. Multivariate Behavioral Research, 52(4), 465–484.

pubmed: 28426237 doi: 10.1080/00273171.2017.1309262

Bozdogan, H. (1987). Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions. Psychometrika, 52, 345–370.

doi: 10.1007/BF02294361

Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. Springer-Verlag.

Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.

doi: 10.18637/jss.v048.i06

Chang, H. -H., & Ying, Z. (2008). To weight or not to weight? Balancing influence of initial items in adaptive testing. Psychometrika, 73, 441–450.

doi: 10.1007/s11336-007-9047-7

Chen, W. H., Lenderking, W., Jin, Y., Wyrwich, K. W., Gelhorn, H., & Revicki, D. A. (2014). Is Rasch model analysis applicable in small sample size pilot studies for assessing item characteristics? An example using PROMIS pain behavior item bank data. Quality of Life Research, 23, 485–493.

pubmed: 23912855 doi: 10.1007/s11136-013-0487-5

Culppepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81(4), 1142–1163.

doi: 10.1007/s11336-015-9477-6

Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specification error in confirmatory factor analysis. Psychological Methods, 1(1), 16–29.

doi: 10.1037/1082-989X.1.1.16

de Ayala, R. J. (2009). The theory and practice of item response theory. Guilford Press.

Deonovic, B., Yudelson, M., Bolsinova, M., Attali, M., & Maris, G. (2018). Learning meets assessment: On the relation between item response theory and Bayesian knowledge tracing. Behaviormetrika, 45, 457–474.

doi: 10.1007/s41237-018-0070-z

Dziak, J. J., Coffman, D. L., Lanza, S. T., & Li, R. (2012). Sensitivity and specificity of information criteria, Technical Report Series No.12–119. University Park: The Methodology Center, Penn State. Accessed via https://www.methodology.psu.edu/files/2019/03/12-119-2e90hc6.pdf

Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Erlbaum.

Feuerstahler, L. M. (2018). Sources of error in IRT trait estimation. Applied Psychological Measurement, 42(5), 359–375.

pubmed: 30034054 doi: 10.1177/0146621617733955

Feuerstahler, L. M. (2022). Metric stability in item response models. Multivariate Behavioral Research, 57(1), 94–111.

pubmed: 32876499 doi: 10.1080/00273171.2020.1809980

Feuerstahler, L. M., & Waller, N. G. (2014). Abstract: Estimation of the 4-parameter model with marginal maximum likelihood. Multivariate Behavioral Research, 49(3), 285–285.

pubmed: 26735195 doi: 10.1080/00273171.2014.912889

Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 222, 309–368.

Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 147–200). Macmillan.

Han, T. K. (2012). Fixing the c parameter in the three-parameter logistic model. Practical Assessment, Research & Evaluation, 17(1), 1–24.

Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning: Data mining, inference, and prediction. Springer-Verlag.

doi: 10.1007/978-0-387-21606-5

Hitchcock, C., & Sober, E. (2004). Predicting versus accommodation and the risk of overfitting. The British Journal for the Philosophy of Science, 55, 1–34.

doi: 10.1093/bjps/55.1.1

Janssen, R., & De Boeck, P. (1999). Confirmatory analyses of componential test structure using multidimensional item response theory. Multivariate Behavioral Research, 34(2), 245–268.

pubmed: 26753937 doi: 10.1207/S15327906Mb340205

Kang, T. (2006). Model selection methods for unidimensional and multidimensional IRT models (Unpublished doctoral dissertation). University of Wisconsin-Madison.

Kang, T., & Cohen, A. S. (2007). IRT model selection methods for dichotomous items. Applied Psychological Measurement, 31(4), 331–358.

doi: 10.1177/0146621606292213

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773–795.

doi: 10.1080/01621459.1995.10476572

Lee, S., & Bolt, D. M. (2017). Asymmetric item characteristic curves and item complexity: Insights from simulation and real data analyses. Psychometrika, 83, 453–475.

pubmed: 28948426 doi: 10.1007/s11336-017-9586-5

Lee, S., & Bolt, D. M. (2018). An alternative to the 3PL: Using asymmetric item characteristic curves to address guessing effects. Journal of Educational Measurement, 55(1), 90–111.

doi: 10.1111/jedm.12165

Liao, W.-W., Ho, R.-G., Yen, Y.-C., & Cheng, H.-C. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality, 40, 1679–1694.

doi: 10.2224/sbp.2012.40.10.1679

Liao, X., & Bolt, D. M. (2021). Item characteristic curve asymmetry: A better way to accommodate slips and guesses than a four-parameter model? Journal of Educational and Behavioral Statistics, 46(6), 753–775.

doi: 10.3102/10769986211003283

Lin, T. H., & Dayton, C. M. (1997). Model selection information criteria for non-nested latent class models. Journal of Educational and Behavioral Statistics, 22(3), 249–264.

doi: 10.3102/10769986022003249

Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematics and Statistical Psychology, 63, 509–525.

doi: 10.1348/000711009X474502

Lord, F. M. (1975). The ‘ability’ scale in item characteristic curve theory. Psychometrika, 40(2), 205–217.

doi: 10.1007/BF02291567

Lubke, G. H., & Muthén, B. O. (2005). Investigating population heterogeneity with factor mixture models. Psychological Methods, 10, 21–39.

pubmed: 15810867 doi: 10.1037/1082-989X.10.1.21

Magis, D. (2015). A note on the equivalence between observed and expected information functions with polytomous IRT models. Journal of Educational & Behavioral Statistics, 40, 96–105.

doi: 10.3102/1076998614558122

Merkle, E. C., & You, D. (2018). nonnest2: Tests of non-nested models [Computer software manual]. Retrieved from https://cran.r-project.org/package=nonnest2 (R package version 0.5- 2)

Mirels, H. L., & Garrett, J. B. (1971). The Protestant Ethic as a personality variable. Journal of Consulting and Clinical Psychology, 36(1), 40–44.

pubmed: 5542480 doi: 10.1037/h0030477

Molenaar, D. (2014). Heteroscedastic latent trait models for dichotomous data. Psychometrika, 80, 625–644.

pubmed: 25080866 doi: 10.1007/s11336-014-9406-0

Neyman, J., & Scott, E. L. (1948). Consistent estimates based on partially consistent observations. Econometrica, 16, 1–32.

doi: 10.2307/1914288

Ogasawara, H. (2012). Asymptotic expansions for the ability estimator in item response theory. Computational Statistics, 27, 661–683.

doi: 10.1007/s00180-011-0282-0

Osgood, D., McMorris, B. J., & Potenza, M. T. (2002). Analyzing multiple-item measures of crime and deviance I: Item response theory scaling. Journal of Quantitative Criminology, 18, 267–296.

doi: 10.1023/A:1016008004010

Primi, R., & Najano, T. C. (2018). Using four-parameter item response theory to model human figure drawings. Avaliacao Psicologica, 17(4), 473–483.

Rafferty, A. E. (1996). Approximate Bayes factors and accounting for model uncertainty in generalized linear models. Biometrika, 83(2), 251–266.

doi: 10.1093/biomet/83.2.251

Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Paedaogiske Institut

Reise, S. P., Du, H., Wong, E. F., Hubbard, A. S., & Haviland, M. G. (2021). Matching IRT models to patient-reported outcomes constructs: The graded response and log-logistic models for scaling depression. Psychometrika, 86(3), 800–824.

pubmed: 34463910 pmcid: 8437930 doi: 10.1007/s11336-021-09802-0

Reise, S. P., & Rodriguea, A. (2016). Item response theory and the measurement of psychiatric constructs: Some empirical and conceptual issues and challenges. Psychological Medicine, 46, 2025–2039.

pubmed: 27056796 doi: 10.1017/S0033291716000520

Reise, S. P., Rodriguez, A., Spritzer, K. L., & Hays, R. D. (2018). Alternative approaches to addressing non-normal distributions in the application of IRT models to personality measures. Journal of Personality Assessment, 100, 363–374.

pubmed: 29087217 doi: 10.1080/00223891.2017.1381969

Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8, 164–184.

pubmed: 12924813 doi: 10.1037/1082-989X.8.2.164

Robitzsch, A. (2019). sirt: Supplementary Item Response Theory Models. R package version, 3, 7–40.

Samejima, F. (2000). Logistic positive exponent family of models: Virtue of asymmetric item characteristic curves. Psychometrika, 65, 319–335.

doi: 10.1007/BF02296149

Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.

doi: 10.1214/aos/1176344136

Shim, H., Bonifay, W., & Wiedermann, W. (2023). Parsimonious asymmetric item response theory modeling with the complementary log-log link. Behavior Research Methods, 55(1), 200–219.

pubmed: 35355241 doi: 10.3758/s13428-022-01824-5

Swaminathan, H., & Gifford, J. A. (1983). Estimation of parameters in the three-parameter latent trait model. In D. J. Weiss (Ed.), New horizons in testing (pp. 13–30). Academic Press.

Travares, H. R., de Andrade, D. F., & Pereira, C. A. (2004). Detection of determinant genes and diagnostic via item response theory. Genetics and Molecular Biology, 27, 679–685.

doi: 10.1590/S1415-47572004000400033

Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica, 57(2), 307–333.

doi: 10.2307/1912557

Wagenmakers, E.-J., & Farrell, S. (2004). AIC model selection using Akaike weights. Psychonomic Bulletin & Review, 11(1), 192–196.

doi: 10.3758/BF03206482

Waller, N. G., & Reise, S. P. (2009). Measuring psychopathology with non-standard IRT models: Fitting the four-parameter model to the MMPI. In S. Embretson & J. S. Roberts (Eds.), New directions in psychological measurement with model-based approaches (pp. 147–173). American Psychological Association.

Wang, T., Graves, B., Rosseel, Y., & Merkle, E. C. (2022). Computation and application of generalized linear mixed model derivatives using lme4. Psychometrika, 87(3), 1173–1193.

pubmed: 35118605 doi: 10.1007/s11336-022-09840-2

Wasserman, L. (2000). Bayesian model selection and model averaging. Journal of Mathematical Psychology, 44, 92–107.

pubmed: 10733859 doi: 10.1006/jmps.1999.1278

Whittaker, T. A., Chang, W., & Dodd, B. G. (2012). The performance of IRT model selection methods with mixed-format test. Applied Psychological Measurement, 36(3), 159–180.

doi: 10.1177/0146621612440305

Whittaker, T. A., Chang, W., & Dodd, B. G. (2013). The impact of varied discrimination parameters on mixed-format item response theory model selection. Educational and Psychological Measurement, 73(3), 471–490.

doi: 10.1177/0013164412472188

Wiedermann, W., & von Eye, A. (2020). Reciprocal relations in categorial variables. Psychological Methods, 25(6), 708–725.

pubmed: 32105103 doi: 10.1037/met0000257

Woods, C. M., & Lin, N. (2008). Item response theory with estimation of the latent density using Davidian curves. Applied Psychological Measurement, 33(2), 102–117.

doi: 10.1177/0146621608319512

Zheng, C., Guo, S., & Kern, J. L. (2021). Fast Bayesian estimation for the four-parameter logistic model (4PLM). SAGE Open, 11(4).

Parsimonious item response theory modeling with the negative log-log link: The role of inflection point shift.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Références

Auteurs

Hyejin Shim (H)

Wes Bonifay (W)

Wolfgang Wiedermann (W)

Classifications MeSH