Combining Clickstream Analyses and Graph-Modeled Data Clustering for Identifying Common Response Processes.

action sequences cluster editing complex problem solving response times

Journal

Psychometrika
ISSN: 1860-0980
Titre abrégé: Psychometrika
Pays: United States
ID NLM: 0376503

Informations de publication

Date de publication:
03 2021
Historique:
received: 03 04 2020
accepted: 15 12 2020
revised: 10 12 2020
pubmed: 6 2 2021
medline: 18 9 2021
entrez: 5 2 2021
Statut: ppublish

Résumé

Complex interactive test items are becoming more widely used in assessments. Being computer-administered, assessments using interactive items allow logging time-stamped action sequences. These sequences pose a rich source of information that may facilitate investigating how examinees approach an item and arrive at their given response. There is a rich body of research leveraging action sequence data for investigating examinees' behavior. However, the associated timing data have been considered mainly on the item-level, if at all. Considering timing data on the action-level in addition to action sequences, however, has vast potential to support a more fine-grained assessment of examinees' behavior. We provide an approach that jointly considers action sequences and action-level times for identifying common response processes. In doing so, we integrate tools from clickstream analyses and graph-modeled data clustering with psychometrics. In our approach, we (a) provide similarity measures that are based on both actions and the associated action-level timing data and (b) subsequently employ cluster edge deletion for identifying homogeneous, interpretable, well-separated groups of action patterns, each describing a common response process. Guidelines on how to apply the approach are provided. The approach and its utility are illustrated on a complex problem-solving item from PIAAC 2012.

Identifiants

pubmed: 33544300
doi: 10.1007/s11336-020-09743-0
pii: 10.1007/s11336-020-09743-0
pmc: PMC8035117
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

190-214

Références

Albert, D., & Steinberg, L. (2011). Age differences in strategic planning as indexed by the tower of london. Child Development, 82(5), 1501–1517. https://doi.org/10.1111/j.1467-8624.2011.01613.x .
doi: 10.1111/j.1467-8624.2011.01613.x pubmed: 21679178
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education and Joint Committee on Standards for Educational and Psychological Testing. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Balart, P., Oosterveen, M., & Webbink, D. (2018). Test scores, noncognitive skills and economic growth. Economics of Education Review, 63, 134–153. https://doi.org/10.1016/j.econedurev.2017.12.004 .
doi: 10.1016/j.econedurev.2017.12.004
Banerjee, A., & Ghosh, J. (2001). Clickstream clustering using weighted longest common subsequences. In Proceedings of the web mining workshop at the first SIAM conference on data mining (pp. 33–40).
Bansal, N., Blum, A., & Chawla, S. (2004). Correlation clustering. Machine Learning, 56(1–3), 89–113. https://doi.org/10.1023/B:MACH.0000033116.57574.95 .
doi: 10.1023/B:MACH.0000033116.57574.95
Berkelaar, M. et al. (2020). Lp\_solve (Version 5.5). http://lpsolve.sourceforge.net/5.5/ .
Böcker, S., & Baumbach, J. (2013). Cluster editing. In The nature of computation. logic, algorithms, applications—9th conference on computability in Europe (CiE 2013) (pp. 33–44). Berlin: Springer. https://doi.org/10.1007/978-3-642-39053-1_5 .
Chen, J., Molter, H., Sorge, M., & Suchy, O. (2018). Cluster editing in multi-layer and temporal graphs. In 29th international symposium on algorithms and computation (ISAAC 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik.
Chen, Y., Li, X., Liu, J., & Ying, Z. (2019). Statistical analysis of complex problem-solving process data: An event history analysis approach. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2019.00486 .
De Boeck, P., & Jeon, M. (2019). An overview of models for response times and processes in cognitive tests. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2019.00102 .
De Boeck, P., & Scalise, K. (2019). Collaborative problem solving: Processing actions, time, and performance. Frontiers in Psychology, 10, 1280. https://doi.org/10.3389/fpsyg.2019.01280 .
doi: 10.3389/fpsyg.2019.01280 pubmed: 31231281 pmcid: 6566913
Eichmann, B., Goldhammer, F., Greiff, S., Pucite, L., & Naumann, J. (2019). The role of planning in complex problem solving. Computers & Education, 128, 1–12. https://doi.org/10.1016/j.compedu.2018.08.004 .
doi: 10.1016/j.compedu.2018.08.004
Eichmann, B., Greiff, S., Naumann, J., Brandhuber, L., & Goldhammer, F. (2020). Exploring behavioural patterns during complex problem-solving. Journal of Computer Assisted Learning, 36(6), 933–956. https://doi.org/10.1111/jcal.12451 .
doi: 10.1111/jcal.12451
Fox, J.-P., & Marianti, S. (2016). Joint modeling of ability and differential speed using responses and response times. Multivariate Behavioral Research, 51(4), 540–553. https://doi.org/10.1080/00273171.2016.1171128 .
doi: 10.1080/00273171.2016.1171128 pubmed: 27269482
Goldhammer, F., Naumann, J., & Keßel, Y. (2013). Assessing individual differences in basic computer skills. European Journal of Psychological Assessment, 29(4), 263–275. https://doi.org/10.1027/1015-5759/a000153 .
doi: 10.1027/1015-5759/a000153
Goldhammer, F., Naumann, J., Stelter, A., Tóth, K., Rölke, H., & Klieme, E. (2014). The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. Journal of Educational Psychology, 106(3), 608–626. https://doi.org/10.1037/a0034716 .
doi: 10.1037/a0034716
Greiff, S., Niepel, C., Scherer, R., & Martin, R. (2016). Understanding students’ performance in a computer-based assessment of complex problem solving: An analysis of behavioral data from computer-generated log files. Computers in Human Behavior, 61, 36–46. https://doi.org/10.1016/j.chb.2016.02.095 .
doi: 10.1016/j.chb.2016.02.095
Greiff, S., Wüstenberg, S., & Avvisati, F. (2015). Computer-generated log-file analyses as a window into students’ minds? A showcase study based on the PISA 2012 assessment of problem solving. Computers & Education, 91, 92–105. https://doi.org/10.1016/j.compedu.2015.10.018 .
doi: 10.1016/j.compedu.2015.10.018
Grötschel, M., & Wakabayashi, Y. (1989). A cutting plane algorithm for a clustering problem. Mathematical Programming, 45(1–3), 59–96. https://doi.org/10.1007/BF01589097 .
doi: 10.1007/BF01589097
Guo, J., Komusiewicz, C., Niedermeier, R., & Uhlmann, J. (2010). A more relaxed model for graph-based data clustering: s-plex cluster editing. SIAM Journal on Discrete Mathematics, 24(4), 1662–1683. https://doi.org/10.1137/090767285 .
Gurobi Optimization, LLC. (2019). Gurobi optimizer (Version 9.0). https://www.gurobi.com .
Hao, J., Shu, Z., & von Davier, A. (2015). Analyzing process data from game/scenario-based tasks: An edit distance approach. Journal of Educational Data Mining, 7(1), 33–50.
Hartung, S., & Hoos, H. H. (2015). Programming by optimisation meets parameterised algorithmics: A case study for cluster editing. In International conference on learning and intelligent optimization (pp. 43–58). Berlin: Springer.
He, Q., Borgonovi, F., & Paccagnella, M. (2019). Using process data to understand adults’ problem-solving behaviour in the programme for the international assessment of adult competencies (PIAAC): Identifying generalised patterns across multiple tasks with sequence mining. OECD Education Working Papers, No. 205, OECD Publishing, Paris, France. https://doi.org/10.1787/650918f2-en .
He, Q., Liao, D., & Jiao, H. (2019). Clustering behavioral patterns using process data in piaac problem-solving items. In Theoretical and practical advances in computer-based educational measurement (pp. 189–212). Berlin: Springer.
He, Q., & von Davier, M. (2015). Identifying feature sequences from process data in problem-solving items with n-grams. In Quantitative psychology research (pp. 173–190). Berlin: Springer.
He, Q., & von Davier, M. (2016). Analyzing process data from problem-solving items with n-grams: Insights from a computer-based large-scale assessment. In A. L. van der Ark, D. M. Bolt, W.-C. Wang, J. A. Douglas, & S.-M. Chow (Eds.), Quantitative psychology research. the 79th annual meeting of the psychometric society, madison, wisconsin, 2014 (pp. 750–777). New York, NY: Springer.
Hitt, C., Trivitt, J., & Cheng, A. (2016). When you say nothing at all: The predictive power of student effort on surveys. Economics of Education Review, 52, 105–119. https://doi.org/10.1016/j.econedurev.2016.02.001 .
doi: 10.1016/j.econedurev.2016.02.001
IBM Corp. (2016). IBM ILOG CPLEX Optimization Studio: CPLEX user’s manual (Version 12.7). https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.studio.help/pdf/usrcplex.pdf .
Komusiewicz, C. (2011). Parameterized algorithmics for network analysis: Clustering & querying. Doctoral dissertation, Technische Universität Berlin.
Komusiewicz, C., & Uhlmann, J. (2012). Cluster editing with locally bounded modifications. Discrete Applied Mathematics, 160(15), 2259–2270. https://doi.org/10.1016/j.dam.2012.05.019 .
doi: 10.1016/j.dam.2012.05.019
Krivánek, M., & Morávek, J. (1986). NP-hard problems in hierarchical-tree clustering. Acta Informatica, 23(3), 311–323. https://doi.org/10.1007/BF00289116 .
doi: 10.1007/BF00289116
Kyllonen, P., & Zu, J. (2016). Use of response time for measuring cognitive ability. Journal of Intelligence. https://doi.org/10.3390/jintelligence4040014 .
LaMar, M. M. (2018). Markov decision process measurement model. Psychometrika, 83(1), 67–88. https://doi.org/10.1007/s11336-017-9570-0 .
doi: 10.1007/s11336-017-9570-0 pubmed: 28447309
Liao, D., He, Q., & Jiao, H. (2019). Mapping background variables with sequential patterns in problem-solving environments: An investigation of us adults’ employment status in PIAAC. Frontiers in Psychology, 10, 646. https://doi.org/10.3389/fpsyg.2019.00646 .
doi: 10.3389/fpsyg.2019.00646 pubmed: 30971986 pmcid: 6445889
Livne, O. E., Han, L., Alkorta-Aranburu, G., Wentworth-Sheilds, W., Abney, M., Ober, C., et al. (2015). PRIMAL: Fast and accurate pedigree-based imputation from sequence data in a founder population. PLOS Computational Biology, 11(3), 1–14. https://doi.org/10.1371/journal.pcbi.1004139 .
doi: 10.1371/journal.pcbi.1004139
Molenaar, D., Oberski, D., Vermunt, J., & De Boeck, P. (2016). Hidden Markov item response theory models for responses and response times. Multivariate Behavioral Research, 51(5), 606–626. https://doi.org/10.1080/00273171.2016.1192983 .
doi: 10.1080/00273171.2016.1192983 pubmed: 27712114
Naumann, J., & Goldhammer, F. (2017). Time-on-task effects in digital reading are non-linear and moderated by persons’ skills and tasks’ demands. Learning and Individual Differences, 53, 1–16. https://doi.org/10.1016/j.lindif.2016.10.002 .
doi: 10.1016/j.lindif.2016.10.002
OECD. (2013). Technical report of the survey of adult skills (PIAAC). OECD Publishing. Paris, France. https://www.oecd.org/skills/piaac/_TechnicalReport_17OCT13.pdf .
OECD. (2017). PISA 2015 technical report. OECD Publishing. Paris, France. https://www.oecd.org/pisa/sitedocument/PISA-2015-technical-report-final.pdf
Partchev, I., & De Boeck, P. (2012). Can fast and slow intelligence be differentiated? Intelligence, 40(1), 23–32. https://doi.org/10.1016/j.intell.2011.11.002 .
doi: 10.1016/j.intell.2011.11.002
Python Software Foundation. (2019). Python language reference (Version 3.8.1). https://www.python.org .
Qiao, X., & Jiao, H. (2018). Data mining techniques in analyzing process data: A didactic. Frontiers in Psychology, 9, 2231. https://doi.org/10.3389/fpsyg.2018.02231 .
doi: 10.3389/fpsyg.2018.02231 pubmed: 30532716 pmcid: 6265513
Salles, F., Dos Santos, R., & Keskpaik, S. (2020). When didactics meet data science: Process data analysis in large-scale mathematics assessment in France. Large-scale Assessments in Education, 8(7), 1–20. https://doi.org/10.1186/s40536aAS020aAS00085aASy .
doi: 10.1186/s40536aAS020aAS00085aASy
Salvador, S., & Chan, P. (2004). Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In 16th IEEE international conference on tools with artificial intelligence (pp. 576–584). IEEE.
Scherer, R., Greiff, S., & Hautamäki, J. (2015). Exploring the relation between time on task and ability in complex problem solving. Intelligence, 48, 37–50. https://doi.org/10.1016/j.intell.2014.10.003 .
doi: 10.1016/j.intell.2014.10.003
Shamir, R., Sharan, R., & Tsur, D. (2004). Cluster graph modification problems. Discrete Applied Mathematics, 144(1–2), 173–182. https://doi.org/10.1016/j.dam.2004.01.007 .
doi: 10.1016/j.dam.2004.01.007
Stadler, M., Fischer, F., & Greiff, S. (2019). Taking a closer look: An exploratory analysis of successful and unsuccessful strategy use in complex problems. Frontiers in Psychology. https://doi.org/10.3389/fpsyg.2019.00777 .
Stelter, A., Goldhammer, F., Naumann, J., & Rölke, H. (2015). Die automatisierung prozeduralen wissens. eine analyse basierend auf prozessdaten. In: J. Stiller & C. Laschke (Eds.), (pp. 111–131). Frankfurt: Peter Lang Edition.
Sukkarieh, J. Z., von Davier, M., & Yamamoto, K. (2012). From biology to education: Scoring and clustering multilingual text sequences and other sequential tasks. ETS Research Report Series, 2012(2), i–43.
Tang, X., Wang, Z., He, Q., Liu, J., & Ying, Z. (2020). Latent feature extraction for process data via multidimensional scaling. Psychometrika, 85(2), 378–397. https://doi.org/10.1007/s11336-020-09708-3 .
doi: 10.1007/s11336-020-09708-3 pubmed: 32572672
Tang, X., Wang, Z., Liu, J., & Ying, Z. (2020). An exploratory analysis of the latent structure of process data via action sequence autoencoders. British Journal of Mathematical and Statistical Psychology,. https://doi.org/10.1111/bmsp.12203 .
doi: 10.1111/bmsp.12203
Trivedi, S., Pardos, Z., Sárközy, G., & Heffernan, N. (2011). Spectral clustering in educational data mining. In Proceedings of the 4th international conference on educational data mining, 2011 (pp. 129–138).
Tsur, D. (2019). Cluster deletion revisited. arXiv: 1907.08399 .
Ulitzsch, E., von Davier, M., & Pohl, S. (2019). Using response times for joint modeling of response and omission behavior. Multivariate Behavioral Research,. https://doi.org/10.1080/00273171.2019.1643699 .
doi: 10.1080/00273171.2019.1643699 pubmed: 31448968
Ulitzsch, E., von Davier, M., & Pohl, S. (2020). A hierarchical latent response model for inferences about examinee engagement in terms of guessing and item-level nonresponse. British Journal of Mathematical and Statistical Psychology, 73, 83–112. https://doi.org/10.1111/bmsp.12188 .
doi: 10.1111/bmsp.12188
van Bevern, R., Froese, V., & Komusiewicz, C. (2018). Parameterizing edge modification problems above lower bounds. Theory of Computing Systems, 62(3), 739–770. https://doi.org/10.1007/s00224-016-9746-5 .
Vista, A., Care, E., & Awwal, N. (2017). Visualising and examining sequential actions as behavioural paths that can be interpreted as markers of complex behaviours. Computers in Human Behavior, 76, 656–671. https://doi.org/10.1016/j.chb.2017.01.027 .
doi: 10.1016/j.chb.2017.01.027
von Davier, A. A., Zhu, M., & Kyllonen, P. C. (2017). Innovative assessment of collaboration. In Introduction: Innovative assessment of collaboration (pp. 1–18). Berlin: Springer.
von Davier, M., Khorramdel, L., He, Q., Shin, H. J., & Chen, H. (2019). Developments in psychometric population models for technology-based large-scale assessments: An overview of challenges and opportunities. Journal of Educational and Behavioral Statistics, 44(6), 671–705. https://doi.org/10.3102/1076998619881789 .
doi: 10.3102/1076998619881789
Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68(3), 456–477. https://doi.org/10.1111/bmsp.12054 .
doi: 10.1111/bmsp.12054
Wang, C., Xu, G., Shang, Z., & Kuncel, N. (2018). Detecting aberrant behavior and item preknowledge: A comparison of mixture modeling method and residual method. Journal of Educational and Behavioral Statistics, 43(4), 469–501. https://doi.org/10.3102/1076998618767123 .
doi: 10.3102/1076998618767123
Wang, Z., Tang, X., Liu, J., & Ying, Z. (2020). Subtask analysis of process data through a predictive model. http://scientifichpc.com/processdata/docs/subtask.pdf .
Weeks, J. P., von Davier, M., & Yamamoto, K. (2016). Using response time data to inform the coding of omitted responses. Psychological Test and Assessment Modeling, 58(4), 671–701.
Wise, S. L. (2017). Rapid-guessing behavior: Its identification, interpretation, and implications. Educational Measurement: Issues and Practice, 36(4), 52–61. https://doi.org/10.1111/emip.12165 .
doi: 10.1111/emip.12165
Wittkop, T., Baumbach, J., Lobo, F. P., & Rahmann, S. (2007). Large scale clustering of protein sequences with FORCE: A layout based heuristic for weighted cluster editing. BMC Bioinformatics, 8, 396. https://doi.org/10.1186/1471-2105-8-396 .
doi: 10.1186/1471-2105-8-396 pubmed: 17941985 pmcid: 2147039
Wollack, J. A., & Maynes, D. D. (2017). Detection of test collusion using cluster analysis. Handbook of quantitative methods for detecting cheating on tests (pp. 124–150). Routledge: New York, NY.
Xu, H., Fang, G., Chen, Y., Liu, J., & Ying, Z. (2018). Latent class analysis of recurrent events in problem-solving items. Applied Psychological Measurement, 42(6), 478–498. https://doi.org/10.1177/0146621617748325 .
doi: 10.1177/0146621617748325 pubmed: 30787489 pmcid: 6373852
Zamarro, G., Cheng, A., Shakeel, M. D., & Hitt, C. (2018). Comparing and validating measures of non-cognitive traits: Performance task measures and self-reports from a nationally representative internet panel. Journal of Behavioral and Experimental Economics, 72, 51–60. https://doi.org/10.1016/j.socec.2017.11.005 .
doi: 10.1016/j.socec.2017.11.005
Zhu, M., Shu, Z., & von Davier, A. A. (2016). Using networks to visualize and analyze process data for educational assessment. Journal of Educational Measurement, 53(2), 190–211. https://doi.org/10.1111/jedm.12107 .
doi: 10.1111/jedm.12107

Auteurs

Esther Ulitzsch (E)

Educational Measurement, IPN - Leibniz Institute for Science and Mathematics Education, Olshausenstraße 62, 24118, Kiel, Germany. ulitzsch@leibniz-ipn.de.

Qiwei He (Q)

Educational Testing Service, Princeton, USA.

Vincent Ulitzsch (V)

Technische Universität Berlin, Berlin, Germany.

Hendrik Molter (H)

Technische Universität Berlin, Berlin, Germany.

André Nichterlein (A)

Technische Universität Berlin, Berlin, Germany.

Rolf Niedermeier (R)

Technische Universität Berlin, Berlin, Germany.

Steffi Pohl (S)

Freie Universität Berlin, Berlin, Germany.

Articles similaires

Humans Middle Aged Female Male Surveys and Questionnaires
Adolescent Child Female Humans Male
Humans Male Female Intensive Care Units COVID-19

Classifications MeSH