A measure of reliability convergence to select and optimize cognitive tasks for individual differences research.
Journal
Communications psychology
ISSN: 2731-9121
Titre abrégé: Commun Psychol
Pays: England
ID NLM: 9918716686206676
Informations de publication
Date de publication:
04 Jul 2024
04 Jul 2024
Historique:
received:
19
07
2023
accepted:
18
06
2024
medline:
7
9
2024
pubmed:
7
9
2024
entrez:
6
9
2024
Statut:
epublish
Résumé
Surging interest in individual differences has faced setbacks in light of recent replication crises in psychology, for example in brain-wide association studies exploring brain-behavior correlations. A crucial component of replicability for individual differences studies, which is often assumed but not directly tested, is the reliability of the measures we use. Here, we evaluate the reliability of different cognitive tasks on a dataset with over 250 participants, who each completed a multi-day task battery. We show how reliability improves as a function of number of trials, and describe the convergence of the reliability curves for the different tasks, allowing us to score tasks according to their suitability for studies of individual differences. We further show the effect on reliability of measuring over multiple time points, with tasks assessing different cognitive domains being differentially affected. Data collected over more than one session may be required to achieve trait-like stability.
Identifiants
pubmed: 39242856
doi: 10.1038/s44271-024-00114-4
pii: 10.1038/s44271-024-00114-4
doi:
Types de publication
Journal Article
Langues
eng
Pagination
64Subventions
Organisme : Israel Science Foundation (ISF)
ID : 829/22
Informations de copyright
© 2024. The Author(s).
Références
Chen, G., Gully, S. M., Whiteman, J.-A. & Kilcullen, R. N. Examination of relationships among trait-like individual differences, state-like individual differences, and learning performance. J. Appl. Psychol. 85, 835–847 (2000).
pubmed: 11125649
doi: 10.1037/0021-9010.85.6.835
Duchaine, B. & Nakayama, K. The Cambridge face memory test: Results for neurologically intact individuals and an investigation of its validity using inverted face stimuli and prosopagnosic participants. Neuropsychologia 44, 576–585 (2006).
pubmed: 16169565
doi: 10.1016/j.neuropsychologia.2005.07.001
Witkin, H. A. Individual differences in ease of perception of embedded figures. J. Pers. 19, 1–15 (1950).
pubmed: 14795367
doi: 10.1111/j.1467-6494.1950.tb01084.x
Barnett, J. & Breakwell, G. M. Risk perception and experience: Hazard personality profiles and individual differences. Risk Anal. 21, 171–178 (2001).
pubmed: 11332545
doi: 10.1111/0272-4332.211099
Dubois, J. & Adolphs, R. Building a science of individual differences from fMRI. Trends Cogn. Sci. 20, 425–443 (2016).
pubmed: 27138646
pmcid: 4886721
doi: 10.1016/j.tics.2016.03.014
Hariri, A. R. The neurobiology of individual differences in complex behavioral traits. Annu. Rev. Neurosci. 32, 225–247 (2009).
pubmed: 19400720
pmcid: 2755193
doi: 10.1146/annurev.neuro.051508.135335
Mennes, M. et al. Linking inter-individual differences in neural activation and behavior to intrinsic brain dynamics. Neuroimage 54, 2950–2959 (2011).
pubmed: 20974260
doi: 10.1016/j.neuroimage.2010.10.046
Finn, E. S. et al. Functional connectome fingerprinting: identifying individuals using patterns of brain connectivity. Nat. Neurosci. 18, 1664–1671 (2015).
pubmed: 26457551
pmcid: 5008686
doi: 10.1038/nn.4135
Chen, J. et al. Shared memories reveal shared structure in neural activity across individuals. Nat. Neurosci. 20, 115–125 (2017).
pubmed: 27918531
doi: 10.1038/nn.4450
Ramot, M., Walsh, C., Reimann, G. E. & Martin, A. Distinct neural mechanisms of social orienting and mentalizing revealed by independent measures of neural and eye movement typicality. Commun. Biol. 3, 1–11 (2020).
doi: 10.1038/s42003-020-0771-1
Hampson, M., Driesen, N. R., Skudlarski, P., Gore, J. C. & Constable, R. T. Brain connectivity related to working memory performance. J. Neurosci. 26, 13338 (2006).
pubmed: 17182784
pmcid: 2677699
doi: 10.1523/JNEUROSCI.3408-06.2006
Rosenberg, M. D. et al. A neuromarker of sustained attention from whole-brain functional connectivity. Nat. Neurosci. 19, 165–171 (2016).
pubmed: 26595653
doi: 10.1038/nn.4179
Stevens, W. D., Kravitz, D. J., Peng, C. S., Tessler, M. H. & Martin, A. Privileged functional connectivity between the visual word form area and the language system. J. Neurosci. 37, 5288–5297 (2017).
pubmed: 28450544
pmcid: 5456110
doi: 10.1523/JNEUROSCI.0138-17.2017
Ramot, M., Walsh, C. & Martin, A. Multifaceted integration - memory for faces is subserved by widespread connections between visual, memory, auditory and social networks. J. Neurosci. 39, 4976–4985 (2019).
Gotts, S. J. et al. Fractionation of social brain circuits in autism spectrum disorders. Brain 135, 2711 (2012).
pubmed: 22791801
pmcid: 3437021
doi: 10.1093/brain/aws160
Panno, A., Sarrionandia, A., Lauriola, M. & Giacomantonio, M. Alexithymia and risk preferences: Predicting risk behaviour across decision domains. Int. J. Psychol. 54, 468–477 (2019).
pubmed: 29460281
doi: 10.1002/ijop.12479
Xie, W., Campbell, S. & Zhang, W. Working memory capacity predicts individual differences in social-distancing compliance during the COVID-19 pandemic in the United States. Proc. Natl. Acad. Sci. 117, 17667–17674 (2020).
pubmed: 32651280
pmcid: 7395511
doi: 10.1073/pnas.2008868117
Rohde, T. E. & Thompson, L. A. Predicting academic achievement with cognitive ability. Intelligence 35, 83–92 (2007).
doi: 10.1016/j.intell.2006.05.004
Cragg, L., Keeble, S., Richardson, S., Roome, H. E. & Gilmore, C. Direct and indirect influences of executive functions on mathematics achievement. Cognition 162, 12–26 (2017).
pubmed: 28189034
doi: 10.1016/j.cognition.2017.01.014
McMahon, R. J. Diagnosis, assessment, and treatment of externalizing problems in children: The role of longitudinal data. J. Consult. Clin. Psychol. 62, 901–917 (1994).
pubmed: 7806720
doi: 10.1037/0022-006X.62.5.901
Alberdi, A., Aztiria, A. & Basarab, A. On the early diagnosis of Alzheimer’s Disease from multimodal signals: A survey. Artif. Intell. Med. 71, 1–29 (2016).
pubmed: 27506128
doi: 10.1016/j.artmed.2016.06.003
Daunizeau, J., Adam, V. & Rigoux, L. VBA: A probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLOS Comput. Biol. 10, e1003441 (2014).
pubmed: 24465198
pmcid: 3900378
doi: 10.1371/journal.pcbi.1003441
Xu, T. et al. Interindividual variability of functional connectivity in awake and anesthetized rhesus Macaque Monkeys. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 4, 543–553 (2019).
pubmed: 31072758
pmcid: 7063583
Matzel, L. D. et al. Individual differences in the expression of a “General” learning ability in mice. J. Neurosci. 23, 6423–6433 (2003).
pubmed: 12878682
pmcid: 6740645
doi: 10.1523/JNEUROSCI.23-16-06423.2003
Dall, S. R. X., Houston, A. I. & McNamara, J. M. The behavioural ecology of personality: Consistent individual differences from an adaptive perspective. Ecol. Lett. 7, 734–739 (2004).
doi: 10.1111/j.1461-0248.2004.00618.x
Finn, E. S. & Todd Constable, R. Individual variation in functional brain connectivity: Implications for personalized approaches to psychiatric disease. Dialogues Clin. Neurosci. 18, 277–287 (2016).
pubmed: 27757062
pmcid: 5067145
doi: 10.31887/DCNS.2016.18.3/efinn
Parkes, L., Satterthwaite, T. D. & Bassett, D. S. Towards precise resting-state fMRI biomarkers in psychiatry: synthesizing developments in transdiagnostic research, dimensional models of psychopathology, and normative neurodevelopment. Curr. Opin. Neurobiol. 65, 120–128 (2020).
pubmed: 33242721
pmcid: 7770086
doi: 10.1016/j.conb.2020.10.016
Edlow, B. L. et al. Personalized connectome mapping to guide targeted therapy and promote recovery of consciousness in the intensive care unit. Neurocrit. Care 33, 364–375 (2020).
pubmed: 32794142
pmcid: 8336723
doi: 10.1007/s12028-020-01062-7
Gallen, C. L. & D’Esposito, M. Brain Modularity: A biomarker of Intervention-related plasticity. Trends Cogn. Sci. 23, 293–304 (2019).
pubmed: 30827796
pmcid: 6750199
doi: 10.1016/j.tics.2019.01.014
Marek, S. et al. Reproducible brain-wide association studies require thousands of individuals. Nature 603, 654–660 (2022).
pubmed: 35296861
pmcid: 8991999
doi: 10.1038/s41586-022-04492-9
Moran, E. K. et al. Both unmedicated and medicated individuals with schizophrenia show impairments across a wide array of cognitive and reinforcement learning tasks. Psychol. Med. 52, 1115–1125 (2022).
pubmed: 32799938
doi: 10.1017/S003329172000286X
Fried, E. I. & Nesse, R. M. Depression sum-scores don’t add up: why analyzing specific depression symptoms is essential. BMC Med 13, 72 (2015).
pubmed: 25879936
pmcid: 4386095
doi: 10.1186/s12916-015-0325-4
Khan, A., Mar, K. F. & Brown, W. A. The conundrum of depression clinical trials: one size does not fit all. Int. Clin. Psychopharmacol. 33, 239–248 (2018).
pubmed: 29939890
pmcid: 6078483
doi: 10.1097/YIC.0000000000000229
Rodebaugh, T. L. et al. Unreliability as a threat to understanding psychopathology: The cautionary tale of attentional bias. J. Abnorm. Psychol. 125, 840–851 (2016).
pubmed: 27322741
pmcid: 4980228
doi: 10.1037/abn0000184
Gratton, C., Nelson, S. M. & Gordon, E. M. Brain-behavior correlations: Two paths toward reliability. Neuron 110, 1446–1449 (2022).
pubmed: 35512638
doi: 10.1016/j.neuron.2022.04.018
Rosenberg, M. D. & Finn, E. S. How to establish robust brain–behavior relationships without thousands of individuals. Nat. Neurosci. 25, 835–837 (2022).
pubmed: 35710985
doi: 10.1038/s41593-022-01110-9
Nour, M. M., Liu, Y. & Dolan, R. J. Functional neuroimaging in psychiatry and the case for failing better. Neuron 110, 2524–2544 (2022).
pubmed: 35981525
doi: 10.1016/j.neuron.2022.07.005
Noble, S., Mejia, A. F., Zalesky, A. & Scheinost, D. Improving power in functional magnetic resonance imaging by moving beyond cluster-level inference. Proc. Natl. Acad. Sci. 119, e2203020119 (2022).
pubmed: 35925887
pmcid: 9371642
doi: 10.1073/pnas.2203020119
Tetereva, A., Li, J., Deng, J. D., Stringaris, A. & Pat, N. Capturing brain‐cognition relationship: Integrating task‐based fMRI across tasks markedly boosts prediction and test‐retest reliability. NeuroImage 263, 119588 (2022).
pubmed: 36057404
doi: 10.1016/j.neuroimage.2022.119588
Bijsterbosch, J. Piggybacking on big data. Nat. Neurosci. 25, 682–683 (2022).
pubmed: 35578133
pmcid: 9179090
doi: 10.1038/s41593-022-01058-w
Enkavi, A. Z. et al. Large-scale analysis of test–retest reliabilities of self-regulation measures. Proc. Natl. Acad. Sci. 116, 5472–5477 (2019).
pubmed: 30842284
pmcid: 6431228
doi: 10.1073/pnas.1818430116
Chen, G. et al. Hyperbolic trade-off: The importance of balancing trial and subject sample sizes in neuroimaging. NeuroImage 247, 118786 (2022).
pubmed: 34906711
doi: 10.1016/j.neuroimage.2021.118786
Nikolaidis, A. et al. Suboptimal phenotypic reliability impedes reproducible human neuroscience. bioRxiv 2022.07.22.501193 (2022) https://doi.org/10.1101/2022.07.22.501193 .
Parsons, S., Kruijt, A.-W. & Fox, E. Psychological science needs a standard practice of reporting the reliability of cognitive-behavioral measurements. Adv. Methods Pract. Psychol. Sci. 2, 378–395 (2019).
doi: 10.1177/2515245919879695
Pronk, T., Hirst, R. J., Wiers, R. W. & Murre, J. M. J. Can we measure individual differences in cognitive measures reliably via smartphones? A comparison of the flanker effect across device types and samples. Behav. Res. Methods 55, 1641–1652 (2023).
Rouder, J. N. & Haaf, J. M. A psychometrics of individual differences in experimental tasks. Psychon. Bull. Rev. 26, 452–467 (2019).
pubmed: 30911907
doi: 10.3758/s13423-018-1558-y
Zorowitz, S. & Niv, Y. Improving the Reliability of Cognitive Task Measures: A Narrative Review. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 8, 789–797 (2023).
Snijder, J.-P., Tang, R., Bugg, J. M., Conway, A. R. A. & Braver, T. S. On the psychometric evaluation of cognitive control tasks: An Investigation with the Dual Mechanisms of Cognitive Control (DMCC) battery. Behav. Res. Methods 56, 1604–1639 (2024).
Elbich, D. B. & Scherf, S. Beyond the FFA: Brain-behavior correspondences in face recognition abilities. Neuroimage 147, 409–422 (2017).
pubmed: 27993674
doi: 10.1016/j.neuroimage.2016.12.042
Van Essen, D. C. et al. The WU-Minn Human Connectome Project: An overview. NeuroImage 80, 62–79 (2013).
pubmed: 23684880
doi: 10.1016/j.neuroimage.2013.05.041
Langenecker, S. A., Zubieta, J.-K., Young, E. A., Akil, H. & Nielson, K. A. A task to manipulate attentional load, set-shifting, and inhibitory control: convergent validity and test-retest reliability of the Parametric Go/No-Go Test. J. Clin. Exp. Neuropsychol. 29, 842–853 (2007).
pubmed: 17852593
doi: 10.1080/13803390601147611
Dale, G. & Arnell, K. M. How reliable is the attentional blink? Examining the relationships within and between attentional blink tasks over time. Psychol. Res. 77, 99–105 (2013).
pubmed: 22159732
doi: 10.1007/s00426-011-0403-y
Burton, A. M., White, D. & McNeill, A. The glasgow face matching test. Behav. Res. Methods 42, 286–291 (2010).
pubmed: 20160307
doi: 10.3758/BRM.42.1.286
McCaffery, J. M., Robertson, D. J., Young, A. W. & Burton, A. M. Individual differences in face identity processing. Cogn. Res. Princ. Implic. 3, 21 (2018).
pubmed: 30009251
pmcid: 6019420
doi: 10.1186/s41235-018-0112-9
Fernández-Abascal, E. G., Cabello, R., Fernández-Berrocal, P. & Baron-Cohen, S. Test-retest reliability of the ‘Reading the Mind in the Eyes’ test: a one-year follow-up study. Mol. Autism 4, 33 (2013).
pubmed: 24020728
pmcid: 3848772
doi: 10.1186/2040-2392-4-33
Pinkham, A. E., Harvey, P. D. & Penn, D. L. Paranoid individuals with schizophrenia show greater social cognitive bias and worse social functioning than non-paranoid individuals with schizophrenia. Schizophr. Res. Cogn. 3, 33–38 (2016).
pubmed: 27990352
pmcid: 5156478
doi: 10.1016/j.scog.2015.11.002
Aldi, G. A. et al. Validation of the mnemonic similarity task—context version. Braz. J. Psychiatry 40, 432–440 (2018).
pubmed: 29412339
pmcid: 6899373
doi: 10.1590/1516-4446-2017-2379
Hedge, C., Powell, G. & Sumner, P. The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behav. Res. Methods 50, 1166–1186 (2018).
pubmed: 28726177
doi: 10.3758/s13428-017-0935-1
Rey-Mermet, A., Gade, M. & Oberauer, K. Should we stop thinking about inhibition? Searching for individual and age differences in inhibition ability. J. Exp. Psychol. Learn. Mem. Cogn. 44, 501–526 (2018).
pubmed: 28956944
doi: 10.1037/xlm0000450
Higgins, W. C., Kaplan, D. M., Deschrijver, E. & Ross, R. M. Construct validity evidence reporting practices for the Reading the mind in the eyes test: A systematic scoping review. Clin. Psychol. Rev. 108, 102378 (2023).
White, D. & Burton, A. M. Individual differences and the multidimensional nature of face perception. Nat. Rev. Psychol. 1, 287–300 (2022).
doi: 10.1038/s44159-022-00041-3
Lord, F. M. & Novick, M. R. Statistical Theories of Mental Test Scores. (IAP, 2008).
Leppink, J. & Pérez-Fuster, P. We need more replication research – A case for test-retest reliability. Perspect. Med. Educ. 6, 158–164 (2017).
pubmed: 28390030
pmcid: 5466566
doi: 10.1007/S40037-017-0347-Z
Chmielewski, M. & Watson, D. What is being assessed and why it matters: the impact of transient error on trait research. J. Pers. Soc. Psychol. 97, 186–202 (2009).
pubmed: 19586248
doi: 10.1037/a0015618
Green, S. B. A coefficient alpha for test-retest data. Psychol. Methods 8, 88–101 (2003).
pubmed: 12741675
doi: 10.1037/1082-989X.8.1.88
Calamia, M., Markon, K. & Tranel, D. The robust reliability of neuropsychological measures: Meta-analyses of test–retest correlations. Clin. Neuropsychol. 27, 1077–1105 (2013).
pubmed: 24016131
doi: 10.1080/13854046.2013.809795
Duff, K. Evidence-based indicators of neuropsychological change in the individual patient: Relevant concepts and methods. Arch. Clin. Neuropsychol. 27, 248–261 (2012).
pubmed: 22382384
pmcid: 3499091
doi: 10.1093/arclin/acr120
Noble, S., Scheinost, D. & Constable, R. T. A decade of test-retest reliability of functional connectivity: A systematic review and meta-analysis. NeuroImage 203, 116157 (2019).
pubmed: 31494250
doi: 10.1016/j.neuroimage.2019.116157
Salthouse, T. A. Implications of within-person variability in cognitive and neuropsychological functioning for the interpretation of change. Neuropsychology 21, 401–411 (2007).
pubmed: 17605573
doi: 10.1037/0894-4105.21.4.401
Bohn, M. et al. Great ape cognition is structured by stable cognitive abilities and predicted by developmental conditions. Nat. Ecol. Evol. 7, 927–938 (2023).
pubmed: 37106158
pmcid: 10250201
doi: 10.1038/s41559-023-02050-8
Katherine H. Karlsgodt et al. Capacity-based differences in structural connectivity and functional network activation associated with spatial working memory. http://lcni-3.uoregon.edu/phenowiki/index.php/Karlsgodt_2011_ACNP (2011).
Cowan, N. The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behav. Brain Sci. 24, 87–114 (2001).
pubmed: 11515286
doi: 10.1017/S0140525X01003922
Votruba, K. L. & Langenecker, S. A. Factor structure, construct validity, and age- and education-based normative data for the Parametric Go/No-Go Test. J. Clin. Exp. Neuropsychol. 35, 132–146 (2013).
pubmed: 23289626
pmcid: 4040279
doi: 10.1080/13803395.2012.758239
Ragland, J. D. et al. Working memory for complex figures: An fMRI comparison of letter and fractal n-back tasks. Neuropsychology 16, 370–379 (2002).
pubmed: 12146684
pmcid: 4332798
doi: 10.1037/0894-4105.16.3.370
Dennett, H. W. et al. The Cambridge Car Memory Test: A task matched in format to the Cambridge Face Memory Test, with norms, reliability, sex differences, dissociations from face memory, and expertise effects. Behav. Res. Methods 44, 587–605 (2012).
pubmed: 22012343
doi: 10.3758/s13428-011-0160-2
Vanderwal, T., Kelly, C., Eilbott, J., Mayes, L. C. & Castellanos, F. X. Inscapes: a movie paradigm to improve compliance in functional magnetic resonance imaging. NeuroImage 122, 222–232 (2015).
pubmed: 26241683
doi: 10.1016/j.neuroimage.2015.07.069
Kirwan, C. B., Jones, C. K., Miller, M. I. & Stark, C. E. L. High-resolution fMRI investigation of the medial temporal lobe. Hum. Brain Mapp. 28, 959–966 (2007).
pubmed: 17133381
doi: 10.1002/hbm.20331
Stark, C. E. L., Noche, J. A., Ebersberger, J. R., Mayer, L. & Stark, S. M. Optimizing the mnemonic similarity task for efficient, widespread use. Front. Behav. Neurosci. 17, 1080366 (2023).
Rezlescu, C., Chapman, A., Susilo, T. & Caramazza, A. Large inversion effects are not specific to faces and do not vary with object expertise. PsyArXiv Preprints https://discovery.ucl.ac.uk/id/eprint/10140283/ (Charlottesville, VA, USA, 2016).
Ragland, J. D. et al. Relational and item-specific encoding (RISE): Task development and psychometric characteristics. Schizophr. Bull 38, 114–124 (2012).
pubmed: 22124089
doi: 10.1093/schbul/sbr146
McKone, E. et al. Face ethnicity and measurement reliability affect face recognition performance in developmental prosopagnosia: Evidence from the Cambridge Face Memory Test–Australian. Cogn. Neuropsychol. 28, 109–146 (2011).
pubmed: 22122116
doi: 10.1080/02643294.2011.616880
Arrington, M., Elbich, D., Dai, J., Duchaine, B. & Scherf, K. S. Introducing the female Cambridge face memory test – long form (F-CFMT + ). Behav. Res. Methods 54, 3071–3084 (2022).
Palermo, R., O’Connor, K. B., Davis, J. M., Irons, J. & McKone, E. New tests to measure individual differences in matching and labelling facial expressions of emotion, and their association with ability to recognise vocal emotions and facial identity. PLoS ONE 8, e68126 (2013).
pubmed: 23840821
pmcid: 3695959
doi: 10.1371/journal.pone.0068126
Sijtsma, K. & van der Ark, L. A. Reliability. in Encyclopedia of Personality and Individual Differences (eds. Zeigler-Hill, V. & Shackelford, T. K.) 4385–4402 (Springer International Publishing, Cham, 2020). https://doi.org/10.1007/978-3-319-24612-3_1348 .
Cronbach, L. J. Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334 (1951).
doi: 10.1007/BF02310555
Charter, R. A. It is time to bury the Spearman-Brown “Prophecy” formula for some common applications. Educ. Psychol. Meas. 61, 690–696 (2001).
doi: 10.1177/00131640121971446
Gulliksen, H. Theory of Mental Tests. (Routledge, New York, 1987). https://doi.org/10.4324/9780203052150 .
Pronk, T., Molenaar, D., Wiers, R. W. & Murre, J. Methods to split cognitive task data for estimating split-half reliability: A comprehensive review and systematic assessment. Psychon. Bull. Rev. 29, 44–54 (2022).
pubmed: 34100223
doi: 10.3758/s13423-021-01948-3
Thissen, D. & Wainer, H. Test Scoring. xii, 422 (Lawrence Erlbaum Associates Publishers, Mahwah, NJ, US, 2001).
MacLeod, J. W. et al. Appraising the ANT: Psychometric and theoretical considerations of the Attention Network Test. Neuropsychology 24, 637–651 (2010).
pubmed: 20804252
doi: 10.1037/a0019803
Cooper, S. R., Gonthier, C., Barch, D. M. & Braver, T. S. The role of psychometrics in individual differences research in cognition: A Case Study of the AX-CPT. Front. Psychol. 8, 1482 (2017).
Guttman, L. A basis for analyzing test-retest reliability. Psychometrika 10, 255–282 (1945).
pubmed: 21007983
doi: 10.1007/BF02288892
Hill, J. & Sawilowsky, S. S. Bias in Monte Carlo simulations due to pseudo-random number generator initial seed selection. J. Mod. Appl. Stat. Methods 10, 29–50 (2011).
Spearman, C. Correlation calculated from faulty data. Br. J. Psychol. 1904-1920 3, 271–295 (1910).
doi: 10.1111/j.2044-8295.1910.tb00206.x
Spearman, C. The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904).
doi: 10.2307/1412159
Spearman, C. Demonstration of formulæ for true measurement of correlation. Am. J. Psychol. 18, 161–169 (1907).
doi: 10.2307/1412408
Brown, W. Some experimental results in the correlation of mental abilities1. Br. J. Psychol. 1904-1920 3, 296–322 (1910).
doi: 10.1111/j.2044-8295.1910.tb00207.x
Weiss, N. A., Holmes, P. T. & Hardy, M. A Course in Probability. (Pearson Addison Wesley, 2005).
Zimmerman, D. & Zumbo, B. Resolving the Issue of How Reliability is Related to Statistical Power: Adhering to Mathematical Definitions. J. Mod. Appl. Stat. Methods 14, 9–26 (2015).
Xu, Z., Adam, K. C. S., Fang, X. & Vogel, E. K. The reliability and stability of visual working memory capacity. Behav. Res. Methods 50, 576–588 (2018).
pubmed: 28389852
pmcid: 5632133
doi: 10.3758/s13428-017-0886-6
Matheson, G. J. We need to talk about reliability: making better use of test-retest studies for study design and interpretation. PeerJ 7, e6918 (2019).
pubmed: 31179173
pmcid: 6536112
doi: 10.7717/peerj.6918
Metsämuuronen, J. Attenuation-corrected estimators of reliability. Appl. Psychol. Meas. 46, 720–737 (2022).
pubmed: 36262520
pmcid: 9574086
doi: 10.1177/01466216221108131
Trafimow, D. The attenuation of correlation coefficients: A statistical literacy issue. Teach. Stat. 38, 25–28 (2016).
doi: 10.1111/test.12087
Aldridge, V. K., Dovey, T. M. & Wade, A. Assessing test-retest reliability of psychological measures. Eur. Psychol. 22, 207–218 (2017).
doi: 10.1027/1016-9040/a000298
Bobak, C. A., Barr, P. J. & O’Malley, A. J. Estimation of an inter-rater intra-class correlation coefficient that overcomes common assumption violations in the assessment of health measurement scales. BMC Med. Res. Methodol. 18, 93 (2018).
pubmed: 30208858
pmcid: 6134634
doi: 10.1186/s12874-018-0550-6
Koo, T. K. & Li, M. Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15, 155–163 (2016).
pubmed: 27330520
pmcid: 4913118
doi: 10.1016/j.jcm.2016.02.012
Kucina, T. et al. Calibration of cognitive tests to address the reliability paradox for decision-conflict tasks. Nat. Commun. 14, 2234 (2023).
pubmed: 37076456
pmcid: 10115879
doi: 10.1038/s41467-023-37777-2
Chmielewski, M. & Kucker, S. C. An MTurk Crisis? Shifts in data quality and the impact on study results. Soc. Psychol. Personal. Sci. 11, 464–473 (2020).
doi: 10.1177/1948550619875149
Newman, A., Bavik, Y. L., Mount, M. & Shao, B. Data collection via online platforms: Challenges and recommendations for future research. Appl. Psychol. 70, 1380–1402 (2021).
doi: 10.1111/apps.12302
Dupuis, M., Meier, E. & Cuneo, F. Detecting computer-generated random responding in questionnaire-based data: A comparison of seven indices. Behav. Res. Methods 51, 2228–2237 (2019).
pubmed: 30091086
doi: 10.3758/s13428-018-1103-y
Roth, P. L. Missing data: A conceptual review for applied psychologists. Pers. Psychol. 47, 537–560 (1994).
doi: 10.1111/j.1744-6570.1994.tb01736.x
Enders, C. K. Applied Missing Data Analysis: Second Edition. ix, 546 (The Guilford Press, New York, NY, US, 2022).
Enders, C. K. Missing data: An update on the state of the art. Psychol. Methods No Pagination Specified-No Pagination Specified (2023) https://doi.org/10.1037/met0000563 .
Robison, M. K., Miller, A. L. & Unsworth, N. A multi-faceted approach to understanding individual differences in mind-wandering. Cognition 198, 104078 (2020).
pubmed: 32062086
doi: 10.1016/j.cognition.2019.104078
Yaron, I., Zeevi, Y., Korisky, U., Marshall, W. & Mudrik, L. Progressing, not regressing: A possible solution to the problem of regression to the mean in unconscious processing studies. Psychon. Bull. Rev. 31, 49–64 (2024).
Kadlec, J., Walsh, Catherine R., Rissman, Jesse, & Ramot, Michal. Putting cognitive tasks on trial: A measure of reliability convergence. OSF https://doi.org/10.17605/OSF.IO/CRE2B (2023).
Van Rossum, G. & Drake, F. L. Python 3 Reference Manual. (CreateSpace, Scotts Valley, CA, 2009).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
pubmed: 32939066
pmcid: 7759461
doi: 10.1038/s41586-020-2649-2
McKinney, W. Data Structures for Statistical Computing in Python. in 56–61 (Austin, Texas, 2010). https://doi.org/10.25080/Majora-92bf1922-00a .
Hunter, J. D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
doi: 10.1109/MCSE.2007.55
Waskom, M. L. seaborn: statistical data visualization. J. Open Source Softw 6, 3021 (2021).
doi: 10.21105/joss.03021
Newville, M., Stensitzki, T., Allen, D. B. & Ingargiola, A. LMFIT: Non-Linear Least-Square Minimization and Curve-Fitting for Python. Zenodo https://doi.org/10.5281/zenodo.11813 (2014).
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
pubmed: 32015543
pmcid: 7056644
doi: 10.1038/s41592-019-0686-2
Vallat, R. Pingouin: statistics in Python. J. Open Source Softw 3, 1026 (2018).
doi: 10.21105/joss.01026
Shinn, M. CanD features. (2022).
PyScript. PyScript (2023).
Henninger, F., Shevchenko, Y., Mertens, U. K., Kieslich, P. J. & Hilbig, B. E. lab.js: A free, open, online study builder. Behav. Res. Methods 54, 556–573 (2022).
pubmed: 34322854
doi: 10.3758/s13428-019-01283-5
Kadlec, J. A measure of reliability convergence to select and optimize cognitive tasks for individual differences research - Code at the time of final submission. Zenodo https://doi.org/10.5281/zenodo.11564064 (2024).
McGugin, R. W., Richler, J. J., Herzmann, G., Speegle, M. & Gauthier, I. The Vanderbilt expertise test reveals domain-general and domain-specific sex effects in object recognition. Vision Res. 69, 10–22 (2012).
pubmed: 22877929
pmcid: 3513270
doi: 10.1016/j.visres.2012.07.014