A measure of reliability convergence to select and optimize cognitive tasks for individual differences research.


Journal

Communications psychology
ISSN: 2731-9121
Titre abrégé: Commun Psychol
Pays: England
ID NLM: 9918716686206676

Informations de publication

Date de publication:
04 Jul 2024
Historique:
received: 19 07 2023
accepted: 18 06 2024
medline: 7 9 2024
pubmed: 7 9 2024
entrez: 6 9 2024
Statut: epublish

Résumé

Surging interest in individual differences has faced setbacks in light of recent replication crises in psychology, for example in brain-wide association studies exploring brain-behavior correlations. A crucial component of replicability for individual differences studies, which is often assumed but not directly tested, is the reliability of the measures we use. Here, we evaluate the reliability of different cognitive tasks on a dataset with over 250 participants, who each completed a multi-day task battery. We show how reliability improves as a function of number of trials, and describe the convergence of the reliability curves for the different tasks, allowing us to score tasks according to their suitability for studies of individual differences. We further show the effect on reliability of measuring over multiple time points, with tasks assessing different cognitive domains being differentially affected. Data collected over more than one session may be required to achieve trait-like stability.

Identifiants

pubmed: 39242856
doi: 10.1038/s44271-024-00114-4
pii: 10.1038/s44271-024-00114-4
doi:

Types de publication

Journal Article

Langues

eng

Pagination

64

Subventions

Organisme : Israel Science Foundation (ISF)
ID : 829/22

Informations de copyright

© 2024. The Author(s).

Références

Chen, G., Gully, S. M., Whiteman, J.-A. & Kilcullen, R. N. Examination of relationships among trait-like individual differences, state-like individual differences, and learning performance. J. Appl. Psychol. 85, 835–847 (2000).
pubmed: 11125649 doi: 10.1037/0021-9010.85.6.835
Duchaine, B. & Nakayama, K. The Cambridge face memory test: Results for neurologically intact individuals and an investigation of its validity using inverted face stimuli and prosopagnosic participants. Neuropsychologia 44, 576–585 (2006).
pubmed: 16169565 doi: 10.1016/j.neuropsychologia.2005.07.001
Witkin, H. A. Individual differences in ease of perception of embedded figures. J. Pers. 19, 1–15 (1950).
pubmed: 14795367 doi: 10.1111/j.1467-6494.1950.tb01084.x
Barnett, J. & Breakwell, G. M. Risk perception and experience: Hazard personality profiles and individual differences. Risk Anal. 21, 171–178 (2001).
pubmed: 11332545 doi: 10.1111/0272-4332.211099
Dubois, J. & Adolphs, R. Building a science of individual differences from fMRI. Trends Cogn. Sci. 20, 425–443 (2016).
pubmed: 27138646 pmcid: 4886721 doi: 10.1016/j.tics.2016.03.014
Hariri, A. R. The neurobiology of individual differences in complex behavioral traits. Annu. Rev. Neurosci. 32, 225–247 (2009).
pubmed: 19400720 pmcid: 2755193 doi: 10.1146/annurev.neuro.051508.135335
Mennes, M. et al. Linking inter-individual differences in neural activation and behavior to intrinsic brain dynamics. Neuroimage 54, 2950–2959 (2011).
pubmed: 20974260 doi: 10.1016/j.neuroimage.2010.10.046
Finn, E. S. et al. Functional connectome fingerprinting: identifying individuals using patterns of brain connectivity. Nat. Neurosci. 18, 1664–1671 (2015).
pubmed: 26457551 pmcid: 5008686 doi: 10.1038/nn.4135
Chen, J. et al. Shared memories reveal shared structure in neural activity across individuals. Nat. Neurosci. 20, 115–125 (2017).
pubmed: 27918531 doi: 10.1038/nn.4450
Ramot, M., Walsh, C., Reimann, G. E. & Martin, A. Distinct neural mechanisms of social orienting and mentalizing revealed by independent measures of neural and eye movement typicality. Commun. Biol. 3, 1–11 (2020).
doi: 10.1038/s42003-020-0771-1
Hampson, M., Driesen, N. R., Skudlarski, P., Gore, J. C. & Constable, R. T. Brain connectivity related to working memory performance. J. Neurosci. 26, 13338 (2006).
pubmed: 17182784 pmcid: 2677699 doi: 10.1523/JNEUROSCI.3408-06.2006
Rosenberg, M. D. et al. A neuromarker of sustained attention from whole-brain functional connectivity. Nat. Neurosci. 19, 165–171 (2016).
pubmed: 26595653 doi: 10.1038/nn.4179
Stevens, W. D., Kravitz, D. J., Peng, C. S., Tessler, M. H. & Martin, A. Privileged functional connectivity between the visual word form area and the language system. J. Neurosci. 37, 5288–5297 (2017).
pubmed: 28450544 pmcid: 5456110 doi: 10.1523/JNEUROSCI.0138-17.2017
Ramot, M., Walsh, C. & Martin, A. Multifaceted integration - memory for faces is subserved by widespread connections between visual, memory, auditory and social networks. J. Neurosci. 39, 4976–4985 (2019).
Gotts, S. J. et al. Fractionation of social brain circuits in autism spectrum disorders. Brain 135, 2711 (2012).
pubmed: 22791801 pmcid: 3437021 doi: 10.1093/brain/aws160
Panno, A., Sarrionandia, A., Lauriola, M. & Giacomantonio, M. Alexithymia and risk preferences: Predicting risk behaviour across decision domains. Int. J. Psychol. 54, 468–477 (2019).
pubmed: 29460281 doi: 10.1002/ijop.12479
Xie, W., Campbell, S. & Zhang, W. Working memory capacity predicts individual differences in social-distancing compliance during the COVID-19 pandemic in the United States. Proc. Natl. Acad. Sci. 117, 17667–17674 (2020).
pubmed: 32651280 pmcid: 7395511 doi: 10.1073/pnas.2008868117
Rohde, T. E. & Thompson, L. A. Predicting academic achievement with cognitive ability. Intelligence 35, 83–92 (2007).
doi: 10.1016/j.intell.2006.05.004
Cragg, L., Keeble, S., Richardson, S., Roome, H. E. & Gilmore, C. Direct and indirect influences of executive functions on mathematics achievement. Cognition 162, 12–26 (2017).
pubmed: 28189034 doi: 10.1016/j.cognition.2017.01.014
McMahon, R. J. Diagnosis, assessment, and treatment of externalizing problems in children: The role of longitudinal data. J. Consult. Clin. Psychol. 62, 901–917 (1994).
pubmed: 7806720 doi: 10.1037/0022-006X.62.5.901
Alberdi, A., Aztiria, A. & Basarab, A. On the early diagnosis of Alzheimer’s Disease from multimodal signals: A survey. Artif. Intell. Med. 71, 1–29 (2016).
pubmed: 27506128 doi: 10.1016/j.artmed.2016.06.003
Daunizeau, J., Adam, V. & Rigoux, L. VBA: A probabilistic treatment of nonlinear models for neurobiological and behavioural data. PLOS Comput. Biol. 10, e1003441 (2014).
pubmed: 24465198 pmcid: 3900378 doi: 10.1371/journal.pcbi.1003441
Xu, T. et al. Interindividual variability of functional connectivity in awake and anesthetized rhesus Macaque Monkeys. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 4, 543–553 (2019).
pubmed: 31072758 pmcid: 7063583
Matzel, L. D. et al. Individual differences in the expression of a “General” learning ability in mice. J. Neurosci. 23, 6423–6433 (2003).
pubmed: 12878682 pmcid: 6740645 doi: 10.1523/JNEUROSCI.23-16-06423.2003
Dall, S. R. X., Houston, A. I. & McNamara, J. M. The behavioural ecology of personality: Consistent individual differences from an adaptive perspective. Ecol. Lett. 7, 734–739 (2004).
doi: 10.1111/j.1461-0248.2004.00618.x
Finn, E. S. & Todd Constable, R. Individual variation in functional brain connectivity: Implications for personalized approaches to psychiatric disease. Dialogues Clin. Neurosci. 18, 277–287 (2016).
pubmed: 27757062 pmcid: 5067145 doi: 10.31887/DCNS.2016.18.3/efinn
Parkes, L., Satterthwaite, T. D. & Bassett, D. S. Towards precise resting-state fMRI biomarkers in psychiatry: synthesizing developments in transdiagnostic research, dimensional models of psychopathology, and normative neurodevelopment. Curr. Opin. Neurobiol. 65, 120–128 (2020).
pubmed: 33242721 pmcid: 7770086 doi: 10.1016/j.conb.2020.10.016
Edlow, B. L. et al. Personalized connectome mapping to guide targeted therapy and promote recovery of consciousness in the intensive care unit. Neurocrit. Care 33, 364–375 (2020).
pubmed: 32794142 pmcid: 8336723 doi: 10.1007/s12028-020-01062-7
Gallen, C. L. & D’Esposito, M. Brain Modularity: A biomarker of Intervention-related plasticity. Trends Cogn. Sci. 23, 293–304 (2019).
pubmed: 30827796 pmcid: 6750199 doi: 10.1016/j.tics.2019.01.014
Marek, S. et al. Reproducible brain-wide association studies require thousands of individuals. Nature 603, 654–660 (2022).
pubmed: 35296861 pmcid: 8991999 doi: 10.1038/s41586-022-04492-9
Moran, E. K. et al. Both unmedicated and medicated individuals with schizophrenia show impairments across a wide array of cognitive and reinforcement learning tasks. Psychol. Med. 52, 1115–1125 (2022).
pubmed: 32799938 doi: 10.1017/S003329172000286X
Fried, E. I. & Nesse, R. M. Depression sum-scores don’t add up: why analyzing specific depression symptoms is essential. BMC Med 13, 72 (2015).
pubmed: 25879936 pmcid: 4386095 doi: 10.1186/s12916-015-0325-4
Khan, A., Mar, K. F. & Brown, W. A. The conundrum of depression clinical trials: one size does not fit all. Int. Clin. Psychopharmacol. 33, 239–248 (2018).
pubmed: 29939890 pmcid: 6078483 doi: 10.1097/YIC.0000000000000229
Rodebaugh, T. L. et al. Unreliability as a threat to understanding psychopathology: The cautionary tale of attentional bias. J. Abnorm. Psychol. 125, 840–851 (2016).
pubmed: 27322741 pmcid: 4980228 doi: 10.1037/abn0000184
Gratton, C., Nelson, S. M. & Gordon, E. M. Brain-behavior correlations: Two paths toward reliability. Neuron 110, 1446–1449 (2022).
pubmed: 35512638 doi: 10.1016/j.neuron.2022.04.018
Rosenberg, M. D. & Finn, E. S. How to establish robust brain–behavior relationships without thousands of individuals. Nat. Neurosci. 25, 835–837 (2022).
pubmed: 35710985 doi: 10.1038/s41593-022-01110-9
Nour, M. M., Liu, Y. & Dolan, R. J. Functional neuroimaging in psychiatry and the case for failing better. Neuron 110, 2524–2544 (2022).
pubmed: 35981525 doi: 10.1016/j.neuron.2022.07.005
Noble, S., Mejia, A. F., Zalesky, A. & Scheinost, D. Improving power in functional magnetic resonance imaging by moving beyond cluster-level inference. Proc. Natl. Acad. Sci. 119, e2203020119 (2022).
pubmed: 35925887 pmcid: 9371642 doi: 10.1073/pnas.2203020119
Tetereva, A., Li, J., Deng, J. D., Stringaris, A. & Pat, N. Capturing brain‐cognition relationship: Integrating task‐based fMRI across tasks markedly boosts prediction and test‐retest reliability. NeuroImage 263, 119588 (2022).
pubmed: 36057404 doi: 10.1016/j.neuroimage.2022.119588
Bijsterbosch, J. Piggybacking on big data. Nat. Neurosci. 25, 682–683 (2022).
pubmed: 35578133 pmcid: 9179090 doi: 10.1038/s41593-022-01058-w
Enkavi, A. Z. et al. Large-scale analysis of test–retest reliabilities of self-regulation measures. Proc. Natl. Acad. Sci. 116, 5472–5477 (2019).
pubmed: 30842284 pmcid: 6431228 doi: 10.1073/pnas.1818430116
Chen, G. et al. Hyperbolic trade-off: The importance of balancing trial and subject sample sizes in neuroimaging. NeuroImage 247, 118786 (2022).
pubmed: 34906711 doi: 10.1016/j.neuroimage.2021.118786
Nikolaidis, A. et al. Suboptimal phenotypic reliability impedes reproducible human neuroscience. bioRxiv 2022.07.22.501193 (2022) https://doi.org/10.1101/2022.07.22.501193 .
Parsons, S., Kruijt, A.-W. & Fox, E. Psychological science needs a standard practice of reporting the reliability of cognitive-behavioral measurements. Adv. Methods Pract. Psychol. Sci. 2, 378–395 (2019).
doi: 10.1177/2515245919879695
Pronk, T., Hirst, R. J., Wiers, R. W. & Murre, J. M. J. Can we measure individual differences in cognitive measures reliably via smartphones? A comparison of the flanker effect across device types and samples. Behav. Res. Methods 55, 1641–1652 (2023).
Rouder, J. N. & Haaf, J. M. A psychometrics of individual differences in experimental tasks. Psychon. Bull. Rev. 26, 452–467 (2019).
pubmed: 30911907 doi: 10.3758/s13423-018-1558-y
Zorowitz, S. & Niv, Y. Improving the Reliability of Cognitive Task Measures: A Narrative Review. Biol. Psychiatry Cogn. Neurosci. Neuroimaging 8, 789–797 (2023).
Snijder, J.-P., Tang, R., Bugg, J. M., Conway, A. R. A. & Braver, T. S. On the psychometric evaluation of cognitive control tasks: An Investigation with the Dual Mechanisms of Cognitive Control (DMCC) battery. Behav. Res. Methods 56, 1604–1639 (2024).
Elbich, D. B. & Scherf, S. Beyond the FFA: Brain-behavior correspondences in face recognition abilities. Neuroimage 147, 409–422 (2017).
pubmed: 27993674 doi: 10.1016/j.neuroimage.2016.12.042
Van Essen, D. C. et al. The WU-Minn Human Connectome Project: An overview. NeuroImage 80, 62–79 (2013).
pubmed: 23684880 doi: 10.1016/j.neuroimage.2013.05.041
Langenecker, S. A., Zubieta, J.-K., Young, E. A., Akil, H. & Nielson, K. A. A task to manipulate attentional load, set-shifting, and inhibitory control: convergent validity and test-retest reliability of the Parametric Go/No-Go Test. J. Clin. Exp. Neuropsychol. 29, 842–853 (2007).
pubmed: 17852593 doi: 10.1080/13803390601147611
Dale, G. & Arnell, K. M. How reliable is the attentional blink? Examining the relationships within and between attentional blink tasks over time. Psychol. Res. 77, 99–105 (2013).
pubmed: 22159732 doi: 10.1007/s00426-011-0403-y
Burton, A. M., White, D. & McNeill, A. The glasgow face matching test. Behav. Res. Methods 42, 286–291 (2010).
pubmed: 20160307 doi: 10.3758/BRM.42.1.286
McCaffery, J. M., Robertson, D. J., Young, A. W. & Burton, A. M. Individual differences in face identity processing. Cogn. Res. Princ. Implic. 3, 21 (2018).
pubmed: 30009251 pmcid: 6019420 doi: 10.1186/s41235-018-0112-9
Fernández-Abascal, E. G., Cabello, R., Fernández-Berrocal, P. & Baron-Cohen, S. Test-retest reliability of the ‘Reading the Mind in the Eyes’ test: a one-year follow-up study. Mol. Autism 4, 33 (2013).
pubmed: 24020728 pmcid: 3848772 doi: 10.1186/2040-2392-4-33
Pinkham, A. E., Harvey, P. D. & Penn, D. L. Paranoid individuals with schizophrenia show greater social cognitive bias and worse social functioning than non-paranoid individuals with schizophrenia. Schizophr. Res. Cogn. 3, 33–38 (2016).
pubmed: 27990352 pmcid: 5156478 doi: 10.1016/j.scog.2015.11.002
Aldi, G. A. et al. Validation of the mnemonic similarity task—context version. Braz. J. Psychiatry 40, 432–440 (2018).
pubmed: 29412339 pmcid: 6899373 doi: 10.1590/1516-4446-2017-2379
Hedge, C., Powell, G. & Sumner, P. The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behav. Res. Methods 50, 1166–1186 (2018).
pubmed: 28726177 doi: 10.3758/s13428-017-0935-1
Rey-Mermet, A., Gade, M. & Oberauer, K. Should we stop thinking about inhibition? Searching for individual and age differences in inhibition ability. J. Exp. Psychol. Learn. Mem. Cogn. 44, 501–526 (2018).
pubmed: 28956944 doi: 10.1037/xlm0000450
Higgins, W. C., Kaplan, D. M., Deschrijver, E. & Ross, R. M. Construct validity evidence reporting practices for the Reading the mind in the eyes test: A systematic scoping review. Clin. Psychol. Rev. 108, 102378 (2023).
White, D. & Burton, A. M. Individual differences and the multidimensional nature of face perception. Nat. Rev. Psychol. 1, 287–300 (2022).
doi: 10.1038/s44159-022-00041-3
Lord, F. M. & Novick, M. R. Statistical Theories of Mental Test Scores. (IAP, 2008).
Leppink, J. & Pérez-Fuster, P. We need more replication research – A case for test-retest reliability. Perspect. Med. Educ. 6, 158–164 (2017).
pubmed: 28390030 pmcid: 5466566 doi: 10.1007/S40037-017-0347-Z
Chmielewski, M. & Watson, D. What is being assessed and why it matters: the impact of transient error on trait research. J. Pers. Soc. Psychol. 97, 186–202 (2009).
pubmed: 19586248 doi: 10.1037/a0015618
Green, S. B. A coefficient alpha for test-retest data. Psychol. Methods 8, 88–101 (2003).
pubmed: 12741675 doi: 10.1037/1082-989X.8.1.88
Calamia, M., Markon, K. & Tranel, D. The robust reliability of neuropsychological measures: Meta-analyses of test–retest correlations. Clin. Neuropsychol. 27, 1077–1105 (2013).
pubmed: 24016131 doi: 10.1080/13854046.2013.809795
Duff, K. Evidence-based indicators of neuropsychological change in the individual patient: Relevant concepts and methods. Arch. Clin. Neuropsychol. 27, 248–261 (2012).
pubmed: 22382384 pmcid: 3499091 doi: 10.1093/arclin/acr120
Noble, S., Scheinost, D. & Constable, R. T. A decade of test-retest reliability of functional connectivity: A systematic review and meta-analysis. NeuroImage 203, 116157 (2019).
pubmed: 31494250 doi: 10.1016/j.neuroimage.2019.116157
Salthouse, T. A. Implications of within-person variability in cognitive and neuropsychological functioning for the interpretation of change. Neuropsychology 21, 401–411 (2007).
pubmed: 17605573 doi: 10.1037/0894-4105.21.4.401
Bohn, M. et al. Great ape cognition is structured by stable cognitive abilities and predicted by developmental conditions. Nat. Ecol. Evol. 7, 927–938 (2023).
pubmed: 37106158 pmcid: 10250201 doi: 10.1038/s41559-023-02050-8
Katherine H. Karlsgodt et al. Capacity-based differences in structural connectivity and functional network activation associated with spatial working memory. http://lcni-3.uoregon.edu/phenowiki/index.php/Karlsgodt_2011_ACNP (2011).
Cowan, N. The magical number 4 in short-term memory: A reconsideration of mental storage capacity. Behav. Brain Sci. 24, 87–114 (2001).
pubmed: 11515286 doi: 10.1017/S0140525X01003922
Votruba, K. L. & Langenecker, S. A. Factor structure, construct validity, and age- and education-based normative data for the Parametric Go/No-Go Test. J. Clin. Exp. Neuropsychol. 35, 132–146 (2013).
pubmed: 23289626 pmcid: 4040279 doi: 10.1080/13803395.2012.758239
Ragland, J. D. et al. Working memory for complex figures: An fMRI comparison of letter and fractal n-back tasks. Neuropsychology 16, 370–379 (2002).
pubmed: 12146684 pmcid: 4332798 doi: 10.1037/0894-4105.16.3.370
Dennett, H. W. et al. The Cambridge Car Memory Test: A task matched in format to the Cambridge Face Memory Test, with norms, reliability, sex differences, dissociations from face memory, and expertise effects. Behav. Res. Methods 44, 587–605 (2012).
pubmed: 22012343 doi: 10.3758/s13428-011-0160-2
Vanderwal, T., Kelly, C., Eilbott, J., Mayes, L. C. & Castellanos, F. X. Inscapes: a movie paradigm to improve compliance in functional magnetic resonance imaging. NeuroImage 122, 222–232 (2015).
pubmed: 26241683 doi: 10.1016/j.neuroimage.2015.07.069
Kirwan, C. B., Jones, C. K., Miller, M. I. & Stark, C. E. L. High-resolution fMRI investigation of the medial temporal lobe. Hum. Brain Mapp. 28, 959–966 (2007).
pubmed: 17133381 doi: 10.1002/hbm.20331
Stark, C. E. L., Noche, J. A., Ebersberger, J. R., Mayer, L. & Stark, S. M. Optimizing the mnemonic similarity task for efficient, widespread use. Front. Behav. Neurosci. 17, 1080366 (2023).
Rezlescu, C., Chapman, A., Susilo, T. & Caramazza, A. Large inversion effects are not specific to faces and do not vary with object expertise. PsyArXiv Preprints https://discovery.ucl.ac.uk/id/eprint/10140283/ (Charlottesville, VA, USA, 2016).
Ragland, J. D. et al. Relational and item-specific encoding (RISE): Task development and psychometric characteristics. Schizophr. Bull 38, 114–124 (2012).
pubmed: 22124089 doi: 10.1093/schbul/sbr146
McKone, E. et al. Face ethnicity and measurement reliability affect face recognition performance in developmental prosopagnosia: Evidence from the Cambridge Face Memory Test–Australian. Cogn. Neuropsychol. 28, 109–146 (2011).
pubmed: 22122116 doi: 10.1080/02643294.2011.616880
Arrington, M., Elbich, D., Dai, J., Duchaine, B. & Scherf, K. S. Introducing the female Cambridge face memory test – long form (F-CFMT + ). Behav. Res. Methods 54, 3071–3084 (2022).
Palermo, R., O’Connor, K. B., Davis, J. M., Irons, J. & McKone, E. New tests to measure individual differences in matching and labelling facial expressions of emotion, and their association with ability to recognise vocal emotions and facial identity. PLoS ONE 8, e68126 (2013).
pubmed: 23840821 pmcid: 3695959 doi: 10.1371/journal.pone.0068126
Sijtsma, K. & van der Ark, L. A. Reliability. in Encyclopedia of Personality and Individual Differences (eds. Zeigler-Hill, V. & Shackelford, T. K.) 4385–4402 (Springer International Publishing, Cham, 2020). https://doi.org/10.1007/978-3-319-24612-3_1348 .
Cronbach, L. J. Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334 (1951).
doi: 10.1007/BF02310555
Charter, R. A. It is time to bury the Spearman-Brown “Prophecy” formula for some common applications. Educ. Psychol. Meas. 61, 690–696 (2001).
doi: 10.1177/00131640121971446
Gulliksen, H. Theory of Mental Tests. (Routledge, New York, 1987). https://doi.org/10.4324/9780203052150 .
Pronk, T., Molenaar, D., Wiers, R. W. & Murre, J. Methods to split cognitive task data for estimating split-half reliability: A comprehensive review and systematic assessment. Psychon. Bull. Rev. 29, 44–54 (2022).
pubmed: 34100223 doi: 10.3758/s13423-021-01948-3
Thissen, D. & Wainer, H. Test Scoring. xii, 422 (Lawrence Erlbaum Associates Publishers, Mahwah, NJ, US, 2001).
MacLeod, J. W. et al. Appraising the ANT: Psychometric and theoretical considerations of the Attention Network Test. Neuropsychology 24, 637–651 (2010).
pubmed: 20804252 doi: 10.1037/a0019803
Cooper, S. R., Gonthier, C., Barch, D. M. & Braver, T. S. The role of psychometrics in individual differences research in cognition: A Case Study of the AX-CPT. Front. Psychol. 8, 1482 (2017).
Guttman, L. A basis for analyzing test-retest reliability. Psychometrika 10, 255–282 (1945).
pubmed: 21007983 doi: 10.1007/BF02288892
Hill, J. & Sawilowsky, S. S. Bias in Monte Carlo simulations due to pseudo-random number generator initial seed selection. J. Mod. Appl. Stat. Methods 10, 29–50 (2011).
Spearman, C. Correlation calculated from faulty data. Br. J. Psychol. 1904-1920 3, 271–295 (1910).
doi: 10.1111/j.2044-8295.1910.tb00206.x
Spearman, C. The proof and measurement of association between two things. Am. J. Psychol. 15, 72–101 (1904).
doi: 10.2307/1412159
Spearman, C. Demonstration of formulæ for true measurement of correlation. Am. J. Psychol. 18, 161–169 (1907).
doi: 10.2307/1412408
Brown, W. Some experimental results in the correlation of mental abilities1. Br. J. Psychol. 1904-1920 3, 296–322 (1910).
doi: 10.1111/j.2044-8295.1910.tb00207.x
Weiss, N. A., Holmes, P. T. & Hardy, M. A Course in Probability. (Pearson Addison Wesley, 2005).
Zimmerman, D. & Zumbo, B. Resolving the Issue of How Reliability is Related to Statistical Power: Adhering to Mathematical Definitions. J. Mod. Appl. Stat. Methods 14, 9–26 (2015).
Xu, Z., Adam, K. C. S., Fang, X. & Vogel, E. K. The reliability and stability of visual working memory capacity. Behav. Res. Methods 50, 576–588 (2018).
pubmed: 28389852 pmcid: 5632133 doi: 10.3758/s13428-017-0886-6
Matheson, G. J. We need to talk about reliability: making better use of test-retest studies for study design and interpretation. PeerJ 7, e6918 (2019).
pubmed: 31179173 pmcid: 6536112 doi: 10.7717/peerj.6918
Metsämuuronen, J. Attenuation-corrected estimators of reliability. Appl. Psychol. Meas. 46, 720–737 (2022).
pubmed: 36262520 pmcid: 9574086 doi: 10.1177/01466216221108131
Trafimow, D. The attenuation of correlation coefficients: A statistical literacy issue. Teach. Stat. 38, 25–28 (2016).
doi: 10.1111/test.12087
Aldridge, V. K., Dovey, T. M. & Wade, A. Assessing test-retest reliability of psychological measures. Eur. Psychol. 22, 207–218 (2017).
doi: 10.1027/1016-9040/a000298
Bobak, C. A., Barr, P. J. & O’Malley, A. J. Estimation of an inter-rater intra-class correlation coefficient that overcomes common assumption violations in the assessment of health measurement scales. BMC Med. Res. Methodol. 18, 93 (2018).
pubmed: 30208858 pmcid: 6134634 doi: 10.1186/s12874-018-0550-6
Koo, T. K. & Li, M. Y. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J. Chiropr. Med. 15, 155–163 (2016).
pubmed: 27330520 pmcid: 4913118 doi: 10.1016/j.jcm.2016.02.012
Kucina, T. et al. Calibration of cognitive tests to address the reliability paradox for decision-conflict tasks. Nat. Commun. 14, 2234 (2023).
pubmed: 37076456 pmcid: 10115879 doi: 10.1038/s41467-023-37777-2
Chmielewski, M. & Kucker, S. C. An MTurk Crisis? Shifts in data quality and the impact on study results. Soc. Psychol. Personal. Sci. 11, 464–473 (2020).
doi: 10.1177/1948550619875149
Newman, A., Bavik, Y. L., Mount, M. & Shao, B. Data collection via online platforms: Challenges and recommendations for future research. Appl. Psychol. 70, 1380–1402 (2021).
doi: 10.1111/apps.12302
Dupuis, M., Meier, E. & Cuneo, F. Detecting computer-generated random responding in questionnaire-based data: A comparison of seven indices. Behav. Res. Methods 51, 2228–2237 (2019).
pubmed: 30091086 doi: 10.3758/s13428-018-1103-y
Roth, P. L. Missing data: A conceptual review for applied psychologists. Pers. Psychol. 47, 537–560 (1994).
doi: 10.1111/j.1744-6570.1994.tb01736.x
Enders, C. K. Applied Missing Data Analysis: Second Edition. ix, 546 (The Guilford Press, New York, NY, US, 2022).
Enders, C. K. Missing data: An update on the state of the art. Psychol. Methods No Pagination Specified-No Pagination Specified (2023) https://doi.org/10.1037/met0000563 .
Robison, M. K., Miller, A. L. & Unsworth, N. A multi-faceted approach to understanding individual differences in mind-wandering. Cognition 198, 104078 (2020).
pubmed: 32062086 doi: 10.1016/j.cognition.2019.104078
Yaron, I., Zeevi, Y., Korisky, U., Marshall, W. & Mudrik, L. Progressing, not regressing: A possible solution to the problem of regression to the mean in unconscious processing studies. Psychon. Bull. Rev. 31, 49–64 (2024).
Kadlec, J., Walsh, Catherine R., Rissman, Jesse, & Ramot, Michal. Putting cognitive tasks on trial: A measure of reliability convergence. OSF https://doi.org/10.17605/OSF.IO/CRE2B (2023).
Van Rossum, G. & Drake, F. L. Python 3 Reference Manual. (CreateSpace, Scotts Valley, CA, 2009).
Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).
pubmed: 32939066 pmcid: 7759461 doi: 10.1038/s41586-020-2649-2
McKinney, W. Data Structures for Statistical Computing in Python. in 56–61 (Austin, Texas, 2010). https://doi.org/10.25080/Majora-92bf1922-00a .
Hunter, J. D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 9, 90–95 (2007).
doi: 10.1109/MCSE.2007.55
Waskom, M. L. seaborn: statistical data visualization. J. Open Source Softw 6, 3021 (2021).
doi: 10.21105/joss.03021
Newville, M., Stensitzki, T., Allen, D. B. & Ingargiola, A. LMFIT: Non-Linear Least-Square Minimization and Curve-Fitting for Python. Zenodo https://doi.org/10.5281/zenodo.11813 (2014).
Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).
pubmed: 32015543 pmcid: 7056644 doi: 10.1038/s41592-019-0686-2
Vallat, R. Pingouin: statistics in Python. J. Open Source Softw 3, 1026 (2018).
doi: 10.21105/joss.01026
Shinn, M. CanD features. (2022).
PyScript. PyScript (2023).
Henninger, F., Shevchenko, Y., Mertens, U. K., Kieslich, P. J. & Hilbig, B. E. lab.js: A free, open, online study builder. Behav. Res. Methods 54, 556–573 (2022).
pubmed: 34322854 doi: 10.3758/s13428-019-01283-5
Kadlec, J. A measure of reliability convergence to select and optimize cognitive tasks for individual differences research - Code at the time of final submission. Zenodo https://doi.org/10.5281/zenodo.11564064 (2024).
McGugin, R. W., Richler, J. J., Herzmann, G., Speegle, M. & Gauthier, I. The Vanderbilt expertise test reveals domain-general and domain-specific sex effects in object recognition. Vision Res. 69, 10–22 (2012).
pubmed: 22877929 pmcid: 3513270 doi: 10.1016/j.visres.2012.07.014

Auteurs

Jan Kadlec (J)

Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel.

Catherine R Walsh (CR)

Department of Psychology, University of California, Los Angeles, CA, USA.
Section on Functional Imaging Methods, National Institute of Mental Health, Bethesda, MD, USA.

Uri Sadé (U)

Faculty of Physics, Weizmann Institute of Science, Rehovot, Israel.

Ariel Amir (A)

Faculty of Physics, Weizmann Institute of Science, Rehovot, Israel.

Jesse Rissman (J)

Department of Psychology, University of California, Los Angeles, CA, USA.
Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA, USA.

Michal Ramot (M)

Department of Brain Sciences, Weizmann Institute of Science, Rehovot, Israel. michal.ramot@weizmann.ac.il.

Classifications MeSH