Perceptions of artificial intelligence system's aptitude to judge morality and competence amidst the rise of Chatbots.
Artificial intelligence
Chatbots
Competence
Impression formation
Large language models
Morality
Social evaluation
Journal
Cognitive research: principles and implications
ISSN: 2365-7464
Titre abrégé: Cogn Res Princ Implic
Pays: England
ID NLM: 101697632
Informations de publication
Date de publication:
18 Jul 2024
18 Jul 2024
Historique:
received:
29
09
2023
accepted:
02
07
2024
medline:
18
7
2024
pubmed:
18
7
2024
entrez:
17
7
2024
Statut:
epublish
Résumé
This paper examines how humans judge the capabilities of artificial intelligence (AI) to evaluate human attributes, specifically focusing on two key dimensions of human social evaluation: morality and competence. Furthermore, it investigates the impact of exposure to advanced Large Language Models on these perceptions. In three studies (combined N = 200), we tested the hypothesis that people will find it less plausible that AI is capable of judging the morality conveyed by a behavior compared to judging its competence. Participants estimated the plausibility of AI origin for a set of written impressions of positive and negative behaviors related to morality and competence. Studies 1 and 3 supported our hypothesis that people would be more inclined to attribute AI origin to competence-related impressions compared to morality-related ones. In Study 2, we found this effect only for impressions of positive behaviors. Additional exploratory analyses clarified that the differentiation between the AI origin of competence and morality judgments persisted throughout the first half year after the public launch of popular AI chatbot (i.e., ChatGPT) and could not be explained by participants' general attitudes toward AI, or the actual source of the impressions (i.e., AI or human). These findings suggest an enduring belief that AI is less adept at assessing the morality compared to the competence of human behavior, even as AI capabilities continued to advance.
Identifiants
pubmed: 39019988
doi: 10.1186/s41235-024-00573-7
pii: 10.1186/s41235-024-00573-7
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
47Informations de copyright
© 2024. The Author(s).
Références
Abele, A. E., Cuddy, A. J. C., Judd, C. M., & Yzerbyt, V. Y. (2008). Fundamental dimensions of social judgment. European Journal of Social Psychology, 38(7), 1063–1065. https://doi.org/10.1002/ejsp.574
doi: 10.1002/ejsp.574
Abele, A. E., Ellemers, N., Fiske, S. T., Koch, A., & Yzerbyt, V. (2021). Navigating the social world: Toward an integrated framework for evaluating self, individuals, and groups. Psychological Review, 128(2), 290–314. https://doi.org/10.1037/rev0000262
doi: 10.1037/rev0000262
pubmed: 32940512
Abele, A. E., Hauke, N., Peters, K., Louvet, E., Szymkow, A., & Duan, Y. (2016). Facets of the fundamental content dimensions: Agency with competence and assertiveness—Communion with warmth and morality. Frontiers in Psychology, 7, 1–17. https://doi.org/10.3389/fpsyg.2016.01810
doi: 10.3389/fpsyg.2016.01810
Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis. https://doi.org/10.1017/pan.2023.2
doi: 10.1017/pan.2023.2
Barr, D. J., Lev, R., Scheepers, C., & Tily, H. J. (2013). Keep it maximal appendix. Journal of Memory and Language, 68(3), 1–5. https://doi.org/10.1016/j.jml.2012.11.001.Random
doi: 10.1016/j.jml.2012.11.001.Random
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
doi: 10.18637/jss.v067.i01
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, 57(1), 289–300. https://doi.org/10.2307/2346101
doi: 10.2307/2346101
Bigman, Y. E., & Gray, K. (2018). People are averse to machines making moral decisions. Cognition, 181, 21–34. https://doi.org/10.1016/j.cognition.2018.08.003
doi: 10.1016/j.cognition.2018.08.003
pubmed: 30107256
Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences, 120(6), e2218523120. https://doi.org/10.1073/pnas.2218523120
doi: 10.1073/pnas.2218523120
Borkenau, P. (1986). Toward an understanding of trait interrelations: Acts as instances for several traits. Journal of Personality and Social Psychology, 51(2), 371–381. https://doi.org/10.1037/0022-3514.51.2.371
doi: 10.1037/0022-3514.51.2.371
Brambilla, M., Rusconi, P., Sacchi, S., & Cherubini, P. (2011). Looking for honesty: The primary role of morality (vs. Sociability and competence) in information gathering. European Journal of Social Psychology. https://doi.org/10.1002/ejsp.744
doi: 10.1002/ejsp.744
Cameron, C. D., Lindquist, K. A., & Gray, K. (2015). A constructionist review of morality and emotions: No evidence for specific links between moral content and discrete emotions. Personality and Social Psychology Review, 19(4), 371–394. https://doi.org/10.1177/1088868314566683
doi: 10.1177/1088868314566683
pubmed: 25587050
Carrier, A., Louvet, E., Chauvin, B., & Rohmer, O. (2014). The primacy of agency over competence in status perception. Social Psychology, 45(5), 347–356. https://doi.org/10.1027/1864-9335/a000176
doi: 10.1027/1864-9335/a000176
Castelo, N., Bos, M. W., & Lehmann, D. R. (2019). Task-dependent algorithm aversion. Journal of Marketing Research, 56(5), 809–825. https://doi.org/10.1177/0022243719851788
doi: 10.1177/0022243719851788
Confalonieri, R., Coba, L., Wagner, B., & Besold, T. R. (2021). A historical perspective of explainable Artificial Intelligence. Wires Data Mining and Knowledge Discovery, 11(1), e1391. https://doi.org/10.1002/widm.1391
doi: 10.1002/widm.1391
Cross, E. S., & Ramsey, R. (2021). Mind meets machine: Towards a cognitive science of human–machine interactions. Trends in Cognitive Sciences, 25(3), 200–212. https://doi.org/10.1016/j.tics.2020.11.009
doi: 10.1016/j.tics.2020.11.009
pubmed: 33384213
Darda, K., Carre, M., & Cross, E. (2023). Value attributed to text-based archives generated by artificial intelligence. Royal Society Open Science, 10(2), 220915. https://doi.org/10.1098/rsos.220915
doi: 10.1098/rsos.220915
pubmed: 36778947
pmcid: 9905996
DeBruine, L. M., & Barr, D. J. (2021). Understanding mixed-effects models through data simulation. Advances in Methods and Practices in Psychological Science, 4(1), 2515245920965119. https://doi.org/10.1177/2515245920965119
doi: 10.1177/2515245920965119
Dijkstra, J. J. (1999). User agreement with incorrect expert system advice. Behaviour & Information Technology, 18(6), 399–411. https://doi.org/10.1080/014492999118832
doi: 10.1080/014492999118832
Dillion, D., Tandon, N., Gu, Y., & Gray, K. (2023). Can AI language models replace human participants? Trends in Cognitive Sciences. https://doi.org/10.1016/j.tics.2023.04.008
doi: 10.1016/j.tics.2023.04.008
pubmed: 37173156
Eagly, A. H., Nater, C., Miller, D. I., Kaufmann, M., & Sczesny, S. (2020). Gender stereotypes have changed: A cross-temporal meta-analysis of U.S. public opinion polls from 1946 to 2018. American Psychologist, 75(3), 301–315. https://doi.org/10.1037/amp0000494
doi: 10.1037/amp0000494
pubmed: 31318237
Fazio, R. H. (2007). Attitudes as object–evaluation associations of varying strength. Social Cognition, 25(5), 603–637. https://doi.org/10.1521/soco.2007.25.5.603
doi: 10.1521/soco.2007.25.5.603
pubmed: 19424447
pmcid: 2677817
Fiske, S. T. (2017). Prejudices in cultural contexts: Shared stereotypes (gender, age) versus variable stereotypes (race, ethnicity, religion). Perspectives on Psychological Science, 12(5), 791–799. https://doi.org/10.1177/1745691617708204
doi: 10.1177/1745691617708204
pubmed: 28972839
pmcid: 5657003
Fiske, S. T. (2018). Stereotype content: Warmth and competence endure. Current Directions in Psychological Science, 27(2), 67–73. https://doi.org/10.1177/0963721417738825
doi: 10.1177/0963721417738825
pubmed: 29755213
pmcid: 5945217
Fiske, S. T., Cuddy, A. J. C., & Glick, P. (2007). Universal dimensions of social cognition: Warmth and competence. Trends in Cognitive Sciences, 11(2), 77–83. https://doi.org/10.1016/j.tics.2006.11.005
doi: 10.1016/j.tics.2006.11.005
pubmed: 17188552
Freeman, J. B., & Ambady, N. (2011). A dynamic interactive theory of person construal. Psychological Review, 118(2), 247–279. https://doi.org/10.1037/a0022327
doi: 10.1037/a0022327
pubmed: 21355661
Fumagalli, M., Ferrucci, R., Mameli, F., Marceglia, S., Mrakic-Sposta, S., Zago, S., Lucchiari, C., Consonni, D., Nordio, F., Pravettoni, G., Cappa, S., & Priori, A. (2010). Gender-related differences in moral judgments. Cognitive Processing, 11(3), 219–226. https://doi.org/10.1007/s10339-009-0335-2
doi: 10.1007/s10339-009-0335-2
pubmed: 19727878
Gangadharbatla, H. (2022). The role of AI attribution knowledge in the evaluation of artwork. Empirical Studies of the Arts, 40(2), 125–142. https://doi.org/10.1177/0276237421994697
doi: 10.1177/0276237421994697
Gilbert, D. T. (1998). Ordinary personology. In D. T. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), The handbook of social psychology (pp. 89–150). McGraw-Hill.
Goodwin, G. P., Piazza, J., & Rozin, P. (2014). Moral character predominates in person perception and evaluation. Journal of Personality and Social Psychology, 106(1), 148–168. https://doi.org/10.1037/a0034726
doi: 10.1037/a0034726
pubmed: 24274087
Gray, H. M., Gray, K., & Wegner, D. M. (2007). Dimensions of mind perception. Science, 315(5812), 619–619. https://doi.org/10.1126/science.1134475
doi: 10.1126/science.1134475
pubmed: 17272713
Gray, K., Young, L., & Waytz, A. (2012). Mind perception is the essence of morality. Psychological Inquiry, 23(2), 101–124. https://doi.org/10.1080/1047840X.2012.651387
doi: 10.1080/1047840X.2012.651387
pubmed: 22754268
pmcid: 3379786
Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293(5537), 2105–2108. https://doi.org/10.1126/science.1062872
doi: 10.1126/science.1062872
pubmed: 11557895
Gunser, V. E., Gottschling, S., Brucker, B., Richter, S., & Gerjets, P. (2021). Can users distinguish narrative texts written by an artificial intelligence writing tool from purely human text? In C. Stephanidis, M. Antona, & S. Ntoa (Eds.), HCI international 2021—posters (Vol. 1419, pp. 520–527). Springer. https://doi.org/10.1007/978-3-030-78635-9_67
doi: 10.1007/978-3-030-78635-9_67
Haidt, J., Koller, S. H., & Dias, M. G. (1993). Affect, culture, and morality, or is it wrong to eat your dog? Journal of Personality and Social Psychology, 65(4), 613–628. https://doi.org/10.1037/0022-3514.65.4.613
doi: 10.1037/0022-3514.65.4.613
pubmed: 8229648
Haslam, N. (2006). Dehumanization: An integrative review. Personality and Social Psychology Review, 10(3), 252–264. https://doi.org/10.1207/s15327957pspr1003_4
doi: 10.1207/s15327957pspr1003_4
pubmed: 16859440
Hitsuwari, J., Ueda, Y., Yun, W., & Nomura, M. (2023). Does human–AI collaboration lead to more creative art? Aesthetic evaluation of human-made and AI-generated haiku poetry. Computers in Human Behavior, 139, 107502. https://doi.org/10.1016/j.chb.2022.107502
doi: 10.1016/j.chb.2022.107502
Hortensius, R., & Cross, E. S. (2018). From automata to animate beings: The scope and limits of attributing socialness to artificial agents: Socialness attribution and artificial agents. Annals of the New York Academy of Sciences, 1426(1), 93–110. https://doi.org/10.1111/nyas.13727
doi: 10.1111/nyas.13727
Horton, J. J. (2023). Large language models as simulated economic agents: What can we learn from homo silicus? https://doi.org/10.48550/ARXIV.2301.07543
Hu, K. (2023, February 2). ChatGPT sets record for fastest-growing user base—Analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/
Jakesch, M., Hancock, J. T., & Naaman, M. (2023). Human heuristics for AI-generated language are flawed. Proceedings of the National Academy of Sciences, 120(11), e2208839120. https://doi.org/10.1073/pnas.2208839120
doi: 10.1073/pnas.2208839120
Jones, C., & Bergen, B. (2023). Does GPT-4 pass the Turing test? https://doi.org/10.48550/ARXIV.2310.20216
Judd, C. M., Garcia-Marques, T., & Yzerbyt, V. Y. (2019). The complexity of relations between dimensions of social perception: Decomposing bivariate associations with crossed random factors. Journal of Experimental Social Psychology, 82, 200–207. https://doi.org/10.1016/j.jesp.2019.01.008
doi: 10.1016/j.jesp.2019.01.008
Judd, C. M., James-Hawkins, L., Yzerbyt, V., & Kashima, Y. (2005). Fundamental dimensions of social judgment: Understanding the relations between judgments of competence and warmth. Journal of Personality and Social Psychology, 89(6), 899–913. https://doi.org/10.1037/0022-3514.89.6.899
doi: 10.1037/0022-3514.89.6.899
pubmed: 16393023
Kervyn, N., Bergsieker, H. B., & Fiske, S. T. (2012). The innuendo effect: Hearing the positive but inferring the negative. Journal of Experimental Social Psychology, 48(1), 77–85. https://doi.org/10.1016/j.jesp.2011.08.001
doi: 10.1016/j.jesp.2011.08.001
pubmed: 26023243
pmcid: 4443850
Kervyn, N., Fiske, S. T., & Yzerbyt, V. Y. (2013). Integrating the stereotype content model (warmth and competence) and the Osgood semantic differential (evaluation, potency, and activity). European Journal of Social Psychology, 43(7), 673–681. https://doi.org/10.1002/ejsp.1978
doi: 10.1002/ejsp.1978
pubmed: 26120217
pmcid: 4479118
Köbis, N., & Mossink, L. D. (2021). Artificial intelligence versus Maya Angelou: Experimental evidence that people cannot differentiate AI-generated from human-written poetry. Computers in Human Behavior, 114, 106553. https://doi.org/10.1016/j.chb.2020.106553
doi: 10.1016/j.chb.2020.106553
Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13
doi: 10.18637/jss.v082.i13
Lai, C. K., Skinner, A. L., Cooley, E., Murrar, S., Brauer, M., Devos, T., Calanchini, J., Xiao, Y. J., Pedram, C., Marshburn, C. K., Simon, S., Blanchar, J. C., Joy-Gaba, J. A., Conway, J., Redford, L., Klein, R. A., Roussos, G., Schellhaas, F. M. H., Burns, M., … Nosek, B. A. (2016). Reducing implicit racial preferences II: Intervention effectiveness across time. Journal of Experimental Psychology. General, 145(8), 1001–1016. https://doi.org/10.1037/xge0000179
Leach, C., Ellemers, N., & Barreto, M. (2007). Group virtue: The importance of morality (vs. Competence and sociability) in the positive evaluation of in-groups. Journal of Personality and Social Psychology, 93(2), 234–249. https://doi.org/10.1037/0022-3514.93.2.234
doi: 10.1037/0022-3514.93.2.234
pubmed: 17645397
Lee, M. K. (2018). Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society, 5(1), 205395171875668. https://doi.org/10.1177/2053951718756684
doi: 10.1177/2053951718756684
Lenth, R. V. (2022). emmeans: Estimated Marginal Means, aka Least-Squares Means (R package version 1.8.3) [Computer software].
Lippmann, W. (1922). Public opinion. Harcourt, Brace, and Company.
Luke, S. G. (2017). Evaluating significance in linear mixed-effects models in R. Behavior Research Methods, 49(4), 1494–1502. https://doi.org/10.3758/s13428-016-0809-y
doi: 10.3758/s13428-016-0809-y
pubmed: 27620283
Mickelberg, A., Walker, B., Ecker, U. K. H., Howe, P., Perfors, A., & Fay, N. (2022). Impression formation stimuli: A corpus of behavior statements rated on morality, competence, informativeness, and believability. PLoS ONE, 17(6), e0269393. https://doi.org/10.1371/journal.pone.0269393
doi: 10.1371/journal.pone.0269393
pubmed: 35657992
pmcid: 9165857
Momen, A., De Visser, E., Wolsten, K., Cooley, K., Wallisser, J., & Tossell, C. C. (2023). Trusting the moral judgments of a robot: Perceived moral competence and humanlikeness of a GPT-3 enabled AI. 501–510
Nightingale, S. J., & Farid, H. (2022). AI-synthesized faces are indistinguishable from real faces and more trustworthy. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.2120481119
doi: 10.1073/pnas.2120481119
pubmed: 35165187
pmcid: 8872790
Nisbett, R. E., & Wilson, T. D. (1977). The halo effect: Evidence for unconscious alteration of judgments. Journal of Personality and Social Psychology, 35(4), 250–256. https://doi.org/10.1037/0022-3514.35.4.250
doi: 10.1037/0022-3514.35.4.250
Oliveira, M., Garcia-Marques, T., Garcia-Marques, L., & Dotsch, R. (2020). Good to Bad or Bad to Bad? What is the relationship between valence and the trait content of the Big Two? European Journal of Social Psychology, 50(2), 463–483. https://doi.org/10.1002/ejsp.2618
doi: 10.1002/ejsp.2618
OpenAI. (2022). ChatGPT (December 15) [Large language model; Large language model]. https://chat.openai.com/chat
Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. University of Illinois Press.
R Core Team. (2022). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing.
Rathje, S., Mirea, D.-M., Sucholutsky, I., Marjieh, R., Robertson, C., & Van Bavel, J. J. (2023). GPT is an effective tool for multilingual psychological text analysis [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/sekf5
Rosenberg, S., Nelson, C., & Vivekananthan, P. S. (1968). A multidimensional approach to the structure of personality impressions. Journal of Personality and Social Psychology, 9(4), 283–294. https://doi.org/10.1037/h0026086
doi: 10.1037/h0026086
pubmed: 5670821
Rosenberg, S., & Olshan, K. (1970). Evaluative and descriptive aspects in personality perception. Journal of Personality and Social Psychology, 16(4), 619–626. https://doi.org/10.1037/h0030081
doi: 10.1037/h0030081
pubmed: 5489497
Ruby, D. (2023, May 18). 57+ ChatGPT statistics 2023. DemandSage. https://www.demandsage.com/chatgpt-statistics/
Shank, D. B., Graves, C., Gott, A., Gamez, P., & Rodriguez, S. (2019). Feeling our way to machine minds: People’s emotions when perceiving mind in artificial intelligence. Computers in Human Behavior, 98, 256–266. https://doi.org/10.1016/j.chb.2019.04.001
doi: 10.1016/j.chb.2019.04.001
Stolier, R. M., Hehman, E., Keller, M. D., Walker, M., & Freeman, J. B. (2018). The conceptual structure of face impressions. Proceedings of the National Academy of Sciences, 114505, 201807222. https://doi.org/10.1073/pnas.1807222115
doi: 10.1073/pnas.1807222115
Suitner, C., & Maass, A. (2008). The role of valence in the perception of agency and communion. European Journal of Social Psychology, 38(7), 1073–1082. https://doi.org/10.1002/ejsp.525
doi: 10.1002/ejsp.525
Tiku, N. (2022, June 11). The Google engineer who thinks the company’s AI has come to life [News]. The Washington Post. https://www.washingtonpost.com/technology/2022/06/11/google-ai-lamda-blake-lemoine/
Tucciarelli, R., Vehar, N., Chandaria, S., & Tsakiris, M. (2022). On the realness of people who do not exist: The social processing of artificial faces. iScience. https://doi.org/10.1016/j.isci.2022.105441
doi: 10.1016/j.isci.2022.105441
pubmed: 36590465
pmcid: 9801245
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124
doi: 10.1126/science.185.4157.1124
pubmed: 17835457
Wegner, D. M., & Gray, K. (2017). The mind club: Who thinks, what feels, and why it matters. Penguin Books.
Westfall, J., Kenny, D. A., & Judd, C. M. (2014). Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. Journal of Experimental Psychology: General, 143(5), 2020–2045. https://doi.org/10.1037/xge0000014
doi: 10.1037/xge0000014
pubmed: 25111580
Wojciszke, B. (2005). Morality and competence in person- and self-perception. European Review of Social Psychology, 16(1), 155–188. https://doi.org/10.1080/10463280500229619
doi: 10.1080/10463280500229619
Wojciszke, B., & Abele, A. E. (2008). The primacy of communion over agency and its reversals in evaluations. European Journal of Social Psychology, 38(7), 1139–1147. https://doi.org/10.1002/ejsp.549
doi: 10.1002/ejsp.549
Yzerbyt, V. Y., Kervyn, N., & Judd, C. M. (2008). Compensation versus halo: The unique relations between the fundamental dimensions of social judgment. Personality and Social Psychology Bulletin, 34(8), 1110–1123. https://doi.org/10.1177/0146167208318602
doi: 10.1177/0146167208318602
pubmed: 18593867