Perceptions of artificial intelligence system's aptitude to judge morality and competence amidst the rise of Chatbots.

Humans Artificial Intelligence Morals Female Male Adult Judgment / physiology Young Adult Social Perception Aptitude / physiology

Artificial intelligence Chatbots Competence Impression formation Large language models Morality Social evaluation

Journal

Cognitive research: principles and implications

ISSN: 2365-7464

Titre abrégé: Cogn Res Princ Implic

Pays: England

ID NLM: 101697632

Informations de publication

Date de publication:
18 Jul 2024

Historique:

received: 29 09 2023

accepted: 02 07 2024

medline: 18 7 2024

pubmed: 18 7 2024

entrez: 17 7 2024

Statut: epublish

Résumé

This paper examines how humans judge the capabilities of artificial intelligence (AI) to evaluate human attributes, specifically focusing on two key dimensions of human social evaluation: morality and competence. Furthermore, it investigates the impact of exposure to advanced Large Language Models on these perceptions. In three studies (combined N = 200), we tested the hypothesis that people will find it less plausible that AI is capable of judging the morality conveyed by a behavior compared to judging its competence. Participants estimated the plausibility of AI origin for a set of written impressions of positive and negative behaviors related to morality and competence. Studies 1 and 3 supported our hypothesis that people would be more inclined to attribute AI origin to competence-related impressions compared to morality-related ones. In Study 2, we found this effect only for impressions of positive behaviors. Additional exploratory analyses clarified that the differentiation between the AI origin of competence and morality judgments persisted throughout the first half year after the public launch of popular AI chatbot (i.e., ChatGPT) and could not be explained by participants' general attitudes toward AI, or the actual source of the impressions (i.e., AI or human). These findings suggest an enduring belief that AI is less adept at assessing the morality compared to the competence of human behavior, even as AI capabilities continued to advance.

Identifiants

DOI: 10.1186/s41235-024-00573-7 PMID: 39019988

pubmed: 39019988

doi: 10.1186/s41235-024-00573-7

pii: 10.1186/s41235-024-00573-7

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Abele, A. E., Cuddy, A. J. C., Judd, C. M., & Yzerbyt, V. Y. (2008). Fundamental dimensions of social judgment. European Journal of Social Psychology, 38(7), 1063–1065. https://doi.org/10.1002/ejsp.574

doi: 10.1002/ejsp.574

Abele, A. E., Ellemers, N., Fiske, S. T., Koch, A., & Yzerbyt, V. (2021). Navigating the social world: Toward an integrated framework for evaluating self, individuals, and groups. Psychological Review, 128(2), 290–314. https://doi.org/10.1037/rev0000262

doi: 10.1037/rev0000262 pubmed: 32940512

Abele, A. E., Hauke, N., Peters, K., Louvet, E., Szymkow, A., & Duan, Y. (2016). Facets of the fundamental content dimensions: Agency with competence and assertiveness—Communion with warmth and morality. Frontiers in Psychology, 7, 1–17. https://doi.org/10.3389/fpsyg.2016.01810

doi: 10.3389/fpsyg.2016.01810

Argyle, L. P., Busby, E. C., Fulda, N., Gubler, J. R., Rytting, C., & Wingate, D. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis. https://doi.org/10.1017/pan.2023.2

doi: 10.1017/pan.2023.2

Barr, D. J., Lev, R., Scheepers, C., & Tily, H. J. (2013). Keep it maximal appendix. Journal of Memory and Language, 68(3), 1–5. https://doi.org/10.1016/j.jml.2012.11.001.Random

doi: 10.1016/j.jml.2012.11.001.Random

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01

doi: 10.18637/jss.v067.i01

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, 57(1), 289–300. https://doi.org/10.2307/2346101

doi: 10.2307/2346101

Bigman, Y. E., & Gray, K. (2018). People are averse to machines making moral decisions. Cognition, 181, 21–34. https://doi.org/10.1016/j.cognition.2018.08.003

doi: 10.1016/j.cognition.2018.08.003 pubmed: 30107256

Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences, 120(6), e2218523120. https://doi.org/10.1073/pnas.2218523120

doi: 10.1073/pnas.2218523120

Borkenau, P. (1986). Toward an understanding of trait interrelations: Acts as instances for several traits. Journal of Personality and Social Psychology, 51(2), 371–381. https://doi.org/10.1037/0022-3514.51.2.371

doi: 10.1037/0022-3514.51.2.371

Brambilla, M., Rusconi, P., Sacchi, S., & Cherubini, P. (2011). Looking for honesty: The primary role of morality (vs. Sociability and competence) in information gathering. European Journal of Social Psychology. https://doi.org/10.1002/ejsp.744

doi: 10.1002/ejsp.744

Cameron, C. D., Lindquist, K. A., & Gray, K. (2015). A constructionist review of morality and emotions: No evidence for specific links between moral content and discrete emotions. Personality and Social Psychology Review, 19(4), 371–394. https://doi.org/10.1177/1088868314566683

doi: 10.1177/1088868314566683 pubmed: 25587050

Carrier, A., Louvet, E., Chauvin, B., & Rohmer, O. (2014). The primacy of agency over competence in status perception. Social Psychology, 45(5), 347–356. https://doi.org/10.1027/1864-9335/a000176

doi: 10.1027/1864-9335/a000176

Castelo, N., Bos, M. W., & Lehmann, D. R. (2019). Task-dependent algorithm aversion. Journal of Marketing Research, 56(5), 809–825. https://doi.org/10.1177/0022243719851788

doi: 10.1177/0022243719851788

Confalonieri, R., Coba, L., Wagner, B., & Besold, T. R. (2021). A historical perspective of explainable Artificial Intelligence. Wires Data Mining and Knowledge Discovery, 11(1), e1391. https://doi.org/10.1002/widm.1391

doi: 10.1002/widm.1391

Cross, E. S., & Ramsey, R. (2021). Mind meets machine: Towards a cognitive science of human–machine interactions. Trends in Cognitive Sciences, 25(3), 200–212. https://doi.org/10.1016/j.tics.2020.11.009

doi: 10.1016/j.tics.2020.11.009 pubmed: 33384213

Darda, K., Carre, M., & Cross, E. (2023). Value attributed to text-based archives generated by artificial intelligence. Royal Society Open Science, 10(2), 220915. https://doi.org/10.1098/rsos.220915

doi: 10.1098/rsos.220915 pubmed: 36778947 pmcid: 9905996

DeBruine, L. M., & Barr, D. J. (2021). Understanding mixed-effects models through data simulation. Advances in Methods and Practices in Psychological Science, 4(1), 2515245920965119. https://doi.org/10.1177/2515245920965119

doi: 10.1177/2515245920965119

Dijkstra, J. J. (1999). User agreement with incorrect expert system advice. Behaviour & Information Technology, 18(6), 399–411. https://doi.org/10.1080/014492999118832

doi: 10.1080/014492999118832

Dillion, D., Tandon, N., Gu, Y., & Gray, K. (2023). Can AI language models replace human participants? Trends in Cognitive Sciences. https://doi.org/10.1016/j.tics.2023.04.008

doi: 10.1016/j.tics.2023.04.008 pubmed: 37173156

Eagly, A. H., Nater, C., Miller, D. I., Kaufmann, M., & Sczesny, S. (2020). Gender stereotypes have changed: A cross-temporal meta-analysis of U.S. public opinion polls from 1946 to 2018. American Psychologist, 75(3), 301–315. https://doi.org/10.1037/amp0000494

doi: 10.1037/amp0000494 pubmed: 31318237

Fazio, R. H. (2007). Attitudes as object–evaluation associations of varying strength. Social Cognition, 25(5), 603–637. https://doi.org/10.1521/soco.2007.25.5.603

doi: 10.1521/soco.2007.25.5.603 pubmed: 19424447 pmcid: 2677817

Fiske, S. T. (2017). Prejudices in cultural contexts: Shared stereotypes (gender, age) versus variable stereotypes (race, ethnicity, religion). Perspectives on Psychological Science, 12(5), 791–799. https://doi.org/10.1177/1745691617708204

doi: 10.1177/1745691617708204 pubmed: 28972839 pmcid: 5657003

Fiske, S. T. (2018). Stereotype content: Warmth and competence endure. Current Directions in Psychological Science, 27(2), 67–73. https://doi.org/10.1177/0963721417738825

doi: 10.1177/0963721417738825 pubmed: 29755213 pmcid: 5945217

Fiske, S. T., Cuddy, A. J. C., & Glick, P. (2007). Universal dimensions of social cognition: Warmth and competence. Trends in Cognitive Sciences, 11(2), 77–83. https://doi.org/10.1016/j.tics.2006.11.005

doi: 10.1016/j.tics.2006.11.005 pubmed: 17188552

Freeman, J. B., & Ambady, N. (2011). A dynamic interactive theory of person construal. Psychological Review, 118(2), 247–279. https://doi.org/10.1037/a0022327

doi: 10.1037/a0022327 pubmed: 21355661

Fumagalli, M., Ferrucci, R., Mameli, F., Marceglia, S., Mrakic-Sposta, S., Zago, S., Lucchiari, C., Consonni, D., Nordio, F., Pravettoni, G., Cappa, S., & Priori, A. (2010). Gender-related differences in moral judgments. Cognitive Processing, 11(3), 219–226. https://doi.org/10.1007/s10339-009-0335-2

doi: 10.1007/s10339-009-0335-2 pubmed: 19727878

Gangadharbatla, H. (2022). The role of AI attribution knowledge in the evaluation of artwork. Empirical Studies of the Arts, 40(2), 125–142. https://doi.org/10.1177/0276237421994697

doi: 10.1177/0276237421994697

Gilbert, D. T. (1998). Ordinary personology. In D. T. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), The handbook of social psychology (pp. 89–150). McGraw-Hill.

Goodwin, G. P., Piazza, J., & Rozin, P. (2014). Moral character predominates in person perception and evaluation. Journal of Personality and Social Psychology, 106(1), 148–168. https://doi.org/10.1037/a0034726

doi: 10.1037/a0034726 pubmed: 24274087

Gray, H. M., Gray, K., & Wegner, D. M. (2007). Dimensions of mind perception. Science, 315(5812), 619–619. https://doi.org/10.1126/science.1134475

doi: 10.1126/science.1134475 pubmed: 17272713

Gray, K., Young, L., & Waytz, A. (2012). Mind perception is the essence of morality. Psychological Inquiry, 23(2), 101–124. https://doi.org/10.1080/1047840X.2012.651387

doi: 10.1080/1047840X.2012.651387 pubmed: 22754268 pmcid: 3379786

Greene, J. D., Sommerville, R. B., Nystrom, L. E., Darley, J. M., & Cohen, J. D. (2001). An fMRI investigation of emotional engagement in moral judgment. Science, 293(5537), 2105–2108. https://doi.org/10.1126/science.1062872

doi: 10.1126/science.1062872 pubmed: 11557895

Gunser, V. E., Gottschling, S., Brucker, B., Richter, S., & Gerjets, P. (2021). Can users distinguish narrative texts written by an artificial intelligence writing tool from purely human text? In C. Stephanidis, M. Antona, & S. Ntoa (Eds.), HCI international 2021—posters (Vol. 1419, pp. 520–527). Springer. https://doi.org/10.1007/978-3-030-78635-9_67

doi: 10.1007/978-3-030-78635-9_67

Haidt, J., Koller, S. H., & Dias, M. G. (1993). Affect, culture, and morality, or is it wrong to eat your dog? Journal of Personality and Social Psychology, 65(4), 613–628. https://doi.org/10.1037/0022-3514.65.4.613

doi: 10.1037/0022-3514.65.4.613 pubmed: 8229648

Haslam, N. (2006). Dehumanization: An integrative review. Personality and Social Psychology Review, 10(3), 252–264. https://doi.org/10.1207/s15327957pspr1003_4

doi: 10.1207/s15327957pspr1003_4 pubmed: 16859440

Hitsuwari, J., Ueda, Y., Yun, W., & Nomura, M. (2023). Does human–AI collaboration lead to more creative art? Aesthetic evaluation of human-made and AI-generated haiku poetry. Computers in Human Behavior, 139, 107502. https://doi.org/10.1016/j.chb.2022.107502

doi: 10.1016/j.chb.2022.107502

Hortensius, R., & Cross, E. S. (2018). From automata to animate beings: The scope and limits of attributing socialness to artificial agents: Socialness attribution and artificial agents. Annals of the New York Academy of Sciences, 1426(1), 93–110. https://doi.org/10.1111/nyas.13727

doi: 10.1111/nyas.13727

Horton, J. J. (2023). Large language models as simulated economic agents: What can we learn from homo silicus? https://doi.org/10.48550/ARXIV.2301.07543

Hu, K. (2023, February 2). ChatGPT sets record for fastest-growing user base—Analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/

Jakesch, M., Hancock, J. T., & Naaman, M. (2023). Human heuristics for AI-generated language are flawed. Proceedings of the National Academy of Sciences, 120(11), e2208839120. https://doi.org/10.1073/pnas.2208839120

doi: 10.1073/pnas.2208839120

Jones, C., & Bergen, B. (2023). Does GPT-4 pass the Turing test? https://doi.org/10.48550/ARXIV.2310.20216

Judd, C. M., Garcia-Marques, T., & Yzerbyt, V. Y. (2019). The complexity of relations between dimensions of social perception: Decomposing bivariate associations with crossed random factors. Journal of Experimental Social Psychology, 82, 200–207. https://doi.org/10.1016/j.jesp.2019.01.008

doi: 10.1016/j.jesp.2019.01.008

Judd, C. M., James-Hawkins, L., Yzerbyt, V., & Kashima, Y. (2005). Fundamental dimensions of social judgment: Understanding the relations between judgments of competence and warmth. Journal of Personality and Social Psychology, 89(6), 899–913. https://doi.org/10.1037/0022-3514.89.6.899

doi: 10.1037/0022-3514.89.6.899 pubmed: 16393023

Kervyn, N., Bergsieker, H. B., & Fiske, S. T. (2012). The innuendo effect: Hearing the positive but inferring the negative. Journal of Experimental Social Psychology, 48(1), 77–85. https://doi.org/10.1016/j.jesp.2011.08.001

doi: 10.1016/j.jesp.2011.08.001 pubmed: 26023243 pmcid: 4443850

Kervyn, N., Fiske, S. T., & Yzerbyt, V. Y. (2013). Integrating the stereotype content model (warmth and competence) and the Osgood semantic differential (evaluation, potency, and activity). European Journal of Social Psychology, 43(7), 673–681. https://doi.org/10.1002/ejsp.1978

doi: 10.1002/ejsp.1978 pubmed: 26120217 pmcid: 4479118

Köbis, N., & Mossink, L. D. (2021). Artificial intelligence versus Maya Angelou: Experimental evidence that people cannot differentiate AI-generated from human-written poetry. Computers in Human Behavior, 114, 106553. https://doi.org/10.1016/j.chb.2020.106553

doi: 10.1016/j.chb.2020.106553

Kuznetsova, A., Brockhoff, P. B., & Christensen, R. H. B. (2017). lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software, 82(13), 1–26. https://doi.org/10.18637/jss.v082.i13

doi: 10.18637/jss.v082.i13

Lai, C. K., Skinner, A. L., Cooley, E., Murrar, S., Brauer, M., Devos, T., Calanchini, J., Xiao, Y. J., Pedram, C., Marshburn, C. K., Simon, S., Blanchar, J. C., Joy-Gaba, J. A., Conway, J., Redford, L., Klein, R. A., Roussos, G., Schellhaas, F. M. H., Burns, M., … Nosek, B. A. (2016). Reducing implicit racial preferences II: Intervention effectiveness across time. Journal of Experimental Psychology. General, 145(8), 1001–1016. https://doi.org/10.1037/xge0000179

Leach, C., Ellemers, N., & Barreto, M. (2007). Group virtue: The importance of morality (vs. Competence and sociability) in the positive evaluation of in-groups. Journal of Personality and Social Psychology, 93(2), 234–249. https://doi.org/10.1037/0022-3514.93.2.234

doi: 10.1037/0022-3514.93.2.234 pubmed: 17645397

Lee, M. K. (2018). Understanding perception of algorithmic decisions: Fairness, trust, and emotion in response to algorithmic management. Big Data & Society, 5(1), 205395171875668. https://doi.org/10.1177/2053951718756684

doi: 10.1177/2053951718756684

Lenth, R. V. (2022). emmeans: Estimated Marginal Means, aka Least-Squares Means (R package version 1.8.3) [Computer software].

Lippmann, W. (1922). Public opinion. Harcourt, Brace, and Company.

Luke, S. G. (2017). Evaluating significance in linear mixed-effects models in R. Behavior Research Methods, 49(4), 1494–1502. https://doi.org/10.3758/s13428-016-0809-y

doi: 10.3758/s13428-016-0809-y pubmed: 27620283

Mickelberg, A., Walker, B., Ecker, U. K. H., Howe, P., Perfors, A., & Fay, N. (2022). Impression formation stimuli: A corpus of behavior statements rated on morality, competence, informativeness, and believability. PLoS ONE, 17(6), e0269393. https://doi.org/10.1371/journal.pone.0269393

doi: 10.1371/journal.pone.0269393 pubmed: 35657992 pmcid: 9165857

Momen, A., De Visser, E., Wolsten, K., Cooley, K., Wallisser, J., & Tossell, C. C. (2023). Trusting the moral judgments of a robot: Perceived moral competence and humanlikeness of a GPT-3 enabled AI. 501–510

Nightingale, S. J., & Farid, H. (2022). AI-synthesized faces are indistinguishable from real faces and more trustworthy. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.2120481119

doi: 10.1073/pnas.2120481119 pubmed: 35165187 pmcid: 8872790

Nisbett, R. E., & Wilson, T. D. (1977). The halo effect: Evidence for unconscious alteration of judgments. Journal of Personality and Social Psychology, 35(4), 250–256. https://doi.org/10.1037/0022-3514.35.4.250

doi: 10.1037/0022-3514.35.4.250

Oliveira, M., Garcia-Marques, T., Garcia-Marques, L., & Dotsch, R. (2020). Good to Bad or Bad to Bad? What is the relationship between valence and the trait content of the Big Two? European Journal of Social Psychology, 50(2), 463–483. https://doi.org/10.1002/ejsp.2618

doi: 10.1002/ejsp.2618

OpenAI. (2022). ChatGPT (December 15) [Large language model; Large language model]. https://chat.openai.com/chat

Osgood, C. E., Suci, G. J., & Tannenbaum, P. H. (1957). The measurement of meaning. University of Illinois Press.

R Core Team. (2022). R: A language and environment for statistical computing [Computer software]. R Foundation for Statistical Computing.

Rathje, S., Mirea, D.-M., Sucholutsky, I., Marjieh, R., Robertson, C., & Van Bavel, J. J. (2023). GPT is an effective tool for multilingual psychological text analysis [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/sekf5

Rosenberg, S., Nelson, C., & Vivekananthan, P. S. (1968). A multidimensional approach to the structure of personality impressions. Journal of Personality and Social Psychology, 9(4), 283–294. https://doi.org/10.1037/h0026086

doi: 10.1037/h0026086 pubmed: 5670821

Rosenberg, S., & Olshan, K. (1970). Evaluative and descriptive aspects in personality perception. Journal of Personality and Social Psychology, 16(4), 619–626. https://doi.org/10.1037/h0030081

doi: 10.1037/h0030081 pubmed: 5489497

Ruby, D. (2023, May 18). 57+ ChatGPT statistics 2023. DemandSage. https://www.demandsage.com/chatgpt-statistics/

Shank, D. B., Graves, C., Gott, A., Gamez, P., & Rodriguez, S. (2019). Feeling our way to machine minds: People’s emotions when perceiving mind in artificial intelligence. Computers in Human Behavior, 98, 256–266. https://doi.org/10.1016/j.chb.2019.04.001

doi: 10.1016/j.chb.2019.04.001

Stolier, R. M., Hehman, E., Keller, M. D., Walker, M., & Freeman, J. B. (2018). The conceptual structure of face impressions. Proceedings of the National Academy of Sciences, 114505, 201807222. https://doi.org/10.1073/pnas.1807222115

doi: 10.1073/pnas.1807222115

Suitner, C., & Maass, A. (2008). The role of valence in the perception of agency and communion. European Journal of Social Psychology, 38(7), 1073–1082. https://doi.org/10.1002/ejsp.525

doi: 10.1002/ejsp.525

Tiku, N. (2022, June 11). The Google engineer who thinks the company’s AI has come to life [News]. The Washington Post. https://www.washingtonpost.com/technology/2022/06/11/google-ai-lamda-blake-lemoine/

Tucciarelli, R., Vehar, N., Chandaria, S., & Tsakiris, M. (2022). On the realness of people who do not exist: The social processing of artificial faces. iScience. https://doi.org/10.1016/j.isci.2022.105441

doi: 10.1016/j.isci.2022.105441 pubmed: 36590465 pmcid: 9801245

Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185(4157), 1124–1131. https://doi.org/10.1126/science.185.4157.1124

doi: 10.1126/science.185.4157.1124 pubmed: 17835457

Wegner, D. M., & Gray, K. (2017). The mind club: Who thinks, what feels, and why it matters. Penguin Books.

Westfall, J., Kenny, D. A., & Judd, C. M. (2014). Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli. Journal of Experimental Psychology: General, 143(5), 2020–2045. https://doi.org/10.1037/xge0000014

doi: 10.1037/xge0000014 pubmed: 25111580

Wojciszke, B. (2005). Morality and competence in person- and self-perception. European Review of Social Psychology, 16(1), 155–188. https://doi.org/10.1080/10463280500229619

doi: 10.1080/10463280500229619

Wojciszke, B., & Abele, A. E. (2008). The primacy of communion over agency and its reversals in evaluations. European Journal of Social Psychology, 38(7), 1139–1147. https://doi.org/10.1002/ejsp.549

doi: 10.1002/ejsp.549

Yzerbyt, V. Y., Kervyn, N., & Judd, C. M. (2008). Compensation versus halo: The unique relations between the fundamental dimensions of social judgment. Personality and Social Psychology Bulletin, 34(8), 1110–1123. https://doi.org/10.1177/0146167208318602

doi: 10.1177/0146167208318602 pubmed: 18593867

Perceptions of artificial intelligence system's aptitude to judge morality and competence amidst the rise of Chatbots.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Manuel Oliveira (M)

Justus Brands (J)

Judith Mashudi (J)

Baptist Liefooghe (B)

Ruud Hortensius (R)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH