Human detection of political speech deepfakes across transcripts, audio, and video.

Humans Politics Speech Female Male Video Recording Adult Young Adult Communication Algorithms

Journal

Nature communications

ISSN: 2041-1723

Titre abrégé: Nat Commun

Pays: England

ID NLM: 101528555

Informations de publication

Date de publication:
02 Sep 2024

Historique:

received: 14 09 2022

accepted: 22 08 2024

medline: 3 9 2024

pubmed: 3 9 2024

entrez: 2 9 2024

Statut: epublish

Résumé

Recent advances in technology for hyper-realistic visual and audio effects provoke the concern that deepfake videos of political speeches will soon be indistinguishable from authentic video. We conduct 5 pre-registered randomized experiments with N = 2215 participants to evaluate how accurately humans distinguish real political speeches from fabrications across base rates of misinformation, audio sources, question framings with and without priming, and media modalities. We do not find base rates of misinformation have statistically significant effects on discernment. We find deepfakes with audio produced by the state-of-the-art text-to-speech algorithms are harder to discern than the same deepfakes with voice actor audio. Moreover across all experiments and question framings, we find audio and visual information enables more accurate discernment than text alone: human discernment relies more on how something is said, the audio-visual cues, than what is said, the speech content.

Identifiants

DOI: 10.1038/s41467-024-51998-z PMID: 39223110

pubmed: 39223110

doi: 10.1038/s41467-024-51998-z

pii: 10.1038/s41467-024-51998-z

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

7629

Informations de copyright

Références

Hancock, J. T. & Bailenson, J. N. The social impact of deepfakes. Cyberpsychol. Behav. Soc. Netw. 24, 149–152 (2021).

pubmed: 33760669 doi: 10.1089/cyber.2021.29208.jth

Chesney, B. & Citron, D. Deep fakes: A looming challenge for privacy, democracy, and national security. Calif. L. Rev. 107, 1753 (2019).

Paris, B. & Donovan, J. Deepfakes and Cheap Fakes. United States of America: Data & Society (2019).

Leibowicz, C., McGregor, S. & Ovadya, A. The Deepfake Detection Dilemma: A Multistakeholder Exploration of Adversarial Dynamics in Synthetic Media (2021).

Agarwal, S. et al. Protecting World Leaders Against Deep Fakes. In CVPR workshops, vol. 1 (2019).

Pataranutaporn, P. et al. Ai-generated characters for supporting personalized learning and well-being. Nat. Mach. Intell. 3, 1013–1022 (2021).

doi: 10.1038/s42256-021-00417-9

Guess, A. M. & Lyons, B. A. Misinformation, disinformation, and online propaganda. Social media and democracy: The state of the field, prospects for reform 10–33 (2020).

Boháček, M. & Farid, H. Protecting world leaders against deep fakes using facial, gestural, and vocal mannerisms. Proc. Natl. Acad. Sci. USA 119, e2216035119 (2022).

pubmed: 36417442 pmcid: 9860138 doi: 10.1073/pnas.2216035119

Karras, T., Laine, S. & Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4401–4410 (2019).

Karras, T. et al. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8110–8119 (2020).

Nichol, A. et al. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. In International Conference on Machine Learning (pp. 16784–16804). (PMLR, 2022).

Kamali, N., Nakamura, K., Chatzimparmpas, A., Hullman, J. & Groh, M. How to distinguish ai-generated images from authentic photographs. Preprint at arXiv https://doi.org/10.48550/arXiv.2406.08651 (2024).

Groh, M., Epstein, Z., Obradovich, N., Cebrian, M. & Rahwan, I. Human detection of machine-manipulated media. Commun. ACM 64, 40–47 (2021).

doi: 10.1145/3445972

Suvorov, R. et al. Resolution-robust large mask inpainting with fourier convolutions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2149–2159 (2022).

Arik, S. O., Chen, J., Peng, K., Ping, W. & Zhou, Y. Neural voice cloning with a few samples. Advances in neural information processing systems 31 https://doi.org/10.48550/arXiv.1802.06006 (2018).

Luong, H.-T. & Yamagishi, J. Nautilus: a versatile voice cloning system. IEEE/ACM Trans. Audio, Speech, Lang. Process. 28, 2967–2981 (2020).

doi: 10.1109/TASLP.2020.3034994

Prajwal, K. R., Mukhopadhyay, R., Namboodiri, V. P. & Jawahar, C. A lip sync expert is all you need for speech to lip generation in the wild. In Proceedings of the 28th ACM International Conference on Multimedia, MM ’20, 484–492 (Association for Computing Machinery, New York, NY, USA, 2020).

Lahiri, A., Kwatra, V., Frueh, C., Lewis, J. & Bregler, C. Lipsync3d: Data-efficient learning of personalized 3d talking faces from video using pose and lighting normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2755–2764 (2021).

Hong, W., Ding, M., Zheng, W., Liu, X. & Tang, J. Cogvideo: Large-scale pretraining for text-to-video generation via transformers, https://doi.org/10.48550/ARXIV.2205.15868 (2022).

Peirce, C. S. Peirce on Signs: Writings on Semiotic (UNC Press Books, 1991).

Messaris, P. & Abraham, L. The role of images in framing news stories. In Framing Public Life, 231–242 (Routledge, 2001).

Glasford, D. E. Seeing is believing: communication modality, anger, and support for action on behalf of out-groups. J. Appl. Soc. Psychol. 43, 2223–2230 (2013).

doi: 10.1111/jasp.12173

Yadav, A. et al. If a picture is worth a thousand words is video worth a million? differences in affective and cognitive processing of video and text cases. J. Comput. High. Educ. 23, 15–37 (2011).

doi: 10.1007/s12528-011-9042-y

Appiah, O. Rich media, poor media: The impact of audio/video vs. text/picture testimonial ads on browsers’ evaluations of commercial web sites and online products. J. Curr. Issues Res. Advert. 28, 73–86 (2006).

doi: 10.1080/10641734.2006.10505192

Powell, T. E., Boomgaarden, H. G., De Swert, K. & de Vreese, C. H. Video killed the news article? comparing multimodal framing effects in news videos and articles. J. Broadcast. Electron. Media 62, 578–596 (2018).

doi: 10.1080/08838151.2018.1483935

Garimella, K. & Eckles, D. Images and Misinformation in Political Groups: Evidence from Whatsapp in India. Harvard Kennedy School Misinformation Review (2020).

Budak, C., Nyhan, B., Rothschild, D. M., Thorson, E. & Watts, D. J. Misunderstanding the harms of online misinformation. Nature 630, 45–53 (2024).

pubmed: 38840013 doi: 10.1038/s41586-024-07417-w

Goel, V., Raj, S. & Ravichandran, P. How Whatsapp Leads Mobs to Murder in India. The New York Times (2018).

Sundar, S. S., Molina, M. D. & Cho, E. Seeing is believing: Is video modality more powerful in spreading fake news via online messaging apps? J. Comput. Mediat. Commun. 26, 301–319 (2021).

doi: 10.1093/jcmc/zmab010

Wittenberg, C., Tappin, B. M., Berinsky, A. J. & Rand, D. G. The (minimal) persuasive advantage of political video over text. Proc. Natl. Acad. Sci. USA 118, e2114388118 (2021).

pubmed: 34782473 doi: 10.1073/pnas.2114388118

Sundar, S. S. The Main Model: A Heuristic Approach to Understanding Technology Effects on Credibility. Digital Media, Youth, and Credibility (2008).

Hancock, J. T., Naaman, M. & Levy, K. Ai-mediated communication: definition, research agenda, and ethical considerations. J. Comput. Mediat. Commun. 25, 89–100 (2020).

doi: 10.1093/jcmc/zmz022

Barari, S., Lucas, C. & Munger, K. Political Deepfake Videos Misinform the Public, But No More than Other Fake Media. Open Science Framework (2021).

Murphy, G. & Flynn, E. Deepfake false memories. Memory 30, 480–492 (2022).

doi: 10.1080/09658211.2021.1919715

Vaccari, C. & Chadwick, A. Deepfakes and disinformation: Exploring the impact of synthetic political video on deception, uncertainty, and trust in news. Soc. Media+ Soc. 6, 2056305120903408 (2020).

Dobber, T., Metoui, N., Trilling, D., Helberger, N. & de Vreese, C. Do (microtargeted) deepfakes have real effects on political attitudes? Int. J. Press Polit. 26, 69–91 (2021).

doi: 10.1177/1940161220944364

Hameleers, M., van der Meer, T. G. & Dobber, T. You won’t believe what they just said! the effects of political deepfakes embedded as vox populi on social media. Soc. Media+ Soc. 8, 20563051221116346 (2022).

Reeves, B., Yeykelis, L. & Cummings, J. J. The use of media in media psychology. Media Psychol. 19, 49–71 (2016).

doi: 10.1080/15213269.2015.1030083

Kasra, M., Shen, C. & O’Brien, J. F. Seeing is believing: How people fail to identify fake images on the web. In Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems, 1–6 (2018).

Hameleers, M., Powell, T. E., Van Der Meer, T. G. & Bos, L. A picture paints a thousand lies? the effects and mechanisms of multimodal disinformation and rebuttals disseminated via social media. Political Commun. 37, 281–301 (2020).

doi: 10.1080/10584609.2019.1674979

Nightingale, S. J. & Farid, H. Ai-synthesized faces are indistinguishable from real faces and more trustworthy. Proc. Natl. Acad. Sci. USA 119, e2120481119 (2022).

pubmed: 35165187 pmcid: 8872790 doi: 10.1073/pnas.2120481119

Cardwell, B. A., Henkel, L. A., Garry, M., Newman, E. J. & Foster, J. L. Nonprobative photos rapidly lead people to believe claims about their own (and other people’s) pasts. Mem. Cogn. 44, 883–896 (2016).

doi: 10.3758/s13421-016-0603-1

Cardwell, B. A., Lindsay, D. S., Förster, K. & Garry, M. Uninformative photos can increase people’s perceived knowledge of complicated processes. J. Appl. Res. Mem. Cogn. 6, 244–252 (2017).

doi: 10.1016/j.jarmac.2017.05.002

Newman, E. J., Jalbert, M. C., Schwarz, N. & Ly, D. P. Truthiness, the illusory truth effect, and the role of need for cognition. Conscious. Cogn. 78, 102866 (2020).

pubmed: 31935624 doi: 10.1016/j.concog.2019.102866

Newman, E. J., Garry, M., Bernstein, D. M., Kantner, J. & Lindsay, D. S. Nonprobative photographs (or words) inflate truthiness. Psychon. Bull. Rev. 19, 969–974 (2012).

pubmed: 22869334 doi: 10.3758/s13423-012-0292-0

Fazio, L. K., Brashier, N. M., Payne, B. K. & Marsh, E. J. Knowledge does not protect against illusory truth. J. Exp. Psychol. Gen. 144, 993 (2015).

pubmed: 26301795 doi: 10.1037/xge0000098

Ecker, U. K. et al. The psychological drivers of misinformation belief and its resistance to correction. Nat. Rev. Psychol. 1, 13–29 (2022).

doi: 10.1038/s44159-021-00006-y

Dolhansky, B. et al. The deepfake detection challenge (DFDC) dataset. Preprint at arXiv https://doi.org/10.48550/arXiv.2006.07397 (2020).

Groh, M., Epstein, Z., Firestone, C. & Picard, R. Deepfake detection by human crowds, machines, and machine-informed crowds. Proc. Natl. Acad. Sci. USA 119, e2110013119 (2022).

pubmed: 34969837 doi: 10.1073/pnas.2110013119

Köbis, N., Doležalová, B. & Soraperra, I. Fooled twice–people cannot detect deepfakes but think they can. Science 24, 103364 (2021).

Lovato, J. et al. Diverse misinformation: impacts of human biases on detection of deepfakes on networks. Npj Complex. 1, 5 (2024).

doi: 10.1038/s44260-024-00006-y

Tahir, R. et al. Seeing is believing: Exploring perceptual differences in deepfake videos. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1–16 (2021).

Lee, E.-J. & Shin, S. Y. Mediated misinformation: Questions answered, more questions to ask. Am. Behav. Sci. 65, 259–276 (2021).

doi: 10.1177/0002764219869403

Pennycook, G. & Rand, D. G. Fighting misinformation on social media using crowdsourced judgments of news source quality. Proc. Natl. Acad. Sci. USA 116, 2521–2526 (2019).

pubmed: 30692252 pmcid: 6377495 doi: 10.1073/pnas.1806781116

Austin, E. W. & Dong, Q. Source v. content effects on judgments of news believability. Journalism Q. 71, 973–983 (1994).

doi: 10.1177/107769909407100420

Shen, C. et al. Fake images: The effects of source, intermediary, and digital media literacy on contextual assessment of image credibility online. N. Media Soc. 21, 438–463 (2019).

doi: 10.1177/1461444818799526

Dias, N., Pennycook, G. & Rand, D. G. Emphasizing Publishers does not Effectively Reduce Susceptibility to Misinformation on Social Media. Harvard Kennedy School Misinformation Review 1 (2020).

Jakesch, M., Koren, M., Evtushenko, A. & Naaman, M. The Role of Source, Headline and Expressive Responding in Political News Evaluation. Headline and Expressive Responding in Political News Evaluation (December 5, 2018).

Nadarevic, L., Reber, R., Helmecke, A. J. & Köse, D. Perceived truth of statements and simulated social media postings: an experimental investigation of source credibility, repeated exposure, and presentation format. Cogn. Res. Princ. Implic. 5, 1–16 (2020).

Kim, A., Moravec, P. L. & Dennis, A. R. Combating fake news on social media with source ratings: The effects of user and expert reputation ratings. J. Manag. Inf. Syst. 36, 931–968 (2019).

doi: 10.1080/07421222.2019.1628921

Pennycook, G. & Rand, D. G. Lazy, not biased: Susceptibility to partisan fake news is better explained by lack of reasoning than by motivated reasoning. Cognition 188, 39–50 (2019).

pubmed: 29935897 doi: 10.1016/j.cognition.2018.06.011

Appel, M. & Prietzel, F. The detection of political deepfakes. J. Comput. Mediat.Commun. 27, zmac008 (2022).

doi: 10.1093/jcmc/zmac008

Arechar, A. A. et al. Understanding and reducing online misinformation across 16 countries on six continents. Nat. Hum. Behav. 7, 1502–1513 (2022).

Pennycook, G. & Rand, D. G. The psychology of fake news. Trends Cogn. Sci. 25, 38–402 (2021).

doi: 10.1016/j.tics.2021.02.007

Lazer, D. M. J. et al. The science of fake news. Science 359, 1094–1096 (2018).

pubmed: 29590025 doi: 10.1126/science.aao2998

Dan, V. et al. Visual mis-and disinformation, social media, and democracy. J. Mass Commun. Q. 98, 641–664 (2021).

Calo, R., Coward, C., Spiro, E. S., Starbird, K. & West, J. D. How do you solve a problem like misinformation? Sci. Adv. 7, eabn0481 (2021).

pubmed: 34878833 pmcid: 11323800 doi: 10.1126/sciadv.abn0481

Sankaranarayanan, A., Groh, M., Picard, R. & Lippman, A. The presidential deepfakes dataset. In Proceedings of the AIofAI Workshop at the International Joint Conference on Artificial Intelligence (2021).

Perov, I. et al. Deepfacelab: Integrated, flexible and extensible face-swapping framework. Preprint at arXiv https://doi.org/10.48550/arXiv.2005.05535 (2020).

Free text to speech & AI Voice Generator. Elevenlabs. https://elevenlabs.io .

Abadie, A., Athey, S., Imbens, G. & Wooldridge, J. When Should You Adjust Standard Errors for Clustering? The Quarterly Journal of Economics (2017).

Gomila, R. Logistic or linear? estimating causal effects of experimental treatments on binary outcomes using regression analysis. J. Exp. Psychol. Gen. 150, 700 (2021).

pubmed: 32969684 doi: 10.1037/xge0000920

Frederick, S. Cognitive reflection and decision making. J. Econ. Perspect. 19, 25–42 (2005).

doi: 10.1257/089533005775196732

Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300 (1995).

doi: 10.1111/j.2517-6161.1995.tb02031.x

Goodman, J. D. Microphone Catches a Candid Obama. The New York Times (2012).

Lyu, S. Deepfake detection: Current challenges and next steps. In 2020 IEEE international conference on multimedia & expo workshops (ICMEW), pp. 1–6 (IEEE, 2020).

Bryan, C. J., Tipton, E. & Yeager, D. S. Behavioural science is unlikely to change the world without a heterogeneity revolution. Nat. Hum. Behav. 5, 980–989 (2021).

pubmed: 34294901 pmcid: 8928154 doi: 10.1038/s41562-021-01143-3

Vincent, J. Ai image generator midjourney stops free trials but says influx of new users to blame. The Verge (2023).

Metzger, M. J., Flanagin, A. J. & Medders, R. B. Social and heuristic approaches to credibility evaluation online. J. Commun. 60, 413–439 (2010).

doi: 10.1111/j.1460-2466.2010.01488.x

Barasch, A., Schroeder, J., Zev Berman, J. & Small, D. Cues to sincerity: How people assess and convey sincerity in language. ACR North American Advances (2018).

Schroeder, J. & Epley, N. Mistaking minds and machines: How speech affects dehumanization and anthropomorphism. J. Exp. Psychol. Gen. 145, 1427 (2016).

pubmed: 27513307 doi: 10.1037/xge0000214

Franzen, L., Delis, I., Sousa, G. D., Kayser, C. & Philiastides, M. G. Auditory information enhances post-sensory visual evidence during rapid multisensory decision-making. Nat. Commun. 11, 5440 (2020).

pubmed: 33116148 pmcid: 7595090 doi: 10.1038/s41467-020-19306-7

Allen, J., Howland, B., Mobius, M., Rothschild, D. & Watts, D. J. Evaluating the fake news problem at the scale of the information ecosystem. Sci. Adv. 6, eaay3539 (2020).

pubmed: 32284969 pmcid: 7124954 doi: 10.1126/sciadv.aay3539

Watts, D. J., Rothschild, D. M. & Mobius, M. Measuring the news and its impact on democracy. Proc. Natl. Acad. Sci. USA 118, e1912443118 (2021).

pubmed: 33837145 pmcid: 8053935 doi: 10.1073/pnas.1912443118

Epstein, Z. et al. Art and the science of generative ai. Science 380, 1110–1111 (2023).

pubmed: 37319193 doi: 10.1126/science.adh4451

Agarwal, S. & Farid, H. Detecting deep-fake videos from aural and oral dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 981–989 (2021).

Pennycook, G. et al. Shifting attention to accuracy can reduce misinformation online. Nature 592, 590–595 (2021).

pubmed: 33731933 doi: 10.1038/s41586-021-03344-2

Epstein, Z., Sirlin, N., Arechar, A., Pennycook, G. & Rand, D. The social media context interferes with truth discernment. Sci. Adv. 9, eabo6169 (2023).

pubmed: 36867704 pmcid: 9984169 doi: 10.1126/sciadv.abo6169

Roozenbeek, J., van der Linden, S., Goldberg, B., Rathje, S. & Lewandowsky, S. Psychological inoculation improves resilience against misinformation on social media. Sci. Adv. 8, eabo6254 (2022).

pubmed: 36001675 pmcid: 9401631 doi: 10.1126/sciadv.abo6254

Berger, J. & Milkman, K. L. What makes online content viral? J. Mark. Res. 49, 192–205 (2012).

doi: 10.1509/jmr.10.0353

Vosoughi, S., Roy, D. & Aral, S. The spread of true and false news online. Science 359, 1146–1151 (2018).

pubmed: 29590045 doi: 10.1126/science.aap9559

Brady, W. J., Wills, J. A., Jost, J. T., Tucker, J. A. & Van Bavel, J. J. Emotion shapes the diffusion of moralized content in social networks. Proc. Natl. Acad. Sci. USA 114, 7313–7318 (2017).

pubmed: 28652356 pmcid: 5514704 doi: 10.1073/pnas.1618923114

Brady, W. J., Crockett, M. J. & Van Bavel, J. J. The mad model of moral contagion: The role of motivation, attention, and design in the spread of moralized content online. Perspect. Psychol. Sci. 15, 978–1010 (2020).

pubmed: 32511060 doi: 10.1177/1745691620917336

Lazer, D. Studying human attention on the internet. Proc. Natl. Acad. Sci. USA 117, 21–22 (2020).

pubmed: 31848240 doi: 10.1073/pnas.1919348117

Fuller, T.Gnomologia: Adagies and Proverbs; Wise Sentences and Witty Sayings, Ancient and Modern, Foreign and British, vol. 1 (B. Barker, 1732).

Messaris, P.Visual Persuasion: The Role of Images in Advertising (Sage, 1997).

Farid, H. Digital doctoring: how to tell the real from the fake. Significance 3, 162–166 (2006).

doi: 10.1111/j.1740-9713.2006.00197.x

King, D.The Commissar Vanishes: The Falsification of Photographs and Art in Stalin’s Russia (Metropolitan Books New York, 1997).

Lai, V. & Tan, C. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. Proceedings of the Conference on Fairness, Accountability, and Transparency 29–38, (2019).

Agarwal, S. et al. Watch those words: Video falsification detection using word-conditioned facial motion. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 4710–4719 (2023).

Long, D. & Magerko, B. What is ai literacy? competencies and design considerations. In Proceedings of the 2020 CHI conference on human factors in computing systems, 1–16 (2020).

Annapureddy, R., Fornaroli, A. & Gatica-Perez, D. Generative AI Literacy: Twelve Defining Competencies. https://doi.org/10.1145/3685680 (2024).

Sankaranarayanan, A., Groh, M., Picard, R. & Lippman, A. The presidential deepfakes dataset. In CEUR Workshop Proceedings, vol. 2942, 57–72 (CEUR-WS, 2021).

Palan, S. & Schitter, C. Prolific.ac—A subject pool for online experiments. J. Behav. Exp. Financ. 17, 22–27 (2018).

doi: 10.1016/j.jbef.2017.12.004

Berinsky, A. J., Margolis, M. F. & Sances, M. W. Separating the shirkers from the workers? making sure respondents pay attention on self-administered surveys. Am. J. Polit. Sci. 58, 739–753 (2014).

doi: 10.1111/ajps.12081

Groh, M. et al. Participant Data and Code for “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. https://doi.org/10.48550/arXiv.2202.12883 (2024).

Groh, M. et al. Stimuli for “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. https://doi.org/10.48550/arXiv.2202.12883 (2024).

Human detection of political speech deepfakes across transcripts, audio, and video.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

Matthew Groh (M)

Aruna Sankaranarayanan (A)

Nikhil Singh (N)

Dong Young Kim (DY)

Andrew Lippman (A)

Rosalind Picard (R)

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Smoking Cessation and Incident Cardiovascular Disease.

Evaluation of Low-Value Services Across Major Medicare Advantage Insurers and Traditional Medicare.

Effectiveness of Virtual Yoga for Chronic Low Back Pain: A Randomized Clinical Trial.

Classifications MeSH