Human detection of political speech deepfakes across transcripts, audio, and video.
Journal
Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555
Informations de publication
Date de publication:
02 Sep 2024
02 Sep 2024
Historique:
received:
14
09
2022
accepted:
22
08
2024
medline:
3
9
2024
pubmed:
3
9
2024
entrez:
2
9
2024
Statut:
epublish
Résumé
Recent advances in technology for hyper-realistic visual and audio effects provoke the concern that deepfake videos of political speeches will soon be indistinguishable from authentic video. We conduct 5 pre-registered randomized experiments with N = 2215 participants to evaluate how accurately humans distinguish real political speeches from fabrications across base rates of misinformation, audio sources, question framings with and without priming, and media modalities. We do not find base rates of misinformation have statistically significant effects on discernment. We find deepfakes with audio produced by the state-of-the-art text-to-speech algorithms are harder to discern than the same deepfakes with voice actor audio. Moreover across all experiments and question framings, we find audio and visual information enables more accurate discernment than text alone: human discernment relies more on how something is said, the audio-visual cues, than what is said, the speech content.
Identifiants
pubmed: 39223110
doi: 10.1038/s41467-024-51998-z
pii: 10.1038/s41467-024-51998-z
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
7629Informations de copyright
© 2024. The Author(s).
Références
Hancock, J. T. & Bailenson, J. N. The social impact of deepfakes. Cyberpsychol. Behav. Soc. Netw. 24, 149–152 (2021).
pubmed: 33760669
doi: 10.1089/cyber.2021.29208.jth
Chesney, B. & Citron, D. Deep fakes: A looming challenge for privacy, democracy, and national security. Calif. L. Rev. 107, 1753 (2019).
Paris, B. & Donovan, J. Deepfakes and Cheap Fakes. United States of America: Data & Society (2019).
Leibowicz, C., McGregor, S. & Ovadya, A. The Deepfake Detection Dilemma: A Multistakeholder Exploration of Adversarial Dynamics in Synthetic Media (2021).
Agarwal, S. et al. Protecting World Leaders Against Deep Fakes. In CVPR workshops, vol. 1 (2019).
Pataranutaporn, P. et al. Ai-generated characters for supporting personalized learning and well-being. Nat. Mach. Intell. 3, 1013–1022 (2021).
doi: 10.1038/s42256-021-00417-9
Guess, A. M. & Lyons, B. A. Misinformation, disinformation, and online propaganda. Social media and democracy: The state of the field, prospects for reform 10–33 (2020).
Boháček, M. & Farid, H. Protecting world leaders against deep fakes using facial, gestural, and vocal mannerisms. Proc. Natl. Acad. Sci. USA 119, e2216035119 (2022).
pubmed: 36417442
pmcid: 9860138
doi: 10.1073/pnas.2216035119
Karras, T., Laine, S. & Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4401–4410 (2019).
Karras, T. et al. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8110–8119 (2020).
Nichol, A. et al. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. In International Conference on Machine Learning (pp. 16784–16804). (PMLR, 2022).
Kamali, N., Nakamura, K., Chatzimparmpas, A., Hullman, J. & Groh, M. How to distinguish ai-generated images from authentic photographs. Preprint at arXiv https://doi.org/10.48550/arXiv.2406.08651 (2024).
Groh, M., Epstein, Z., Obradovich, N., Cebrian, M. & Rahwan, I. Human detection of machine-manipulated media. Commun. ACM 64, 40–47 (2021).
doi: 10.1145/3445972
Suvorov, R. et al. Resolution-robust large mask inpainting with fourier convolutions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2149–2159 (2022).
Arik, S. O., Chen, J., Peng, K., Ping, W. & Zhou, Y. Neural voice cloning with a few samples. Advances in neural information processing systems 31 https://doi.org/10.48550/arXiv.1802.06006 (2018).
Luong, H.-T. & Yamagishi, J. Nautilus: a versatile voice cloning system. IEEE/ACM Trans. Audio, Speech, Lang. Process. 28, 2967–2981 (2020).
doi: 10.1109/TASLP.2020.3034994
Prajwal, K. R., Mukhopadhyay, R., Namboodiri, V. P. & Jawahar, C. A lip sync expert is all you need for speech to lip generation in the wild. In Proceedings of the 28th ACM International Conference on Multimedia, MM ’20, 484–492 (Association for Computing Machinery, New York, NY, USA, 2020).
Lahiri, A., Kwatra, V., Frueh, C., Lewis, J. & Bregler, C. Lipsync3d: Data-efficient learning of personalized 3d talking faces from video using pose and lighting normalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2755–2764 (2021).
Hong, W., Ding, M., Zheng, W., Liu, X. & Tang, J. Cogvideo: Large-scale pretraining for text-to-video generation via transformers, https://doi.org/10.48550/ARXIV.2205.15868 (2022).
Peirce, C. S. Peirce on Signs: Writings on Semiotic (UNC Press Books, 1991).
Messaris, P. & Abraham, L. The role of images in framing news stories. In Framing Public Life, 231–242 (Routledge, 2001).
Glasford, D. E. Seeing is believing: communication modality, anger, and support for action on behalf of out-groups. J. Appl. Soc. Psychol. 43, 2223–2230 (2013).
doi: 10.1111/jasp.12173
Yadav, A. et al. If a picture is worth a thousand words is video worth a million? differences in affective and cognitive processing of video and text cases. J. Comput. High. Educ. 23, 15–37 (2011).
doi: 10.1007/s12528-011-9042-y
Appiah, O. Rich media, poor media: The impact of audio/video vs. text/picture testimonial ads on browsers’ evaluations of commercial web sites and online products. J. Curr. Issues Res. Advert. 28, 73–86 (2006).
doi: 10.1080/10641734.2006.10505192
Powell, T. E., Boomgaarden, H. G., De Swert, K. & de Vreese, C. H. Video killed the news article? comparing multimodal framing effects in news videos and articles. J. Broadcast. Electron. Media 62, 578–596 (2018).
doi: 10.1080/08838151.2018.1483935
Garimella, K. & Eckles, D. Images and Misinformation in Political Groups: Evidence from Whatsapp in India. Harvard Kennedy School Misinformation Review (2020).
Budak, C., Nyhan, B., Rothschild, D. M., Thorson, E. & Watts, D. J. Misunderstanding the harms of online misinformation. Nature 630, 45–53 (2024).
pubmed: 38840013
doi: 10.1038/s41586-024-07417-w
Goel, V., Raj, S. & Ravichandran, P. How Whatsapp Leads Mobs to Murder in India. The New York Times (2018).
Sundar, S. S., Molina, M. D. & Cho, E. Seeing is believing: Is video modality more powerful in spreading fake news via online messaging apps? J. Comput. Mediat. Commun. 26, 301–319 (2021).
doi: 10.1093/jcmc/zmab010
Wittenberg, C., Tappin, B. M., Berinsky, A. J. & Rand, D. G. The (minimal) persuasive advantage of political video over text. Proc. Natl. Acad. Sci. USA 118, e2114388118 (2021).
pubmed: 34782473
doi: 10.1073/pnas.2114388118
Sundar, S. S. The Main Model: A Heuristic Approach to Understanding Technology Effects on Credibility. Digital Media, Youth, and Credibility (2008).
Hancock, J. T., Naaman, M. & Levy, K. Ai-mediated communication: definition, research agenda, and ethical considerations. J. Comput. Mediat. Commun. 25, 89–100 (2020).
doi: 10.1093/jcmc/zmz022
Barari, S., Lucas, C. & Munger, K. Political Deepfake Videos Misinform the Public, But No More than Other Fake Media. Open Science Framework (2021).
Murphy, G. & Flynn, E. Deepfake false memories. Memory 30, 480–492 (2022).
doi: 10.1080/09658211.2021.1919715
Vaccari, C. & Chadwick, A. Deepfakes and disinformation: Exploring the impact of synthetic political video on deception, uncertainty, and trust in news. Soc. Media+ Soc. 6, 2056305120903408 (2020).
Dobber, T., Metoui, N., Trilling, D., Helberger, N. & de Vreese, C. Do (microtargeted) deepfakes have real effects on political attitudes? Int. J. Press Polit. 26, 69–91 (2021).
doi: 10.1177/1940161220944364
Hameleers, M., van der Meer, T. G. & Dobber, T. You won’t believe what they just said! the effects of political deepfakes embedded as vox populi on social media. Soc. Media+ Soc. 8, 20563051221116346 (2022).
Reeves, B., Yeykelis, L. & Cummings, J. J. The use of media in media psychology. Media Psychol. 19, 49–71 (2016).
doi: 10.1080/15213269.2015.1030083
Kasra, M., Shen, C. & O’Brien, J. F. Seeing is believing: How people fail to identify fake images on the web. In Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems, 1–6 (2018).
Hameleers, M., Powell, T. E., Van Der Meer, T. G. & Bos, L. A picture paints a thousand lies? the effects and mechanisms of multimodal disinformation and rebuttals disseminated via social media. Political Commun. 37, 281–301 (2020).
doi: 10.1080/10584609.2019.1674979
Nightingale, S. J. & Farid, H. Ai-synthesized faces are indistinguishable from real faces and more trustworthy. Proc. Natl. Acad. Sci. USA 119, e2120481119 (2022).
pubmed: 35165187
pmcid: 8872790
doi: 10.1073/pnas.2120481119
Cardwell, B. A., Henkel, L. A., Garry, M., Newman, E. J. & Foster, J. L. Nonprobative photos rapidly lead people to believe claims about their own (and other people’s) pasts. Mem. Cogn. 44, 883–896 (2016).
doi: 10.3758/s13421-016-0603-1
Cardwell, B. A., Lindsay, D. S., Förster, K. & Garry, M. Uninformative photos can increase people’s perceived knowledge of complicated processes. J. Appl. Res. Mem. Cogn. 6, 244–252 (2017).
doi: 10.1016/j.jarmac.2017.05.002
Newman, E. J., Jalbert, M. C., Schwarz, N. & Ly, D. P. Truthiness, the illusory truth effect, and the role of need for cognition. Conscious. Cogn. 78, 102866 (2020).
pubmed: 31935624
doi: 10.1016/j.concog.2019.102866
Newman, E. J., Garry, M., Bernstein, D. M., Kantner, J. & Lindsay, D. S. Nonprobative photographs (or words) inflate truthiness. Psychon. Bull. Rev. 19, 969–974 (2012).
pubmed: 22869334
doi: 10.3758/s13423-012-0292-0
Fazio, L. K., Brashier, N. M., Payne, B. K. & Marsh, E. J. Knowledge does not protect against illusory truth. J. Exp. Psychol. Gen. 144, 993 (2015).
pubmed: 26301795
doi: 10.1037/xge0000098
Ecker, U. K. et al. The psychological drivers of misinformation belief and its resistance to correction. Nat. Rev. Psychol. 1, 13–29 (2022).
doi: 10.1038/s44159-021-00006-y
Dolhansky, B. et al. The deepfake detection challenge (DFDC) dataset. Preprint at arXiv https://doi.org/10.48550/arXiv.2006.07397 (2020).
Groh, M., Epstein, Z., Firestone, C. & Picard, R. Deepfake detection by human crowds, machines, and machine-informed crowds. Proc. Natl. Acad. Sci. USA 119, e2110013119 (2022).
pubmed: 34969837
doi: 10.1073/pnas.2110013119
Köbis, N., Doležalová, B. & Soraperra, I. Fooled twice–people cannot detect deepfakes but think they can. Science 24, 103364 (2021).
Lovato, J. et al. Diverse misinformation: impacts of human biases on detection of deepfakes on networks. Npj Complex. 1, 5 (2024).
doi: 10.1038/s44260-024-00006-y
Tahir, R. et al. Seeing is believing: Exploring perceptual differences in deepfake videos. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1–16 (2021).
Lee, E.-J. & Shin, S. Y. Mediated misinformation: Questions answered, more questions to ask. Am. Behav. Sci. 65, 259–276 (2021).
doi: 10.1177/0002764219869403
Pennycook, G. & Rand, D. G. Fighting misinformation on social media using crowdsourced judgments of news source quality. Proc. Natl. Acad. Sci. USA 116, 2521–2526 (2019).
pubmed: 30692252
pmcid: 6377495
doi: 10.1073/pnas.1806781116
Austin, E. W. & Dong, Q. Source v. content effects on judgments of news believability. Journalism Q. 71, 973–983 (1994).
doi: 10.1177/107769909407100420
Shen, C. et al. Fake images: The effects of source, intermediary, and digital media literacy on contextual assessment of image credibility online. N. Media Soc. 21, 438–463 (2019).
doi: 10.1177/1461444818799526
Dias, N., Pennycook, G. & Rand, D. G. Emphasizing Publishers does not Effectively Reduce Susceptibility to Misinformation on Social Media. Harvard Kennedy School Misinformation Review 1 (2020).
Jakesch, M., Koren, M., Evtushenko, A. & Naaman, M. The Role of Source, Headline and Expressive Responding in Political News Evaluation. Headline and Expressive Responding in Political News Evaluation (December 5, 2018).
Nadarevic, L., Reber, R., Helmecke, A. J. & Köse, D. Perceived truth of statements and simulated social media postings: an experimental investigation of source credibility, repeated exposure, and presentation format. Cogn. Res. Princ. Implic. 5, 1–16 (2020).
Kim, A., Moravec, P. L. & Dennis, A. R. Combating fake news on social media with source ratings: The effects of user and expert reputation ratings. J. Manag. Inf. Syst. 36, 931–968 (2019).
doi: 10.1080/07421222.2019.1628921
Pennycook, G. & Rand, D. G. Lazy, not biased: Susceptibility to partisan fake news is better explained by lack of reasoning than by motivated reasoning. Cognition 188, 39–50 (2019).
pubmed: 29935897
doi: 10.1016/j.cognition.2018.06.011
Appel, M. & Prietzel, F. The detection of political deepfakes. J. Comput. Mediat.Commun. 27, zmac008 (2022).
doi: 10.1093/jcmc/zmac008
Arechar, A. A. et al. Understanding and reducing online misinformation across 16 countries on six continents. Nat. Hum. Behav. 7, 1502–1513 (2022).
Pennycook, G. & Rand, D. G. The psychology of fake news. Trends Cogn. Sci. 25, 38–402 (2021).
doi: 10.1016/j.tics.2021.02.007
Lazer, D. M. J. et al. The science of fake news. Science 359, 1094–1096 (2018).
pubmed: 29590025
doi: 10.1126/science.aao2998
Dan, V. et al. Visual mis-and disinformation, social media, and democracy. J. Mass Commun. Q. 98, 641–664 (2021).
Calo, R., Coward, C., Spiro, E. S., Starbird, K. & West, J. D. How do you solve a problem like misinformation? Sci. Adv. 7, eabn0481 (2021).
pubmed: 34878833
pmcid: 11323800
doi: 10.1126/sciadv.abn0481
Sankaranarayanan, A., Groh, M., Picard, R. & Lippman, A. The presidential deepfakes dataset. In Proceedings of the AIofAI Workshop at the International Joint Conference on Artificial Intelligence (2021).
Perov, I. et al. Deepfacelab: Integrated, flexible and extensible face-swapping framework. Preprint at arXiv https://doi.org/10.48550/arXiv.2005.05535 (2020).
Free text to speech & AI Voice Generator. Elevenlabs. https://elevenlabs.io .
Abadie, A., Athey, S., Imbens, G. & Wooldridge, J. When Should You Adjust Standard Errors for Clustering? The Quarterly Journal of Economics (2017).
Gomila, R. Logistic or linear? estimating causal effects of experimental treatments on binary outcomes using regression analysis. J. Exp. Psychol. Gen. 150, 700 (2021).
pubmed: 32969684
doi: 10.1037/xge0000920
Frederick, S. Cognitive reflection and decision making. J. Econ. Perspect. 19, 25–42 (2005).
doi: 10.1257/089533005775196732
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300 (1995).
doi: 10.1111/j.2517-6161.1995.tb02031.x
Goodman, J. D. Microphone Catches a Candid Obama. The New York Times (2012).
Lyu, S. Deepfake detection: Current challenges and next steps. In 2020 IEEE international conference on multimedia & expo workshops (ICMEW), pp. 1–6 (IEEE, 2020).
Bryan, C. J., Tipton, E. & Yeager, D. S. Behavioural science is unlikely to change the world without a heterogeneity revolution. Nat. Hum. Behav. 5, 980–989 (2021).
pubmed: 34294901
pmcid: 8928154
doi: 10.1038/s41562-021-01143-3
Vincent, J. Ai image generator midjourney stops free trials but says influx of new users to blame. The Verge (2023).
Metzger, M. J., Flanagin, A. J. & Medders, R. B. Social and heuristic approaches to credibility evaluation online. J. Commun. 60, 413–439 (2010).
doi: 10.1111/j.1460-2466.2010.01488.x
Barasch, A., Schroeder, J., Zev Berman, J. & Small, D. Cues to sincerity: How people assess and convey sincerity in language. ACR North American Advances (2018).
Schroeder, J. & Epley, N. Mistaking minds and machines: How speech affects dehumanization and anthropomorphism. J. Exp. Psychol. Gen. 145, 1427 (2016).
pubmed: 27513307
doi: 10.1037/xge0000214
Franzen, L., Delis, I., Sousa, G. D., Kayser, C. & Philiastides, M. G. Auditory information enhances post-sensory visual evidence during rapid multisensory decision-making. Nat. Commun. 11, 5440 (2020).
pubmed: 33116148
pmcid: 7595090
doi: 10.1038/s41467-020-19306-7
Allen, J., Howland, B., Mobius, M., Rothschild, D. & Watts, D. J. Evaluating the fake news problem at the scale of the information ecosystem. Sci. Adv. 6, eaay3539 (2020).
pubmed: 32284969
pmcid: 7124954
doi: 10.1126/sciadv.aay3539
Watts, D. J., Rothschild, D. M. & Mobius, M. Measuring the news and its impact on democracy. Proc. Natl. Acad. Sci. USA 118, e1912443118 (2021).
pubmed: 33837145
pmcid: 8053935
doi: 10.1073/pnas.1912443118
Epstein, Z. et al. Art and the science of generative ai. Science 380, 1110–1111 (2023).
pubmed: 37319193
doi: 10.1126/science.adh4451
Agarwal, S. & Farid, H. Detecting deep-fake videos from aural and oral dynamics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 981–989 (2021).
Pennycook, G. et al. Shifting attention to accuracy can reduce misinformation online. Nature 592, 590–595 (2021).
pubmed: 33731933
doi: 10.1038/s41586-021-03344-2
Epstein, Z., Sirlin, N., Arechar, A., Pennycook, G. & Rand, D. The social media context interferes with truth discernment. Sci. Adv. 9, eabo6169 (2023).
pubmed: 36867704
pmcid: 9984169
doi: 10.1126/sciadv.abo6169
Roozenbeek, J., van der Linden, S., Goldberg, B., Rathje, S. & Lewandowsky, S. Psychological inoculation improves resilience against misinformation on social media. Sci. Adv. 8, eabo6254 (2022).
pubmed: 36001675
pmcid: 9401631
doi: 10.1126/sciadv.abo6254
Berger, J. & Milkman, K. L. What makes online content viral? J. Mark. Res. 49, 192–205 (2012).
doi: 10.1509/jmr.10.0353
Vosoughi, S., Roy, D. & Aral, S. The spread of true and false news online. Science 359, 1146–1151 (2018).
pubmed: 29590045
doi: 10.1126/science.aap9559
Brady, W. J., Wills, J. A., Jost, J. T., Tucker, J. A. & Van Bavel, J. J. Emotion shapes the diffusion of moralized content in social networks. Proc. Natl. Acad. Sci. USA 114, 7313–7318 (2017).
pubmed: 28652356
pmcid: 5514704
doi: 10.1073/pnas.1618923114
Brady, W. J., Crockett, M. J. & Van Bavel, J. J. The mad model of moral contagion: The role of motivation, attention, and design in the spread of moralized content online. Perspect. Psychol. Sci. 15, 978–1010 (2020).
pubmed: 32511060
doi: 10.1177/1745691620917336
Lazer, D. Studying human attention on the internet. Proc. Natl. Acad. Sci. USA 117, 21–22 (2020).
pubmed: 31848240
doi: 10.1073/pnas.1919348117
Fuller, T.Gnomologia: Adagies and Proverbs; Wise Sentences and Witty Sayings, Ancient and Modern, Foreign and British, vol. 1 (B. Barker, 1732).
Messaris, P.Visual Persuasion: The Role of Images in Advertising (Sage, 1997).
Farid, H. Digital doctoring: how to tell the real from the fake. Significance 3, 162–166 (2006).
doi: 10.1111/j.1740-9713.2006.00197.x
King, D.The Commissar Vanishes: The Falsification of Photographs and Art in Stalin’s Russia (Metropolitan Books New York, 1997).
Lai, V. & Tan, C. On human predictions with explanations and predictions of machine learning models: A case study on deception detection. Proceedings of the Conference on Fairness, Accountability, and Transparency 29–38, (2019).
Agarwal, S. et al. Watch those words: Video falsification detection using word-conditioned facial motion. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 4710–4719 (2023).
Long, D. & Magerko, B. What is ai literacy? competencies and design considerations. In Proceedings of the 2020 CHI conference on human factors in computing systems, 1–16 (2020).
Annapureddy, R., Fornaroli, A. & Gatica-Perez, D. Generative AI Literacy: Twelve Defining Competencies. https://doi.org/10.1145/3685680 (2024).
Sankaranarayanan, A., Groh, M., Picard, R. & Lippman, A. The presidential deepfakes dataset. In CEUR Workshop Proceedings, vol. 2942, 57–72 (CEUR-WS, 2021).
Palan, S. & Schitter, C. Prolific.ac—A subject pool for online experiments. J. Behav. Exp. Financ. 17, 22–27 (2018).
doi: 10.1016/j.jbef.2017.12.004
Berinsky, A. J., Margolis, M. F. & Sances, M. W. Separating the shirkers from the workers? making sure respondents pay attention on self-administered surveys. Am. J. Polit. Sci. 58, 739–753 (2014).
doi: 10.1111/ajps.12081
Groh, M. et al. Participant Data and Code for “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. https://doi.org/10.48550/arXiv.2202.12883 (2024).
Groh, M. et al. Stimuli for “Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video”. https://doi.org/10.48550/arXiv.2202.12883 (2024).