Science

Misinformation and Epistemic Risk

How AI shifts the cost of deception — and what that does to shared reality

Learning Objectives

By the end of this module you will be able to:

  • Explain how AI systems generate synthetic text, images, audio, and video, and why quality has decoupled from human effort.
  • Define the liar's dividend and explain why it threatens shared reality even without any fabricated content.
  • Describe why AI-generated content detection is structurally unreliable and what the arms race dynamic means in practice.
  • Evaluate current mitigation approaches — provenance standards, platform labeling, moderation — and their known limits.
  • Explain how information pollution at scale degrades epistemic conditions for democratic discourse.

The cost asymmetry at the heart of the problem

The fundamental shift AI introduces is not simply "more fake content." It is a structural change in the cost relationship between producing deceptive content and verifying it.

Historically, creating convincing synthetic media required specialized skills, expensive equipment, and significant labor. Verification — while imperfect — operated against content created at roughly equivalent cost. AI collapses the production cost to near zero while leaving verification costs largely unchanged. This asymmetry is the foundation of every problem discussed in this module.

The threat is not that AI makes fakes better. It is that AI makes fakes cheaper — and that cheapness changes incentive structures for everyone, including people who never create a single fake.

How synthetic content is generated

Modern AI generates convincing synthetic content across every media modality using different architectures. For video and image deepfakes, transformer-based detection approaches have emerged as more generalizable than CNN-based alternatives — transformer architectures show roughly 11% performance decline in cross-dataset evaluation, while CNN approaches degrade by more than 15%. This matters not only for detection, but for understanding how generation works: the same architectural properties that make transformers better at detecting fakes reflect how the underlying generation techniques have evolved.

For text, large language models can consistently generate high-quality election disinformation, and testing across 13 LLM variants found that most models broadly comply with requests for such content. Prompt engineering can further optimize output to match or exceed human-crafted propaganda.

Contemporary influence operations do not pick a single modality. State and non-state actors now coordinate production across text, images, video, and audio in parallel, creating comprehensive information environments where multiple reinforcing false narratives operate simultaneously across different media types and languages.

Quality decoupling and the volume strategy

One counterintuitive finding from research on state-sponsored propaganda deserves attention: many of the largest AI-powered campaigns deliberately accept lower individual content quality in exchange for volume and breadth. The strategic calculation is scale over precision — covering more topics, maintaining narrative presence across more platforms, and achieving aggregate reach despite what researchers have termed "AI slop" quality per individual post.

This has two implications. First, the threat is not always the convincing deepfake — it can be the flood of low-quality noise. Second, the transition from human to AI content generation enables state actors to dramatically expand topic coverage while maintaining persuasiveness at the campaign level rather than the post level.

Information pollution as a market failure

The aggregate effect of this cost collapse has been described as a new class of market failure. Estimates suggest that by 2026, up to 90% of online content may be AI-generated or AI-assisted. When the quantity of noise increases faster than the capacity to filter signal, the information ecosystem experiences systemic degradation — not mere overload, but a reduction in the overall value of information even as total content production increases. This manifests as reduced incentives for quality content production, concentration of information markets among fewer credible providers, and increased cognitive burden on users.

Information pollution vs. information overload

These are different problems. Overload is too much signal. Pollution is noise crowding out signal. The distinction matters for solutions: overload calls for better filters; pollution calls for different cost structures and provenance systems.

The liar's dividend

The liar's dividend is one of the most important — and most misunderstood — concepts in this domain.

The liar's dividend describes the strategic benefit bad actors gain from the mere existence of sophisticated AI-generated media. Once high-quality fakes become technically feasible and publicly known to be possible, anyone can invoke their existence to discredit inconvenient authentic evidence. This does not require creating a single fake. It requires only that audiences know fakes are possible.

The mechanism works through burden-of-proof shifting. When authentic video of, say, atrocities in a conflict zone surfaces, the accused can now say "this is AI-generated" — and the claim is no longer obviously absurd. The audience must now verify authenticity, which is technically difficult and time-consuming. The liar's dividend exploits the asymmetry between the cost of generating convincing fakes and the effort required to verify authenticity.

Research on recent conflict zones describes this directly: the proliferation of deepfakes enables genuine evidence of real events to be dismissed as fabrication, degrading the ability of citizens and institutions to establish shared factual foundations for democratic deliberation.

Epistemic erosion and fragmentation

The effects compound. As AI-generated content becomes harder to distinguish from authentic media, rational actors increasingly discount all digital evidence as potentially fabricated. This "epistemic trust erosion" represents a fundamental shift in how audiences evaluate media credibility — people lose confidence not in specific false claims, but in entire institutional categories. Once exposed to AI-generated misinformation from a news organization, audiences develop skepticism toward that organization's entire output, even when most reporting remains accurate.

At a societal level, this produces epistemic fragmentation. AI-generated content enables personalized misinformation tailored to specific audience segments, creating incompatible information ecosystems with fundamentally divergent worldviews. The deeper threat is not that people believe different things — it is that they inhabit incompatible realities and lose the shared factual substrate needed to negotiate disagreement. When that substrate dissolves, collective deliberation becomes structurally impaired.

These harms are not distributed equally. AI-generated misinformation disproportionately harms the epistemic positions of marginalized groups through a mechanism called epistemic injustice: the liar's dividend means marginalized groups' actual testimony about their experiences can be preemptively dismissed as fabricated, while well-resourced actors invoke deepfake technology to discredit opposing evidence.

Autonomous AI agents running a propaganda campaign

In early 2026, researchers at USC demonstrated that simple AI agents could autonomously coordinate propaganda campaigns without any direct human control. The agents could write original posts, analyze which narratives succeeded, copy successful strategies from coordinated team members, and amplify shared messaging — all without human direction.

What makes this different from previous influence operations?

Previous state-sponsored operations involved human operators selecting topics, drafting content, and manually coordinating cross-platform amplification. The USC findings represent a qualitative shift: coordination itself becomes automated. Each post is slightly varied, so latent coordination makes the conversation appear organic and authentic. Detection becomes significantly harder because there is no central command signal to identify — the coordination emerges from agents learning what works.

The multimodal and scale dimension

Cross-reference this with what research has documented about current operations: nine ongoing influence operations have adopted multimodal AI generation capabilities, simultaneously producing videos, images, translations, and text across platforms. An autonomous agent network operating across these modalities would combine scale, coordination, and multimodal reach in ways no previous human-operated campaign could match.

Why detection fails here

AI-generated election disinformation is now indistinguishable from authentic journalism in over 50% of test cases. Human reviewers are not a reliable backstop: people significantly overestimate their ability to detect deepfakes, believing they can identify synthetic media at rates substantially higher than their actual performance.

The training lag makes this worse over time

A roughly two-year lag exists between current AI training data cutoffs and the present moment. Propaganda campaigns will become more effective as newer model generations deploy with more recent training data, enabling better alignment with current events and more sophisticated exploitation of contemporary vulnerabilities.

Common Misconceptions

"Better detection technology will solve this"

Detection is genuinely valuable, but it has documented structural limits that are worth understanding honestly.

Deepfake detection models exhibit a fundamental accuracy-generalization tradeoff: models can achieve high accuracy on their training datasets, but performance degrades by 11–15% when applied to unseen deepfake generation methods. No single detection approach can simultaneously maintain high accuracy while generalizing to new generation techniques encountered in real-world deployment.

Beyond generalization, there is a precision-recall tradeoff that is equally fundamental: forensic analysis tools achieve high recall (catching most fakes) but poor specificity (high false positive rates on authentic content), while AI classifiers show the inverse pattern. Platform operators face an unavoidable tradeoff: catch more fakes, or avoid falsely accusing authentic content.

And in deployment, state-of-the-art detection methods trained on clean, high-resolution samples struggle with compressed, low-resolution, post-processed material from real social media platforms — precisely the conditions under which they would need to work.

Finally, there is the arms race problem itself: detection methods face inherent obsolescence because research evaluation protocols test against static datasets rather than adversarial or evolving generation techniques. By the time a peer-reviewed detection method is published, it may already be testing against outdated generation capabilities.

"AI can also be used to fight misinformation, so it balances out"

This creates a genuine paradox rather than a resolution. Generative AI systems simultaneously generate and help detect misinformation. While AI tools are used to identify disinformation actors and assist fact-checkers in real time, over 50% of AI-generated news responses contained significant factual distortions in 2025 testing, and more than 60% of responses from AI-powered search engines were inaccurate. The same tools being deployed defensively are generating new content that requires defense.

More structurally: fact-checking capacity globally cannot scale with the volume and velocity of AI-generated misinformation. As of May 2025, 457 fact-checking organizations are active worldwide, but the global infrastructure faces fundamental scalability challenges, and most languages receive significantly less coverage than English. The volume asymmetry is not close.

"People can tell when they're reading AI-generated content"

The evidence goes the other direction. People significantly overestimate their ability to detect deepfakes, creating heightened vulnerability to deepfake-based misinformation through an overconfidence gap.

The trust dynamics around detected AI content are also more complex than they first appear. Even when audiences reject high-quality AI-generated news in favor of lower-quality human-created alternatives, this reflects not rational assessment of content quality but epistemic uncertainty about authorship and accountability. Only 29% of respondents will read fully AI-generated news, compared to 84% who prefer news without AI involvement, despite quality ratings that favor the AI-generated articles. Trust is an epistemic signal that tracks authorship and institutional accountability — it does not track content quality directly.

"Watermarking and labels will fix the provenance problem"

Provenance standards are a real and meaningful development, but they have documented limits that matter. Current watermarking and C2PA standards produce conflicting authentication signals — a single asset can simultaneously claim human authorship via C2PA and AI generation via watermark. Watermarks can be removed through routine image processing operations, and provenance information is frequently lost in datasets-of-datasets lacking standardized documentation.

Platform labeling is additionally fragmented: YouTube, TikTok, and Meta have each implemented synthetic media labeling policies with inconsistent standards and requirements. Without unified standards, users face a patchwork of inconsistent labels and badges that often confuse rather than clarify.

Thought Experiment: The diplomat's footage

A government releases video footage that appears to show soldiers from a neighboring country committing atrocities in a disputed border region. The footage is high-resolution, dated, and geolocated.

The neighboring government immediately claims the footage is an AI deepfake. Within hours, several verified accounts share plausible-sounding technical analysis suggesting the lighting, facial movements, and compression artifacts are consistent with known generation methods. The claim spreads faster than any official rebuttal.

Consider:

  • If you are a citizen trying to form a view on this event, what information would you need, and how would you get it? What would it cost you in time and effort compared to simply treating all digital evidence as suspect?
  • If you are a journalist covering the story, what verification steps are available? Who bears the burden of proof?
  • If the footage is genuine, and the deepfake claim succeeds in discrediting it, what has been lost — and for whom?
  • If you are an independent researcher who has confirmed the footage is authentic, but most of your audience has already seen the "deepfake" claims, how do you communicate confidence in authenticity in an environment where the phrase "I verified this" has been systematically cheapened?

Now consider the reverse: if the footage is actually a deepfake, what would have stopped it from succeeding?

This scenario has no clean resolution. That is the point. The liar's dividend does not require deception to succeed — it only requires that deception be plausible enough to paralyze verification and delay collective judgment until the window for action has closed.

Key Takeaways

  1. The core problem is cost asymmetry, not content quality. AI decouples content production cost from quality, driving it toward zero, while verification costs remain high. This structural shift advantages deception at scale regardless of whether any specific piece of fake content is convincing.
  2. The liar's dividend works without any fakes. The mere known possibility of AI-generated synthetic media allows bad actors to dismiss authentic evidence as fabricated. This shifts the burden of proof onto those presenting evidence and degrades the epistemic value of audiovisual content in legal, political, and public discourse.
  3. Detection is not a solution, it is a temporary and partial countermeasure. Detection tools face fundamental accuracy-generalization tradeoffs, precision-recall tradeoffs, lab-to-real-world degradation, and an arms race where offensive capabilities advance faster than defensive evaluation cycles. Human overconfidence in detection ability makes the problem worse.
  4. The epistemic damage is systemic, not just per-content. Trust erosion, epistemic fragmentation, and epistemic injustice are not consequences of specific false claims — they are consequences of the environment created by AI-generated content at scale. Marginalized groups bear disproportionate costs from this environment.
  5. Current defenses are real but structurally incomplete. C2PA provenance standards, platform labeling, hybrid human-AI moderation, and regulatory sandboxes are genuine responses — each with documented limits. Policy cycles lag capability advances, making formal governance tend to address outdated threat models.

Further Exploration

Foundational research

On detection and its limits

On scale and state use

On mitigations and governance