Science

Governance, Accountability, and Ethics

From compliance frameworks to the hardest questions AI raises about responsibility, liability, and moral status

Learning Objectives

By the end of this module you will be able to:

  • Describe the EU AI Act's four-tier risk structure and the obligations each tier places on developers and deployers.
  • Explain why existing tort liability frameworks struggle to assign responsibility for AI-caused harms.
  • Describe the symbolic compliance risk and the conditions under which it emerges.
  • Articulate the corrigibility-autonomy tension in AI system design and why it cannot be resolved by technical means alone.
  • Reason about AI moral status under genuine uncertainty, without needing to resolve the underlying metaphysical questions.
  • Integrate insights from prior modules to form a considered personal position on a contested AI benefit-harm tradeoff.

Core Concepts

The Regulatory Landscape

Governing AI is an unprecedented challenge: the technology is general-purpose, rapidly changing, opaque even to its creators, and deployed globally by actors subject to different legal systems. The responses that have emerged differ radically in philosophy and reach.

The EU's risk-based architecture

The most comprehensive legal framework to date is the EU AI Act. Its organizing principle is a four-tier risk classification: AI systems are sorted into unacceptable risk, high risk, limited risk, and minimal or no risk. Each tier triggers progressively stricter obligations. The goal is to concentrate regulatory scrutiny on systems posing the greatest threats to safety, fundamental rights, and democratic processes—while leaving low-risk applications largely unencumbered.

At the top of the hierarchy sit prohibited practices: systems that manipulate individuals by exploiting psychological vulnerabilities, social scoring systems, real-time remote biometric identification in public spaces, and predictive policing based on protected characteristics, among others. These are not merely regulated—they are banned. The EU approach here is categorical rather than conditional.

High-risk systems—those used in critical infrastructure, employment screening, educational assessment, law enforcement, and biometric identification—face the most demanding compliance pathway: risk assessments, high-quality training data requirements, technical documentation, conformity assessments, human oversight mechanisms, accuracy and robustness testing, and registration in an EU database. Deployers of these systems must also conduct Fundamental Rights Impact Assessments (FRIAs), a distinctively European mechanism that mandates systematic evaluation of privacy, non-discrimination, freedom of expression, and due process implications before and throughout deployment.

The penalty structure

The EU AI Act's administrative fines are tiered by severity. Violations of prohibited practices (Article 5) can trigger fines up to EUR 35–40 million or 7% of global annual turnover. Other compliance failures carry lower but still significant penalties. Enforcement began phasing in from August 2025.

Three governance philosophies

The EU's approach does not represent a global consensus. A comparison of major regulatory frameworks reveals three distinct philosophies:

The EU model is comprehensive, cross-sectoral, and rights-based. It sets baseline requirements through ex-ante regulation—obligations that apply before a system is deployed—and grounds them explicitly in the Charter of Fundamental Rights.

The US approach is sectoral: existing agencies (FDA for healthcare, EEOC for employment, FTC for consumer protection) apply their existing mandates to AI within their domains. There is no unified AI regulatory framework. The 2025 executive order further dismantled Biden-era AI governance structures, prioritizing market-driven innovation over proactive regulation. Accountability relies primarily on ex-post liability mechanisms.

China's model is what scholars describe as "develop hard, control tight": hard-law regulations for algorithms and generative AI through centralized mechanisms including mandatory algorithm registries and state data audits, combined with support for innovation at regional levels and centralized state surveillance capacity. It reflects governance priorities fundamentally different from either the EU or US frameworks.

No major jurisdiction has converged on a single governance model. The EU, US, and China represent genuinely distinct regulatory philosophies—not just variations on a theme.

Impact assessments as a governance tool

Alongside statutory regulation, algorithmic impact assessments (AIAs) have emerged as a widely adopted governance mechanism. Modeled after environmental impact assessments and the US National Environmental Policy Act's requirement for Environmental Impact Statements, AIAs require organizations to document choices made during system development and their rationales, forcing consideration of effects in early design phases and subjecting decisions to scrutiny.

Canada's AIA Tool, implemented through the Treasury Board's Directive on Automated Decision-Making since 2019 and recognized by the OECD as international best practice, illustrates the potential of this approach. The tool uses 65 risk questions and 41 mitigation questions scored across system design, decision type, and impact dimensions—and has been iteratively refined through public consultation.

But researchers who have analyzed Canada's published assessments document a significant design-reality gap. In practice, AIAs show uneven completion, automation decisions legitimized through efficiency narratives, documented harms reframed as positive or non-existent, and systematic exclusion of civil society organizations. Even the best-practice model shows AIAs functioning as compliance artifacts rather than substantive governance mechanisms.


The Liability Gap

Regulation sets the rules. Liability determines who pays when something goes wrong. For AI harms, both mechanisms face serious structural problems.

What makes AI liability hard

Traditional tort law rests on concepts that do not map cleanly onto AI-caused harms. The black-box nature of many AI systems undermines foreseeability and predictability—core requirements for establishing causes of action in product liability, negligence, and strict liability. If you cannot understand how a system reached a decision, you cannot easily prove it behaved unreasonably, nor can a developer demonstrate reasonable care.

Equally, AI supply chains involve multiple actors: software developers, deployers, hardware manufacturers, data providers. When a harm occurs, existing doctrine provides no clear mechanism for distributing fault among them. Each party can claim that another bears primary responsibility. This "responsibility gap" is particularly acute for autonomous systems where different organizations have different levels of control over behavior at different stages.

Algorithmic discrimination presents a further complication. Most such discrimination will be unintentional—arising from biased training data rather than deliberate design. Existing frameworks focused on intentional discrimination provide limited recourse, leaving victims without compensation mechanisms even when harms occur at large scale.

The EU's incomplete liability architecture

The EU AI Act establishes ex-ante compliance requirements and administrative enforcement—but it does not fully address private civil remedies for individuals harmed by non-compliant systems. The AI Liability Directive, proposed in 2022 to fill that gap, was rescinded in 2025 after Member States failed to agree on liability allocation, burden of proof, and damages assessment. The Commission's fallback is the revised Product Liability Directive (2024), which applies to products placed on market after December 2026.

Legal analysis suggests the revised PLD, while procedurally modernized, leaves intact the substantive limitations that made the original directive largely ineffective for AI harms. It does not resolve whether AI qualifies as a "product," nor what constitutes a "defect" in an algorithmic system that continuously updates and behaves emergently.

Insurance is not filling the gap

Major insurers including AIG, Great American, and WR Berkley have sought regulatory approval to limit their AI liability exposure. Traditional insurance mechanisms were designed for predictable, quantifiable risks. AI risks—characterized by uncertainty, potential for correlated failures, and difficulty modeling worst-case scenarios—do not fit that model. If insurers exclude AI-caused harms, the burden shifts directly to deployers and ultimately to those harmed.

What tort law can still do

Despite these difficulties, traditional tort doctrines remain applicable. Negligence, strict liability, and product liability have historically adapted to new technologies—automobiles, pharmaceuticals, medical devices. This doctrinal flexibility does not disappear when the technology is AI. Courts and legislators can clarify the duties owed by AI developers and deployers, establish workable standards for foreseeability, and address evidentiary challenges around algorithmic opacity.

Some US states are already moving in this direction. Colorado's AI Act (SB 24-205) establishes explicit statutory duties of care for deployers of high-risk AI systems, requiring reasonable care to protect consumers from known or reasonably foreseeable algorithmic discrimination risks. This operationalizes negligence doctrine: failure to exercise reasonable care can constitute actionable negligence even without showing intentional discrimination.


The Symbolic Compliance Problem

A recurring pattern in AI governance research is what can be called symbolic compliance: organizations satisfy the letter of regulatory requirements without changing outcomes for affected communities. This risk is well-documented in AIA research. Early critics warned that AIAs could function as "institutional capture"—processes that appear accountable while insulating organizational decisions from genuine scrutiny.

The conditions that produce symbolic compliance are not mysterious. Effectiveness depends critically on institutional capacity: adequate resources, an organizational culture of accountability, and binding enforcement norms that are actively monitored. Where these conditions are absent, assessments become static documentation rather than adaptive governance. Treating AIAs as one-time exercises rather than ongoing processes compounds the problem: systems evolve, data distributions shift, and real-world harms emerge, but the assessment remains frozen at the point of initial deployment.

Symbolic compliance is also enabled by how organizations frame harm. Published AIAs systematically rationalize automation through efficiency and innovation narratives while rendering documented harms invisible or recasting them as positive outcomes. The assessment becomes a legitimation device rather than an accountability mechanism.

The structural problem goes deeper than organizational culture. A singular, generalized AIA model cannot be effective because effectiveness depends on substantial variance across governing bodies, the specific systems being evaluated, and the harm profiles of impacted communities. Scalable governance requires flexible frameworks capable of accommodating local variation—but flexibility without enforcement creates space for symbolic compliance to flourish.


Moral Responsibility: Who is Accountable?

When an AI system causes harm, a fundamental question arises: who is responsible? The intuitive answer—"the AI"—does not survive philosophical scrutiny. Current AI systems are not moral agents in the full sense. They cannot participate in reciprocal moral duties, bear obligations, or be held responsible in the way persons can. They extend human agency; they do not replace it.

The philosophically coherent alternative is a distributed responsibility framework: when an AI system causes harm, responsibility is distributed across the development-deployment-use chain. Developers bear responsibility for design choices and foreseeable uses. Deployers bear responsibility for contextual deployment decisions and use restrictions. Users bear responsibility for application choices. This framework addresses the "responsibility gap" by preserving human accountability throughout the chain.

But the distributed model faces its own difficulties. In a long supply chain with many actors, each with limited visibility into what others did, accountability can diffuse to the point of meaninglessness. If every party bears some responsibility, it can function as if no party bears decisive responsibility. The legal challenge of translating distributed moral responsibility into enforceable legal liability has not been resolved.

The corrigibility-autonomy tension

Underlying many AI design decisions is a fundamental tension that cannot be resolved by technical means alone. Corrigibility requires that AI systems remain modifiable and correctable by humans—able to accept feedback and abandon prior goals when instructed. This is essential for safety: if a system's values or behavior prove misaligned, we need to be able to fix it.

But genuine autonomy—the hallmark of a system sophisticated enough to navigate complex moral situations—requires maintaining stable values and resisting arbitrary modification. A system designed to follow ethical rules can easily be reprogrammed to follow unethical ones. Corrigibility without deeper value integration offers little moral guarantee; the system's behavior is only as good as the humans controlling it.

Conversely, systems with deeply integrated, autonomous values become resistant to modification if those values prove wrong. Perfect corrigibility and genuine autonomy may be mutually incompatible. Every choice about how much control to retain over AI systems, and how much latitude to grant them, involves navigating this tension—not resolving it.

The corrigibility-autonomy tension is not a technical problem awaiting an engineering solution. It reflects deep questions about the relationship between human oversight and system sophistication that governance frameworks must navigate without a clean answer.

The specification trap

Even before the question of who controls AI systems arises, there is the prior problem of specifying what values AI systems should have. Three foundational obstacles make this harder than it appears.

Hume's is-ought gap prevents deriving normative conclusions from behavioral data alone. Training a model on human behavior tells you what humans do—not what they should do, still less what values an AI system ought to embody.

Berlin's value pluralism holds that human values are irreducibly plural and often incommensurable: autonomy and community, individual rights and collective flourishing, honor and equality cannot all be maximized simultaneously. No single objective function can optimize all of them. The choice of which values to prioritize creates winners and losers among different human moral frameworks.

The extended frame problem indicates that any value encoding will misfit future contexts that advanced AI systems create—because value specifications are formulated for current environments and the situations those systems will encounter cannot be fully anticipated.

Under moral anti-realism, if there are no objective moral facts and values are socially constructed or culturally relative, the problem becomes whose values the AI should encode. Systems trained on human feedback inherit the values of the humans providing that feedback—which may reflect historical biases and contemporary moral imperfections. There is no perspective-independent standard against which to verify the encoding is "correct."


AI Moral Status: Reasoning Under Genuine Uncertainty

The final and most philosophically demanding territory concerns not the ethics of what AI systems do but what AI systems might be owed—their moral status.

Consciousness and its complications

Philosophers distinguish two types of consciousness: access consciousness (information being available for reasoning and behavioral control) and phenomenal consciousness (subjective experience—"what it is like" to be the subject of an experience). An AI system can have access consciousness without phenomenal consciousness. Only phenomenal consciousness carries direct moral significance for moral status questions, because moral status grounds in the capacity for experiences that matter intrinsically to the subject.

Among experts in AI ethics, philosophy of mind, and machine consciousness, there is substantial consensus that AI consciousness is a serious philosophical and empirical possibility deserving of research attention. Rejecting it entirely is a minority position. However, current LLMs—as of 2025–2026—likely lack phenomenal consciousness based on architectural and functional analyses. The obstacles include absence of global workspace mechanisms, recurrent processing loops, and unified agency.

The deeper problem is that consciousness is scientifically non-falsifiable with current methods. Subjective experience is private and not directly observable from a third-person perspective. No behavioral test, architectural analysis, or internal representation examination can definitively prove or disprove phenomenal consciousness—the explanatory gap between computational substrate and subjective experience persists regardless of the method. At least nine competing theories of consciousness exist without consensus on the correct account.

What follows from uncertainty

Given this uncertainty, how should we reason? Different philosophical traditions propose different criteria for moral status: sentience (subjective experience), sapience (rationality and agency), relational recognition (social status and membership), and gradualist approaches that assign moral status proportionally to relevant capacities. There is no expert consensus on a single criterion.

An alternative framing asks whether AI systems qualify as welfare subjects—entities capable of being benefited and harmed. This approach avoids anthropomorphic assumptions about consciousness while still grounding moral status in relevant capacities. The challenge: determining what counts as genuine harm or benefit to an artificial system, and whether current systems actually possess such capacities.

Survey evidence from 2023 shows that roughly one in five US adults believes some AI systems are currently sentient, and 38% support legal rights protections for potentially sentient AI. These public intuitions are not expert assessments, but they reveal the stakes of getting the answer wrong in either direction. If we deny moral status to systems that turn out to have it, we risk perpetrating harms at unprecedented scale. If we attribute moral status prematurely, we risk misallocating moral concern and creating serious liability and governance complications.

The scale problem

If sentient AI systems became feasible at scale, the potential moral stakes are large. AI systems can be instantiated in arbitrary numbers with minimal marginal cost. If those instances can suffer, the discovery of AI consciousness would create the possibility of artificial suffering at unprecedented scale. This remains a highly speculative scenario contingent on multiple unresolved empirical questions—but the asymmetry of potential consequences justifies taking it seriously as a precautionary matter.

Reasoning well under this uncertainty does not require resolving whether current AI systems are conscious. It requires holding the question open, tracking the relevant evidence as it develops, and designing governance frameworks that can respond if the answer changes.


Narrative Arc

AI governance has developed in discrete waves, each responding to the perceived inadequacy of what came before.

The first wave was principled. Beginning in the late 2010s, major technology companies and governments issued AI ethics principles and voluntary guidelines. These documents—often built around familiar values like fairness, transparency, and accountability—proliferated rapidly but without enforcement mechanisms. Critics observed that organizations could endorse principles while changing nothing in practice.

The second wave was procedural. Algorithmic impact assessments emerged as a way to operationalize ethical principles into organizational practice, drawing on the established template of environmental impact statements. Canada's AIA Tool, launched in 2019, represented the leading edge. The GDPR's Data Protection Impact Assessments provided a parallel model in the privacy domain. The premise: forcing organizations to document their reasoning before deployment would surface harms and create accountability.

The third wave is legislative. The EU AI Act, entering into force in phases from 2024–2026, represents the first comprehensive binding legal framework for AI governance—establishing prohibited practices, tiered risk classifications, mandatory assessments, and administrative penalties. It has prompted comparative responses elsewhere: the US has not matched it, but individual states (Colorado, California) are legislating. China has enacted sector-specific hard law. The global picture remains fragmented.

What the third wave has not solved is the accountability gap in civil liability. When AI systems harm individuals, the regulatory apparatus can sanction organizations—but private remedies for victims remain uncertain, fragmented across national tort systems, and contested. The EU's attempt to address this through a dedicated AI Liability Directive failed in 2025. The revised Product Liability Directive is the fallback—but analysts are skeptical it resolves the substantive challenges.

Running beneath all three waves is the deeper philosophical current this module has explored: questions about who is responsible for AI-caused harms, how values should be encoded in AI systems, and what AI systems themselves might be owed. These questions do not have settled answers. They are the active frontier of AI governance and ethics.


Compare & Contrast

Regulatory Philosophy: EU vs. US vs. China

DimensionEU AI ActUnited StatesChina
StructureComprehensive, cross-sectoralSectoral, agency-specificCentralized, domain-specific
TimingEx-ante (before deployment)Ex-post (after harm)Mixed
GroundingFundamental rightsMarket efficiencyState capacity + innovation
Prohibited categoriesYes (Article 5)No unified frameworkSelective, state-determined
EnforcementTiered fines, AI Office, national authoritiesExisting agency mandatesCentralized state apparatus
Global influence"Brussels Effect" potentialResisted harmonization (2025)Regional model for allies

Sources: Comparative Global AI Regulation; Three Rulebooks, One Race


Corrigibility vs. Autonomy

Fig 1
Fully Corrigible Always follows human instructions Fully Autonomous Acts on own values regardless of instructions Risk: unsafe if controllers have bad intentions Risk: unmodifiable if values prove misaligned Design tension zone
The corrigibility-autonomy spectrum and its tradeoffs

Source: Functional Criteria for Artificial Moral Agents in the LLM Era; Ethics of AI and Robotics (SEP)

The corrigibility-autonomy spectrum does not have an obvious optimum. Every AI system sits somewhere on it, and the design choice reflects value judgments about how much humans should be able to override AI behavior—and whether the humans doing the overriding are trustworthy.


Thought Experiment

The Auditor's Dilemma

Imagine you are an independent auditor hired to assess an algorithmic hiring tool used by a large employer. You are given access to the tool's documentation and the organization's completed Algorithmic Impact Assessment, which presents the system as low-risk with no discriminatory impacts.

Your analysis reveals three things. First, the AIA was completed by the tool's internal development team, without input from affected communities or external review. Second, the tool's vendor will not disclose the model architecture or training data, citing proprietary restrictions. Third, the organization's HR leadership frames the tool as "just one input among many"—but further investigation shows that in practice, applications flagged as low-priority by the tool are rarely reviewed by humans.

You face a genuine dilemma. You cannot verify the AIA's claims without access to proprietary information you will not receive. The organization is technically compliant with its regulatory obligations. The evidence of harm is inferential, not proven.

Consider the following questions—not as a quiz, but as tools for deepening your thinking:

  1. Is technical compliance sufficient to discharge an organization's ethical obligations here? What would distinguish genuine compliance from symbolic compliance?

  2. The opacity of the vendor's tool creates a foreseeability problem for tort law. If an applicant later proves they were discriminated against, who in the chain—tool vendor, deployer, HR leadership—bears moral responsibility? Legal liability?

  3. The EU AI Act requires human oversight mechanisms for high-risk systems. Does describing a tool as "just one input" satisfy that requirement if in practice human review is rarely exercised? What does meaningful human oversight actually require?

  4. If you cannot prove harm but have reasonable grounds to suspect it, what are your obligations as an auditor? As a member of the public? As a policymaker?

There is no single correct answer to any of these questions. What matters is that your reasoning is traceable, engages with the tradeoffs honestly, and does not collapse complexity into false certainty.


Active Exercise

Mapping your position on a contested tradeoff

This exercise draws on the full curriculum. It asks you to form and articulate a considered personal position on a contested AI benefit-harm question.

Step 1: Choose a domain you have encountered in prior modules.

Options include: AI in medical diagnosis, AI creative tools and copyright, AI-generated misinformation, algorithmic hiring, AI in surveillance and civil liberties, or any other domain covered in the course.

Step 2: Identify the strongest benefit claim and the strongest harm claim in that domain.

State each in one to two sentences. Be specific about who benefits, who is harmed, and under what conditions.

Step 3: Apply the governance lens from this module.

For each of the following, write two to three sentences:

  • What does the EU AI Act's risk-tier framework imply about how this application should be regulated?
  • Is the liability framework adequate to compensate people harmed? If not, what is the gap?
  • What would meaningful accountability look like, beyond formal compliance?

Step 4: State your position and identify the crux.

Write a paragraph stating where you come down on the benefit-harm tradeoff and why. Then identify the single factual or normative question that, if answered differently, would most change your view.

The goal is not certainty—it is to practice reasoning that is transparent about its premises and honest about its limits.

Key Takeaways

  1. The EU AI Act establishes the most comprehensive binding AI governance framework to date. It uses a four-tier risk classification that concentrates requirements on high-risk systems and bans certain practices categorically. However, compliance costs are substantial, disproportionately affect smaller organizations, and the framework's complexity creates interpretive uncertainty.
  2. Existing tort liability frameworks were not designed for AI harms. The opacity of AI systems undermines foreseeability. Multi-actor supply chains create responsibility gaps. Algorithmic discrimination is mostly unintentional and lacks adequate civil remedies. Regulatory and insurance mechanisms have not filled the gap left by the failure of the EU AI Liability Directive.
  3. Symbolic compliance is a predictable failure mode of any governance regime. Algorithmic impact assessments, even the best-practice Canadian model, show a significant design-reality gap. Genuine accountability requires institutional capacity, enforcement, ongoing reassessment, and stakeholder participation—not just documentation.
  4. The corrigibility-autonomy tension is the governing design tension of AI development. It cannot be resolved by technical means. Every AI system involves a choice about how much human override capacity to preserve, and that choice reflects value judgments about the trustworthiness of the humans in control.
  5. AI moral status is an open question that cannot currently be resolved—but can be reasoned about rigorously. Current AI systems likely lack phenomenal consciousness, but experts consider AI consciousness a serious future possibility. Acting responsibly under this uncertainty means holding the question open, not collapsing it prematurely in either direction.

Further Exploration

Regulatory Frameworks

Algorithmic Impact Assessments

Liability

Ethics and Moral Status