Engineering

How Organizations Learn — and Why They Usually Don't

Single-loop learning, double-loop learning, and the sensemaking gap in engineering organizations

Learning Objectives

By the end of this module you will be able to:

Distinguish single-loop from double-loop learning and explain why double-loop learning is rare in practice.
Describe the espoused-theory/theory-in-use gap and identify it in a real engineering culture scenario.
Explain Weick's sensemaking properties and why identity is the central sensemaking resource.
Articulate why blameless postmortems are necessary but not sufficient for double-loop learning.
Describe what structural and interpersonal conditions enable or block double-loop learning in engineering organizations.

Core Concepts

The Two Loops of Organizational Learning

When a system produces an outcome that doesn't match intentions, something has to change. The question is: what?

Argyris and Schön identified two fundamentally different answers to that question. Single-loop learning occurs when members detect a mismatch between intention and outcome, then correct it by modifying their actions — while leaving the underlying governing variables (values, beliefs, norms, and assumptions) entirely unexamined. The loop closes around the action, not around the rule that generated it.

Double-loop learning goes further. It occurs when members not only correct the error but also examine and potentially alter the governing variables themselves — the foundational assumptions and decision-making premises that produced the strategy in the first place. The second loop bends back past the action and interrogates the frame that selected it.

Fig 1

Single-loop learning corrects actions. Double-loop learning questions the governing variables that produced those actions.

Both mechanisms are needed. Single-loop learning enables incremental improvement and operational efficiency — it is the right tool when the existing frame is sound and only the execution is off. Double-loop learning is the tool for when the frame itself is the problem.

The Theory-in-Use Gap

Argyris and Schön introduced a related distinction that explains much of the friction in organizational learning: the gap between espoused theory and theory-in-use.

Espoused theory is what people say they do — the values and principles they articulate in meetings, on slides, in post-mortems. Theory-in-use is the actual mental map governing their real-time behavior, often invisible to the person holding it. The gap between the two is not hypocrisy; it is typically genuine. People sincerely believe they act as they espouse. They rationalize their actual behavior as consistent with their stated principles without perceiving the divergence.

Closing this gap — aligning what people espouse with what they actually do — requires examining and revising underlying mental models. That is itself a form of double-loop learning.

The engineering version of this gap

An engineering team espouses: "we design for failure." Their theory-in-use, revealed when a service goes down: route the on-call engineer to fix it quietly so nobody escalates, then close the ticket. The assumption driving actual behavior ("outages are embarrassing, contain them") was never stated as a governing value — and therefore never examined.

Model I and Model II

Argyris' Model II framework describes the organizational learning system capable of double-loop learning. Model II governing values include: accessing high-quality information, integrating diverse perspectives, and making theories and assumptions explicit for collective testing. Failures are treated as learning opportunities, not as threats to identity.

The default, Model I, works in the opposite direction. A core Model I strategy is unilateral control: making untested attributions about others, advocating positions without openness to challenge, and designing goals without genuine dialogue. Unilateral control provides short-term efficiency and a sense of security. It also systematically prevents the two-way communication needed to surface and examine underlying assumptions — which is the exact mechanism that double-loop learning requires.

Why Sensemaking Matters

Where Argyris and Schön focus on error correction and assumption-revision, Karl Weick's sensemaking theory addresses the earlier question: how do people construct meaning from what is happening around them in the first place?

Weick identifies seven intertwined properties of organizational sensemaking: identity, retrospection, enactment, social contact, ongoing process, extracted cues, and plausibility. These are not sequential steps; they form an integrated system. The most consequential is identity — who people understand themselves to be in their organizational context directly shapes which events trigger attention, which cues get noticed, and how retrospective sense is constructed. Identity is the organizing principle that channels all other sensemaking properties.

The second property critical for engineering contexts is retrospection: meaning is not constructed in advance of action but emerges after the fact. Weick defines sensemaking as "the ongoing retrospective development of plausible images that rationalize what people are doing." During an incident or ambiguous event, people cannot rely on pre-prepared interpretations. They must act first, then make sense of their actions and outcomes afterward. The quality and speed of retrospective sense-construction directly affects what gets learned.

A third property is crucial and frequently misunderstood: plausibility beats accuracy. Sensemaking stops when a sufficiently plausible interpretation is found — not when the most accurate interpretation is found. Organizations satisfice for meaning just as they satisfice for decisions (as covered in the bounded rationality module). This has a direct implication: the first plausible narrative of an incident to circulate tends to become the accepted one, regardless of how complete or accurate it is.

Sensemaking does not stop when you find the truth. It stops when you find something plausible enough to act on.

Sensemaking is also shaped significantly by organizational hierarchy. Whose interpretations gain collective acceptance, which cues are framed and circulated, which narratives become organizational memory — all are mediated by hierarchical position. Leaders shape sensemaking through the discourses they initiate and endorse. But the relationship is not unidirectional: middle managers and front-line staff initiate sensemaking episodes that can reframe hierarchical authority itself. The quality of organizational sensemaking depends on whether dialogue crosses hierarchical boundaries.

Shared Mental Models and Team Performance

Sensemaking operates at the level of interpretation under ambiguity. Shared mental models (SMMs) operate at the level of stable, distributed cognitive structures that coordinate team behavior over time.

Shared mental models have a positive, empirically demonstrable relationship with team performance across diverse team contexts — co-located teams, distributed software development teams, military units, and emergency response teams. The mechanism runs through team processes: SMMs improve coordination and communication, and this behavioral change mediates the performance relationship.

SMMs include multiple distinct types of knowledge: task/equipment models (procedural and tool knowledge), team interaction models (roles and responsibilities), and team models (teammate characteristics, skills, and preferences). Team-related models more strongly predict team processes; task-related models show stronger connections to task performance outcomes.

Critically, SMMs are structured relational representations, not just collections of agreed facts. How concepts link to one another — the causal structure of the model — matters more for coordination than content agreement. Research consistently shows that measurement methods capturing structural properties predict team processes better than methods measuring content similarity alone. A team that agrees on facts but organizes them differently cannot coordinate as fluidly as a team that shares the same relational structure.

Compare & Contrast

What blameless postmortems do — and don't do

Blameless postmortems shift the focus from individual blame to systemic factors, creating the psychological safety that makes teams more willing to communicate incidents, share information, and propose prevention strategies. This is necessary. Organizations with mature postmortem cultures show measurably fewer repeat incidents and faster recovery.

But a postmortem is a sensemaking episode. It constructs retrospective meaning from ambiguous events. Whether that meaning then changes governing variables — or only action strategies — determines whether the learning is single-loop or double-loop.

	Single-loop result	Double-loop result
What changes	The action (e.g., add a retry mechanism)	The governing assumption (e.g., reconsider the reliability target that made that action necessary)
What's questioned	"Did we execute correctly?"	"Is what we're optimizing for the right thing?"
Typical postmortem outcome	Action items assigned	Architectural or process assumptions revisited
Threat to identity?	Low	High — requires admitting the frame was wrong

The gap is explained by threat-rigidity theory. When organizations face perceived threat — including the implicit threat of a major incident — they respond with cognitive and behavioral rigidity: leaders restrict information processing to prior knowledge, authority centralizes, and formalization increases. These are adaptive responses for short-term survival. They are precisely the opposite of what double-loop learning requires — which is openness to having the governing frame challenged.

A blameless postmortem removes the threat of individual blame. It does not remove the threat to organizational identity and existing architectural assumptions. That second threat, which is the relevant one for double-loop learning, requires the structural conditions of Model II: psychological safety that extends to the governing assumptions themselves, not just to the people who acted within them.

The sufficient conditions trap

A common failure mode: an organization runs blameless postmortems (good), generates action items (good), assigns owners (good), and then declares itself a learning organization. None of this ensures that governing assumptions were examined. If the postmortem's implicit frame is "we need to operate this system better," rather than "should this system exist in its current form?", it is still single-loop — regardless of how blame-free the discussion was.

Routines as learning infrastructure — and as inertia

Organizational routines encode knowledge and compensate for individual cognitive limits. They transfer across members, persist through turnover, and provide the stability that makes experimentation possible. In this sense, they are essential infrastructure for learning.

But the same mechanism that makes routines stable makes them persist even when environmental conditions change. Routines encode past solutions. When those solutions were calibrated to a different environment, the routine becomes a cognitive barrier — filtering out feedback that challenges the worldview embedded in it. Dated strategic frames, accumulated from past successes, get automatically reused to interpret current information without critical evaluation, even when those frames are inadequate for novel situations.

This is the intersection of bounded rationality and organizational learning: the same cognitive economizing that makes organizations tractable also makes them resistant to the kind of feedback that would trigger double-loop revision.

Annotated Case Study

The architecture decision that survived its own postmortems

Scenario: A platform engineering team runs a monolithic core service that has become a reliability bottleneck. Over 18 months, the team runs six blameless postmortems following five-9s breaches. Each postmortem produces action items: add circuit breakers, improve observability, increase test coverage, refine deployment procedures. Metrics improve. Repeat incidents decline.

The head of infrastructure declares the postmortem process a success. The core service remains monolithic.

What this illustrates — annotation by concept:

Single-loop learning (Argyris): Each postmortem corrected action strategies — improved execution within an existing frame. The governing variable — "a monolith is the appropriate architecture for this service" — was never examined. No postmortem asked whether the architectural assumption generating the fragility should be revised.
Espoused theory vs. theory-in-use: The team espouses: "we design for reliability." The theory-in-use, revealed by behavior: "we optimize for deployment simplicity, because rewriting the monolith would require cross-team coordination we have no budget for." This second assumption governed real decisions but was never surfaced in a postmortem, because postmortems focused on incidents, not on architectural premises.
Sensemaking (Weick): The first plausible narrative — "we had operational problems and fixed them" — circulated and stuck. Because it was plausible and actionable, sensemaking terminated there. Nobody was incentivized to construct a less comfortable narrative: "we have an architectural problem we are managing around rather than addressing."
Identity as the central barrier: Acknowledging the architectural assumption would have required the team — and its leadership — to admit that a significant past decision was structurally flawed. Identity is the organizing principle that channels which cues get noticed and how retrospective sense is made. The team's identity as capable engineers who design reliable systems made it cognitively easier to frame each incident as an operational execution problem than as evidence of an architectural mistake.
Threat-rigidity: Each incident, by creating operational pressure, triggered the exact conditions under which organizations restrict information processing to prior knowledge and centralize control. High-stakes moments — precisely when double-loop learning is most needed — are structurally the moments it is least likely to occur.
What would double-loop look like?: A postmortem that surfaced and examined the governing assumption: "We treat this service as permanently monolithic. Is that assumption still valid given current scale and team structure?" This question threatens the frame, not just the execution. It requires what Model II provides: the organizational conditions to make assumptions explicit and test them collectively, without identity being on the line.

Common Misconceptions

"Double-loop learning is just more thorough root cause analysis."

Root cause analysis asks: what caused this failure? Double-loop learning asks: are the values and assumptions that drove our design and our response to this failure the right ones? A thorough RCA can be entirely single-loop — producing an accurate causal chain that leads back to an action change, while the underlying governing frame is never touched. The distinction is not depth of analysis; it is what is placed in scope for questioning.

"If we make postmortems blameless, double-loop learning follows."

Blameless postmortems reduce the threat to individuals. Double-loop learning requires removing the threat to organizational identity and architectural assumptions. These are different threats. An organization can be perfectly blameless about individual execution while remaining completely defensive about its foundational choices. Psychological safety is the enabling condition for double-loop learning, but it must extend to the governing variables — not just to the people who acted within them.

"Shared mental models mean the team agrees on everything."

Shared mental models are organized relational knowledge structures, not factual consensus. Two people can agree on all the same facts and hold incompatible mental models if they organize those facts differently — connecting different causes to different effects, assigning different weights to different risks. Conversely, a team with a well-calibrated shared model can disagree on surface details while maintaining the shared relational structure needed for implicit coordination. What matters is structural alignment, not content agreement.

"Double-loop learning is rare because people resist change."

Double-loop learning is rare because it is operationally difficult to implement, not primarily because people are resistant to change. The theory lacks concrete mechanisms for structuring the necessary interactions. Organizations do not have reliable tooling for making governing variables explicit, testing them collectively, and revising them without the process collapsing into defensiveness. The problem is one of operationalization — and resistance to change is a downstream effect of that operationalization failure, not its root cause.

"A learning organization runs more retrospectives."

Frequency of retrospectives is a single-loop intervention by default. More retrospectives generate more opportunities to close the action loop — which is valuable. But the widespread theoretical enthusiasm for learning organizations has not translated into substantive organizational change, and one reason is precisely this conflation: cadence is mistaken for depth. A team running weekly retrospectives that never examine governing assumptions is a team running single-loop learning at high velocity. That is not a learning organization in the sense Argyris and Schön meant.

Key Takeaways

Single-loop learning corrects actions; double-loop learning questions assumptions. Most organizational learning — including blameless postmortems, retrospectives, and incident reviews — operates in the single loop. This is appropriate when the governing frame is sound. It is insufficient when the frame is the problem.
The espoused-theory/theory-in-use gap is the primary diagnostic tool. The difference between what an engineering culture claims to value and what behavior actually reveals is where double-loop learning must begin. The gap is usually invisible to insiders — which is why surfacing it requires structural conditions, not just good intentions.
Sensemaking stops at plausibility, not accuracy. The first credible narrative of an ambiguous event tends to stick. In incident reviews, this means the framing that feels actionable and non-threatening is the one that shapes organizational memory — regardless of how accurate or complete it is. Identity pressure is the primary driver of which narratives survive.
Threat-rigidity works against double-loop learning precisely when it is most needed. High-stakes events create the organizational conditions — centralized control, restricted information processing, reliance on prior knowledge — that are the opposite of what double-loop inquiry requires. This is structural, not a failure of individual courage.
Shared mental models enable implicit coordination, but structure matters more than content. Teams that share the same relational organization of knowledge — not just the same facts — coordinate more fluidly, anticipate each other's behavior, and adapt faster under pressure. Building SMMs is an active practice: cross-training, shared visual representations, and explicit role modeling are interventions with empirical support.

Further Exploration

Foundational texts

Double Loop Learning in Organizations — Argyris (1977) — The original paper. Short, dense, and worth reading directly.
Chris Argyris: theories of action, double-loop learning and organizational learning — The most accessible secondary synthesis of Argyris and Schön's full framework.
Sensemaking in Organizations: Reflections on Karl Weick — An accessible entry point to Weick's framework with organizational context.

Research and reviews

Revitalizing double-loop learning in organizational contexts: A systematic review — A 2023 review of 128 studies (1974–2021). Essential for understanding why the theory has had limited practical impact and what the research agenda looks like going forward.
Threat Rigidity Effects in Organizational Behavior: A Multilevel Analysis — The foundational paper explaining why organizations become rigid under threat. Directly relevant to why incidents don't produce double-loop learning.
The Influence of Shared Mental Models on Team Process and Performance — Primary empirical work on the SMM-performance relationship.
Measuring Shared Team Mental Models: A Meta-Analysis — Why how you measure SMMs changes what you find, and what that means for practice.

Engineering and practice

Google SRE: Blameless Postmortem for System Resilience — The canonical engineering practice document. Read it in conjunction with this module to notice what it does and does not address at the governing-variable level.
Sensing from the middle: middle managers' sensemaking of change processes — How staff engineers and tech leads, not just leadership, are active sensemakers whose interpretations shape organizational meaning.