Philosophy

Knowledge, Cognition, and Codebases

Your codebase is a distributed cognitive system. Designing it well means designing for how minds actually work.

Learning Objectives

By the end of this module you will be able to:

  • Distinguish tacit from explicit knowledge and explain why certain architectural knowledge resists documentation.
  • Apply the three types of cognitive load (intrinsic, extraneous, germane) to evaluate the onboarding cost of a codebase or architectural decision.
  • Explain distributed cognition and identify the cognitive artifacts — tests, ADRs, naming conventions — that extend a team's working memory.
  • Use legitimate peripheral participation to design onboarding paths that build expertise, not just familiarity.
  • Diagnose knowledge silos and knowledge decay as epistemic failures with organizational causes, not merely documentation failures.

Core Concepts

1. Tacit vs. Explicit Knowledge

The philosopher Michael Polanyi argued that "we know more than we can tell." This asymmetry is the foundation of the tacit/explicit distinction, and it has direct consequences for how software teams function.

Explicit knowledge is information that can be articulated, written down, and transmitted: API specifications, architecture diagrams, runbooks, README files. It travels easily between people and survives staff turnover.

Tacit knowledge is harder to externalise. It includes the experiential understanding of why a particular service boundary was drawn the way it was, what failure modes the team learned the hard way, and which parts of the codebase are safe to refactor versus which are secretly load-bearing. This knowledge lives primarily in developers' minds — and it leaves when they do.

Most software systems depend heavily on tacit knowledge. Research estimates that 74% of organizations lack formal methods to capture and retain technical knowledge, resulting in losses estimated at $31.5 billion annually across Fortune 500 companies. Pair programming, code review, and mentorship are the primary mechanisms for converting tacit understanding into shareable, durable form.

The epistemological origin of technical debt

Ward Cunningham's 1992 formulation of technical debt was fundamentally epistemological, not cosmetic. He defined debt as accruing when code no longer reflects the team's current understanding of the problem domain. Debt, in this frame, is a knowledge problem: the gap between what the team now knows and what the code currently encodes. Refactoring is not cleanup — it is knowledge synchronization.

2. Cognitive Load Theory

Cognitive Load Theory, developed by John Sweller, explains how the human working memory is consumed during learning and problem-solving. It identifies three distinct sources of load:

Load TypeSourceReducible by design?
IntrinsicThe inherent complexity of the problem domain and code structureNo — only through learning
ExtraneousPoor naming conventions, inconsistent patterns, bad toolingYes
GermaneProductive effort building durable mental schemasYes — you want to increase this

Code comprehension requires sustained mental effort, and empirical research has established measurable correlations between code characteristics and developer cognitive load, using techniques including EEG, eye-tracking, and functional neuroimaging.

The key design insight: you cannot eliminate intrinsic load (complex domains are complex), but you can aggressively eliminate extraneous load. Poor source code readability, inconsistent naming, and unclear lexicon measurably increase the neural activation associated with comprehension tasks. They also produce concrete negative outcomes: elevated bug rates, extended onboarding times, and reduced developer confidence in making changes.

Divergence amplifies load

When code diverges from team understanding — through accumulated technical debt, stale documentation, or undocumented architectural decisions — it imposes elevated cognitive load on every developer who touches it. The effect is especially acute for newer team members and those maintaining unfamiliar modules.

3. Working Memory, Chunking, and Expert Schemas

Working memory has a limited capacity — traditionally characterised as 7±2 meaningful units. The key word is "meaningful." A chunk can be a single token or an entire design pattern, depending on what the reader already knows.

This is why expert and novice developers experience the same codebase so differently. Expert developers possess substantially more domain-specific schemas stored in long-term memory. They recognise familiar patterns — code idioms, architectural structures, design patterns — as single coherent units, drastically reducing working memory consumption. Novice developers, lacking those schemas, must process code element by element. The difference directly parallels chess expertise: experts recognise board configurations as meaningful patterns; novices see individual pieces.

The experience gap is not primarily about intelligence or effort. It is about the size and richness of long-term memory schemas available for pattern recognition.

The practical consequence: design patterns, naming conventions, and consistent module structure do not just make code "cleaner." They make it compressible into chunks that fit inside working memory simultaneously. Consistency is cognitive infrastructure.

4. Distributed Cognition

Edwin Hutchins demonstrated, through studies of ship navigation and cockpit operation, that cognitive processes are not confined to individual minds. They are distributed across people, tools, and environmental structures.

Applied to software development, this means: the team's working knowledge is not the sum of what each individual knows. It is constituted by individuals together with the artifacts they work in and around. Code structure, tests, documentation, type annotations, linters, and architecture diagrams all participate in the cognitive system.

Design patterns and architectural artifacts function as external cognitive artifacts that reduce individual cognitive load and encode proven solutions. When you write an expressive type signature, you are not decorating your code — you are offloading reasoning work from future readers to the artifact itself. Static analysis tools, linters, and type systems represent the same principle: they externalise error-detection and reasoning tasks that would otherwise consume human cognitive capacity.

The unit of analysis for team knowledge is not the individual developer. It is the team-artifact system.

Fig 1
Distributed Cognitive System Team shared mental models Codebase names, tests, types Artifacts ADRs, docs, diagrams
Knowledge is distributed across the team-artifact system. Removing a developer or degrading an artifact both reduce the system's cognitive capacity.

5. Legitimate Peripheral Participation and Situated Learning

Jean Lave and Etienne Wenger's theory of Legitimate Peripheral Participation (LPP) reframes learning as a fundamentally social, contextual process — not the transmission of information from expert to novice.

In LPP, newcomers do not passively receive knowledge. They actively construct understanding through guided participation in meaningful but low-risk tasks that are nonetheless productive and necessary to the community's goals. Learning occurs as newcomers gradually move from peripheral participation toward full participation, developing both practical skills and professional identity along the way.

Three features of LPP are particularly relevant to how software teams work:

Trajectory: The arc from newcomer to old-timer is not simply about accumulating skills. It involves deepening understanding of the community's history, norms, and practices — and developing identity within that community.

Legitimacy: The social structure of a community determines what kinds of participation are available to newcomers. Senior engineers who gatekeep code review access, shield newcomers from production incidents, or fail to assign meaningful work are not being cautious — they are restricting learning trajectories.

Situatedness: Knowledge is not abstract but embedded in the specific context and activity in which it is used. Developers learn code navigation by actually navigating code in authentic project contexts, not through abstract instruction about the codebase. Onboarding structured around observation and tutorials delays genuine learning; onboarding structured around guided contribution accelerates it.

Communities of practice in software organizations

In large distributed enterprises, multiple overlapping communities of practice typically exist — organized around subsystems, domains, or technical disciplines. These communities are not optional social structures. They are the primary mechanism through which architectural knowledge, design heuristics, and institutional context are transmitted across the organization.

6. Knowledge Silos and Knowledge Decay

When knowledge becomes concentrated in individuals or sub-groups rather than distributed across the team-artifact system, knowledge silos form. Silos can be individual (one developer holds system knowledge no one else has), departmental (teams fail to communicate), technological (knowledge fragmented across incompatible systems), or cultural (norms that discourage transparency).

The impact is not theoretical: silos create bottlenecks, delay inter-team progress, and compromise code quality. They also make teams brittle. Team member turnover creates acute knowledge-code divergence by removing knowledge from the distributed system without transferring it. 76% of legacy systems exhibit poor documentation and rely heavily on tacit knowledge held by long-tenured maintainers — knowledge that simply disappears when those engineers leave.

Knowledge decay is the temporal counterpart: documentation, architecture diagrams, and other artifacts become stale when not actively maintained alongside code changes. In fast-moving development contexts, documentation maintenance competes with feature development — and usually loses.


Annotated Case Study

The Departure of the Last Person Who Knew Why

Consider a hypothetical that is, in practice, extremely common.

A payments team built a microservice in 2019. At the time, they made an unusual choice: they bypassed the company's standard event bus and implemented direct HTTP callbacks for settlement notifications. The original architect knew why — the event bus at the time had a delivery guarantee model incompatible with financial settlement requirements. She left the company in 2021.

By 2023, the service has three new maintainers. A platform team proposes migrating all services to the company's updated event bus. The payments service is flagged for migration. No one on the current team objects — they don't know why the original decision was made.

The migration proceeds. Three months later, settlement notifications begin to fail silently under load.

What went wrong epistemically:

  • The original architectural decision encoded tacit domain knowledge about settlement semantics that was never made explicit.
  • No ADR existed to capture the context, the constraint, and the rejected alternative.
  • The distributed cognitive system failed: the artifact (the code) did not preserve the reasoning; the community did not maintain it through succession.

What the annotated case shows:

  1. Tacit knowledge without capture mechanisms is ephemeral. The architect's understanding of settlement requirements was real and correct. Its tacit form made it invisible to successors. Architecture decision records exist precisely for this: to capture the context, rationale, and consequences of decisions so that future engineers can understand why a constraint exists, not just that it does.

  2. Knowledge decay is active, not passive. The absence of an ADR was not simply the original team's failure to document. It was the gradual drift of a distributed cognitive system whose artifact layer stopped representing reality. Artifacts that are not actively maintained become obstacles rather than aids.

  3. Cognitive load conceals missing knowledge. The platform team's proposal generated no pushback because comprehension of the payments service was difficult — the codebase's unusual structure was interpreted as legacy oddness rather than intentional design. When code diverges from understanding, developers are more likely to make sub-optimal changes, not because they are careless but because they cannot distinguish intentional from accidental complexity.

  4. Onboarding structured as observation failed. The three engineers who inherited the service had been walked through it during onboarding. They received an explanation of the what, not the why. Legitimate peripheral participation would have had them working inside the service's constraints in authentic tasks, which would have surfaced the unusual design and prompted investigation before the architect departed.


Key Principles

1. Treat the codebase as epistemic infrastructure, not just technical infrastructure.

A codebase is not merely a set of executable instructions. It is a distributed cognitive system in which knowledge is encoded in names, tests, architecture, comments, and the gaps between them. Every design decision you make either adds or removes knowledge from the system. Naming something processor.handle() instead of payment.settle() is not aesthetic — it destroys domain context.

2. Extraneous load is a tax on everyone who touches the system.

Inconsistent naming conventions, undocumented architectural divergences, and obsolete comments are not cosmetic problems. They consume working memory that should be directed at understanding the domain. Every engineer who reads this code pays this tax on every interaction. Reducing extraneous load is multiplicative — it benefits every future reader.

3. Onboarding is a learning design problem, not a documentation problem.

You cannot onboard someone into a complex system by giving them documentation. Documentation is explicit knowledge; the system also contains tacit knowledge that must be acquired through participation. Design onboarding as a trajectory from peripheral to full participation: start with low-risk but real contributions, provide access to the artifact layer (ADRs, architecture diagrams, test suites), and ensure new team members have legitimate access to code review and production systems early.

4. Knowledge concentration is organizational fragility.

When the understanding of a system or subsystem is held by one person, the team is one departure away from a knowledge loss event. Identify your knowledge silos not by asking who is the expert, but by asking: if this person were unavailable tomorrow, what would the team not know? The answer defines the knowledge debt.

5. Artifacts must be maintained to remain cognitive infrastructure.

ADRs, architecture diagrams, and dependency maps that are not updated alongside code changes do not become neutral — they become actively misleading. Build artifact maintenance into the definition of done for architectural decisions.


Active Exercise

Cognitive Load Audit

This exercise produces a concrete diagnostic of the epistemic health of a codebase or subsystem you currently work in.

Part 1: Identify the load types (30 min)

Pick a module or subsystem you know moderately well — not one you built, not one that is completely foreign. Walk through it as if you were onboarding a new engineer.

For each point of confusion you encounter, classify it:

  • Intrinsic: The confusion comes from the domain's genuine complexity. No design change would make it easier — only deeper domain knowledge would.
  • Extraneous: The confusion comes from the way the code is presented — poor naming, inconsistent structure, surprising divergence from patterns used elsewhere in the codebase.
  • Germane: This is a moment where working through the confusion builds a lasting mental model. You are learning something that will make future navigation faster.

Tally the distribution. A healthy system should produce some intrinsic load (complexity is real), minimal extraneous load, and meaningful germane load.

Part 2: Find the tacit knowledge gap (20 min)

Identify one architectural decision in this module whose rationale is not captured anywhere you can find — no ADR, no comment, no commit message, no document. It may be a naming choice, a service boundary, a dependency direction, or a deliberate limitation.

Then ask: who, if anyone, still knows why? What would happen to that knowledge if that person left?

Part 3: Design one improvement (20 min)

Based on Parts 1 and 2, propose one concrete change that addresses the most significant knowledge risk you found. This could be:

  • An ADR for an undocumented architectural decision.
  • A renaming that replaces an implementation-level name with a domain-level name.
  • An onboarding task sequence that gives new engineers legitimate peripheral participation in this module's maintenance.

Write the proposal in one paragraph. Focus on what knowledge it preserves or transmits, and for whom.

Key Takeaways

  1. Tacit knowledge is the primary medium of engineering expertise and it is structurally resistant to documentation. Active practices (pair programming, code review, mentorship) are necessary to convert it into explicit, shareable form. Organizations without these practices experience continuous, largely invisible knowledge loss.
  2. Cognitive load is the cost of reading your decisions. The three types — intrinsic (domain complexity), extraneous (presentation failures), germane (productive schema-building) — give you a vocabulary for diagnosing and improving the comprehensibility of a codebase. Reducing extraneous load is design work, not cleanup.
  3. A codebase is a distributed cognitive system. Knowledge is not located only in individual minds; it is distributed across people, code structure, tests, type annotations, and documentation artifacts. Designing the artifact layer — naming, ADRs, architecture — is designing the team's extended working memory.
  4. Expertise is schema-depth, not mere experience. Expert developers process familiar patterns as single chunks, dramatically reducing working memory consumption. This means consistency and convention are cognitive infrastructure: they make the codebase compressible.
  5. Knowledge silos and decay are organizational failures, not individual ones. When knowledge concentrates in individuals or documentation drifts from reality, the cause is systemic — inadequate knowledge capture practices, onboarding as observation rather than participation, lack of artifact maintenance disciplines. The remedies are structural, not motivational.

Further Exploration

On tacit and explicit knowledge

On cognitive load in software

On distributed cognition

On legitimate peripheral participation

On ADRs and knowledge preservation