Engineering

How LLMs Process Language

Why your obviously clear prompt produces garbage — a linguistic model

Learning Objectives

By the end of this module you will be able to:

  • Classify prompt instructions using Searle's five illocutionary categories and explain why the category affects how an LLM should respond.
  • Predict where indirect or implicature-heavy prompts will cause LLM failures, using the RSA framework as a model.
  • Describe the LLM theory-of-mind gap and its consequences for multi-turn and persona-driven prompts.
  • Distinguish between what a prompt literally says and what it implicates, and write prompts that close that gap deliberately.

Core Concepts

Speech Acts: Language as Action

When you write a prompt, you are not just transmitting information — you are performing an act. Linguists call this a speech act. Searle's taxonomy identifies five fundamental categories of what speakers do with utterances:

Fig 1
CATEGORY DEFINITION PROMPT EXAMPLE Assertive commits to truth of a proposition "The capital of France is Paris." Directive attempts to get the hearer to perform an action "Summarize this text in three bullet points." Commissive commits the speaker to a future course of action "Always respond in JSON format." Expressive expresses a psychological state "I find verbose answers unhelpful." Declaration brings about institutional reality through utterance "You are a senior security engineer."
Searle's five illocutionary categories and their prompt analogues

The category matters because it sets the expected response mode. A directive ("List the edge cases") is a task instruction — the model should produce an enumeration. A declaration ("You are an expert in X") is a role-assignment with no output of its own. Mixing them silently — for instance, burying a directive inside an assertive framing — is one of the most common sources of ambiguous prompts.

Directives dominate most prompts

The vast majority of prompt instructions are directives. Every time you say "do X", "write Y", "list Z", you are issuing a directive. Keep that speech act clean: make the requested action unambiguous, make the object of that action explicit, and make any constraints on completion explicit.

Implicature: The Gap Between Said and Meant

Grice's conversational implicature draws a sharp distinction between two layers of meaning:

  • What is said — the conventional, logical-form meaning of the words.
  • What is conversationally implicated — additional meaning the hearer infers by assuming the speaker is being cooperative.

The classic example: if you ask "Is Mary intelligent?" and someone answers "She has a pleasant personality," the literal answer says nothing about intelligence. But the implicature — the inference a cooperative listener draws — is that Mary is not intelligent, because a cooperative speaker would have said something relevant if they believed otherwise.

Communication succeeds despite the persistent gap between literal meaning and intended meaning — because human hearers reason about speaker intent. LLMs do this poorly.

This mechanism is critical because prompt engineers frequently rely on implicature without realizing it. When you write "Be concise," you implicate a preference about length and density that you never state explicitly. When you write "Don't use bullet points," you implicate a broader preference for prose without ever stating what you do want instead. Every implicature in your prompt is a place where meaning depends on the model correctly reasoning about your intent — which it may not.

Indirect Speech Acts and Politeness

An indirect speech act performs one illocutionary act by means of another. "Could you pass the salt?" is grammatically a question about ability, but functionally a request for action. The hearer recovers the indirect meaning through Gricean reasoning: taking it literally would violate relevance, so the hearer infers the indirect directive.

Indirect speech acts are commonly deployed as politeness strategies: posing a request as a question about ability ("Could you...?") hedges the imposition, preserving the hearer's conversational agency. The indirection creates plausible deniability. This is deeply ingrained in how people write — and it flows into prompts without the writer noticing.

Politeness-motivated indirection is a prompt liability

When you write "It would be great if the output included..." or "Could you possibly make this more concise?", you are using politeness indirection. For human readers, the indirect request is transparent. For LLMs, the indirect force may not be recovered. Write directives directly.

The Rational Speech Acts Framework

The Rational Speech Act (RSA) framework provides a computational model of how pragmatic meaning works. It treats communication as recursive Bayesian reasoning:

  1. A literal listener interprets utterances at face value.
  2. A pragmatic speaker chooses utterances that maximize informativeness relative to the literal listener's interpretation.
  3. A pragmatic listener reasons about what a rational speaker would have said, integrating the utterance with prior beliefs about the world.

This recursion is what enables disambiguation. If a speaker says "Some of the students passed," the literal meaning is compatible with "all of them passed." But a pragmatic listener infers "not all passed" — because a cooperative, informative speaker would have said "all" if they meant all. The implicature is recovered through reasoning about what utterance a rational speaker would have chosen.

Crucially, RSA defines informativeness in information-theoretic terms: speakers optimize utterances for maximum signal-to-noise ratio given context, and listeners weight the prior probability of interpretations against the speaker's signal. Meaning emerges not from lexical content alone, but from the relative efficiency of utterances in distinguishing among alternative interpretations.

For prompt engineering, RSA offers a useful mental model: the model is (attempting to) behave like a pragmatic listener, reasoning about what a rational prompter would have said. The failure modes emerge precisely where the model's prior about "what a rational prompter would say" diverges from your actual intent.

The Theory-of-Mind Gap

LLMs fundamentally lack Theory of Mind (ToM) — an internal model of other minds necessary for reasoning about speakers' beliefs, intentions, and knowledge states. Instead, they rely on statistical mimicry of patterns in training data. This architectural absence has direct consequences:

  • Irony and sarcasm are handled through surface pattern matching, not by reasoning about the speaker's actual beliefs relative to what they said.
  • Indirect speech acts are handled well when they are conventionalized ("Can you...?") — because those forms are frequent in training data — but break down on non-conventionalized or context-dependent indirect requests.
  • Multi-turn context is not tracked through a model of what the user believes and wants; it is tracked as a token sequence. The model has no persistent representation of your goals across turns.
  • Persona instructions ("You are an expert in X") are declarations that the model cannot fully ground, because it has no internal belief state to update.

Common Misconceptions

"The model understands what I mean." LLMs do not understand in the intentional sense. Performance on pragmatic tasks like implicature relies on statistical patterns, not reasoning about communicative intent. When a model appears to understand an indirect request, it is almost always because that request pattern appears frequently in training data, not because the model is reasoning about your intent.

"Being polite makes prompts more effective." Politeness strategies in natural language are typically achieved through indirection — hedges, conditional forms, questions about ability. These forms exist to preserve face and signal cooperative intent to human hearers. To an LLM, they are noise on top of the directive. Being clear is more effective than being polite.

"Adding more context always helps." Context helps when it resolves ambiguity that the model cannot infer. Context hurts when it introduces implicatures that the model mishandles. A long prompt full of indirect hints, hedges, and implied constraints may perform worse than a shorter, more direct one. The RSA framework suggests that every token you add signals something to the model — make sure what it signals is what you intend.

"The model remembers what you agreed earlier in the conversation." Because LLMs lack ToM, there is no persistent model of user intent across turns. The model only has access to the token history. If you established a constraint ten turns ago and it has drifted out of the context window, or has been diluted by subsequent tokens, it may be silently ignored.

Worked Example

Scenario: You want a model to review a pull request description and flag any missing information, without rewriting it.

First attempt (implicature-heavy):

"Can you take a look at this PR description and let me know if anything is missing or could be clearer?"

This prompt has multiple problems:

  • "Can you take a look" is an indirect speech act — grammatically a question about ability, indirectly a directive. The model will likely handle this one correctly since the form is highly conventionalized. But it is still noise.
  • "Let me know if anything is missing or could be clearer" conflates two very different tasks: (1) identifying missing information and (2) proposing rewrites. The implicature you intend — "flag but don't rewrite" — is not stated.
  • "Could be clearer" is a subjective criterion with no reference point. The model has no access to your quality bar.

Revised prompt (direct, explicit):

"Review the following pull request description. Identify any information a reviewer would need that is missing. Do not rewrite or improve the description — only list what is absent. Output a numbered list. If nothing is missing, say 'Complete'."

What changed:

  • The directive is explicit: "identify... list."
  • The constraint is stated, not implicated: "Do not rewrite."
  • The output format is specified.
  • The termination condition is defined.
The revision principle

For each sentence in your prompt, ask: am I stating this, or am I expecting the model to infer it? Every inference you expect the model to make is a potential failure point. If it matters, state it.

Compare & Contrast

Direct vs. Indirect Directives

Direct directiveIndirect directive
FormImperative: "List the edge cases."Question: "Could you list the edge cases?"
Human interpretationUnambiguousTransparent via convention
LLM interpretationReliableReliable only if highly conventionalized
Politeness functionNoneFace-saving for human speaker
Prompt recommendationPreferredAvoid unless testing social behavior

Literal Meaning vs. Implicature

What is saidWhat is implicated
"Be concise."Produce output that is conciseUnknown length threshold, unknown format preferences
"This code seems complex."The code has some complexityThere is a problem; simplify it
"Experienced engineers prefer..."A preference claimYou should behave as an experienced engineer would

The right-hand column is where your prompt's meaning lives in your head. The left-hand column is what the model reliably sees. Implicature recovery requires reasoning about speaker intent — something LLMs do at roughly 60% accuracy versus 86% for humans. Every cell in the right-hand column is a potential failure point.

Conventionalized vs. Non-Conventionalized Indirect Requests

LLMs handle conventionalized indirect speech acts with reasonable success but degrade significantly on non-conventionalized or context-dependent indirect requests. This has a direct practical implication:

ConventionalizedNon-conventionalized
Example"Can you summarize this?""This is a lot of text." (implying: summarize it)
LLM handlingGenerally reliableBrittle — depends on context and training distribution
Why it works / failsHigh-frequency pattern in training dataRequires pragmatic inference about speaker intent
RecommendationAcceptable but still prefer imperativeAvoid entirely; state the directive explicitly

Key Takeaways

  1. Prompts are speech acts. Every prompt instruction belongs to Searle's taxonomy — most are directives. Identify the illocutionary category and make sure the form matches the intended force.
  2. Implicature is a reliability risk. Grice's distinction between what is said and what is implicated maps directly onto prompt failures. Anything your prompt implies rather than states is a potential breakdown point. State it.
  3. Indirection is a liability, not a courtesy. Indirect speech acts work for human hearers via Gricean reasoning. LLMs handle conventionalized indirect requests tolerably but degrade on non-conventionalized forms. Prefer direct imperatives.
  4. LLMs have no Theory of Mind. The theory-of-mind gap means that LLMs cannot reason about your beliefs, goals, or knowledge state. They pattern-match on form. Multi-turn context is a token history, not a model of your intent.
  5. The RSA mental model is diagnostic. Think of the model as a pragmatic listener reasoning about what a rational speaker would have said. If your prompt's unusual choices could be misread as meaningful signals, clarify them. Every token signals something — make sure it signals what you intend.

Further Exploration

Core Frameworks

Pragmatics & Implicature

LLMs and Pragmatics