Semaphore

From optical telegraphs to concurrency primitives — three centuries of signal-carrying

Lead Summary

The word semaphore carries at least three distinct meanings that span over two hundred years and two unrelated fields. In its oldest sense it names a tower-based visual telegraph system, its term coined in 1801 by Claude Chappe from Greek roots meaning "sign-carrier." That heritage branched into the railway semaphore arm — the pivoting mechanical signal that governed train movements from the 1840s until the last UK mainline example was decommissioned in November 2025 — and into the handheld maritime flag semaphore that emerged in 1866 for ship-to-ship communication. A century after the towers, Edsger Dijkstra borrowed the name for something entirely different: an integer synchronization primitive for concurrent programs, equipped with two atomic operations (P and V) that coordinate thread access to shared resources. What ties all three together is the underlying abstraction of a signal that carries a small integer state — one that can only be read or changed under controlled conditions.

Etymology & Terminology

Claude Chappe coined the word semaphore in 1801 to name his optical telegraph system. The construction is Greek: σῆμα (sêma, "sign") and φορός (phorós, "carrying"), yielding "sign-carrier." The French Army preferred the word télégraphe (far-writer) for institutional use, but semaphore persisted in English as the generic term for all optical angle-based systems.

The term's domain then split. On railways it came to mean the pivoted arm signal. At sea it named the handheld flag practice that descends from the same angular encoding principle. In computing, Dijkstra appropriated it as a metaphor: the semaphore "carries a signal" about the state of a shared resource, just as a railway semaphore arm signals whether a block is clear.

Dijkstra's operations themselves derive from Dutch. P stands for probeer te verlagen or passeren ("try to reduce" or "to pass"); V stands for verhogen ("to increase"). These names sometimes confuse English speakers, who encounter them rendered as wait/signal, acquire/release, or down/up in different textbooks and APIs.

Historical Development

The Chappe optical telegraph (1792–mid-19th century)

Claude Chappe built the first operational optical telegraph line in France in 1792. Each station in his network consisted of a tower with a two-arm semaphore mounted on top, capable of producing 49 distinct angular combinations. Towers were spaced approximately 5 to 15 kilometres apart — close enough for operators to read the adjacent station's arms through specially designed telescopes. Messages hopped from station to station at a speed no mounted courier could match.

The design choice of movable arms over simple panels was deliberate: it was easier to discern the angle of a rod at distance than to determine the presence or absence of a panel element. The encoding exploited this perceptual advantage fully.

Railway semaphore signals (1841–2025)

Charles Hutton Gregory introduced the railway semaphore signal on the London and Croydon Railway at New Cross in 1841, adapting military telegraph hardware for railway safety. The design spread quickly: by 1870 the semaphore had become the dominant visual signal on British and North American rail networks.

By the early 2020s, semaphore signals had been almost completely replaced by colour-light and LED signals. The final operational semaphores on the UK national mainline were decommissioned on 1 November 2025 at Manea station in Cambridgeshire.

Maritime flag semaphore (1866–present)

Flag semaphore originated in 1866 as a handheld adaptation of the fixed optical telegraph tradition. Where Chappe's towers relayed signals across the continent, a single person with two flags could communicate ship-to-ship over a few miles. The system persisted alongside Morse flashing-light long after radio became the primary medium, surviving because it requires no power, no infrastructure, and no electromagnetic spectrum allocation — the exact niche radio and satellite cannot fill when silence is required.

Computing semaphores (1962–present)

Edsger Dijkstra invented the semaphore synchronization primitive in the early 1960s while developing the Electrologica X8 operating system. The formal publication came in his 1965 paper "Cooperating Sequential Processes" (EWD 123). The abstraction became foundational to concurrent programming theory and practice, and was subsequently formalized in the POSIX 1003.1b standard.

Flag Semaphore: Mechanism & Encoding

Maritime flag semaphore uses eight angular positions for each arm: cardinal and intercardinal directions relative to the operator's body. Two arms combined yield a large combinatorial space — sufficient to encode all 26 letters of the alphabet plus control characters.

Control characters

Flag semaphore includes two dedicated control signals: the Numbers signal and the Letters signal. Transmitting the Numbers signal switches the receiver into numeric mode, where letters A through K represent digits 1 through 0. The Letters signal switches back to alphabetic encoding.

Flag colors follow international maritime conventions. At sea, red-and-yellow (Oscar) flags are used; on land, white-and-blue (Papa) flags are standard. This color distinction is part of the broader International Code of Signals flag designation system.

Distinctions within maritime signaling

Flag semaphore is easily confused with two related but separate systems:

International Code of Signals (ICS): Drafted in 1855 and published in 1857 as the Commercial Code by the British Board of Trade, the ICS uses 18 flags hoisted on halyards to encode standardized maritime phrases and words — not individual letters. The ICS can produce over 70,000 distinct messages from those 18 flags. It is a phrase-book system; flag semaphore is a letter-by-letter alphabet.
Railway semaphore: Uses fixed pivoted arms on posts rather than handheld flags, controlling train movement rather than conveying linguistic information. Both systems use angular positions for encoding, but the physical mechanism and purpose are entirely distinct.

Railway Semaphore: Design & Operation

Arm positions and meaning

Railway semaphore signals communicated through discrete arm positions. In lower-quadrant designs, the arm held horizontal meant danger, inclined downward at 45° meant caution, and the arm vertical (hidden inside the post) meant clear. Upper-quadrant designs reversed the vertical direction: arm horizontal was danger, 45° upward was caution, and vertical (90° upward) was clear. Three-aspect signals using all three positions were standardized on American railways around 1900.

Visibility by design

Railway semaphore design incorporated specific features to maximize legibility at distance. Arms were painted red for contrast against most backgrounds, with contrasting stripes or markings applied to enhance visibility further. The rear face was typically white with black markings. Where poor background contrast was unavoidable, a white sighting board was mounted behind the signal.

For night operation, semaphores initially used kerosene lamps with coloured lenses. Once tungsten filaments were perfected, railroads quickly replaced kerosene lamps with electric bulbs, enabling brighter and more reliable night-time operation.

Fail-safe engineering

A system failure should produce the safest possible outcome, not the most permissive one.

Upper-quadrant semaphore arms were designed so that if the signal wire breaks or ice weighs down the arm, gravity returns it to the horizontal "danger" position. This gravity-driven fail-safe meant component failure defaulted to stop rather than proceed. British railway companies standardized on upper-quadrant designs from the 1920s onward specifically to leverage this property. Lower-quadrant signals could fail in the "off" (clear) position — making them inferior from a safety standpoint.

The railway semaphore is also conceptually distinct from two related infrastructure systems. The interlocking prevents unsafe signal/point combinations by enforcing mechanical or electrical constraints on which signals can display favorable aspects simultaneously. The block system allocates track sections one train at a time. The semaphore arm is the user-facing indicator; interlocking and block systems are the safety logic behind it.

Decline and replacement

Electric colour-light signals began displacing semaphores from the 1920s onward. Colour-light signals offered superior visibility in fog, snow, and ice; required less maintenance; and could be remotely controlled from centralized signal boxes. The enabling technology was the electric lamp bright enough for daylight visibility, which became available starting in 1904. By the 1980s–90s, widespread replacement was underway; only scattered heritage railways retain operational semaphores today.

Computing Semaphore: Core Concepts

The abstract data structure

A computing semaphore is an integer value (always non-negative) paired with two atomic operations. Dijkstra's 1965 formalization defined them precisely:

P (wait / acquire): Atomically decrements the semaphore counter. If the result would be negative, the calling thread blocks and is added to a wait queue. The decrement and the check are indivisible — they constitute a single atomic operation, which is what prevents race conditions.

V (signal / release): Atomically increments the semaphore counter. If one or more threads are blocked waiting, V wakes one of them, which then completes its pending P operation. V never blocks.

Why atomicity matters

Without atomicity, two threads could both read the counter value as 1, both conclude they may proceed, and both decrement it — producing -1 and allowing two threads simultaneously into a section that was supposed to be exclusive. Atomic P and V prevent this interleaving by making the test-and-update indivisible.

Binary vs. counting semaphores

A binary semaphore holds only values 0 or 1 and enforces mutual exclusion for a single resource. A counting semaphore holds any non-negative integer and controls access to multiple identical instances of a resource: if the semaphore is initialized to N, at most N threads may hold it simultaneously.

The critical distinction between a binary semaphore and a mutex is ownership. A mutex enforces that only the thread which locked it can unlock it. A binary semaphore has no such constraint — any thread can call V, regardless of whether it called P. The POSIX 1003.1b standard formalizes this through separate APIs: pthread_mutex_unlock called by a non-owning thread is an error; sem_post from any thread is normal usage.

Implementation: hardware foundations

Kernel implementations of P and V must themselves be made atomic. The mechanism employed is the spinlock, built on hardware atomic instructions:

Test-and-set (TAS): An atomic instruction that sets a memory location to 1 and returns its previous value in a single indivisible operation. Used to implement the spinlock that guards semaphore state.
Compare-and-swap (CAS): A more general atomic instruction taking a memory location, an expected value, and a new value. If the location matches the expected value, it is replaced atomically; otherwise the operation has no effect. TAS is a special case of CAS; CAS is more widely used in modern semaphore implementations.

By wrapping P and V in spinlocks, the kernel ensures atomicity even on multiprocessor systems, while allowing threads to sleep (block) rather than busy-wait — distinguishing semaphores from pure spinlock-based synchronization.

Classic Problems and Patterns

Producer-consumer (bounded buffer)

The producer-consumer problem is solved using two counting semaphores: one (empty) initialized to the buffer size tracking available slots, and one (full) initialized to zero tracking filled slots. Producers call P on empty before inserting, then V on full. Consumers call P on full before removing, then V on empty. A third binary semaphore provides mutual exclusion over the buffer structure itself.

Dining philosophers

Dijkstra's solution to the dining philosophers problem eliminates deadlock by breaking the resource-holding condition: each philosopher atomically acquires both forks or waits — no philosopher may hold exactly one fork outside the critical section. The solution uses one mutex semaphore, one binary semaphore per philosopher, and one state variable per philosopher.

Cross-thread signaling

Because semaphores lack ownership semantics, they are the correct tool when the thread that signals (V) and the thread that waits (P) are different. A semaphore initialized to 0 allows thread B to wait until thread A posts a V, without A ever having called P. A mutex cannot implement this pattern — the lock-holder and lock-releaser must be the same thread.

Rate limiting and concurrency capping

Modern distributed systems repurpose counting semaphores for rate limiting. A semaphore initialized to N caps the number of simultaneously active requests, protecting downstream resources (database connections, file uploads, API rate limits) from exhaustion. This is a shift from strict mutual exclusion toward bounding in-flight operations — a practical evolution of the original abstraction.

Tuning semaphore counts

Soft resource allocations set through semaphore-like permit counts greatly impact throughput. Too-small values create queues and reduce concurrency; too-large values cause excessive context switching and resource waste. Tuning requires empirical measurement, not guesswork.

Fairness and Starvation

The classical semaphore definition says nothing about which waiting thread wakes on a V operation. An arbitrary selection policy can starve individual threads indefinitely. Using a FIFO wait queue guarantees no starvation (assuming no deadlock), making queue-based semaphores preferable when fairness is a requirement. Between 1979 and 1986, three distinct algorithms were proposed to achieve starvation-free mutual exclusion using extended "blocked-set" semaphore primitives. These algorithms have been formally verified using mechanized proofs.

Semaphores vs. Mutexes: When to Use Which

	Mutex	Semaphore
Ownership	Thread that locks must unlock	Any thread may signal
Initial value	1 (binary state)	Any non-negative integer
Best for	Protecting owned state	Counting resources; cross-thread signaling
POSIX API	`pthread_mutex_*`	`sem_*`

The consistent guidance from operating systems literature: use mutexes when protecting state with clear ownership semantics (only one thread modifies the resource at a time, and always the same thread); use semaphores for counting or for producer-consumer-style coordination where signaler and waiter are different threads.

Priority Inheritance

In real-time systems, semaphores interact with thread scheduling through priority inversion: a high-priority thread blocks waiting for a semaphore held by a low-priority thread, which may itself be preempted by medium-priority threads. The priority inheritance protocol, formalized by Sha, Rajkumar, and Lehoczky in 1990, addresses this by temporarily elevating the lock-holder's priority to match the highest-priority waiter. If the lock-holder is itself blocked, priority inheritance propagates transitively through the chain of lock holders.

Distributed Semaphores

Extending semaphores to distributed systems introduces new failure modes. If a permit holder crashes before releasing, the semaphore may remain permanently acquired.

Failure recovery via leases

Distributed semaphore implementations address permit-holder failure through time-based leases. Each permit carries a time token; if the holder crashes and the lease expires, other waiting parties can acquire it. Redis's RPermitExpirableSemaphore explicitly adds this mechanism.

Consensus-based implementations

etcd implements distributed locks and semaphore-like primitives using the Raft consensus protocol, with a global revision counter that linearizes all operations across the cluster. Linearizable semantics ensure the observed order matches a single-threaded execution — suitable for critical applications requiring strict ordering. The tradeoff against eventually-consistent approaches like Redis is performance cost for stronger guarantees.

Algorithm families

Distributed semaphore algorithms cluster into three approaches:

Replication-based: Semaphore state is replicated across nodes; operations require acknowledgement from a quorum.
Token-based: The semaphore is embodied as a token that must be physically acquired to perform operations.
Consensus-based: Operations require consensus from peers before they are permitted.

Each trades off message complexity, fault tolerance, and consistency guarantees differently. Parameterized extensions allow k-out-of-M synchronization primitives, enabling patterns like bounded reader-writer access that binary mutual exclusion cannot express.

Key Takeaways

The word semaphore has three distinct meanings across domains. The oldest refers to optical telegraph towers (Chappe, 1801), the second to railway signal arms (1840s), and the third to a computing synchronization primitive (Dijkstra, 1960s). All three share the underlying abstraction of an integer state that carries signals under controlled conditions.
Railway semaphores used discrete arm positions with fail-safe design. Horizontal meant danger, 45-degree angles meant caution, and vertical meant clear. Upper-quadrant designs (standardized in Britain from the 1920s) leveraged gravity to return arms to danger if the signal wire broke, making safety the default state on failure.
Computing semaphores are integer counters protected by atomic operations. The P operation (wait) atomically decrements the counter and blocks if it would become negative. The V operation (signal) atomically increments and wakes a waiting thread. Atomicity prevents race conditions where multiple threads might read the same counter value and proceed simultaneously.
Binary and counting semaphores serve different purposes. Binary semaphores (values 0 or 1) enforce mutual exclusion on a single resource. Counting semaphores hold any non-negative integer to control access to multiple identical resources. Unlike mutexes, any thread can call V on a semaphore, regardless of whether it called P.
Distributed semaphores require additional failure recovery mechanisms. Time-based leases allow permit recovery if a holder crashes. Consensus-based implementations using protocols like Raft provide linearizable semantics but at higher performance cost than eventually-consistent approaches like Redis.

Further Exploration

Historical Origins and Physical Systems

Chappe telegraph — Wikipedia — Optical telegraph network, tower spacing, and origin of the semaphore name
Railway semaphore signal — Wikipedia — Arm-position conventions, fail-safe design, lighting evolution, and replacement timeline
Flag semaphore — Wikipedia — Encoding system, flag colors, control characters, and distinction from International Code of Signals

Computing Semaphore Fundamentals

E.W. Dijkstra Archive: Cooperating Sequential Processes (EWD 123) — Dijkstra's original 1965 formalization of semaphores and P/V operations
OSTEP: Semaphores chapter — Rigorous treatment of semaphore semantics, implementation, and classic problems
Origin of P-V (NYU) — Etymology of P and V and history of Dijkstra's naming choices

Advanced Topics

Starvation-free mutual exclusion with semaphores (ACM) — Formal treatment of starvation-freedom with mechanized verification
Priority Inheritance Protocols (Sha, Rajkumar, Lehoczky 1990) — Foundational paper on priority inheritance for real-time systems
Algorithms to implement semaphores in distributed environments (ACM) — Survey of replication, token, and consensus approaches to distributed semaphores

Quick reference

Origin of term Claude Chappe, 1801 (Britannica)

Etymology Greek sêma (sign) + phorós (carrying)

Physical forms Optical tower, railway arm, handheld flags

Computing form Integer counter with atomic P/V operations

Invented by (CS) Edsger Dijkstra, ~1962–1965 (EWD 123)

Standardized in POSIX 1003.1b (POSIX)

Key variants Binary semaphore, counting semaphore, distributed semaphore

Railway lifespan 1841–2025 (UK mainline)