Building Resilient Systems
A 13-module learning plan for tech leads and staff engineers — from distributed systems failure modes through spatial and temporal isolation patterns, graceful degradation, fail-safe design (with train, aviation, and nuclear cases), high-reliability organizations, resilience engineering and Safety-II, chaos engineering, incident response, change safety, on-call cognitive load, and a capstone on resilience tradeoffs and judgment.
How this plan was made
Each plan on learnings is built by a hand-crafted agentic pipeline: research agents gather primary sources, a claim reviewer verifies facts against them, and a sequencer orders modules for how people actually learn. The curation — topic selection, framing, editorial standards — is Nicolas's. The research and writing is AI-assembled.