System prompts at scale fail the same way massive config files fail. Shopify’s production experience with Sidekick reiterates this:
This growth led to what we call “Death by a Thousand Instructions” - our system prompt became an unwieldy collection of special cases, conflicting guidance, and edge case handling that slowed down the system and made it nearly impossible to maintain.
Every new edge case gets appended. Every tool addition needs guidance. Every merchant-facing feature requires context. The prompt becomes a god object that touches everything, is understood by no one, and impossible to modify without breaking something else. I’ve seen this pattern play out with environment variables, feature flags, and now with agent context.
Peter Steinberger’s agent file captures the current unfortunate trend: “My Agent file is currently ~800 lines long and feels like a collection of organizational scar tissue.” Each fix, each model quirk, each edge case gets appended. Worse, the instructions become model-specific-what works for Claude breaks GPT-5, making the files impossible to share across different agents. The context cost compounds: every tool you add, every instruction you include, is “a constant cost and garbage in my context.”
Their solution mirrors how the industry has learned to decompose monolithic applications:
Our breakthrough came with implementing Just-in-Time (JIT) instructions. Instead of cramming all guidance into the system prompt, we return relevant instructions alongside tool data exactly when they’re needed.
The architecture maps directly to microservices patterns. Localized guidance. Clear boundaries. Composition over monoliths. The same principles that work for code decomposition work for context decomposition.
The benefits track what we’d expect from proper modularity:
- Localized Guidance: Instructions appear only when relevant, keeping the core system prompt focused on fundamental agent behavior
- Cache Efficiency: We can dynamically adjust instructions without breaking LLM prompt caches
- Modularity: Different instructions can be served based on beta flags, model versions, or page context
Cache efficiency is particularly interesting. When context is monolithic, any change invalidates the entire cache. With JIT instructions, you can evolve specific domains without breaking unrelated caching. It’s the same reason we prefer many small services over one giant deployment-blast radius and independent evolution.
What clicked for me is how this extends the ETTO principle. Monolithic prompts optimize for completeness, cramming everything in so the agent has full context but failing catastrophically on maintainability. JIT instructions acknowledge the trade-off and build infrastructure to manage it. You’re not hoping engineers will “carefully maintain the system prompt.” You’re making unmaintainable prompts architecturally impossible.
The pattern shows up in how organizations handle agent tool access too. I’ve seen teams give agents massive tool catalogs “just in case,” then wonder why behavior is unpredictable. The cognitive load problem exists for agents just like it exists for humans: too many options, unclear boundaries, emergent failures. JIT instructions solve this by scoping tool guidance to actual usage context.
The discipline isn’t writing better prompts. It’s treating context as a system with architectural properties that need the same design rigor we apply to code.