Three Layers Of A Working Chatbot
Most chatbots fail because they collapse the system, persona, and conversation into a single prompt. A working bot separates the three. Here is the architecture.
L1, L2, L3
Every prompt in a production bot has three jobs. Job one is to set the rules of engagement: who the bot is, what it can and cannot do, what data it has access to. Job two is to render the brand voice and tone: how it speaks, what words it avoids, what it sounds like at 11 p.m. when a customer is frustrated. Job three is to handle the turn it is currently in: the user's last message, the conversation so far, the immediate next move.
We name them L1, L2, L3. Layer one is the system. Layer two is the persona. Layer three is the conversation. Most chatbots concatenate all three into one mega-prompt. The result is a bot that answers off-domain questions, drifts in tone, and forgets the context after eight turns.
A prompt that does three jobs at once does each one badly. A prompt that does one job, three times, does each one well.
Why Separation Matters
When you separate the layers, you can iterate them independently. You change L2 because the brand updates its voice; L1 and L3 are untouched. You change L3 because the bot is missing context after long sessions; L1 and L2 are untouched. You add a new tool to L1; the persona and turn logic do not need to know about it.
When you do not separate them, every change is a regression risk. A copy edit to one paragraph quietly breaks a tool call defined on the other side of the same string. The bot stops working and the team has no isolation to debug.
[Three concentric prompts: the outer rule layer, the middle voice layer, the inner turn layer. Each is editable without breaking the next.]
What Each Layer Owns
L1 owns the contract: tools, permissions, hard rules, refusals. Short. Static across deploys. Audited by engineering.
L2 owns the voice: tone, style, banned phrases, signature constructions. Updated by marketing. Versioned with the brand guide.
L3 owns the moment: the latest message, the running summary, the tool outputs from the current turn. Generated dynamically per request.
When an output is wrong, the question becomes which layer failed. The answer is fast because the layers are separable. Without that separability, every failure is a search.
The Forge Pattern
Every bot The Forge ships uses this layering. It is not a clever optimization. It is the only architecture that survives a year of edits, a brand refresh, and three new tool integrations without becoming a tangle that nobody trusts to deploy.
If your current bot is one giant prompt, the rebuild is a week. The maintainability return shows up the first time someone needs to change the voice without breaking the rules.
From reading to installing.
Field Notes diagnose the friction. The Sprint and the Install eliminate it.