If you work in AI support long enough, you stop being afraid of “hallucinations.”
Not because they disappear.
Because you realize something uncomfortable:
They’re almost never the AI’s fault.
Over the last year, our AI support agent has completely replaced what used to be a full human role. Not “assisting.” Not “deflecting low-value tickets.”
Actually doing the job.
Most days it resolves everything. Zero human touches the queue.
At RB2B , we run it through Intercom’s Fin.
And here’s the thing everyone’s too polite to say out loud:
If your AI is hallucinating, you probably built a broken system and blamed the tool.
Let me explain.
The lie we keep telling ourselves
Most teams talk about hallucinations like this:
“AI just makes stuff up sometimes.”
That’s not what’s happening.
What’s actually happening is this: You gave an intelligent pattern-matching system zero structure, no constraints, and vague instructions—then acted shocked when it did exactly what you trained it to do.
LLMs don’t “know” truth. They predict likely next tokens.
When you don’t give them context, they fill gaps with what sounds plausible.
If you design a workflow that rewards guessing, you get guessing.
If you design one that demands grounding, hallucinations nearly vanish.
That difference has nothing to do with the model.
It has everything to do with whether you did your job.
What most teams actually do (and why it fails)
Here’s the pattern I see constantly:
- Turn on AI agent
- Upload help center docs
- Pat yourself on the back
- Get mad when it makes something up
That’s not deploying AI.
That’s dumping work on an intern with zero training and blaming them when they screw up.
Would you hire a junior support rep, throw docs at them, give them no onboarding, and send them straight to customers?
No?
Then why are you doing it with AI?
AI agents aren’t magic. They’re junior employees with perfect memory and zero judgment.
You have to design around that.
Or you get what you deserve.
The models got better. Your process didn’t.
To be fair—models have improved dramatically.
Older generations hallucinated constantly. Especially on citations or edge cases.
Newer reasoning models are miles ahead:
- Bigger context windows
- Better recall across conversations
- More willing to say “I don’t know”
- Actually follow instructions
So yes, raw hallucination rates dropped.
But that’s not why our support AI works.
The tech helped.
The workflow design mattered 10x more.
How we actually think about hallucinations
I stopped treating hallucinations as something to “fix.”
I treat them like:
- Network timeouts
- API failures
- User typos
They’re just a system property.
You design guardrails.
You don’t cross your fingers and hope.
We built everything around one rule:
The AI is never allowed to guess. Ever.
If it doesn’t know, it escalates.
That one constraint eliminates 80% of risk immediately.
The 4 boring layers that actually work
This is the unglamorous part.
It’s also the only part that matters.
1. Behaviour instructions
Before anything else, we change how the agent behaves.
We explicitly tell it:
- If unsure → say you’re unsure
- If info is missing → ask
- Never invent policies
- Never assume
This sounds obvious.
It’s not optional.
If you skip this, the model defaults to “be helpful,” which in practice means: make something up that sounds good.
Guessing is what people call hallucinating.
2. Ground everything in real data
This is the biggest lever.
We don’t let Fin answer from “general knowledge.”
It only answers from:
- Our knowledge base
- Product docs
- Policies
- Internal procedures
If it can’t cite something real, it doesn’t answer.
Once you do this, hallucinations drop dramatically.
Because it’s not inventing anymore. It’s retrieving.
There’s a massive difference between:
“Generate an answer”
and
“Find the answer in these exact documents”
Support should almost always be the second one.
3. Review every mediocre interaction
This one is manual.
And completely unsexy.
We review every single low-rated conversation.
Every. Single. One.
Not weekly. Not “when we have time.”
Every time.
When the AI sounds off or slightly wrong, there’s always a reason:
- Missing doc
- Unclear policy
- Bad instruction
- Edge case we forgot
It’s almost never “the AI randomly lied.”
It’s almost always “we didn’t teach it properly.”
And the fix? Usually documentation. Not prompting.
4. Add verification loops for high-risk actions
For anything sensitive, we add a second pass.
Examples:
- Billing changes
- Refunds
- Cancellations
- Policy explanations
Either:
- Require confirmation
- Cite the exact source
- Or escalate to human
If stakes are high, you don’t let a single pass go straight to customers.
Same rule you’d use with a human.
Because it is basically a very literal human.
The part nobody wants to hear
Here’s the uncomfortable truth:
Hallucinations aren’t an AI problem anymore.
They’re a you problem.
If you:
- Dump docs in without structure
- Skip clear instructions
- Don’t review outputs
- Don’t maintain your content
Then yes—AI will make things up.
But that’s not the AI failing.
That’s you not doing the foundational work, then blaming the tool.
It’s like blaming a junior rep for not knowing policies you never wrote down.
What this means if you’re deploying AI support
If you’re evaluating Fin or any AI agent, stop asking:
“Does it hallucinate?”
Every model can .
Ask instead:
“Does our workflow make it impossible for hallucinations to reach customers?”
That’s the real question.
Because once you build the right constraints, hallucinations stop being scary.
They become:
- Detectable
- Reviewable
- Fixable
Just another operational bug.
Not a mysterious AI failure.
My actual mental model now
I don’t think of Fin as “AI.”
I think of it as:
A very fast, very literal, very obedient junior support rep.
It will do exactly what you teach it.
Nothing more.
Nothing less.
If it says something wrong?
That’s usually on me.
Not the model.
And honestly—that’s a way more useful frame.
Because it puts responsibility where it belongs.
So if you’re still fighting hallucinations...
It’s probably not time for a better model.
It’s time for better systems.
That’s where the gains actually are.
The tools work.
The question is: did you?