Skip to content
tarıtas
Field notes POST-15 8 min read

Why a Voice Agent Told Callers to Call the Line They Were On

A voice agent kept telling callers to call the office, on the line they had already dialed to reach it. The cause was not a model error. The knowledge base was written for the web, where ending an answer with call our office is correct, because the reader is not yet on the phone. On a call to that same office, the model read the instruction out loud and sent the caller back to their own line. At taritas the first fix was a prompt rule: never recite the office number unless asked. On a small, cost-optimized model it failed about one call in twenty, because the model anchored on the chunk it was grounding on more than on the rule. The durable fix was structural: strip the office number from each knowledge-base chunk before the model ever sees it. A model cannot relay what it cannot see. The audit path keeps the original text, so grounding and evaluation still see the truth.

Published · Updated · Supreet Tare

All names, numbers, and identifiers in this post are anonymized. The patterns are real.

Diagram of how a voice agent strips a phone number from knowledge-base chunks before a language model reads them. On the left, a knowledge-base chunk stored in Postgres with pgvector contains the sentence for help, call our office at the office number. In the middle, a sanitization gate runs a regular expression that matches the spoken form of the office number. On the right, two parallel paths leave the gate. The upper path is the model-facing projection, where the office number is replaced with a redaction placeholder before the chunk enters the small responder model, so the model cannot relay a number it cannot see, and the canonical number lives only in the prompt with explicit conditions for when it may be spoken. The lower path is the audit projection, an audit buffer that keeps the original chunk with the number intact, which the grounding extractor and faithfulness evaluation read so they still trace each response to the real source text. An inset notes that two sibling department lines, a parking-enforcement line and a provincial transport line, are intentionally not touched because they are different numbers callers sometimes need. The caption reads: sanitize the input to the model, not the input to the audit.

A voice agent answered the main public phone line for a public-sector traffic-fines office. Callers dialed the number on the office’s website and reached the agent. Most asked procedural questions: how to dispute a ticket, what form reopens a case, what the office hours are. The agent answered each one from a knowledge base. And on a small but steady share of calls, it ended its answer by telling the caller to phone the office. The same office. The line the caller was already on.

This is a short story about why that happened, why the obvious fix did not hold, and the structural fix that did. The lesson generalizes past phone numbers, so it is worth the read even if you never run a government phone line.

The setup, and the moment it broke

A knowledge base is the source document a voice agent answers from. Ours was a markdown file the client maintained, chunked into passages and embedded into Postgres with the pgvector extension. Each caller question retrieves the most relevant chunks, and the agent grounds its spoken answer on them.

The document was written for the web. That matters. When someone reads a web page, ending an answer with “for help, call our office at the office number” is good advice. The reader is at a desk and not yet on the phone. So one chunk, for a document-copies request, read roughly: complete the request form, send it to the office email, and for help with the form, call our office at the office number during business hours.

On a call to that office, the agent read it out as written. The caller’s reply was the whole bug in one line:

Caller: “I am calling that number. I’m talking to you right now.”

The agent had relayed an instruction that was correct in the document’s original channel and wrong in the one it was now serving. The model had no way to know the difference. It just repeated what the chunk said.

The architecture in 30 seconds

Speech to text turns the caller’s words into text. A retrieval step finds the relevant knowledge-base chunks and injects them into the model’s context for that turn. A small, cost-optimized model, running at temperature 0, reads the chunks and writes the spoken answer. After the call, two separate steps run an audit: a grounding extractor checks that each answer is supported by the chunks the model saw, and a faithfulness evaluation scores whether the answer stayed consistent with those chunks.

Hold on to that last part. The audit reads the chunks the model saw. It becomes the reason the simple version of the fix is wrong.

The first fix, and why it was not enough

The obvious fix is a prompt rule. We added one: never recite the office number unless the caller explicitly asks for it.

It worked most of the time. It failed about one call in twenty. The failures were not random. They were exactly the calls where the caller asked a procedural question whose retrieved chunk ended with the call our office instruction. In those cases the model was grounding on a chunk that contained the number, and the chunk won. A small model at temperature 0 follows the text it is grounding on more reliably than it follows a negative rule in the prompt, especially when the number is presented as a natural part of the answer the chunk is teaching.

A five percent failure rate is too high for the most embarrassing class of bug we have, which is sending a caller back to the line they are already on. Negative rules are a safety net, not a primary defense. We needed the number out of the model’s reach.

The real fix: strip it before the model sees it

The durable fix is structural. Remove the office number from each knowledge-base chunk before the chunk ever enters the model’s context. A model cannot relay what it cannot see. The canonical number then lives in one place, the prompt, with explicit conditions for the few cases where speaking it is correct.

The sanitizer is small. The pattern matches the spoken form of the office number, with tolerance for the punctuation between digit groups. The literal digits are sensitive, so they are held in a named pattern rather than written inline:

OFFICE_NUMBER_RE = re.compile(OFFICE_NUMBER_SPOKEN_FORM, re.IGNORECASE)

OFFICE_NUMBER_PLACEHOLDER = (
    "[office phone redacted from KB. Share only if the caller explicitly "
    "asked for the number; otherwise offer a transfer. The canonical "
    "number is in your prompt rules.]"
)

def sanitize_office_number(text: str) -> str:
    """Strip the office phone number from a KB chunk before the model reads it.

    The caller is already on the office's main line, so a chunk that says
    'call us at the office number' would be relayed verbatim and send them
    back to the line they just dialed. The number still lives in the prompt
    for the legitimate cases (caller asks for it directly, declined a
    transfer, outside business hours, transfer tool failed). The two sibling
    department lines are different numbers and are left untouched.
    """
    if not text:
        return text
    return OFFICE_NUMBER_RE.sub(OFFICE_NUMBER_PLACEHOLDER, text)

It runs at the two points where chunks reach the model: when retrieved chunks are injected into the per-turn context, and when the model calls the knowledge-base search tool directly. Both points pass each chunk through the sanitizer before the model sees it. A second sanitizer for the office email runs in the same place, for a related reason: without it the model spells the address out inconsistently from turn to turn. Same pattern, two cleanups.

The subtlety: do not blind your audit

Here is the part that is easy to get wrong. If you replace the number with a placeholder in the chunks, and the audit reads those same chunks, every answer that used one of those chunks becomes untraceable. The grounding step cannot match the spoken answer back to the placeholder text, so it scores the turn as unsupported. Worse, you would be lying to yourself about what the model actually had in front of it.

So the redaction is applied only on the path that builds the model’s context. A separate audit buffer keeps the raw, unsanitized chunks. The grounding extractor and the faithfulness evaluation read from that buffer. The model sees the sanitized version. The audit sees the truth. Two parallel paths from one source. We confirmed this against a replay corpus of 196 real calls, and there were no faithfulness regressions.

One more deliberate choice. The filter is anchored to the office’s own number. It does not touch two sibling-department lines, a parking-enforcement line and a provincial transport line, which are different numbers that callers genuinely need in specific situations. Redacting those would be a new bug, not a fix.

Hardening

The phone number is the clearest case of a wider problem, so the work did not stop at one regex. We flagged the rest of the knowledge base for other “wrong channel” instructions, like “visit our website” for a caller with no screen, or “send us an email” when the agent cannot send email. Each one wants the same treatment: fix it in the source if the client can, or clean it at the projection layer if they cannot. We also kept the prompt rule in place as a backstop, in case a future import writes the number in a shape the pattern does not match. And we noted that the two sibling lines have no single documented home today, which the next maintenance pass should fix so a number change does not get missed.

Key takeaways

A document written for one channel will break in another. A knowledge base built for the web is a good starting point for a voice agent and a poor finished product, because it is full of instructions that assume the reader is not on the phone with you. When you find one, the strongest fix is structural: keep the unwanted text out of the model’s input, so the bad behavior is impossible rather than discouraged. Prompt rules are a safety net, not a primary defense, and that gap widens on small models at temperature 0. Finally, when you sanitize what a model sees, make sure your audit still sees the original. Protect the model on one path and preserve the truth on the other, from one shared source.

What this means if you are an IT services firm

If you run a voice agent for a client, ask your team one question: which instructions in our knowledge base assume the caller is on a different channel than the one they are actually on? Most knowledge bases were written for a web reader, and “call our office,” “visit our website,” and “email us” are all hiding in them. Then ask a second question: when we tell the model not to do something, is that the only thing stopping it, or have we also kept the thing out of its reach? On a cost-optimized model, the difference between a prompt rule and a structural constraint is the difference between a bug that shows up one call in twenty and one that cannot happen. Building that judgment into a client’s agent, and keeping the audit honest while you do it, is the substance of how we work with partners.

Related questions
Why did the voice agent tell callers to call the number they had already dialed?
Because the knowledge base was written for the web. On a web page, ending an answer with call our office for help is correct, since the reader is at a desk and not yet on the phone. The same document was reused to ground a voice agent answering that office's main line. The model read the instruction as written and sent the caller back to the line they were already on. The model had no way to know the channel had changed.
Why do negative prompt rules fail on small language models?
A rule like never recite X unless Y is a negative instruction. On a small, cost-optimized model at temperature 0 it works most of the time, but it fails in exactly the cases where the model is grounding on a retrieved chunk that contains X. The chunk anchors the model harder than the rule does, especially when X is presented as a natural part of the answer the chunk teaches. Positive rules and structural constraints are more reliable than negative ones.
What is a structural fix versus a prompt fix for a language model?
A prompt fix tells the model what to do or not do. A structural fix changes what the model can see or reach, so the unwanted behavior becomes impossible rather than discouraged. Here the structural fix was removing the office number from each knowledge-base chunk before it entered the model context. The model cannot relay what is not in its input. The prompt rule stayed as a backstop for any chunk the filter misses.
How do you redact model input without breaking your evaluation metrics?
Keep two paths from one source. Apply the redaction only on the path that builds the model context. Keep an unredacted copy on the path the post-call audit reads. At taritas the model saw the sanitized chunks, while grounding extraction and the faithfulness evaluation read the raw chunks from a separate audit buffer. That way the model is protected and the audit still traces each response back to the real source text. We verified no faithfulness regressions on a replay corpus.
What other knowledge-base instructions break on a voice agent?
Any instruction that assumes the reader is on a different channel than the caller. Call our office is the clearest case. Visit our website breaks when the caller has no screen. Send us an email breaks when the agent cannot send email and the caller is not at a keyboard. A knowledge base inherited from the web is a good starting point for a voice agent, but it needs a cleanup pass at the point where its text enters the model.

Reading this because a client asked for voice AI? That is the conversation we are built for. What taritas does for partners.

More from Field notes
PROJECT taritas.com/blog
DWG POST-15
REV 1.0
DATE 2026-06-30