How We Prevent AI Hallucination in Data Search

Generative AI is remarkable at synthesising language — but that same fluency becomes a liability when users need factual answers from their own data. Ask an unprepared AI "what is the price of SKU LF-100016?" and it will confidently produce a number. It just might not be the right number. This post explains the four-layer system we built to make sure every answer our AI gives is grounded in real, retrieved data — not in its training memory.

The Root Problem: AI Fills Silence with Confidence

Large language models are trained to be helpful. When they don't know something, they don't stop — they extrapolate. In a customer-facing data search product, that means an AI might recommend a product that isn't in your catalogue, quote a price it invented, or describe a record that never existed. We call this the "helpful hallucination" trap.

The solution isn't a smarter model. It's a better pipeline.

Layer 1: Search First, Talk Second

The most important rule is the simplest: the AI never answers from memory if the question is about data.

Before a single token is generated, the system performs a hard database search. Raw rows are fetched directly from the database — up to 50,000 rows per dataset — and the matching results are formatted into a structured table. That table is injected into the AI's prompt inside a clearly delimited === VERIFIED DATA === block. The AI is not asked to recall. It is asked to read and report.

Layer 2: The "No Results" Override

A subtler failure mode is what happens when the search runs but finds nothing. A naive setup hands the empty result to the AI, which then cheerfully pivots to general knowledge: "I couldn't find that in your database, but here are some popular alternatives…"

We block this entirely. When a search was genuinely attempted but returned zero matching records, the AI's instructions are completely replaced with a hard prohibition: it must tell the user nothing was found, and it must not suggest, recommend, or describe anything from its training knowledge as a substitute. The database is the authority. Silence from the database means silence from the AI.

Layer 3: Entity-Presence Verification

For queries that name a specific entity — a place, a person, a product — there is an additional gate before results reach the AI. The system checks whether that named entity actually appears in any of the retrieved document chunks. If it doesn't, the results are discarded outright, regardless of how high their relevance score is.

This prevents a common vector for hallucination: a keyword query can score a technically-relevant chunk that never actually mentions the thing the user asked about. By requiring the entity to be present in the text, not just related to it, we ensure the AI is always anchored to something real.

Layer 4: Strict Prompt Constraints

Finally, the AI receives a set of non-negotiable rules alongside the verified data:

Use values verbatim. Numbers, prices, SKUs, and URLs must be copied exactly as they appear in the data — never paraphrased, rounded, or estimated.
No recommendations outside the data. If the user asks "which product should I buy?", the AI may only recommend items that appear in the search results. It cannot draw on general training knowledge to suggest brands or alternatives not present in the dataset.
No external-knowledge disclaimers. The AI must not speculate about features that are "missing" from a record based on what it knows about the world. The data is complete and authoritative; gaps are not gaps to be filled.
Conversation context for pronouns only. Prior chat turns are used solely to resolve references like "it" or "that one" — never to filter or colour the current search. Each new factual question is treated independently.
Ambiguity disclosure. When a partial name matches multiple entities in the dataset, the system detects this and tells the AI explicitly, preventing it from silently picking one or claiming data is missing.

The Net Effect

Each layer addresses a different failure mode: the first ensures there is always real data to reason over; the second stops the AI from inventing alternatives when nothing matches; the third prevents entity misattribution; and the fourth governs how the AI is allowed to present what it found.

No single guard is sufficient on its own. Together, they create a pipeline where the AI's role is to surface and present — not to generate, estimate, or supplement. The result is a search experience users can trust, because every answer traces back to a row in the database.