Most RAG-powered support agents quietly leak customer PII. Not in the dramatic "we got hacked" way. In the subtle, persistent, hard-to-detect way that surfaces at SOC 2 audit time when an auditor finds 47 instances of customer A's data showing up in conversations with customer B.
The mechanism is almost always the same. The fix is two lines. Most teams miss both lines because the RAG tutorials they followed don't mention them.
A support agent retrieves "the most relevant chunks" from a vector store containing all customer records. The chunks are most relevant to the question, not the customer asking it. So a chunk containing Customer A's recent ticket about a refund can be retrieved into Customer B's session — because the words "refund" and "yesterday" happen to vector-match. The LLM then helpfully summarizes details that belong to a different person.
How to detect it in your own stack — 4-minute audit
Open your agent's conversation log. Filter to the last 200 conversations. For each conversation, check:
- Does the agent's response reference any first name, email fragment, or order ID that doesn't appear in the user's own profile or message?
- Does the agent reference dates ("last Tuesday," "your June order") that don't match the user's account timeline?
- Does the response include any phone-number-shaped or address-shaped strings that the user didn't provide?
If you find any of these in 200 conversations, you have the leak. We've audited 60+ RAG support agents and found this in 41 of them. The base rate is roughly two-thirds of production deployments.
The retrieval filter every RAG support agent needs
The fix happens at retrieval, not at generation. Most teams try to fix it at generation ("we'll prompt the LLM not to mix up customers") which fails reliably under load — the LLM doesn't have access to a "this isn't your data" signal because retrieval already injected it into context.
You need to filter the vector store query before the chunks reach the LLM. Most vector stores (Pinecone, Weaviate, Qdrant, pgvector) support metadata filters on the query call. Set customer_id as metadata when you upsert. Pass it as a filter when you query.
Line 1: tag every chunk with customer_id at upsert
# DON'T do this:
index.upsert([(chunk_id, embedding, {"text": chunk_text})])
# DO this:
index.upsert([(chunk_id, embedding, {
"text": chunk_text,
"customer_id": chunk.customer_id,
"doc_type": chunk.doc_type,
"created_at": chunk.created_at,
})])
Two minutes of work at upsert time. Zero query overhead — metadata is indexed in every modern vector DB.
Line 2: pass customer_id as a filter on every query
# DON'T do this:
results = index.query(vector=q_embed, top_k=5)
# DO this:
results = index.query(
vector=q_embed,
top_k=5,
filter={"customer_id": {"$eq": session.customer_id}}
)
Now the retrieval can only return chunks belonging to the customer asking the question. The LLM never sees other customers' data. The leak is structurally impossible — not "we hope the model behaves," but "the data physically isn't in context."
The edge cases the tutorial doesn't cover
Shared knowledge-base chunks (FAQs, policies)
Some chunks legitimately belong to all customers — a "how do I reset my password" FAQ entry shouldn't be filtered out. Two patterns we use:
- Sentinel customer_id: tag shared docs with
customer_id: "shared". Query withfilter={"customer_id": {"$in": [session.customer_id, "shared"]}}. - Two-namespace pattern: separate indexes for customer-scoped and shared content. Query both, merge results. Slightly more code, much cleaner audit trail.
Account hierarchies (parent → child accounts)
If Customer A's admin should see Customer A's team's tickets, the filter needs {"account_id": {"$eq": session.account_id}} not customer_id alone. Otherwise admins can't help their own team. Get this wrong and you're back to the leak — but scoped to the wrong axis.
Time-bounded retrieval
For agents that should not reference data older than, say, 90 days (deleted ticket history, GDPR retention windows), add {"created_at": {"$gte": ninety_days_ago_unix}}. Same pattern, same line count, prevents time-leaks.
Why this isn't fixed by your prompt engineering
You can't reliably stop an LLM from summarizing what's in its context window. The model is trained to be helpful with the information present. If you put another customer's data in front of it, it will use that data. Defensive prompts ("only reference the current customer's account") work in test cases and fail in production once context windows fill up and instruction-following degrades under load.
The right architectural answer to "is the LLM behaving correctly?" is "the LLM never had the chance to behave incorrectly because the data wasn't there."
Filter at retrieval. Trust the LLM with only the data the customer is allowed to see. That's the entire fix.
How to verify the fix held
After deploying, rerun the audit from the top of this post 7 days later. The same 200-conversation sample should now show zero instances of cross-customer references. We also recommend permanent monitoring:
- Log every retrieved chunk's
customer_idalongside the session'scustomer_id. Alert when they don't match. - Use named-entity-recognition (NER) on agent responses to flag any first-names or email-like strings that don't appear in the session profile.
- Run a weekly random-sample audit of 50 conversations until the metric stays at zero for 30 days.
For deeper background on the RAG architecture this fix lives inside, see our earlier deep-dive: RAG Configuration Deep Dive. For the broader question of whether your agent layer is actually doing what its dashboard says, see Why Your Follow-Up Agent Should Fail Loudly.
★ Next issue · NOW PUBLISHED Issue #144 — The 3-Second Rule That Doubles AI Conversation Completion. Plus subscribe to The Scale Brief for new technical issues every Sunday.