Apr 26, 2026 · 4 min read

Preventing accidental PII exposure in Claude Code MCP workflows

When you use Claude Code with MCP tools, the assistant calls connected services — CRM, support systems, billing, product databases — and uses the raw output to answer your question.

That output often contains customer names, emails, phone numbers, account notes, and other identifiers that were not the point of the question, but end up in model context anyway. Most of the time this happens quietly and no one notices.

We built a small Claude Code hook to intercept MCP tool output before it reaches the model and redact likely PII from it.

Repository: DaymarkAI/privacy-proxy

Before reading further: this is a guardrail, not a compliance tool. It will miss things. The goal is to reduce accidental exposure on common PII patterns, not eliminate all risk. Non-string fields pass through unredacted. It also does not stop data from being sent by MCP servers in the first place — it only acts on output after it arrives from the MCP server.

Why this happens in practice

MCP tools are built to return rich, structured data. That is what makes them useful. But the same richness means that a query like "which leads did not convert last month" may return names, emails, rep notes, and phone numbers alongside the aggregate answer you actually wanted.

None of the scenarios below are unusual:

Campaign and pipeline analysis

"Which high-intent leads from last month did not convert, and what objections were logged?"

The answer may include names, emails, rep call notes, and contact details.

Account research before outreach

"Summarize this account from HubSpot, Zendesk, and product usage."

Merged output from multiple sources may include direct identifiers and support context.

Churn investigation

"Why did these accounts churn in Q1?"

Tool output may include customer-level conversations and contact records.

Support escalation triage

"Group top escalations by segment and show examples."

Examples can contain copied customer text, emails, and personal context.

Raw MCP output vs. what reaches the model

The hook sits between the two. These are the kind of responses MCP tools commonly return, and what they look like after redaction.

Pipeline follow-up analysis

Without the hook:

Top missed opportunities:
1) Sarah Kim (sarah.kim@acme.com, +1-415-555-0192), account exec note:
   "Budget approved after 2026-03-14, asked for security review."
2) Daniel Ortiz (daniel@northwind.io), called from +1-646-555-0133.

With the hook:

Top missed opportunities:
1) <PRIVATE_PERSON> (<PRIVATE_EMAIL>, <PRIVATE_PHONE>), account exec note:
   "Budget approved after <PRIVATE_DATE>, asked for security review."
2) <PRIVATE_PERSON> (<PRIVATE_EMAIL>), called from <PRIVATE_PHONE>.

Support escalation summary

Without the hook:

Escalation #1:
- Customer: Priya Nair
- Email: priya.nair@clientco.com
- Account number: ACC-7841-2299
- API key seen in ticket: sk_live_7f4d9c2a...
- Issue location: 40 Market St, San Francisco

With the hook:

Escalation #1:
- Customer: <PRIVATE_PERSON>
- Email: <PRIVATE_EMAIL>
- Account number: <ACCOUNT_NUMBER>
- API key seen in ticket: <SECRET>
- Issue location: <PRIVATE_ADDRESS>

How it works

The hook sits on Claude Code's PostToolUse event and matches any mcp__.* tool. When an MCP tool returns output, the hook runs before the model processes it.

It uses OpenAI Privacy Filter (OPF) for detection — OPF is a model-based PII detector, not a regex or rules list. It identifies spans of text that look like personal identifiers and replaces them with typed placeholders. If OPF fails, times out, or is missing, the hook blocks the model step entirely rather than letting unredacted output through (fail-closed).

Canonical UUID strings are allowlisted and pass through unchanged.

What gets redacted:

<PRIVATE_PERSON> — person names
<PRIVATE_EMAIL> — email addresses
<PRIVATE_PHONE> — phone numbers
<PRIVATE_DATE> — personal dates
<PRIVATE_ADDRESS> — physical addresses
<PRIVATE_URL> — personal URLs
<ACCOUNT_NUMBER> — account or financial identifiers
<SECRET> — API keys, tokens, credential-like strings

Setup

Requirements: Python 3, Claude Code

1. Clone the repo and install OPF

For project level use, setup the repo in your project's .claude directory.

For global use across all Claude Code sessions, setup the repo in root ~/.claude directory.

git clone https://github.com/DaymarkAI/privacy-proxy.git privacy_proxy
cd privacy_proxy
git clone https://github.com/openai/privacy-filter.git privacy-filter
cd privacy-filter
python3 -m venv .venv
source .venv/bin/activate
pip install -e .
cd ..

2. Add the hook to Claude settings

For project-level use, add this to .claude/settings.json in your project:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "mcp__.*",
        "hooks": [
          {
            "type": "command",
            "command": "python3 \"$CLAUDE_PROJECT_DIR/claude_hooks/post_tool_use_pii_redact.py\""
          }
        ]
      }
    ]
  }
}

For global use across all Claude sessions, add the same block to ~/.claude/settings.json, replacing $CLAUDE_PROJECT_DIR with $HOME/.claude/privacy_proxy.

3. Restart your Claude Code session

4. Quick sanity check

./privacy-filter/.venv/bin/opf --device cpu "Alice was born on 1990-01-02."

You should see the name and date replaced with typed placeholders.

Known limitations

Only runs on PostToolUse for mcp__.* tools. Other hook events are not covered.
Only string fields are scanned. Non-string values in tool output pass through unchanged.
Detection quality is bounded by OPF's model — it will miss some patterns and may flag some non-PII.
Does not control what MCP servers include in their responses. The hook intercepts output after the MCP tool returns it and before the model sees it — but if a server returns sensitive data, that data still traveled from the source to your machine.
Not a compliance guarantee.

Preventing accidental PII exposure in Claude Code MCP workflows

Why this happens in practice

Raw MCP output vs. what reaches the model

How it works

Setup

Known limitations

Related articles

Best ETL Tools for 2026: 5 Picks by Use Case

Best Data Analysis Tools for Ecommerce in 2026

Best Data Analysis Tools for Growth Teams in 2026