PII Handling in AI: Best Practices

From AISApedia, the AI skills & terms encyclopedia

PII handling in AI contexts refers to the practices and protocols for identifying, sanitising, and protecting personally identifiable information before it enters language model inputs or outputs. Because large language models may retain and surface training data unpredictably, any PII included in prompts — names, emails, financial details — creates privacy risk that extends well beyond the immediate conversation.

What counts as PII when working with AI?

PII encompasses any data that can identify an individual, either directly or in combination with other information. Direct identifiers include names, email addresses, phone numbers, national ID numbers, and financial account details. Indirect identifiers — job titles, geographic locations, dates of birth — become PII when combined in ways that narrow identification to a single person.

In AI workflows, the risk surface expands beyond traditional data handling. A prompt containing a customer's complaint, purchase history, and approximate location may not include a name, yet the combination can uniquely identify them. Language models process all input as context, meaning PII embedded anywhere in a prompt influences the generated output and may be echoed, paraphrased, or recombined in unexpected ways.

Regulatory frameworks such as GDPR, CCPA, and HIPAA — covered in more depth under AI data privacy — each define PII slightly differently, but the practical implication is the same: if data could trace back to an individual, treat it as sensitive before it reaches any AI system. Teams that have not mapped which regulatory frameworks apply to their data are operating with unquantified risk.

A useful exercise is to audit the last ten prompts your team sent to AI tools and highlight every piece of information that could contribute to identifying a person. Most teams find PII in places they did not expect — a common safety guardrails gap — customer names embedded in error logs, email addresses in forwarded threads, or employee details in meeting notes that were pasted in for summarisation.

Why must PII be stripped before AI processing, not after?

Once PII enters a language model's input, the damage window is already open. Even in API-based interactions where the provider claims not to train on user data, the information traverses network infrastructure, may be logged for abuse detection (see data retention policies), and exists in memory during processing. Post-processing sanitisation — redacting PII from the output — addresses only the visible surface while leaving the underlying exposure intact.

Pre-processing sanitisation replaces identifiable data with placeholder tokens before the prompt is sent. This approach preserves the analytical value of the text while eliminating privacy risk entirely. A customer complaint reading '[CUSTOMER] reports that order [ORDER_ID] arrived damaged at [LOCATION]' gives the model everything it needs for troubleshooting without exposing anyone's identity. The model can still analyse sentiment, identify the issue category, and suggest a resolution.

Teams that rely on after-the-fact redaction frequently discover edge cases where the model weaves PII into its reasoning in ways that are difficult to catch programmatically. The model might mention the customer by name in an analogy, reference their email in a suggested follow-up action, or combine partial details into a re-identifiable profile. The only reliable strategy is to ensure PII never enters the model in the first place.

The legal implications reinforce this principle. Under GDPR, processing personal data through a third-party AI service may constitute a data transfer that requires specific legal bases, data processing agreements, and potentially Data Protection Impact Assessments. Pre-processing sanitisation avoids triggering these requirements entirely because no personal data leaves your systems.

How do teams implement PII sanitisation in practice?

The most accessible approach is regex-based pattern matching: regular expressions that detect email addresses, phone numbers, credit card formats, and similar structured identifiers. These patterns run on the client side before any API call, replacing matches with category labels like [EMAIL] or [PHONE]. Regex handles structured PII well but struggles with unstructured identifiers like names embedded in natural language.

For unstructured PII, named entity recognition (NER) models provide a second layer of protection. Libraries such as spaCy, Presidio (from Microsoft), and cloud-based entity detection services can identify person names, organisations, and locations within free text. Combining regex for structured data with NER for unstructured data covers the majority of PII categories. The NER layer catches cases that regex misses — a customer name mentioned in prose, a company name that could identify an individual at a small firm, or a location specific enough to narrow identification.

In production pipelines, teams typically implement sanitisation as middleware — a common API integration pattern — a processing step that sits between the user input and the AI API call. This architectural pattern ensures that no code path can accidentally bypass sanitisation, regardless of which developer writes the integration. The middleware logs what was redacted (by category, not content) for audit compliance without creating a secondary store of sensitive data.

For teams processing data at scale, dedicated PII detection services offer higher accuracy than DIY regex patterns. These services are trained specifically on PII detection across multiple languages and formats, handling international phone numbers, varied address formats, and cultural name patterns that a simple regex library would miss. The investment is justified when the volume or sensitivity of data makes a missed detection consequential.

When does PII sanitisation fail, and how can you detect gaps?

The most common failure mode is incomplete coverage: a sanitisation pipeline that catches emails and phone numbers but misses account numbers, medical record identifiers, or IP addresses. Each industry has domain-specific PII formats that generic regex patterns will not match. Financial services teams need to handle IBAN numbers, healthcare teams need to address patient identifiers, and e-commerce teams need to sanitise shipping addresses.

Another frequent gap is contextual PII — information that is not identifiable in isolation but becomes so in combination. A message mentioning 'the CEO of [COMPANY] who attended [EVENT] last Tuesday' may have had the name redacted, but the remaining details still uniquely identify the individual. Detecting these combinations requires understanding the data's context, not just its format.

Over-sanitisation creates a different problem: replacing so much text with tokens that the model cannot produce useful output. If every noun is replaced with a placeholder, the prompt becomes incomprehensible. Effective sanitisation balances privacy protection with utility preservation — removing what is identifiable while keeping what is analytically necessary.

Regular auditing is the practical countermeasure. Teams that take data classification for AI seriously run periodic tests — similar to red-teaming LLMs — where known PII patterns are injected into the pipeline to verify they are caught. Red-team exercises — deliberately crafting inputs designed to slip through the sanitisation layer — reveal gaps that routine testing misses. These audits should run at least quarterly, and after any change to the sanitisation pipeline or the data sources it processes.

Try this yourself

Before sending any customer data to AI today, run this regex replacer: email pattern: `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b` → '[EMAIL]'. Test it on real customer messages and verify all PII is stripped before AI processing.

Real-world example

Developer pastes customer complaint directly into AI: 'Sarah Johnson (sarah.j@company.com) reports transaction ID 4532-1111-2222-3333 failed.' Three pieces of PII that could appear in future model outputs. After sanitization: '[CUSTOMER] reports transaction ID [TXN_ID] failed' — same debugging value, zero risk.