Presidio's built-in recognizers cover the common PII types: names, emails, phone numbers, credit cards, SSNs. But every organization has PII that's specific to their business. Internal employee IDs that follow a custom format. Project codenames that shouldn't leak externally. Customer account numbers that don't match any standard pattern. Medical record numbers, policy IDs, internal ticket references. The built-in recognizers don't know about these.

This part covers four ways to build custom recognizers, from the simplest (a list of words to flag) to the most sophisticated (connecting an external NLP service).

Deny-List Recognizers

The fastest way to add a custom recognizer is a deny list. You give Presidio a list of words or phrases and it flags any exact match as a specific entity type.

Use case: your company has internal project codenames (like "Project Titan," "Sapphire," "Nightingale") that are confidential and should never appear in data sent to external services.