Privacy Request Redaction Patterns
The Privacy Request Redaction Patterns feature allows administrators to automatically mask sensitive dataset, collection, and field names in DSR (Data Subject Request) package reports. This helps protect sensitive information while still providing useful reports for compliance and debugging purposes.
This feature is particularly useful for organizations that need to share DSR reports with external parties while protecting internal system architecture and sensitive data structure information.
How It Works
When a privacy request is processed, the system applies redaction rules to dataset, collection, and field names. Names that match configured patterns are replaced with position-based identifiers:
- Datasets:
customer_database→dataset_1,analytics_system→dataset_2 - Collections:
users→collection_1,orders→collection_2 - Fields:
email→field_1,name→field_2
Names that don't match any redaction patterns remain unchanged.
Two Types of Redaction
1. Global Regex Patterns
Configure regex patterns that apply across all datasets to match and redact names based on text patterns.
2. Entity-Specific Configurations
Apply redaction to specific datasets, collections, or fields using fides_meta.redact: name annotations.
Getting Started
Prerequisites
- Fides admin UI access with appropriate permissions
- For entity-specific redaction: Dataset configuration files (YAML format)
Required Permissions
The feature requires one of these OAuth scopes:
PRIVACY_REQUEST_REDACTION_PATTERNS_READ(to view patterns)PRIVACY_REQUEST_REDACTION_PATTERNS_UPDATE(to modify patterns)
Configuration Methods
Method 1: Global Regex Patterns (Admin UI)
-
Navigate to Settings: Go to Settings > Privacy requests in the Fides admin UI
-
View Current Patterns: The page displays all currently configured regex patterns
-
Add New Patterns: Click "Add regex pattern +" to create a new pattern
-
Configure Patterns:
- Enter regex patterns (e.g.,
sensitive_.*,.*_private,^email$) - Patterns are validated for correct regex syntax
- Maximum 100 patterns allowed per system
- Each pattern limited to 500 characters
- Enter regex patterns (e.g.,
-
Save Changes: Click Save to apply the new patterns
-
Remove Patterns: Click the delete (×) button next to any pattern to remove it
Example Regex Patterns
sensitive_.* # Matches anything starting with "sensitive_"
.*_private # Matches anything ending with "_private"
^email$ # Matches exact field name "email"
customer.* # Matches anything starting with "customer"
.*user.* # Matches anything containing "user"Method 2: Entity-Specific Redaction (Dataset Configuration)
Add fides_meta.redact: name annotations to your dataset YAML files to redact specific entities:
Dataset-Level Redaction
dataset:
- fides_key: customer_database
name: Customer Database
description: Main customer data store
fides_meta:
redact: name # Redacts the dataset name itself
collections:
# ... collectionsCollection-Level Redaction
dataset:
- fides_key: customer_database
name: Customer Database
collections:
- name: sensitive_users
description: User data with sensitive information
fides_meta:
redact: name # Redacts "sensitive_users" → "collection_1"
fields:
# ... fieldsField-Level Redaction
dataset:
- fides_key: customer_database
collections:
- name: users
fields:
- name: email
description: User's email address
fides_meta:
redact: name # Redacts "email" → "field_1"
- name: name
description: User's full name
# No redaction - field name remains "name"Nested Field Redaction
dataset:
- fides_key: customer_database
collections:
- name: users
fields:
- name: profile
fields:
- name: social_security_number
fides_meta:
redact: name # Redacts nested field
- name: public_info
# No redaction - nested field name remains unchangedRedaction Precedence
- Entity-specific redaction takes precedence over global patterns
- Global regex patterns are applied when no entity-specific configuration exists
Testing Your Configuration
- Create test datasets with redaction annotations
- Submit a privacy request through the admin UI
- Download the generated DSR package
- Verify redaction by checking that:
- Redacted names show as
dataset_1,collection_2,field_3, etc. - Non-redacted names remain unchanged
- Position-based numbering follows order of appearance
- Redacted names show as
Common Use Cases
1. Protect Sensitive Data Sources
# Redact datasets containing PII
- fides_key: customer_pii_database
fides_meta:
redact: name
# Or use global patterns
sensitive_.* # Redacts any dataset starting with "sensitive_"
.*_pii # Redacts any dataset ending with "_pii"2. Mask Specific Field Types
# Redact email fields specifically
- name: email
fides_meta:
redact: name
# Or use global pattern
^email$ # Redacts exact field name "email"
.*email.* # Redacts any field containing "email"3. Hide Internal System Names
# Redact internal collection names
- name: internal_analytics
fides_meta:
redact: name
# Or use global pattern
internal_.* # Redacts collections starting with "internal_"Best Practices
Pattern Design
- Be specific: Use precise patterns to avoid unintended redaction
- Test thoroughly: Verify patterns work as expected before deploying
- Document patterns: Comment your patterns for future maintainers
Performance Considerations
- Limit patterns: Maximum 100 patterns per system
- Use simple regex: Complex patterns may impact processing time
- Entity-specific preferred: More efficient than broad regex patterns
Security
- Regular review: Periodically audit redaction patterns
- Principle of least surprise: Only redact what's necessary
- Backup original names: Redacted names are preserved in metadata
Troubleshooting
Pattern Not Working
- Check regex syntax: Use the admin UI validation
- Verify pattern matching: Test with sample names
- Check permissions: Ensure proper OAuth scopes
- Review logs: Check Fides logs for redaction errors
Unexpected Redaction
- Too broad patterns: Narrow regex patterns if over-redacting
- Pattern precedence: Entity-specific overrides global patterns
- Test in isolation: Remove other patterns to isolate issues
Performance Issues
- Reduce patterns: Fewer, more specific patterns perform better
- Use entity-specific: More efficient than complex regex
- Monitor processing time: Large datasets may take longer to process