Data Mapping: Monitoring Data Sources
Introduction
Fides Detection & Discovery (D&D) tracks the updates to your organization’s data architecture. When a data source is updated, Fides will report the added, removed, or changed fields as well as suggest data categorizations for those fields.
This tutorial walks through the entire data pipeline, starting from configuration of the dataset, moving through promoting and tagging fields, and finally showing the resulting dataset.
Glossary of Terms
Detection & Discovery (D&D) introduces a few new concepts that are important to understand before you begin:
- Integrations host the connection details to your data store and the configuration for data classification parameters.
- Monitors determine the scope and schedule of detection and discovery tasks
- Scans are the tasks that Fides executes to detect new or updated data
- Staged resources are the database assets that are detected by a Scan before they are promoted to a Dataset
- Monitored resources are the staged resources that have been selected to be promoted to a Dataset and will be classified by the Fides classifier
- Fides classifier: An collection of models that assigns data categories to fields
Connecting to a Data Store
Navigate to the Integrations tab. Click Add Integration, select the integration type, and provide connection details.

For some data stores, Fides will require a database administrator set up a Fides service account within the datastore with the necessary permissions. This allows Fides to run SELECT, UPDATE, or DELETE queries according to the monitor and privacy request requirements. For instructions on how to configure the service user, click the Details button next to the integration during setup.

After providing Fides the connection details, click the Test connection
button to ensure the integration is configured correctly. Once the connection succeeds, within the integration, click Add Monitor. This monitor controls the scheduling of a Scan. Each Scan that Fides executes checks for any new child assets to the monitored resources.

Data Detection
After completing a Scan, staged resources appear in the Data detection tab under Detection & Discovery. Within this page, choose which assets should be monitored for user data. If so, click Monitor; otherwise, click Ignore.
This decision can always be updated within the Monitored and Unmonitored tabs by selecting the available action (Ignore or Monitor).
Data Schema Updates
Monitors track changes in data schemas. Additions, such as a new column, appear with a green, upward arrow. Deletions, such as a dropped column, appear with a red, downward arrow. Other changes appear as a blue dot.

Data Discovery
After choosing to Monitor a schema, staged resources are promoted to the Data Discovery tab. This is where users review the data category tags assigned by the Fides classifier, making necessary adjustments to each field that was promoted. Click through the table rows to see how classifications are made during the automated discovery phase.
To update data category tags, click the data category assigned automatically. Then, search for the correct data category; the UI updates the Status to Reviewed
after you update a data category.
When all the categories are reviewed and properly assigned, click Confirm all to commit the schema and classifications to a Fides dataset, which are viewable in the Manage datasets tab.
Classification
From the data schema and table view, use the Reclassify button to update the data category. This is useful only when you re-run classification:
- before discovery results have been committed to a dataset
- after updating a monitor configuration
Monitors can share configurations for regex annotation parameters. When a field matches a regex pattern, the corresponding data category is applied and the machine learning classification process is skipped for that field. This is especially useful when your data has similar naming schemas across assets.
For example, providing the regex mapping:
.*os_version
-> user.device
classifies all field labels that contain os_version as user.device.
Datasets
After you confirm in the Discovery page, a new Dataset is created and is viewable in the Manage datasets tab.
To read more about datasets, see Datasets.
Updating Annotations Directly on a Dataset
When you update a field’s annotation directly in a dataset, it ignores any updates made in the monitor. In other words, changes you make directly on the dataset have the highest priority when assigning data category tags.