Skip to main content
Build trusted data with Ethyca.

Subject to Ethyca’s Privacy Policy, you agree to allow Ethyca to contact you via the email provided for scheduling and marketing purposes.

Glossary

Retrieval-Augmented Generation(RAG)

Last reviewed

A pattern that retrieves relevant documents from a knowledge base at query time and provides them to an LLM as context, grounding the response in source data. RAG keeps proprietary or personal data out of the model weights while still allowing the model to reason over it.

RAG is a deployment pattern that pairs a generative model with a retrieval system. At query time, the user's question is used to find relevant documents from a knowledge base — typically embeddings stored in a vector database — and those documents are then provided to the model as context, alongside the original question, before the model generates its answer.

The pattern matters for privacy and governance because it keeps proprietary or personal data out of the model weights. The model itself remains general-purpose; the company-specific or sensitive content lives in the retrieval store and is supplied only when needed. That separation makes RAG far easier to govern than fine-tuning: data can be added, updated, or deleted from the retrieval store without retraining the model, and the retrieval store can enforce its own access controls.

But RAG is not a get-out-of-jail-free card. The retrieved documents still pass through the model's context window, meaning the model's prompt sees them and the inference provider — depending on terms — may log them. Data classification of the retrieval store, control of which sources can be retrieved per user, redaction of sensitive fields before retrieval, and clear data-flow contracts with the model provider are all still necessary.