Skip to main content
Build trusted data with Ethyca.

Subject to Ethyca’s Privacy Policy, you agree to allow Ethyca to contact you via the email provided for scheduling and marketing purposes.

Glossary

Generative AI

Last reviewed

AI systems that produce new content — text, images, audio, video, code — rather than only classifying or predicting from existing data. Generative systems raise novel data protection questions because training data, prompts, and outputs may all contain personal data.

Generative AI refers to any AI system that produces new content — text, images, audio, video, code — rather than only classifying or predicting from existing data. Large language models, diffusion models for images, and multimodal models that span text, audio, and video are all examples. The defining property is generativity: the output is new, not retrieved.

The compliance picture is more complex than for predictive AI. Predictive models classify inputs against learned categories; generative models synthesize outputs that may or may not faithfully represent reality. This creates risks not present in earlier AI: confident hallucinations, fabricated attribution to real individuals, copyright entanglement with training data, and the use of personal data in prompts to generate further personal data at scale.

Regulators have responded. The EU AI Act introduces specific obligations on generative model providers — transparency about training data, disclosure that content is AI-generated, technical measures against unlawful content. Existing data-protection law applies on top: prompts containing personal data are still processing, outputs about identified individuals are still personal data, and rights requests against the trained model are an open and contested legal question.