The Data Privacy Vocabulary is a W3C Community Group specification that provides a structured, machine-readable way to describe personal data processing activities, privacy policies, and regulatory compliance. Born from the need to operationalize the European Union's General Data Protection Regulation (GDPR) and similar privacy laws, DPV has grown into a comprehensive ontology covering data protection concepts across more than 40 jurisdictions.
Background
DPV originated in 2019 as part of the SPECIAL H2020 Project, which received funding from the European Union's Horizon 2020 research and innovation programme. The W3C Data Privacy Vocabularies and Controls Community Group (DPVCG) was established to develop and maintain the vocabulary collaboratively. The initial publication, "Creating a Vocabulary for Data Privacy," laid out the foundational design principles. Since then, the vocabulary has undergone continuous development, reaching version 2.0 in 2024 with a major expansion of coverage, and version 2.3 in February 2026. Continued development is funded through the RECITALS project under the EU Horizon programme.
Purpose & Scope
DPV is designed for organizations, developers, regulators, and researchers who need to represent privacy and data protection information in a structured format. It covers:
- Purposes of processing -- why personal data is being collected and used
- Legal bases -- the lawful grounds for processing (consent, legitimate interest, etc.)
- Personal data categories -- types of data being processed
- Technical and organizational measures -- safeguards applied to data
- Data subjects and their rights -- who is affected and what rights apply
- Risks and impacts -- assessment of processing risks
The vocabulary is intentionally modular. The core DPV provides general-purpose concepts, while extensions address specific domains and regulations.
Key Extensions
| Extension | Purpose |
|---|---|
| PD | Personal Data categories |
| LOC | Locations and jurisdictions |
| TECH | Technologies used in processing |
| AI | AI-specific technologies and concepts |
| JUST | Justifications for processing |
| RISK | Risk assessment and management |
| SECTOR | Sector-specific concepts (education, finance, health, etc.) |
| LEGAL | Jurisdiction-specific legal frameworks (EU, India, US, etc.) |
The legal extensions are particularly notable, providing detailed mappings for the GDPR, the EU Data Governance Act (DGA), the EU AI Act, the European Health Data Space (EHDS), and the Charter of Fundamental Rights.
Serializations & Technical Formats
DPV is expressed using RDFS and SKOS semantics by default, with an OWL2 profile available for applications requiring formal reasoning. The namespace URI is https://w3id.org/dpv#, and all persistent identifiers use the w3id.org redirect service. Serializations are available in RDF/XML, Turtle, and JSON-LD.
Governance & Maintenance
The vocabulary is maintained by the W3C DPVCG, an open community group. Participation is open to anyone through the W3C community group process. Development happens publicly on GitHub, with regular meetings documented through published minutes. The contribution guide and use-cases repository help new participants understand the project's direction. The vocabulary follows semantic versioning, with the latest release (v2.3) published in February 2026.
All work is released under the W3C Software and Document License (2023 version).
Notable Implementations
DPV is used in privacy compliance tooling, consent management platforms, and regulatory technology (RegTech) applications. It has been adopted for documenting GDPR compliance, particularly for Data Protection Impact Assessments (DPIAs) and Records of Processing Activities (ROPAs). The guide on implementing ISO/IEC 27560:2023 Consent Records and Receipts demonstrates its applicability to international standards beyond EU regulations.
Related Standards
DPV intersects with several other standards in the privacy and semantic web space. It is designed to work alongside general-purpose ontologies like Schema.org and Dublin Core for resource description, while providing the specialized privacy vocabulary those standards lack.