The Data Quality Vocabulary is a W3C vocabulary that provides an RDF framework for describing, assessing, and communicating data quality information. Published as a W3C Interest Group Note in December 2016, DQV emerged from the Data on the Web Best Practices Working Group's effort to improve the discoverability, usability, and trustworthiness of data published on the web.
Background
As open data publication grew throughout the 2010s, consumers faced a persistent challenge: determining whether a dataset was fit for their intended purpose. Quality information was often absent, inconsistent, or expressed in ad hoc ways that defied comparison. The W3C's Data on the Web Best Practices Working Group addressed this by developing DQV as a companion vocabulary to DCAT (Data Catalog Vocabulary), providing a standardized way to attach quality metadata to datasets and their distributions.
DQV builds on earlier data quality research, particularly the ISO 25012 data quality model and established dimensions such as accuracy, completeness, consistency, and timeliness. Rather than prescribing specific quality metrics, DQV provides a meta-level framework that can accommodate any quality assessment methodology.
Purpose & Scope
DQV defines classes and properties for expressing quality information at multiple levels of granularity. Its core concepts include quality dimensions (measurable aspects like accuracy or completeness), quality categories (groupings of related dimensions), quality metrics (formal definitions of how dimensions are measured), and quality measurements (actual observed values for a metric applied to a dataset).
The vocabulary also supports quality certificates issued by third parties, quality annotations in the form of user feedback, and quality policies that define minimum acceptable thresholds. This layered approach enables both automated quality assessment pipelines and human-mediated quality reporting.
Key Concepts
| Class / Property | Purpose |
|---|---|
dqv:QualityMeasurement |
A specific measured value for a quality metric |
dqv:Metric |
A formal definition of how a quality dimension is assessed |
dqv:Dimension |
A measurable aspect of quality (e.g., completeness) |
dqv:Category |
A grouping of related quality dimensions |
dqv:QualityCertificate |
A statement from an external body attesting quality |
dqv:QualityAnnotation |
Human-provided quality feedback on a dataset |
dqv:hasQualityMeasurement |
Links a dataset to its quality measurements |
Governance & Maintenance
DQV was published by the W3C Data on the Web Best Practices Working Group, which also produced the Data on the Web Best Practices Recommendation and the Dataset Usage Vocabulary. As a W3C Interest Group Note rather than a Recommendation, DQV represents community consensus but has not undergone the full W3C standardization process. The vocabulary's namespace is hosted at http://www.w3.org/ns/dqv#.
Notable Implementations
DQV has been adopted in European open data initiatives, particularly in connection with DCAT-AP (the European DCAT Application Profile). Data portals that implement DCAT-AP can use DQV to express quality metadata alongside their catalog entries. The vocabulary is also referenced in research data management contexts, where quality assessment of scientific datasets is a growing concern.
Related Standards
- DCAT -- The Data Catalog Vocabulary, which DQV extends with quality metadata capabilities
- Data on the Web Best Practices -- The parent W3C Recommendation that motivates DQV's design