Skip to main content
Back to Standards

Data Quality Vocabulary

DQV

A W3C vocabulary for expressing data quality information as RDF. DQV defines a framework of quality dimensions, categories, and metrics that can be applied to any dataset. It supports the attachment of quality measurements, certificates, and annotations to datasets and distributions, enabling both automated assessment and human-readable quality reporting within the broader Data on the Web Best Practices ecosystem.

Overview

The Data Quality Vocabulary is a W3C vocabulary that provides an RDF framework for describing, assessing, and communicating data quality information. Published as a W3C Interest Group Note in December 2016, DQV emerged from the Data on the Web Best Practices Working Group's effort to improve the discoverability, usability, and trustworthiness of data published on the web.

Background

As open data publication grew throughout the 2010s, consumers faced a persistent challenge: determining whether a dataset was fit for their intended purpose. Quality information was often absent, inconsistent, or expressed in ad hoc ways that defied comparison. The W3C's Data on the Web Best Practices Working Group addressed this by developing DQV as a companion vocabulary to DCAT (Data Catalog Vocabulary), providing a standardized way to attach quality metadata to datasets and their distributions.

DQV builds on earlier data quality research, particularly the ISO 25012 data quality model and established dimensions such as accuracy, completeness, consistency, and timeliness. Rather than prescribing specific quality metrics, DQV provides a meta-level framework that can accommodate any quality assessment methodology.

Purpose & Scope

DQV defines classes and properties for expressing quality information at multiple levels of granularity. Its core concepts include quality dimensions (measurable aspects like accuracy or completeness), quality categories (groupings of related dimensions), quality metrics (formal definitions of how dimensions are measured), and quality measurements (actual observed values for a metric applied to a dataset).

The vocabulary also supports quality certificates issued by third parties, quality annotations in the form of user feedback, and quality policies that define minimum acceptable thresholds. This layered approach enables both automated quality assessment pipelines and human-mediated quality reporting.

Key Concepts

Class / Property Purpose
dqv:QualityMeasurement A specific measured value for a quality metric
dqv:Metric A formal definition of how a quality dimension is assessed
dqv:Dimension A measurable aspect of quality (e.g., completeness)
dqv:Category A grouping of related quality dimensions
dqv:QualityCertificate A statement from an external body attesting quality
dqv:QualityAnnotation Human-provided quality feedback on a dataset
dqv:hasQualityMeasurement Links a dataset to its quality measurements

Governance & Maintenance

DQV was published by the W3C Data on the Web Best Practices Working Group, which also produced the Data on the Web Best Practices Recommendation and the Dataset Usage Vocabulary. As a W3C Interest Group Note rather than a Recommendation, DQV represents community consensus but has not undergone the full W3C standardization process. The vocabulary's namespace is hosted at http://www.w3.org/ns/dqv#.

Notable Implementations

DQV has been adopted in European open data initiatives, particularly in connection with DCAT-AP (the European DCAT Application Profile). Data portals that implement DCAT-AP can use DQV to express quality metadata alongside their catalog entries. The vocabulary is also referenced in research data management contexts, where quality assessment of scientific datasets is a growing concern.

Related Standards

  • DCAT -- The Data Catalog Vocabulary, which DQV extends with quality metadata capabilities
  • Data on the Web Best Practices -- The parent W3C Recommendation that motivates DQV's design

Further Reading