The RDF Data Cube Vocabulary is a W3C Recommendation for publishing multi-dimensional statistical data on the web using RDF. It enables government agencies, research institutions, and open data portals to express datasets organized along dimensions such as time periods, geographic regions, and measurement indicators in a form that can be queried with SPARQL and integrated with other linked data resources. The vocabulary has become the primary standard for statistical linked data across Europe and beyond.
Background
Statistical organizations have long struggled with interoperability. Data published by one agency in CSV or proprietary formats could not easily be combined with data from another. The Statistical Data and Metadata Exchange (SDMX) initiative, backed by the Bank for International Settlements, ECB, Eurostat, IMF, OECD, UN, and World Bank, established an information model for exchanging statistical data. The RDF Data Cube Vocabulary brings this model into the linked data world by providing an RDF representation aligned with the SDMX information model. Work began in the W3C Government Linked Data Working Group and resulted in a W3C Recommendation published on 16 January 2014.
Purpose & Scope
The vocabulary provides a way to represent datasets as collections of observations organized along defined dimensions. Each observation records a measured value at a specific point in a multi-dimensional space. The vocabulary is intentionally abstract: it does not prescribe specific dimensions or measures, but provides the structural framework into which domain-specific concepts are plugged.
Core concepts include DataSet, Observation, DimensionProperty, MeasureProperty, AttributeProperty, DataStructureDefinition, ComponentSpecification, and Slice. A DataStructureDefinition declares which dimensions and measures a dataset uses, while each Observation links to specific dimension values and carries one or more measured values.
Key Concepts
| Concept | Role |
|---|---|
qb:DataSet |
A collection of observations sharing the same structure |
qb:Observation |
A single data point within a dataset |
qb:DimensionProperty |
A component identifying a position in the cube (e.g., time, region) |
qb:MeasureProperty |
A component recording the observed value |
qb:DataStructureDefinition |
Declares the components (dimensions, measures, attributes) of a dataset |
qb:Slice |
A subset of observations sharing fixed values on some dimensions |
Governance & Maintenance
The vocabulary was developed by the W3C Government Linked Data Working Group and published as a W3C Recommendation. As with other W3C Recommendations, changes follow the W3C Process. The specification has been stable since its 2014 publication.
Notable Implementations
Eurostat publishes significant portions of its statistical data using the Data Cube vocabulary. The UK Office for National Statistics, the Scottish Government, and the Irish Central Statistics Office have all used Data Cube for publishing open linked data. The vocabulary is also used within the European Data Portal and various SDMX-to-RDF conversion pipelines.
Related Standards
- SDMX -- The statistical data exchange standard whose information model the Data Cube vocabulary implements in RDF
- DCAT -- Often used alongside Data Cube for dataset-level cataloging metadata
- VoID -- Used for describing linksets and statistics about RDF datasets