Skip to main content
Back to Standards

RDF Data Cube Vocabulary

A W3C Recommendation for representing statistical and multi-dimensional data in RDF, aligned with the Statistical Data and Metadata Exchange (SDMX) information model. The vocabulary provides a standard way to publish data cubes — datasets organized along multiple dimensions such as time, geography, and measurement type — as linked data, enabling integration with other RDF datasets and SPARQL querying. It is widely used by government statistical offices, Eurostat, and open data portals.

Overview

The RDF Data Cube Vocabulary is a W3C Recommendation for publishing multi-dimensional statistical data on the web using RDF. It enables government agencies, research institutions, and open data portals to express datasets organized along dimensions such as time periods, geographic regions, and measurement indicators in a form that can be queried with SPARQL and integrated with other linked data resources. The vocabulary has become the primary standard for statistical linked data across Europe and beyond.

Background

Statistical organizations have long struggled with interoperability. Data published by one agency in CSV or proprietary formats could not easily be combined with data from another. The Statistical Data and Metadata Exchange (SDMX) initiative, backed by the Bank for International Settlements, ECB, Eurostat, IMF, OECD, UN, and World Bank, established an information model for exchanging statistical data. The RDF Data Cube Vocabulary brings this model into the linked data world by providing an RDF representation aligned with the SDMX information model. Work began in the W3C Government Linked Data Working Group and resulted in a W3C Recommendation published on 16 January 2014.

Purpose & Scope

The vocabulary provides a way to represent datasets as collections of observations organized along defined dimensions. Each observation records a measured value at a specific point in a multi-dimensional space. The vocabulary is intentionally abstract: it does not prescribe specific dimensions or measures, but provides the structural framework into which domain-specific concepts are plugged.

Core concepts include DataSet, Observation, DimensionProperty, MeasureProperty, AttributeProperty, DataStructureDefinition, ComponentSpecification, and Slice. A DataStructureDefinition declares which dimensions and measures a dataset uses, while each Observation links to specific dimension values and carries one or more measured values.

Key Concepts

Concept Role
qb:DataSet A collection of observations sharing the same structure
qb:Observation A single data point within a dataset
qb:DimensionProperty A component identifying a position in the cube (e.g., time, region)
qb:MeasureProperty A component recording the observed value
qb:DataStructureDefinition Declares the components (dimensions, measures, attributes) of a dataset
qb:Slice A subset of observations sharing fixed values on some dimensions

Governance & Maintenance

The vocabulary was developed by the W3C Government Linked Data Working Group and published as a W3C Recommendation. As with other W3C Recommendations, changes follow the W3C Process. The specification has been stable since its 2014 publication.

Notable Implementations

Eurostat publishes significant portions of its statistical data using the Data Cube vocabulary. The UK Office for National Statistics, the Scottish Government, and the Irish Central Statistics Office have all used Data Cube for publishing open linked data. The vocabulary is also used within the European Data Portal and various SDMX-to-RDF conversion pipelines.

Related Standards

  • SDMX -- The statistical data exchange standard whose information model the Data Cube vocabulary implements in RDF
  • DCAT -- Often used alongside Data Cube for dataset-level cataloging metadata
  • VoID -- Used for describing linksets and statistics about RDF datasets

Further Reading