DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. As a W3C Recommendation published 22 August 2024, it enables publishers to describe datasets and data services in catalogs using a standard model that facilitates federated search, metadata aggregation, and dataset discovery across organizational boundaries. DCAT is foundational to open data infrastructure worldwide, particularly in the European Union's open data portal ecosystem.
Background
The original DCAT vocabulary was developed at the Digital Enterprise Research Institute (DERI), then refined by the W3C eGov Interest Group, and standardized as DCAT 1 in January 2014 by the Government Linked Data (GLD) Working Group. It was originally developed in the context of government data catalogs such as data.gov and data.gov.uk. DCAT 2, published in February 2020 by the Dataset Exchange Working Group, addressed shortcomings identified through community experience -- notably adding a class for data services and improving support for identifiers, quality information, and citation. DCAT 3, the current version, was developed to address more pressing use cases including dataset series, versioning, and inverse properties.
Purpose & Scope
DCAT provides RDF classes and properties to allow datasets and data services to be described and included in a catalog. Use of a standard model facilitates:
- Increased discoverability of datasets and data services
- Federated search for datasets across catalogs at multiple sites
- Aggregation of metadata from multiple catalogs for digital preservation
DCAT makes no assumptions about the serialization formats of the data being described. It distinguishes between the abstract dataset and its different distributions, accommodating data in any format from spreadsheets and XML to RDF and specialized scientific formats.
Key Classes
DCAT 3 is organized around seven main classes:
| Class | Description |
|---|---|
| dcat:Catalog | A dataset in which each item is a metadata record describing some resource |
| dcat:Resource | Parent class of Dataset, DataService, and Catalog (not used directly) |
| dcat:Dataset | A collection of data published or curated by a single agent |
| dcat:Distribution | An accessible form of a dataset such as a downloadable file |
| dcat:DataService | A collection of operations (API) providing access to datasets |
| dcat:DatasetSeries | A collection of separately published datasets sharing common characteristics |
| dcat:CatalogRecord | A metadata record describing the registration of a resource in a catalog |
Versioning & Compatibility
DCAT 3 supersedes DCAT 2 but does not make it obsolete. It maintains backward compatibility -- existing DCAT 2 deployments that do not use DCAT 3 features (versioning, dataset series, inverse properties) remain conformant without changes. Key additions in DCAT 3 include the spdx:checksum property, versioning properties (dcat:version, dcat:previousVersion, dcat:hasCurrentVersion), and the dcat:DatasetSeries class.
External Vocabularies
DCAT incorporates terms from several established vocabularies where stable terms with appropriate meanings exist, including Dublin Core (dcterms), FOAF, PROV-O, SKOS, OWL, ODRL, SPDX, OWL-TIME, and vCard. It defines a minimal set of its own classes and properties. The namespace for DCAT terms is http://www.w3.org/ns/dcat# with the suggested prefix dcat.
Governance & Maintenance
DCAT is developed and maintained by the W3C Dataset Exchange Working Group (DXWG). The DCAT 3 editors are Riccardo Albertoni (CNR, Italy), David Browning, Simon Cox, Alejandra Gonzalez Beltran (STFC, UK), Andrea Perego, and Peter Winstanley. Former editors include Fadi Maali (DERI) and John Erickson (RPI). The specification source, issues, and discussion are hosted on the W3C GitHub repository (w3c/dxwg).
Notable Implementations
DCAT is deployed extensively in government open data portals. The European Data Portal and national portals across EU member states implement DCAT-AP, the European application profile. CKAN, the leading open data catalog software, provides native DCAT support. The US data.gov, Australia's data.gov.au, and similar national portals adopt DCAT or DCAT-based profiles. Research data management platforms including DataCite and re3data also align with DCAT. The Healthcare and Life Sciences Community Profile and GeoDCAT-AP for geospatial data are notable domain-specific profiles.
Related Standards
- Dublin Core -- DCAT relies heavily on Dublin Core terms for basic descriptive metadata
- Schema.org -- Complementary vocabulary for structured data on the Web; crosswalks exist between DCAT and Schema.org's Dataset type
- VoID -- Can be used with DCAT to describe statistics about RDF datasets, as noted in the DCAT specification itself