Name: CSV on the Web
Creator: World Wide Web Consortium
License: W3C-20150513
Keywords: web, government

Overview

CSV on the Web is a suite of W3C Recommendations that brings structured metadata to the world's most ubiquitous data format. CSV files are everywhere -- from government open data portals to scientific repositories -- yet they carry no inherent schema, no type information, and no standardized way to describe their structure. CSVW fills this gap by defining a metadata vocabulary and processing model for tabular data on the Web.

Background

The CSV on the Web Working Group was chartered by W3C in 2013 in response to the widespread use of CSV for publishing open data, particularly by governments. Editors Jeni Tennison (Open Data Institute) and Gregg Kellogg (Kellogg Associates), along with authors Rufus Pollock (Open Knowledge) and Ivan Herman (W3C), developed four specifications published simultaneously as W3C Recommendations on 17 December 2015.

Purpose and Scope

CSVW addresses several fundamental problems with CSV data on the web:

Ambiguity: CSV files lack a formal schema; column semantics, datatypes, and table relationships are undefined
Discoverability: No standard mechanism exists to associate metadata with a CSV file
Interoperability: Different tools make different assumptions about delimiters, encodings, null values, and line endings
Integration: CSV data cannot participate in the linked data ecosystem without transformation

The Four Specifications

Model for Tabular Data and Metadata on the Web defines the abstract data model for tables, columns, rows, cells, and their annotations
Metadata Vocabulary for Tabular Data provides a JSON-LD vocabulary for describing CSV structure, including datatypes, foreign keys, transformations, and dialect settings
Generating JSON from Tabular Data on the Web (CSV2JSON) defines standard conversion from annotated CSV to JSON
Generating RDF from Tabular Data on the Web (CSV2RDF) defines standard conversion from annotated CSV to RDF

How It Works

Publishers place a JSON-LD metadata document alongside their CSV files (or link to it via HTTP headers or a well-known URI). The metadata describes table structure, column names, datatypes, default values, null values, foreign key relationships, and transformation templates. Consumers can then process the CSV with full awareness of its structure and semantics.

Serialization and Namespace

Namespace: http://www.w3.org/ns/csvw#
Metadata documents are expressed in JSON-LD
A comprehensive primer provides introductory guidance for publishers

Governance and Maintenance

Developed by the W3C CSV on the Web Working Group. The specification includes a test suite and implementation report demonstrating interoperability across implementations. Source code and issues are tracked on GitHub.

Notable Implementations

CSVW metadata is used by government open data platforms and data publishing tools. Libraries exist in multiple programming languages for parsing CSVW metadata, validating CSV files against it, and performing the defined JSON and RDF conversions.

Related Standards

JSON-LD (json-ld): The format used for CSVW metadata documents
DCAT (dcat): Often used alongside CSVW for dataset catalog metadata

Resources & Links

Specification

Namespace URI

CSVW Namespace

Documentation

Related Standards

JSON-LD 1.1 (JSON-LD)

World Wide Web Consortium

specification

Data Catalog Vocabulary (DCAT)

World Wide Web Consortium

ontology

CSV on the Web

Overview

Background

Purpose and Scope

The Four Specifications

How It Works

Serialization and Namespace

Governance and Maintenance

Notable Implementations

Related Standards

Further Reading

Resources & Links

Specification

Namespace URI

Documentation

Repository

Validator

Community / Forum

Related Standards