The Web Annotation Data Model is a W3C Recommendation that provides a standardized, interoperable framework for expressing annotations on web resources. Annotations are among the most pervasive activities on the web, from comments on photos and product reviews to scholarly marginalia and machine-generated tagging. This specification establishes a common model enabling annotations to be shared and reused across platforms and tools with sufficient richness for complex requirements while remaining simple for common use cases.
Background
The specification emerged from the Open Annotation Community Group's earlier work. The W3C Web Annotation Working Group, chartered in 2014, refined and standardized the model with editors Robert Sanderson (J. Paul Getty Trust), Paolo Ciccarese (Massachusetts General Hospital), and Benjamin Young (John Wiley & Sons). Published as a W3C Recommendation on 23 February 2017, it supersedes the Open Annotation Data Model community specification.
Purpose and Scope
The model addresses a fundamental need: a standard way to associate content (a body) with a resource or part of a resource (a target), together with metadata about who created the annotation, when, and why. Simple use cases such as attaching a text comment to a web page are straightforward to express. Complex requirements such as selecting a phrase within a PDF, a region of an image, or a segment of a video are accommodated through a rich selector mechanism. The model covers both human and machine-generated annotations.
Core Model
The Web Annotation Data Model is built on a small set of principles:
- An Annotation is a rooted directed graph relating bodies and targets
- An Annotation has zero or more Bodies (the annotation content)
- An Annotation has one or more Targets (the resource being annotated)
- The body is typically "about" the target
- Motivations describe intent (commenting, tagging, describing, bookmarking, highlighting, classifying, linking, moderating, questioning, replying, editing)
- Bodies can be external resources, embedded text, or a Choice between alternatives
A SpecificResource allows targeting a particular segment of a resource using selectors, with optional state, style, and scope constraints.
Selectors
The specification defines multiple selector types for precise content targeting:
| Selector | Purpose |
|---|---|
| Fragment Selector | Uses existing media fragment syntax |
| CSS Selector | Selects DOM elements via CSS |
| XPath Selector | Selects XML/HTML nodes |
| Text Quote Selector | Selects text by exact quote with surrounding context |
| Text Position Selector | Selects text by character offset |
| Data Position Selector | Selects binary data by byte offset |
| SVG Selector | Selects regions using SVG shapes |
| Range Selector | Selects a span between two other selectors |
Selectors can be chained through refinement for increasingly precise targeting.
Serialization
The canonical serialization is JSON-LD, using the context at https://www.w3.org/ns/anno.jsonld with media type application/ld+json;profile="http://www.w3.org/ns/anno.jsonld". While built on Linked Data fundamentals, the design explicitly allows efficient non-graph-based implementations.
Companion Specifications
The model is part of a suite of three W3C Recommendations:
- Web Annotation Vocabulary defines the RDF terms
- Web Annotation Protocol defines a RESTful API for annotation management, including Annotation Collections and Annotation Pages
Governance and Maintenance
Developed by the W3C Web Annotation Working Group. The namespace URI is http://www.w3.org/ns/oa#. The specification is stable with errata tracked on the W3C website.
Notable Implementations
The model is implemented by annotation platforms including Hypothesis, the IIIF ecosystem, Recogito, and various scholarly annotation tools. It provides the interoperability foundation for open annotation on the web.
Related Standards
- JSON-LD (json-ld): The canonical serialization format