The Oxford Common File Layout (OCFL) is a specification for organizing digital content on storage systems in a way that is transparent, predictable, and robust over the long term. Unlike many repository standards that focus on metadata description or network exchange, OCFL addresses the fundamental question of how digital objects and their versions should be arranged on disk, making it a foundational building block for digital preservation infrastructure.
Background
OCFL originated from the digital preservation community's recognition that the internal storage layouts of repository systems were often opaque and tightly coupled to specific software. If the software became unavailable or unsupported, recovering content from the storage layer could be difficult or impossible. The specification was developed by a community of digital preservation practitioners, with initial work associated with the University of Oxford's Bodleian Libraries (hence the name).
Version 1.0 was released on July 7, 2020, after extensive community consultation. Version 1.1 followed on October 7, 2022, with an editorial update (v1.1.1) in November 2024. Community listening sessions for a potential version 2 were announced in August 2023.
Purpose and Scope
OCFL defines the structure of storage roots and objects within those roots. Its design goals are:
- Completeness -- a repository can be rebuilt entirely from the files on storage
- Parsability -- both humans and machines can understand the layout without special software
- Robustness -- resilience against errors, corruption, and storage migration
- Versioning -- objects maintain their full version history
- Storage diversity -- works on conventional filesystems and cloud object stores
The specification does not prescribe metadata formats, serializations, or network protocols. It focuses exclusively on the physical arrangement of files and the inventory manifests that describe them.
Key Concepts
| Concept | Description |
|---|---|
| OCFL Object | A versioned set of files with an inventory manifest |
| OCFL Storage Root | A directory (or object store prefix) containing OCFL objects |
| Inventory | A JSON file listing all versions, files, and their digests |
| Version Directory | A numbered directory (v1, v2, ...) containing content for that version |
| Content Addressing | Files are identified by their digest, enabling deduplication |
| Fixity | Digest-based integrity checking is built into every object |
Serializations and Technical Formats
OCFL uses JSON for its inventory files (inventory.json in each object root). The specification itself is published as HTML. There are no RDF or XML serializations; the standard operates at the filesystem level rather than the metadata interchange level.
Governance and Maintenance
OCFL is maintained by its editorial team through a community-driven process. Development is coordinated on GitHub, with issues and pull requests tracked openly. The broader community participates through a Google Group, a Slack channel (#ocfl in the code4lib workspace), and regular community meetings. Citable copies of the specification are archived on Zenodo.
The project also maintains a community extensions mechanism that allows implementers to define additional behaviors (such as storage layout mappings) without modifying the core specification.
Notable Implementations
- Fedora 6 -- the Fedora digital repository platform uses OCFL as its persistence layer
- Arkisto -- an Australian research data commons project built on OCFL storage
- Various institutional repositories -- multiple universities and cultural heritage organizations have adopted OCFL for digital preservation storage
- Multiple language implementations exist in Java, Python, Go, and Ruby, listed on the OCFL Implementations page
Related Standards
OCFL is complementary to rather than competitive with most metadata and exchange standards. It provides the storage layer beneath systems that use standards like Dublin Core, PREMIS, or BagIt for metadata and packaging.