Skip to main content
Back to Standards
Preservation Metadata: Implementation Strategies logo

Preservation Metadata: Implementation Strategies

PREMIS

By LC

The international standard for metadata to support the preservation of digital objects and ensure their long-term usability. PREMIS defines a core set of preservation metadata elements organized around intellectual entities, objects, events, agents, and rights. Developed originally by OCLC and RLG, the standard consists of a Data Dictionary, an XML schema, and an OWL ontology, and is widely implemented in digital preservation systems and tools worldwide.

Overview

PREMIS is the international standard for metadata that supports the preservation of digital objects and ensures their long-term usability. Developed by an international team of experts and maintained by the Library of Congress, PREMIS is the most widely adopted metadata standard in the digital preservation community and is supported by numerous commercial and open-source tools and repository systems.

Background

PREMIS originated from a working group convened jointly by OCLC (Online Computer Library Center) and RLG (Research Libraries Group) in 2003. The group was charged with creating a practical, implementable set of core preservation metadata elements applicable across the digital preservation community. The resulting PREMIS Data Dictionary version 1.0 was published in 2005.

The standard has undergone several revisions: version 2.0 in 2008 introduced structural refinements and new semantic units, followed by incremental updates through version 2.3. The current version, PREMIS 3.0, was released in 2015 and brought significant changes including simplification of the data model and alignment with linked data principles. An OWL ontology for PREMIS, now at version 3, enables the standard's use in semantic web and RDF-based environments.

The PREMIS Editorial Committee, coordinated through the Library of Congress, is responsible for ongoing maintenance, revisions, and community engagement. The PREMIS Implementors' Group (PIG) provides a forum for practitioners to share implementation experiences.

Purpose and Scope

PREMIS defines the metadata needed to ensure that digital objects remain accessible, authentic, and usable over time. It addresses fundamental preservation questions: What is this object? Where did it come from? What has happened to it? What is needed to render it? Who has rights over it?

The data model is organized around five core entities:

Entity Purpose
Intellectual Entity A coherent set of content reasonably treated as a unit (e.g., a book, a photograph)
Object A discrete unit of information in digital form (file, bitstream, or representation)
Event An action that affects an object (ingestion, migration, validation, etc.)
Agent A person, organization, or software involved in preservation events
Rights Permissions and restrictions governing use and preservation actions

Each entity has associated semantic units (metadata elements) that describe it. PREMIS is intentionally implementation-neutral — it defines what metadata to capture, not how to encode it, though official XML schemas and an OWL ontology are provided.

Serializations and Technical Formats

PREMIS provides multiple serialization options:

  • XML Schema — the primary implementation format, with an official XSD maintained at the Library of Congress
  • OWL Ontology — version 3 enables RDF-based linked data implementations
  • Controlled vocabularies — preservation event types and other value sets are published at id.loc.gov

PREMIS is commonly used in conjunction with METS (Metadata Encoding and Transmission Standard) for packaging preservation metadata within digital repository systems.

Governance and Maintenance

The PREMIS maintenance activity is housed at the Library of Congress, specifically within the Network Development and MARC Standards Office. The PREMIS Editorial Committee coordinates revisions and oversees the standard's evolution. A formal revision process governs changes to both the Data Dictionary and the XML schema. Community input is gathered through the PREMIS Implementors' Group and regular sessions at conferences such as iPRES.

Notable Implementations

PREMIS is embedded in major digital preservation systems including:

  • Archivematica — open-source digital preservation system that generates PREMIS metadata throughout the preservation workflow
  • Rosetta (Ex Libris) — commercial digital preservation platform with native PREMIS support
  • LOCKSS and CLOCKSS — distributed preservation networks that use PREMIS for event tracking
  • Fedora Repository — widely-used digital repository platform with PREMIS integration
  • DSpace — open-source repository software that can export PREMIS metadata

The PREMIS Implementation Fairs, held regularly from 2009 to 2023, documented dozens of implementations across memory institutions worldwide.

Related Standards

  • METS (Metadata Encoding and Transmission Standard) — frequently used as a container for PREMIS metadata in repository systems
  • Dublin Core — general descriptive metadata often used alongside PREMIS for discovery
  • OAIS (Open Archival Information System) — the reference model for digital preservation that PREMIS helps operationalize
  • BagIt — a packaging format sometimes used with PREMIS for content transfer

Further Reading