Skip to main content
Back to Standards
MARCXML: MARC 21 XML Schema logo

MARCXML: MARC 21 XML Schema

MARCXML

By LC

An XML schema framework developed by the Library of Congress Network Development and MARC Standards Office for working with MARC 21 data in XML environments. MARCXML provides a lossless, flexible representation of the complete MARC record structure in XML, preserving all indicators, subfield codes, and control fields. The framework includes schemas, XSLT stylesheets for conversion to and from MODS, Dublin Core, and other formats, as well as Java-based conversion tools.

Overview

MARCXML is the XML representation of MARC 21 records developed by the Library of Congress. It provides a standardized way to express the full content of MARC records in XML, enabling interoperability with modern web technologies while preserving the complete fidelity of the original MARC data structure.

Background

In the early 2000s, as XML became the dominant data interchange format on the web, the Library of Congress Network Development and MARC Standards Office recognized the need for a standard way to represent MARC 21 data in XML. Earlier efforts had produced SGML DTDs for MARC in the mid-1990s, but these were extremely large because they mapped every individual MARC data element to a separate XML element. The MARCXML approach took a different path, using a slim schema that mirrors the generic MARC record structure — leader, control fields, data fields with indicators, and subfields — rather than enumerating every possible tag and subfield code.

Purpose & Scope

MARCXML serves as a bridge between the traditional MARC binary format (ISO 2709) and XML-based systems. The framework is designed to be flexible and extensible, allowing institutions to work with MARC data in ways specific to their needs. Key use cases include data exchange between library systems, transformation pipelines (e.g., converting MARC to MODS or Dublin Core), and long-term preservation of bibliographic data in an open, text-based format.

The schema itself is intentionally generic: it represents the structure of a MARC record without constraining which tags or subfield codes are valid. Validation of MARC content rules is handled separately through additional stylesheets.

Key Components

Component Purpose
MARC21slim.xsd Core XML Schema for MARCXML records
Conversion stylesheets XSLT transforms to/from MODS, Dublin Core, OAI MARC, ONIX
MARCXML Toolkit Java tools for MARC-to-XML conversion with full character set support
Validation stylesheets XSLT-based MARC bibliographic validation
HTML display stylesheets Tagged view and English-labeled view for browser rendering

Technical Architecture

A MARCXML document represents one or more MARC records using a small set of XML elements: <record>, <leader>, <controlfield>, <datafield>, and <subfield>. Each <datafield> carries tag, ind1, and ind2 attributes corresponding to the MARC tag number and indicator values. Subfields are identified by a code attribute. This structure preserves round-trip fidelity with the ISO 2709 binary format.

The framework includes an extensive library of XSLT stylesheets maintained by the Library of Congress for transforming MARCXML to other metadata formats, notably MODS (multiple versions from 3.0 through 3.7) and Dublin Core (in RDF, OAI, and SRW encodings). Reverse transformations from these formats back to MARCXML are also provided.

Governance & Maintenance

MARCXML is maintained by the Network Development and MARC Standards Office at the Library of Congress. Updates to the schema track changes in the MARC 21 formats. The page was last updated on February 2, 2022.

Notable Implementations

MARCXML is widely used across the library community as an interchange format. It serves as the primary XML representation for MARC data in systems such as Ex Libris Alma, OCLC WorldCat, and numerous institutional repository and digital library platforms. The format is central to metadata transformation pipelines, particularly the MARC-to-MODS and MARC-to-BIBFRAME conversion workflows maintained by the Library of Congress.

Related Standards

  • MARC 21 — the underlying record format that MARCXML encodes
  • MODS — a derivative XML schema for which extensive MARCXML-to-MODS conversion stylesheets exist
  • MADS — the authority counterpart to MODS, also linked through MARCXML conversions
  • BIBFRAME — the Library of Congress successor initiative to MARC, with conversion tools operating on MARCXML as an intermediate format

Further Reading