The Access to Biological Collection Data (ABCD) Schema is a comprehensive XML standard developed by TDWG for exchanging data about biological specimens and observations. Where Darwin Core favors simplicity and flat record structures, ABCD provides a deeply structured, highly detailed schema capable of representing the full complexity of natural history collection databases, making it particularly well-suited for European biodiversity networks and institutions with rich, atomized data.
Background
ABCD emerged from the TDWG Access to Biological Collections Data Task Group, which recognized that existing standards did not adequately capture the diversity and complexity of biological collection data across institutions. The standard was ratified by TDWG on September 16, 2005. Development was led by Walter G. Berendsohn at the Botanic Garden and Botanical Museum Berlin-Dahlem, along with contributors from major natural history institutions across Europe and the Americas. The standard was designed from the outset to be compatible with several existing data standards and to support both atomized and free-text data in parallel structures.
Purpose & Scope
ABCD provides an XML schema for accessing and exchanging primary biodiversity data -- records of specimens and observations held in biological collections. The schema is intentionally comprehensive, supporting:
- Detailed specimen metadata including preparation, storage, and acquisition history
- Taxonomic identifications with full nomenclatural detail
- Gathering events with precise spatial and temporal information
- Multimedia references and associations
- Collection and unit-level metadata
- Measurement and fact data associated with specimens
The schema's depth makes it suitable for institutions that need to exchange richly structured data without losing detail in simplification.
Key Components
ABCD XML Schema (v2.06) -- The normative XSD defining all elements and their relationships. Version 2.06 is the current production version deployed in major networks.
ABCD Primer -- An introductory document that walks readers through the principles and structure of ABCD, with examples and references to the normative schema.
ABCD-EFG Extension -- An extension for Earth sciences (palaeontology, mineralogy, geology), adding terms specific to geological specimens.
The schema uses a hierarchical structure where a DataSet contains Units (individual specimens or observations), each of which can carry detailed Identifications, Gathering information, and associated data.
Serializations & Technical Formats
ABCD is natively an XML schema, with the XSD as the normative format. Data exchange occurs as XML documents validated against the ABCD XSD. The schema is designed to work with wrapper protocols such as BioCASe and TAPIR for network-based data access.
Governance & Maintenance
ABCD is maintained as a TDWG standard. The original task group produced the ratified versions, and ongoing development occurs through the tdwg/abcd GitHub repository. Changes follow TDWG's standards process, which requires community review and formal approval. The standard is licensed under CC-BY 4.0 as part of TDWG's open standards policy.
Notable Implementations
ABCD is a foundational standard for the BioCASe (Biological Collection Access Service for Europe) network, which connects natural history collections across European institutions. GBIF supports ABCD as an alternative to Darwin Core for data publishing, and many European institutions use ABCD natively due to its ability to represent their detailed collection data. The ABCDEFG extension is used by geological and palaeontological collections through the GeoCASe portal.
Related Standards
ABCD and Darwin Core are the two principal TDWG standards for primary biodiversity data. They address the same domain from different design philosophies: ABCD is comprehensive and hierarchical, while Darwin Core is flat and minimal. Many data networks support both, and mappings between the two exist. The BioCASe and TAPIR protocols provide network access layers that can serve ABCD-formatted data.