Name: Ecological Metadata Language
Creator: National Center for Ecological Analysis and Synthesis
License: GPL-2.0-or-later
Keywords: biodiversity, scientific-data

Overview

The Ecological Metadata Language (EML) is one of the most widely adopted metadata standards in the environmental and ecological sciences. Developed to address the need for comprehensive, machine-readable documentation of research datasets, EML provides a modular XML-based schema that captures everything from basic bibliographic information to detailed descriptions of spatial coverage, taxonomic scope, research methods, and data table structures. Its adoption spans thousands of data repositories worldwide, making it a cornerstone of open data practices in ecology and earth science.

Background

EML originated in 1997 at the National Center for Ecological Analysis and Synthesis (NCEAS) at the University of California, Santa Barbara. Its creation was motivated by a report from the Ecological Society of America's Committee on the Future of Long-Term Ecological Data and by foundational work on ecological metadata by William Michener and colleagues. Version 1.0 was used internally at NCEAS, with subsequent internal releases (1.2, 1.3, 1.4) closely following the committee's recommendations.

With Version 2.0, EML transitioned to a community-maintained, open specification. Significant improvements in the 2.x series drew on practical experience at NCEAS and extensive feedback from the Long Term Ecological Research (LTER) Network's information managers. Version 2.1 introduced internationalization support, while version 2.2.0, released in 2019, added semantic annotations, data paper support, structured funding information, and dataset licensing — reflecting the evolving landscape of FAIR data principles and open science.

Purpose & Scope

EML is designed for documenting any research data relevant to observational disciplines, with a primary focus on ecology, earth science, and environmental science. It serves researchers, data managers, repository operators, and software developers who need to describe datasets in a structured, interoperable way.

The standard addresses several layers of data documentation:

Resource identification: titles, creators, keywords, citations, and DOIs
Coverage: geographic, temporal, and taxonomic extents
Methods and protocols: detailed descriptions of research methodologies
Data structure: entity types, attributes, measurement scales, and constraints for data tables, spatial rasters, spatial vectors, and other data formats
Semantic annotations: formal links to ontology terms using RDF-compatible annotations
Project context: funding sources, project descriptions, and personnel

Key Modules

EML is organized into a modular architecture with specialized sub-schemas:

Module	Purpose
eml	Root container module
eml-resource	Base bibliographic information
eml-dataset	Dataset-specific metadata
eml-literature	Citation information
eml-software	Software-specific metadata
eml-protocol	Research protocol descriptions
eml-dataTable	Tabular data structure
eml-attribute	Variable/column-level descriptions
eml-coverage	Geographic, temporal, taxonomic extents
eml-methods	Research methodology
eml-project	Research project context
eml-physical	File format and distribution
eml-party	People and organizations
eml-semantics	Semantic annotation support
eml-spatialRaster	Gridded geospatial data
eml-spatialVector	Vector geospatial data

Serializations & Technical Formats

EML is defined entirely in XML Schema (XSD). Documents are validated against the schema using standard XML validation tools, and an EML-specific validity parser enforces additional content reference constraints beyond what XSD alone can express. The canonical namespace for version 2.2.0 is https://eml.ecoinformatics.org/eml-2.2.0.

EML documents can be authored using text editors, XML-specific tools like Oxygen, scripting libraries such as the R EML package, or web-based metadata editors like MetacatUI. The schema supports content references between elements, allowing complex data packages to be described without redundancy.

Governance & Maintenance

EML is maintained by a community of voluntary project members coordinated through NCEAS. Decisions are made by consensus among current project maintainers. Development occurs in feature branches on GitHub, with contributions accepted via pull requests. Discussion takes place on a dedicated Slack channel and through the GitHub issue tracker.

The specification is versioned using a major.minor.patch scheme, with backward compatibility maintained within major versions. The project is funded through NCEAS, with support from the University of California, Santa Barbara, the State of California, and multiple National Science Foundation grants.

Notable Implementations

EML is the primary metadata standard for the Knowledge Network for Biocomplexity (KNB) data repository, the DataONE federation, and the LTER Network Information System. It is used by the Environmental Data Initiative (EDI) and by numerous individual research groups and field stations. The Arctic Data Center, the National Ecological Observatory Network (NEON), and many biodiversity data platforms also use EML for dataset documentation.

Tools supporting EML include the R EML package (part of rOpenSci), the Metacat data management system, and MetacatUI, a web-based metadata editor.

Related Standards

EML exists within a broader ecosystem of scientific metadata standards. It is complementary to Darwin Core (for biodiversity occurrence data), ISO 19115 (for geospatial metadata), and DataCite (for dataset citation). EML's semantic annotation features in version 2.2.0 allow bridging to linked data standards and ontologies such as the Semantic Web for Earth and Environmental Terminology (SWEET) and the Environment Ontology (ENVO).

Resources & Links

Documentation

Repository

GitHub

Change History

Release Notes

Community / Forum

NCEAS Slack (#eml)

Other

DOI

Related Standards

Darwin Core (DwC)

Biodiversity Information Standards (TDWG)

element set

DataCite Metadata Schema

DataCite

element set

Climate and Forecast Metadata Conventions (CF Conventions)

CF Conventions Community

specification

Ecological Metadata Language

Overview

Background

Purpose & Scope

Key Modules

Serializations & Technical Formats

Governance & Maintenance

Notable Implementations

Related Standards

Further Reading

Resources & Links

Specification

Namespace URI

Serialization

Documentation

Repository

Change History

Community / Forum

Other

Related Standards