Name: Lexicon Model for Ontologies (OntoLex-Lemon)
Creator: W3C Ontology-Lexica Community Group
License: W3C-20150513
Keywords: web, linguistics

Overview

OntoLex-Lemon is a W3C Community Group specification that provides an RDF model for representing lexicographic and linguistic data in relation to ontologies. It enables the creation of machine-readable lexicons where words, their forms, meanings, and relationships are expressed as linked data, bridging the gap between natural language processing resources and the Semantic Web. The model is used by Wikidata for its lexicographic data and by numerous linguistic linked data resources across Europe.

Background

The need for a standard way to connect lexical information with ontologies emerged as the Semantic Web matured. Ontologies define concepts and relationships, but they do not inherently capture the linguistic expressions used to denote those concepts across different languages. The original lemon (Lexicon Model for Ontologies) was developed in the Monnet project, a European research initiative focused on multilingual access to structured data. In 2011, the W3C Ontology-Lexica Community Group was formed to develop an improved and standardized version. The result, OntoLex-Lemon, was published as a W3C Community Group Final Report in May 2016.

Purpose & Scope

OntoLex-Lemon provides a core model and several extension modules:

Module	Purpose
ontolex (core)	Lexical entries, forms, senses, and their links to ontology concepts
synsem	Syntactic frames and the syntax-semantics interface
decomp	Morphological decomposition of compound words
vartrans	Lexical variations and translations between languages
lime	Linguistic metadata for describing lexical resources

The core model centers on three key classes: LexicalEntry (a word or multi-word expression), Form (a specific morphological realization with a written or phonetic representation), and LexicalSense (the meaning of a lexical entry in relation to an ontology concept). A LexicalEntry has one or more Form instances and one or more LexicalSense instances, each of which references a concept in an external ontology.

Governance & Maintenance

OntoLex-Lemon is maintained by the W3C Ontology-Lexica Community Group, which continues active development of extension modules. The FrAC (Frequency, Attestation, and Corpus) module is a more recent addition. The community group holds regular meetings and publishes updates through the W3C community group process.

Notable Implementations

The most prominent deployment of OntoLex-Lemon is in Wikidata, which adopted the model for its lexicographic data starting in 2018. Wikidata now contains millions of lexemes modeled according to OntoLex-Lemon, covering hundreds of languages. The model is also used by DBnary (a multilingual lexical resource extracted from Wiktionary), the LLOD (Linguistic Linked Open Data) cloud, and various European linguistic infrastructure projects including ELEXIS (European Lexicographic Infrastructure).

Related Standards

SKOS -- Used for simpler thesaurus-style vocabularies; OntoLex-Lemon provides richer lexical modeling
LexInfo -- An ontology of linguistic categories that extends OntoLex-Lemon with detailed morphological and syntactic properties

Resources & Links

Registry Entry

Related Standards

Simple Knowledge Organization System (SKOS)

W3C

ontology

Lexicon Model for Ontologies (OntoLex-Lemon)

Overview

Background

Purpose & Scope

Governance & Maintenance

Notable Implementations

Related Standards

Further Reading

Resources & Links

Specification

Namespace URI

Documentation

Repository

Registry Entry

Related Standards