Skip to main content
Back to Standards
MPEG-7 Multimedia Content Description Interface logo

MPEG-7 Multimedia Content Description Interface

MPEG-7

An ISO/IEC standard (15938) that provides a comprehensive set of tools for describing multimedia content. MPEG-7 defines Descriptors (D), Description Schemes (DS), and a Description Definition Language (DDL) based on XML Schema that together enable efficient searching, indexing, filtering, and access of audio, visual, and multimedia content. The standard consists of 13 parts covering systems, visual and audio descriptors, multimedia description schemes, reference software, conformance testing, profiles, schema definition, query format, and compact descriptors for visual search. Unlike MPEG-1, MPEG-2, and MPEG-4, MPEG-7 standardizes metadata about content rather than encoding of the content itself.

Overview

MPEG-7, formally standardized as ISO/IEC 15938, is the multimedia content description standard developed by the Moving Picture Experts Group. While its predecessor standards MPEG-1, MPEG-2, and MPEG-4 deal with encoding and representation of audio-visual content, MPEG-7 focuses exclusively on describing multimedia content -- standardizing metadata about content rather than the content itself. This makes it possible to efficiently search, index, filter, and browse multimedia resources regardless of their storage or streaming format.

Background

Work on MPEG-7 began in 1996 within ISO/IEC JTC 1/SC 29 (the MPEG working group), motivated by the growing volume of multimedia content on the internet and in broadcast archives that created an urgent need for standardized description beyond simple text-based metadata. The first parts of the standard were published in 2002. The standard was designed to complement earlier MPEG standards: an MPEG-7 description can be attached to any multimedia content, including analog material, though the object-based representation defined in MPEG-4 is particularly well suited to MPEG-7 categorization.

Purpose & Scope

MPEG-7 provides a comprehensive set of tools for describing multimedia content at varying levels of abstraction. Descriptions can apply to any type of multimedia including still pictures, graphics, 3D models, audio, speech, video, and compositions of these elements. The standard addresses both automatically extracted low-level features (color distribution, texture, shape, timbre) and human-assigned semantic descriptions (who, what, where, when).

Application domains include digital libraries and image/video catalogues, broadcast media selection, multimedia editing and authoring, security and surveillance, e-commerce product search, cultural heritage collections, educational applications, and biomedical imaging.

Structure & Key Components

MPEG-7 consists of 13 parts, each covering a distinct aspect of the framework:

Part ISO/IEC 15938- Title First Published
1 15938-1 Systems 2002
2 15938-2 Description Definition Language 2002
3 15938-3 Visual 2002
4 15938-4 Audio 2002
5 15938-5 Multimedia Description Schemes 2003
6 15938-6 Reference Software 2003
7 15938-7 Conformance Testing 2003
8 TR 15938-8 Extraction and Use of Descriptions 2002
9 15938-9 Profiles and Levels 2005
10 15938-10 Schema Definition 2005
11 TR 15938-11 Profile Schemas 2005
12 15938-12 Query Format 2008
13 15938-13 Compact Descriptors for Visual Search 2015

Technical Approach

MPEG-7 uses three core tools:

  • Descriptors (D) -- Represent features of multimedia content, defined syntactically and semantically. A single object may be described by several descriptors.
  • Description Schemes (DS) -- Specify the structure and semantics of relationships between descriptors and other description schemes.
  • Description Definition Language (DDL) -- An XML Schema-based language for defining structural relations between descriptors, enabling creation and modification of description schemes.

The Systems part (Part 1) defines both a textual format (TeM) and a binary format (BiM) for MPEG-7 descriptions. MPEG-7 descriptions are independent of the content they describe but can be multiplexed with it and synchronized via timecode.

Limitations

The standard was originally written in XML Schema, which provides semi-structured data that is machine-readable but not machine-interpretable in a semantic sense. Low-level features alone cannot capture high-level video semantics -- a limitation known as the "Semantic Gap." Attempts to bridge this gap by mapping MPEG-7 to OWL (including MPEG-7Ontos, COMM, and SWIntO) have had limited success because color distributions and other low-level features are inherently insufficient for representing visual meaning.

Governance & Maintenance

MPEG-7 is maintained by ISO/IEC JTC 1/SC 29. The standard's activity status remains open, with continued extension work particularly in compact descriptors for visual search and video analysis. The most recently published part (Part 13, Compact Descriptors for Visual Search) dates from 2015.

Notable Implementations

MPEG-7 descriptors are used in broadcast media archives for content-based retrieval, in surveillance systems for event detection, and in academic research on multimedia information retrieval. The CDVS extension (Part 13) has found application in mobile visual search and video analytics. The standard has influenced the design of multimedia databases and digital asset management systems in broadcasting organizations worldwide.

Related Standards

  • MPEG-21 -- Digital item declaration and adaptation framework that builds on MPEG-7 descriptions
  • MPEG-4 Part 11 -- Scene description and application engine, complementary to MPEG-7 content description
  • Material Exchange Format (MXF) -- SMPTE container format for professional media, sometimes used alongside MPEG-7

Further Reading