MPEG-7, formally standardized as ISO/IEC 15938, is the multimedia content description standard developed by the Moving Picture Experts Group. While its predecessor standards MPEG-1, MPEG-2, and MPEG-4 deal with encoding and representation of audio-visual content, MPEG-7 focuses exclusively on describing multimedia content -- standardizing metadata about content rather than the content itself. This makes it possible to efficiently search, index, filter, and browse multimedia resources regardless of their storage or streaming format.
Background
Work on MPEG-7 began in 1996 within ISO/IEC JTC 1/SC 29 (the MPEG working group), motivated by the growing volume of multimedia content on the internet and in broadcast archives that created an urgent need for standardized description beyond simple text-based metadata. The first parts of the standard were published in 2002. The standard was designed to complement earlier MPEG standards: an MPEG-7 description can be attached to any multimedia content, including analog material, though the object-based representation defined in MPEG-4 is particularly well suited to MPEG-7 categorization.
Purpose & Scope
MPEG-7 provides a comprehensive set of tools for describing multimedia content at varying levels of abstraction. Descriptions can apply to any type of multimedia including still pictures, graphics, 3D models, audio, speech, video, and compositions of these elements. The standard addresses both automatically extracted low-level features (color distribution, texture, shape, timbre) and human-assigned semantic descriptions (who, what, where, when).
Application domains include digital libraries and image/video catalogues, broadcast media selection, multimedia editing and authoring, security and surveillance, e-commerce product search, cultural heritage collections, educational applications, and biomedical imaging.
Structure & Key Components
MPEG-7 consists of 13 parts, each covering a distinct aspect of the framework:
| Part | ISO/IEC 15938- | Title | First Published |
|---|---|---|---|
| 1 | 15938-1 | Systems | 2002 |
| 2 | 15938-2 | Description Definition Language | 2002 |
| 3 | 15938-3 | Visual | 2002 |
| 4 | 15938-4 | Audio | 2002 |
| 5 | 15938-5 | Multimedia Description Schemes | 2003 |
| 6 | 15938-6 | Reference Software | 2003 |
| 7 | 15938-7 | Conformance Testing | 2003 |
| 8 | TR 15938-8 | Extraction and Use of Descriptions | 2002 |
| 9 | 15938-9 | Profiles and Levels | 2005 |
| 10 | 15938-10 | Schema Definition | 2005 |
| 11 | TR 15938-11 | Profile Schemas | 2005 |
| 12 | 15938-12 | Query Format | 2008 |
| 13 | 15938-13 | Compact Descriptors for Visual Search | 2015 |
Technical Approach
MPEG-7 uses three core tools:
- Descriptors (D) -- Represent features of multimedia content, defined syntactically and semantically. A single object may be described by several descriptors.
- Description Schemes (DS) -- Specify the structure and semantics of relationships between descriptors and other description schemes.
- Description Definition Language (DDL) -- An XML Schema-based language for defining structural relations between descriptors, enabling creation and modification of description schemes.
The Systems part (Part 1) defines both a textual format (TeM) and a binary format (BiM) for MPEG-7 descriptions. MPEG-7 descriptions are independent of the content they describe but can be multiplexed with it and synchronized via timecode.
Limitations
The standard was originally written in XML Schema, which provides semi-structured data that is machine-readable but not machine-interpretable in a semantic sense. Low-level features alone cannot capture high-level video semantics -- a limitation known as the "Semantic Gap." Attempts to bridge this gap by mapping MPEG-7 to OWL (including MPEG-7Ontos, COMM, and SWIntO) have had limited success because color distributions and other low-level features are inherently insufficient for representing visual meaning.
Governance & Maintenance
MPEG-7 is maintained by ISO/IEC JTC 1/SC 29. The standard's activity status remains open, with continued extension work particularly in compact descriptors for visual search and video analysis. The most recently published part (Part 13, Compact Descriptors for Visual Search) dates from 2015.
Notable Implementations
MPEG-7 descriptors are used in broadcast media archives for content-based retrieval, in surveillance systems for event detection, and in academic research on multimedia information retrieval. The CDVS extension (Part 13) has found application in mobile visual search and video analytics. The standard has influenced the design of multimedia databases and digital asset management systems in broadcasting organizations worldwide.
Related Standards
- MPEG-21 -- Digital item declaration and adaptation framework that builds on MPEG-7 descriptions
- MPEG-4 Part 11 -- Scene description and application engine, complementary to MPEG-7 content description
- Material Exchange Format (MXF) -- SMPTE container format for professional media, sometimes used alongside MPEG-7
ISO