NewsML-G2 is the principal open standard for exchanging news content across the global media industry. Published by the International Press Telecommunications Council (IPTC), it provides a comprehensive XML-based framework for packaging and transmitting text, photographs, graphics, audio, video, event data, and sports statistics. The standard is widely adopted by major news agencies, broadcasters, aggregators, and publishers worldwide, serving as the backbone of automated news workflows.
Background
The IPTC has a long history of developing interchange standards for the news industry, beginning with IPTC 7901 for text transmission in the 1970s and continuing through NewsML 1.x in the early 2000s. NewsML-G2 was introduced in 2008 as a second-generation replacement, built on the IPTC News Architecture (NAR) framework. The "G2" designation reflects this generational leap, bringing a unified data model that could accommodate not only traditional news content but also structured event and sports data. The standard has been continuously refined, with version updates reflecting evolving industry requirements around trust, credibility, and multimedia workflows.
Purpose and Scope
NewsML-G2 addresses the full lifecycle of news exchange. It defines several distinct item types, each serving a specific role in the news supply chain:
| Item Type | Purpose |
|---|---|
| News Item | Wraps a single piece of content: text, photo, graphic, audio, or video |
| Package Item | Structures combinations of multiple news items into a coherent package |
| Concept Item | Describes a single concept used in controlled vocabularies |
| Knowledge Item | Exchanges an entire controlled vocabulary as a single file |
| Planning Item | Conveys editorial planning and coverage intentions to customers |
| News Message | Acts as a transport wrapper for transmitting items by any electronic means |
The standard makes extensive use of controlled vocabularies and taxonomies, particularly IPTC Media Topics, to describe and categorize content in a machine-readable manner compatible with Semantic Web principles.
Sub-specifications
NewsML-G2 serves as an umbrella that encompasses two important domain-specific extensions:
EventsML-G2 provides a structured format for conveying event information within a news industry context. It supports use cases ranging from receiving event details from organizers, to publishing event listings, to sharing planned news coverage through editorial daybooks.
SportsML-G2 is the only open, global XML standard for the interchange of sports data. It supports scores, schedules, standings, and statistics across a wide variety of sports competitions through a common framework with plug-in modules for specific sports.
Technical Format
NewsML-G2 is an XML-based standard. Documents conform to XML Schema definitions published by the IPTC. The standard's metadata model is designed to be compact while supporting rich descriptions of content provenance, rights, editorial workflow status, and subject categorization through qualified codes (QCodes) that reference entries in IPTC-maintained controlled vocabularies.
Governance and Maintenance
The IPTC, a consortium of the world's major news agencies, news publishers, and news industry technology vendors, maintains and governs NewsML-G2. Development occurs through IPTC working groups, with periodic releases that undergo member review. The standard is freely available, and the IPTC provides supporting tools including a NewsML-G2 Generator for creating compliant documents and an open-source Python library (python-newsmlg2) that reached version 1.0 for production use.
Notable Implementations
NewsML-G2 is used by the world's leading news organizations. Major wire services including Agence France-Presse (AFP), Deutsche Presse-Agentur (dpa), and Thomson Reuters use it for content distribution. National and regional news agencies across Europe, Asia, and the Americas have adopted it as well. The standard underpins content exchange within news aggregation platforms and editorial systems, and its use of standardized metadata enables automated routing, categorization, and archival of news content at scale.
Related Standards
- ninjs (News in JSON) -- a JSON-based alternative from the IPTC for web-native news exchange
- NITF (News Industry Text Format) -- the IPTC's earlier XML format focused specifically on text articles
- rNews -- the IPTC's RDFa/Microdata vocabulary for embedding news metadata in HTML
- IPTC NewsCodes -- controlled vocabularies referenced extensively within NewsML-G2 metadata
- NewsML 1.x -- the first-generation predecessor that NewsML-G2 replaced
IPTC