Schema.org is the dominant vocabulary for embedding structured data in web pages, email messages, and other digital content. Launched on June 2, 2011 as a joint initiative by Bing, Google, and Yahoo — with Yandex joining in November 2011 — it provides a shared set of types and properties that search engines and applications consume to deliver rich, semantically aware experiences. With adoption across over 45 million web domains as of 2024, Schema.org is one of the most widely deployed metadata vocabularies in existence.
Background
Before Schema.org, each major search engine promoted its own preferred structured data format, creating fragmentation for webmasters. The June 2011 announcement unified these efforts under a single vocabulary that all participating engines would recognize. Much of the initial vocabulary was inspired by earlier formats including microformats, FOAF, and OpenCyc. In 2012, the GoodRelations e-commerce ontology was integrated into Schema.org, substantially expanding its commercial vocabulary. Public discussion has largely taken place on the W3C public vocabularies mailing list since the project's inception.
Google began supporting the JSON-LD format in 2015, and by September 2017 recommended JSON-LD for structured data whenever possible, superseding Microdata as the preferred encoding. Despite early limited uptake (a 2016 survey showed only 17% adoption among US marketing agencies), deployment has grown enormously, with over 45 million domains using schema markup as of 2024. The popularity of Schema.org has also spawned derivative specifications, such as the Croissant metadata format for machine-learning datasets.
Purpose & Scope
Schema.org defines a hierarchy of over 800 types (as of February 2025) and thousands of associated properties for describing entities and the relationships between them. Its primary purpose is to make web content machine-readable so that search engines, email clients, virtual assistants, and other automated systems can extract and present structured information.
The vocabulary covers a wide range of domains including creative works, commerce, events, organizations, people, places, medical and health content, and actions. It is designed to be broadly useful rather than deeply specialized, though its extension mechanism allows domain-specific communities to add detailed types without modifying the core vocabulary.
Key Types
Schema.org organizes its vocabulary into a type hierarchy rooted in Thing. Commonly used types include:
| Type | Description |
|---|---|
| Thing | The most generic type; every other type is a subtype |
| CreativeWork | Articles, books, movies, photographs, software, datasets |
| Organization | Companies, educational institutions, NGOs, governments |
| Person | A person, alive, dead, undead, or fictional |
| Place | Entities with a fixed physical location |
| Event | An event happening at a certain time and location |
| Product | A product offered for sale |
| Action | An action performed by an agent |
| LocalBusiness | A business with a physical location |
| Review | A review of an item |
Each type carries a set of properties inherited through the hierarchy, so all types share the base properties of Thing (name, description, url, image, etc.).
Serializations & Technical Formats
Schema.org data can be expressed in three primary encodings:
- JSON-LD — the currently recommended format, embedded as a
<script type="application/ld+json">block in HTML. Clean separation between markup and structured data. - RDFa — attributes added directly to HTML elements. Widely supported but more verbose than JSON-LD.
- Microdata — HTML5-native attributes (
itemscope,itemtype,itemprop). The original recommended format at Schema.org's launch; now largely superseded by JSON-LD in practice.
The Schema.org namespace URI is https://schema.org/. All type and property URIs are formed by appending the term name to this base (e.g., https://schema.org/Person).
Governance & Maintenance
Schema.org is governed through the W3C Schema.org Community Group, an open forum where anyone can participate. Development happens publicly via the GitHub repository at github.com/schemaorg/schemaorg. Proposals for new types and properties go through a community review process before being accepted. Releases follow a versioning scheme (currently at version 29.4, released December 8, 2025). The vocabulary is licensed under Creative Commons Attribution-ShareAlike 3.0.
Validation Tools
Multiple validators are available for testing Schema.org markup:
- Schema.org Markup Validator — the official validator at validator.schema.org
- Google Rich Results Test — tests eligibility for enhanced search result display
- Bing Markup Validator — Microsoft's validation tool
- Yandex Microformat Validator — Yandex's structured data checker
- Google Search Console — provides reports on unparsable structured data
Notable Implementations
Schema.org markup is consumed by all major search engines to power knowledge panels, rich snippets, carousels, event listings, recipe cards, product comparisons, and other enhanced search results. Beyond search, Schema.org is used by email clients (Gmail uses Schema.org markup for actionable emails), social media platforms, voice assistants, and content management systems. WordPress, Drupal, Shopify, and WooCommerce generate Schema.org markup automatically or via plugins. Pinterest uses Schema.org Rich Pins data. Google Dataset Search indexes datasets described with Schema.org's Dataset type.
Related Standards
- JSON-LD — the preferred encoding, itself a W3C Recommendation
- RDFa — one of the three encoding formats supported by Schema.org
- Dublin Core — an earlier metadata vocabulary; Schema.org's
CreativeWorkproperties overlap with several Dublin Core elements - Open Graph Protocol — Facebook's metadata vocabulary for social sharing
- Microformats — an earlier approach to embedded structured data that preceded Schema.org
- GoodRelations — e-commerce ontology integrated into Schema.org in 2012
Schema.org