Contextual Query Language (CQL) is a formal query language designed to be human-readable while providing the precision needed for structured information retrieval. It serves as the standard query language for the Search/Retrieve via URL (SRU) protocol and is widely used in library and information retrieval systems. CQL bridges the gap between simple keyword search and the complex query syntax of Z39.50, offering a syntax that is both expressive and approachable.
Background
CQL emerged from the Z39.50 community in the early 2000s. The Z39.50 protocol, while powerful, used a complex query model (Type-1/RPN queries) that was difficult for developers and end users to work with directly. As the library community developed SRU as a web-friendly successor to Z39.50, CQL was created as a query language that could be transmitted as a simple URL parameter while still supporting sophisticated search operations. Originally named Common Query Language, it was renamed to Contextual Query Language to better reflect its architecture of extensible context sets. The Library of Congress maintains the specification.
Purpose & Scope
CQL provides a standardized way to express search queries across diverse information retrieval systems. It supports:
- Simple keyword searches
- Field-specific searches using index names
- Boolean operators (AND, OR, NOT)
- Proximity and adjacency searching
- Relation modifiers for precise matching
- Sorting of results
- Extensibility through context sets
CQL is designed for use in federated search environments where queries must be sent to multiple heterogeneous systems using a common syntax.
Key Elements / Properties
CQL queries are built from the following syntactic components:
| Component | Example | Description |
|---|---|---|
| Search term | dinosaur |
Simple keyword search |
| Index | dc.title = dinosaur |
Search within a specific field |
| Relation | dc.date > 2000 |
Comparison operators |
| Boolean | cat AND dog |
Combining search clauses |
| Proximity | cat prox/distance=3 dog |
Terms within a specified distance |
| Sorting | sortBy title/ascending |
Result ordering |
| Context set | dc.title |
Namespace prefix for indexes |
Serializations & Technical Formats
CQL queries are plain text strings, typically passed as URL parameters in SRU requests. The specification includes a formal grammar in BNF notation. CQL can also be represented as XCQL, an XML serialization that makes the parse tree explicit for machine processing.
Governance & Maintenance
CQL is maintained by the Library of Congress as part of the SRU/SRW suite of standards. The specification has also been considered within OASIS standardization processes. Changes are coordinated through the Library of Congress and the broader SRU maintenance community.
Notable Implementations
CQL is the query language used by SRU servers worldwide, including those operated by the Library of Congress, the British Library, OCLC, and many national and academic libraries. The OCLC WorldCat SRU service accepts CQL queries. Library union catalog systems and federated search middleware such as MetaLib have supported CQL. Open-source SRU/CQL implementations exist in Java (CQL-Java), Python, and other languages.
Related Standards
- SRU (Search/Retrieve via URL) -- The protocol that uses CQL as its query language. CQL and SRU are closely coupled.
- Z39.50 -- The predecessor protocol whose query model CQL was designed to simplify and modernize.
- XCQL -- The XML representation of CQL query parse trees.