Skip to main content

Metadata Filters

Metadata filter expressions are attached to queries, or more formally, to their corpus keys. These filter expressions serve to restrict the search to only the part of the corpus that matches the expression. In both form and function, they are a simpler version of a WHERE clause's search condition in ANSI SQL, see §7.6.

A filter expression operates on the metadata attached to documents that are indexed in Vectara. Because you can associate this metadata to either the entire document, or to specific parts within it, the scope must be explicitly specified for every metadata reference in the expression. Valid scopes are doc. and part., for document and part-level metadata, respectively.

To learn more about setting up filterable metadata review the filter attribute section of the corpus creation documentation.

The following filter expression selects customer reviews in German with better than a 3-star rating. Note that while there is a single rating for the entire document, the detected language is set at the part level.

doc.rating > 3.0 and part.lang = 'deu'

The lang metadata tag used in this example is detected and set automatically by the platform at indexing time. It's set at the part level for accuracy, because a single document may contain content in multiple languages.

More complicated expressions are possible, such as the one below, which checks for documents with a publication date in 2021.

1609459200 < doc.pub_epoch and doc.pub_epoch < 1640995200

Here, pub_epoch stores the date in epoch time.

You can find a full list of supported syntax on the Functions and Operators page.