Metadata Filters
Metadata filter expressions are attached to queries, or more formally, to their
corpus keys. These filter expressions serve to restrict the search to only the
part of the corpus that matches the expression. In both form and function,
they are a simpler version of a WHERE
clause's search condition
in ANSI SQL, see §7.6.
A filter expression operates on the metadata attached to documents that are
indexed in Vectara. Because you can associate this
metadata to either the entire document, or to specific parts within it, the
scope must be explicitly specified for every metadata reference in the
expression. Valid scopes are doc.
and part.
, for document and part-level
metadata, respectively.
To learn more about setting up filterable metadata review the filter attribute section of the corpus creation documentation.
The following filter expression selects customer reviews in German with better than a 3-star rating. Note that while there is a single rating for the entire document, the detected language is set at the part level.
doc.rating > 3.0 and part.lang = 'deu'
The lang
metadata tag used in this example is detected and set automatically
by the platform at indexing time. It's set at the part level for accuracy,
because a single document may contain content in multiple languages.
More complicated expressions are possible, such as the one below, which checks for documents with a publication date in 2021.
1609459200 < doc.pub_epoch and doc.pub_epoch < 1640995200
Here, pub_epoch
stores the date in epoch time.
You can find a full list of supported syntax on the Functions and Operators page.