Skip to main content
Version: 2.0

Fuzzy matching

The tech preview of Fuzzy Metadata Search combines exact filtering with approximate matching. This approach is useful because metadata can have inconsistencies in typos in titles, categories, or keywords.

Fuzzy search operates in two main steps:

  1. Exact filtering: A metadata_filter is first applied to narrow results based on attributes like doc.status = 'Active'.
  2. Fuzzy matching: On the remaining documents, fuzzy matching handles common typos and missing characters automatically. These results are then ranked based on relevance score that you can tune using field weighting. This means you can give title a higher weight than category.

The final result is a ranked list that helps users find what they mean, even if they did not type the metadata value exactly.

tip

Use document level metadata when you want unique documents. Use part level metadata when you need to surface matching sections within documents.

caution

Because the fuzzy metadata search feature is a tech preview, it can potentially have breaking changes.

Common uses

  • Finding the correct "Service Level Agreement" even if you type "Servce Levl Agrement."
  • Searching for "software license" returns both "software license" and "software licensing" documents.
  • Searching for product IDs or SKUs that are prone to errors lets users still retrieve a part by ID despite a missing digit.

Field weighting strategy

Adjust field weights to control search relevance:

  • Higher weights (2.0-3.0): Critical fields like title or primary identifier
  • Medium weights (1.0-1.5): Important supporting fields
  • Lower weights (0.5-1.0): Additional context fields

Example weighting strategy

STRATEGIC FIELD WEIGHTING

Code example with json syntax.
1

Example request (document level)

DOCUMENT-LEVEL REQUEST

Code example with json syntax.
1

Example response

DOCUMENT-LEVEL RESPONSE

Code example with json syntax.
1

Filter syntax

metadata_filter uses Vectara’s metadata filter expression syntax. Prefix every field with its scope: doc. (document-level) or part. (part-level).

Supported operators

  • Arithmetic: + - * / %
  • Comparisons: < <= > >= = == != <>
  • Null tests: IS NULL, IS NOT NULL
  • Membership: IN (...)
  • Logical: NOT, AND, OR

Examples

  • doc.status = 'Active'
  • doc.pageCount > 10
  • doc.publish_date >= '2025-08-01'
  • doc.category IN ('contract', 'policy')
  • doc.status = 'Active' AND part.clause_type = 'Liability'

The filter language does not support SQL LIKE. Use fuzzy queries to handle approximate text.

WEIGHTED MULTI‑FIELD SEARCH

Code example with json syntax.
1

Exact filtering plus fuzzy ranking

EXACT FILTERING PLUS FUZZY RANKING

Code example with json syntax.
1

PART‑LEVEL SEARCH

Code example with json syntax.
1