Skip to main content
Version: 2.0

Fuzzy Metadata Search

Metadata is rarely uniform across different document sources. Titles, categories, and headings can vary and change over time. When users only know part of a value, strict equality filters miss relevant items.

The tech preview of Fuzzy Metadata Search combines exact pre‑filtering with fuzzy, weighted matching across specific metadata fields. First, narrow the candidate set precisely, such as by status, region, or date. Then, rank what remains using field‑aware fuzzy matching so users find what they mean, and not just what they type.

  • Supports document-level and part-level metadata searches.
  • Returns relevance‑scored results with pagination (limit, offset, total_count).
  • Lets you weight fields (title^2.0, category^1.0) to tune ranking.
  • Works alongside existing metadata filters for access control and faceted narrowing.
tip

Use document level metadata when you want unique documents. Use part level metadata when you need to surface matching sections within documents.

caution

Because the fuzzy metadata search feature is a tech preview, it can potentially have breaking changes.

How fuzzy search works

  1. Applies fuzzy matching automatically to all field queries
  2. Handles common typos, character transpositions, and missing characters
  3. Field weights influence the final relevance score
  4. Applies exact metadata_filter to narrow results
  5. Performs fuzzy matching on remaining documents

Field weighting strategy

Adjust field weights to control search relevance:

  • Higher weights (2.0-3.0): Critical fields like title or primary identifier
  • Medium weights (1.0-1.5): Important supporting fields
  • Lower weights (0.5-1.0): Additional context fields

Example Weighting Strategy

STRATEGIC FIELD WEIGHTING
1

Example request (document level)

DOCUMENT-LEVEL REQUEST
1

Example response

DOCUMENT-LEVEL RESPONSE
1

Filter syntax

metadata_filter uses Vectara’s metadata filter expression syntax. Prefix every field with its scope: doc. (document-level) or part. (part-level).

Supported operators

  • Arithmetic: + - * / %
  • Comparisons: < <= > >= = == != <>
  • Null tests: IS NULL, IS NOT NULL
  • Membership: IN (...)
  • Logical: NOT, AND, OR

Examples

  • doc.status = 'Active'
  • doc.pageCount > 10
  • doc.publish_date >= '2025-08-01'
  • doc.category IN ('contract', 'policy')
  • doc.status = 'Active' AND part.clause_type = 'Liability'

The filter language does not support SQL LIKE. Use fuzzy queries to handle approximate text.

WEIGHTED MULTI‑FIELD SEARCH
1

Exact filtering plus fuzzy ranking

EXACT FILTERING PLUS FUZZY RANKING
1
PART‑LEVEL SEARCH
1