Fuzzy matching
The tech preview of Fuzzy Metadata Search combines exact filtering with approximate matching. This approach is useful because metadata can have inconsistencies in typos in titles, categories, or keywords.
Fuzzy search operates in two main steps:
- Exact filtering: A
metadata_filteris first applied to narrow results based on attributes likedoc.status = 'Active'. - Fuzzy matching: On the remaining documents, fuzzy matching
handles common typos and missing characters automatically. These results
are then ranked based on relevance score that you can tune using field
weighting. This means you can give
titlea higher weight thancategory.
The final result is a ranked list that helps users find what they mean, even if they did not type the metadata value exactly.
Use document level metadata when you want unique documents. Use part level
metadata when you need to surface matching sections within documents.
Because the fuzzy metadata search feature is a tech preview, it can potentially have breaking changes.
Common uses
- Finding the correct "Service Level Agreement" even if you type "Servce Levl Agrement."
- Searching for "software license" returns both "software license" and "software licensing" documents.
- Searching for product IDs or SKUs that are prone to errors lets users still retrieve a part by ID despite a missing digit.
Field weighting strategy
Adjust field weights to control search relevance:
- Higher weights (
2.0-3.0): Critical fields like title or primary identifier - Medium weights (
1.0-1.5): Important supporting fields - Lower weights (
0.5-1.0): Additional context fields
Example weighting strategy
STRATEGIC FIELD WEIGHTING
Code example with json syntax.1
Example request (document level)
DOCUMENT-LEVEL REQUEST
Code example with json syntax.1
Example response
DOCUMENT-LEVEL RESPONSE
Code example with json syntax.1
Filter syntax
metadata_filter uses Vectara’s metadata filter expression syntax. Prefix every field with its scope: doc. (document-level) or part. (part-level).
Supported operators
- Arithmetic:
+ - * / % - Comparisons:
< <= > >= = == != <> - Null tests:
IS NULL,IS NOT NULL - Membership:
IN (...) - Logical:
NOT,AND,OR
Examples
doc.status = 'Active'doc.pageCount > 10doc.publish_date >= '2025-08-01'doc.category IN ('contract', 'policy')doc.status = 'Active' AND part.clause_type = 'Liability'
The filter language does not support SQL LIKE. Use fuzzy queries to handle approximate text.
Weighted multi‑field search
WEIGHTED MULTI‑FIELD SEARCH
Code example with json syntax.1
Exact filtering plus fuzzy ranking
EXACT FILTERING PLUS FUZZY RANKING
Code example with json syntax.1
Part‑level search
PART‑LEVEL SEARCH
Code example with json syntax.1