Skip to main content
Version: 2.0

Limits and cutoffs

After reranking is applied, you can use the cutoff and limit parameters to control the final result set.

Search cutoffs

The cutoff property of the reranker specifies a minimum score threshold for search results to include after reranking.

Setting this value lets you control which results are considered relevant enough to return, filtering out results that do not meet the desired level of relevance. For example, when you set the cutoff to 0.5, only results with a score of 0.5 or higher are considered.

This cutoff is applied per reranking stage. In a chain of rerankers, each reranker can have its own cutoff value, potentially further reducing the number of results at each stage. If both limit and cutoff are specified, the cutoff is applied first.

caution

Search cutoffs are most effective when used with neural rerankers like the Vectara Multilingual reranker (Slingshot). This provides normalized scores between 0.0 and 1.0. If you use hybrid search methods that involve BM25, scores may be unbounded, making cutoff values less predictable.

Search limits

The limit property allows more granular control over the number of results returned. This limit is applied per each reranking stage, such as if you use chain reranking, and this limit affects the output and not the input to the reranker. It returns the top N results, where N is the value specified by this limit.

Imagine a scenario where you want to limit the output of results to a reranker, whether a single reranker, or within rerankers that are in a chain. For example, you want to process blog posts and ignore non-blog posts. You would set up a UDF to filter for blog categories and return null score for non-blog content.

if (get('$.document_metadata.category') == 'blog') get('$.score') else null

This would remove non-blog posts from the results. Then you can set a limit of 10 to get only the top 10 blog post results.

Combine cutoffs and limits

Using both cutoffs and limits in a chain allows for more refined control over query results.

CODE EXAMPLE

Code example with json syntax.
1

This filters out non-blog content where the UDF reranker limits the output to 10, and sends these 10 results to the Vectara Multilingual reranker which both removes results with a score below 0.5 and returns the top 3 results from the remaining set.

Improve summarization

You can also improve LLM summarization by using cutoffs and limits. For example, filter out low-scoring results with a high threshold before sending them for summarization, which can improve the quality of the generated summary.

This example uses both Slingshot and a User Defined Function to send only highly relevant and recent documents for summarization.

CODE EXAMPLE

Code example with json syntax.
1
  1. The first stage in the chain filters out documents with scores lower than 0.75 and it also limits the results to 10.
  2. The next stage prioritizes documents based on their publish_ts value, which represents the publication timestamp.
tip

You can also enable reranking in the Vectara console after navigating to the Query tab of a corpus and selecting Retrieval. Use this for exploration and experimenting with the API.