Skip to main content
Version: 2.0

User Defined Function reranker

Our out-of-the-box rerankers are effective for general use cases, but some specific use cases require fine-grained control over how search results are ordered. For example, bubbling recently-added documents to the top, or limiting search results to a specific geolocation. This granular control plays a crucial role in Generative AI experiences. Customizing how search results are ranked enables you to influence which information is prioritized by Large Language Models (LLMs). Boosting certain results to the top can effectively guide the LLM to consider that information more prominently, biasing the generated response.

The User Defined Function reranker lets you score each result with a UserFn expression — a small, purpose-built expression language with if, arithmetic, math and time functions, and a get() accessor for document-level metadata, part-level metadata, or scores generated from request-level metadata. To use this reranker, set the type to userfn in a query and put your expression in the user_function field. You can also stack rerankers with the chain reranker when multiple dimensions of relevance matter.

With the flexibility to modify scores based on metadata, conditions, and custom logic, enterprises can craft highly tailored search experiences that meet specific business needs. This reranker enables a wide range of use cases:

  • Recency bias: Prioritize the most recent results in cases where answers based on older data are less relevant than newer data. Examples include news and current events searches, stock market queries, and recruitment searches.
  • Location bias: Prioritize results closer to the location of the user such as local business searches, real estate listings, and event queries.
  • E-commerce bias: Prioritize promotional and sponsored merchandise for sale promotions and new product launches.

How the reranker uses UserFn

For each result in the set, the reranker:

  1. Sets the reranker context to that result.
  2. Evaluates your user_function expression.
  3. Uses the returned number as the result's new score, or drops the result if the expression returned null.

The full syntax — types, operators, if, get(), time and math functions — lives in the UserFn language reference. This page focuses on what's specific to the reranker: the per-result context, null-as-drop, and worked examples.

Reranker context

When the expression runs, get() reads from a single search result. The available paths are:

PER-RESULT CONTEXT SCHEMA

Code example with json syntax.
1

$.score is the score that Vectara has calculated up to this point in the retrieval chain — if userfn is the first reranker, it's the retrieval score; if it follows another reranker, it's whatever that reranker emitted.

Reading the result

GET() EXAMPLES AGAINST THE RERANKER CONTEXT

Code example with sql syntax.
1

See the get() reference for the full behavior — scalars only, null for missing paths, and the optional default-value form.

Null score handling (drop a result)

UserFn's null is a first-class value. The reranker treats null specially: a result whose expression returns null is dropped from the set entirely, before limits are applied and before the next reranker in a chain runs.

note

Returning null to drop a result is a reranker-specific behavior. Step transitions and pipeline verification require boolean and treat null differently — see the language reference.

Filter results below a score threshold:

DROP LOW-SCORE RESULTS

Code example with sql syntax.
1

Filter by metadata:

KEEP ONLY BLOG RESULTS

Code example with sql syntax.
1

Combining with a chain

In this example, the UDF filters out results with scores below 0.5 and limits the output to 100 results. The MMR reranker then processes the survivors by applying a diversity bias and further limits the output to 50.

UDF FILTER, THEN MMR DIVERSITY

Code example with json syntax.
1

Example document with nuanced metadata

The examples below score over this product document, which has metadata for customer_review_stars, units_in_stock, promoted, and similar e-commerce fields.

EXAMPLE PRODUCT DOCUMENT

Code example with json syntax.
1

Worked examples

Combine recency, popularity, and promotion

COMBINED-SIGNAL UDF

Code example with json syntax.
1

This expression layers three signals on top of the base score:

  • Recency: log10(publish_ts) adds a slow-growing recency boost so newer content rises in queries about latest gadgets or new electronics.
  • Popularity: log(2, customer_review_stars) (1–5 stars, log-scaled) adds a gentler boost for well-reviewed items.
  • Promoted content: adds the boolean promoted value, surfacing paid or sponsored items.

Some examples include get('$.score') more than once — multiplying it amplifies the base score. Use smaller multipliers (1.3) for subtle nudges and larger ones (1.5+) when you want a signal to dominate.

Sort on metadata

If you only want to sort on price and ignore the relevance score, return the price as the new score. Items with no price fall to the bottom via the default value:

SORT BY PRICE

Code example with sql syntax.
1

Push out-of-stock items to the bottom

OUT-OF-STOCK PENALTY

Code example with sql syntax.
1

Boosting scores

Boosting raises or lowers the influence of a signal with a multiplier. How aggressive to be depends on your use case:

  • 1.31.4 — subtle nudge.
  • 1.5 — strong preference, e.g. premium content.
  • 1.6+ — items you almost always want at the top.

Boost from metadata

PER-DOCUMENT BOOST FIELD

Code example with sql syntax.
1

Boost on customer rating

ADD A NORMALIZED REVIEW BOOST

Code example with sql syntax.
1

Dividing by 10 keeps a 0–10 review score from dominating the base relevance. For a 0–5 scale, divide by 5.

Boost by content type

50% BOOST FOR TECH SPECS

Code example with sql syntax.
1

Boost by language

60% BOOST FOR FRENCH CONTENT

Code example with sql syntax.
1

Lower the multiplier if French shouldn't dominate; raise it past 1.7 if you want French content nearly always on top.

Surface low-rated documents

Sometimes you want to investigate problem areas — boost low-rated support feedback so it floats to the top instead of being buried:

BOOST NEGATIVE REVIEWS

Code example with sql syntax.
1

The default 5 in get(..., 5) means documents missing the field aren't accidentally boosted.

See also