List generation presets
GET/v2/generation_presets
Organizations often struggle to fine-tune query responses and maintain consistency across different use cases. Vectara creates and maintains predefined generation presets for our users which provides a flexible and powerful way to utilize generation parameters. Each preset includes a complete Velocity template for the prompt along with other generation parameters. Presets are typically associated with a single LLM.
The List Generation Presets API lets you view the generation presets used for query requests. Generation presets group several properties that configure generation for a request. These presets provide more flexibility in how generation parameters are configured, enabling more fine-tuned control over query responses.
This includes the prompt_template, the Large Language Model (LLM), and other generation settings like max_tokens and temperature. Users specify a generation preset in their query or chat requests using the generation_preset_name field.
Generation presets object
The generation_presets object contains the name, description, llm_name, prompt_template, and other fields make up the preset.
If your account has access to a preset, then enabled is set to true. A preset can also be set as a default.
Example list generation presets response
{
"generation_presets": [
{
"name": "vectara-summary-ext-24-05-med-omni",
"description": "Generate summary with controllable citations, Uses GPT-4o with 2,048 max tokens",
"llm_name": "gpt-4o",
"prompt_template": "[\n {\"role\": \"system\", \"content\": \"Follow these detailed step-by-step.",
"max_used_search_results": 25,
"max_tokens": 2048,
"temperature": 0,
"frequency_penalty": 0,
"presence_penalty": 0,
"enabled": true,
"default": false
}
// More presets appear here
]
}
Request
Responses
- 200
- 403
List of Generation Presets.
Permissions do not allow listing generation presets.