Version: 2.0

Generation Presets

The Vectara SDK's Generation Presets module empowers enterprise developers to enhance Retrieval Augmented Generation (RAG) for queries and chats using preconfigured LLM settings. Optimize your generative AI solutions with presets like Mockingbird 2.0 and vectara-summary-ext-24-05-med-omni—. This guide helps you apply these presets to address business needs like personalized customer support and data-driven insights. You will learn:

How to apply presets to tailor query and chat responses
Techniques for customizing presets with filters and parameters
Strategies for handling async generation with error management

For more details on generation presets, see the Generation Presets documentation.

Prerequisites

This guide assumes you have a corpus called my-docs. If you haven't created a corpus yet, follow the Quick Start guide to set up your first corpus.

Using Generation Presets

Available presets include:

mockingbird-2.0: High accuracy and factuality for enterprise summaries.
vectara-summary-ext-24-05-med-omni: GPT-4o-based for advanced summarization and conversational responses.

Example 1: Financial summary with Mockingbird 2.0

FINANCIAL SUMMARY WITH MOCKINGBIRD 2.0

Code example with python syntax.

Generate a tailored financial summary using Mockingbird 2.0 with precise metadata filtering.

generation_preset_name="mockingbird-2.0"
- High accuracy, up-to-date factuality, and balanced tone in enterprise summaries.
corpora=[{"corpus_key": "finance_docs"}]
- Limits search to your company’s trusted finance corpus.
metadata_filter="doc_region = 'EU' AND doc_quarter = 'Q1-2024' AND doc_industry = 'banking'"
- Ensures only documents relevant to European banks in Q1 2024 are used—critical for regulatory and reporting accuracy.
max_used_search_results=10
- Maximizes context coverage but stays concise for summarization.

Example 2: Support chat with GPT-4o

SUPPORT CHAT WITH GPT-4O PRESET

Code example with python syntax.

Deliver concise support responses with the GPT-4o-based vectara-summary-ext-24-05-med-omni preset. This preset leverages a GPT-4o-based model, ideal for responsive, conversational support and broad knowledge coverage.

generation_preset_name="vectara-summary-ext-24-05-med-omni"
- Advanced summarization, up-to-date knowledge, and conversational tone.
corpora=[{"corpus_key": "support_kb"}]
- Focuses the model only on vetted support knowledge base articles.
metadata_filter="doc_platform = 'mobile' AND doc_issue_type = 'auth_failure'"
- Surfaces only mobile-specific authentication problems—preventing noise from irrelevant issues.
max_response_characters=500
- Enforces concise, agent-ready responses (ideal for chatbots or customer-facing UIs).

Error Handling

400 Bad Request: Invalid parameters (max_tokens < 0).
- Resolution: Validate all parameters against their constraints.
403 Forbidden: Insufficient permissions.
- Resolution: Use a Query or Index API Key with appropriate access.
408 Request Timeout: Request exceeds timeout limit.
- Resolution: Increase request_timeout or optimize the query.

Tips

Use generation_preset_name in client.corpora.query or client.chats.create to apply presets like mockingbird-2.0 or vectara-summary-ext-24-05-med-omni.
Customize with model_parameters to override preset settings (e.g., temperature, max_tokens).
Explore the Vectara Prompt Engine for prompt template details.
Contact feedback@vectara.com or use the Vectara Console to create new presets.

Using Generation Presets​

Example 1: Financial summary with Mockingbird 2.0​

FINANCIAL SUMMARY WITH MOCKINGBIRD 2.0

Example 2: Support chat with GPT-4o​

SUPPORT CHAT WITH GPT-4O PRESET

Error Handling​

Using Generation Presets

Example 1: Financial summary with Mockingbird 2.0

Example 2: Support chat with GPT-4o

Error Handling