Bring Your Own LLM
Organizations often need to integrate multiple Large Language Models (LLMs) from different providers to optimize cost, performance, or compliance. Vectara's Bring Your Own LLM (BYO-LLM) capability enables seamless integration of third-party LLMs into Vectara's AI stack, supporting OpenAI-compatible models, resposes API for reasoning models, and Google Cloud Vertex AI.
By configuring LLMs with the Create LLM API, you can enhance flexibility in how Vectara generates summaries, answers, and content, leveraging your preferred LLM infrastructure while retaining full compatibility with Vectara's powerful RAG workflows.
For example, models like GPT-5, Claude Sonnet, and Opus excel at generating code and technical content as part of your text responses. In your applications, you could use advanced models to generate code within Vectara responses, while leveraging multimodal models' image generation capabilities through separate API calls.
Define a custom LLM configuration
The integration relies on defining a custom LLM configuration with the Create LLM endpoint. Vectara supports three LLM types:
Supported LLM Types
Type | Description | Use For |
---|---|---|
openai-compatible | OpenAI-style APIs | OpenAI, Anthropic Claude, Azure OpenAI |
openai-responses | OpenAI Responses API | Reasoning models (o1, o3) |
vertex-ai | Google Cloud Vertex AI | Gemini models |
After you enter the type
, continue with the remaining configuration fields:
Configuration Fields
Field | Description |
---|---|
type | One of the following: openai-compatible , openai-responses , or vertex-ai |
name | User-defined label for the LLM (referenced in queries) |
description | (Optional) Metadata or notes about the model |
model | Specific model version (gpt-4 , claude-3.5-sonnet , gemini-2.5-flash ) |
uri | The API endpoint URL |
auth | Authentication configuration (varies by type) |
headers | (Optional) Additional HTTP headers for the API |
test_model_parameters | (Optional) Test parameters to validate the configuration |
Add custom LLM examples
Here are some examples for Anthropic, OpenAI, and Google LLMs.
Add Anthropic Claude 3.7 Sonnet
Request Body
1
cURL Example
1
Successful Response
1
Add OpenAI GPT-4o
1
Add Google Gemini (Vertex AI)
Using API Key Authentication
1
Using Service Account Authentication
1