Version: 2.0

Bring your own LLM

Organizations often need to integrate multiple Large Language Models (LLMs) from different providers to optimize cost, performance, or compliance. Vectara's Bring Your Own LLM (BYO-LLM) capability enables seamless integration of third-party LLMs into Vectara's AI stack, supporting OpenAI-compatible models, resposes API for reasoning models, and Google Cloud Vertex AI.

By configuring LLMs with the Create LLM API, you can enhance flexibility in how Vectara generates summaries, answers, and content, leveraging your preferred LLM infrastructure while retaining full compatibility with Vectara's powerful RAG workflows.

For example, models like GPT-5, Claude Sonnet, and Opus excel at generating code and technical content as part of your text responses. In your applications, you could use advanced models to generate code within Vectara responses, while leveraging multimodal models' image generation capabilities through separate API calls.

Define a custom LLM configuration

The integration relies on defining a custom LLM configuration with the Create LLM endpoint. Vectara supports three LLM types:

Supported LLM Types

Type	Description	Use For
`openai-compatible`	OpenAI-style APIs	OpenAI, Anthropic Claude, Azure OpenAI
`openai-responses`	OpenAI Responses API	Reasoning models (o1, o3)
`vertex-ai`	Google Cloud Vertex AI	Gemini models

After you enter the type, continue with the remaining configuration fields:

Configuration Fields

Field	Description
`type`	One of the following: `openai-compatible`, `openai-responses`, or `vertex-ai`
`name`	User-defined label for the LLM (referenced in queries)
`description`	(Optional) Metadata or notes about the model
`model`	Specific model version (`gpt-4`, `claude-3.5-sonnet`, `gemini-2.5-flash`)
`uri`	The API endpoint URL
`auth`	Authentication configuration (varies by type)
`headers`	(Optional) Additional HTTP headers for the API
`test_model_parameters`	(Optional) Test parameters to validate the configuration

Add custom LLM examples

Here are some examples for Anthropic, OpenAI, and Google LLMs.