Skip to main content
Version: 2.0

Create an LLM

POST 

/v2/llms

Supported API Key Type:
Index ServicePersonal

Integrate external Large Language Models (LLMs) into Vectara for Retrieval Augmented Generation (RAG) and chat. Connect OpenAI API-compatible models from providers like Anthropic, Azure, Google, or custom-hosted endpoints. Once created, reference your custom LLM by name in query generation parameters.

  • Connect external LLMs using OpenAI-compatible API format
  • Configure multiple LLM providers for different use cases
  • Override Vectara's built-in LLMs with your own models
  • Use custom models for RAG, chat, and document summarization

Example providers:

OpenAI

Type: openai-compatible

Models: GPT-4o, GPT-5 Auth: Bearer token

{
"type": "openai-compatible",
"name": "my-gpt5",
"model": "gpt-5",
"uri": "https://api.openai.com/v1/chat/completions",
"auth": {
"type": "bearer",
"token": "sk-..."
}
}

OpenAI Responses API

Type: openai-responses Models: o1-preview, o1-mini, o3-mini (reasoning models) Auth: Bearer token Note: For reasoning models that don't support streaming

{
"type": "openai-responses",
"name": "my-o1",
"model": "o1-preview",
"uri": "https://api.openai.com/v1/chat/completions",
"auth": {
"type": "bearer",
"token": "sk-..."
}
}

Anthropic Claude

Type: openai-compatible Models: claude-4-opus, claude-4-5-haiku, claude-4-5-sonnet Auth: Bearer token with header

{
"type": "openai-compatible",
"name": "my-claude",
"model": "claude-sonnet-4-5-20250929",
"uri": "https://api.anthropic.com/v1/messages",
"auth": {
"type": "bearer",
"token": "sk-ant-..."
},
"headers": {
"anthropic-version": "2023-06-01"
}
}


### Azure OpenAI
**Type:** `openai-compatible`
**Models:** GPT-3.5, GPT-4 (Azure-deployed versions)
**Auth:** Custom header (api-key)
```json
{
"type": "openai-compatible",
"name": "my-azure-gpt4",
"model": "gpt-4",
"uri": "https://YOUR-RESOURCE.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT/chat/completions?api-version=2024-02-15-preview",
"auth": {
"type": "header",
"header": "api-key",
"value": "your-azure-key"
}
}

### Google Vertex AI (Gemini)
**Type:** `vertex-ai`
**Models:** gemini-2.5-pro, gemini-2.5-flash
**Auth:** Service account or API key
```json
{
"type": "vertex-ai",
"name": "my-gemini",
"model": "gemini-1.5-pro",
"uri": "https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR-PROJECT/locations/us-central1/publishers/google/models/gemini-1.5-pro:generateContent",
"auth": {
"type": "service_account",
"key_json": "{...service account JSON...}"
}
}

Custom OpenAI-Compatible

Type: openai-compatible Models: Any self-hosted or custom LLM, such as OpenRouter. Auth: Bearer or custom header

{
"type": "openai-compatible",
"name": "my-custom-llm",
"model": "llama-3-70b",
"uri": "https://my-llm-endpoint.com/v1/chat/completions",
"auth": {
"type": "bearer",
"token": "custom-token"
}
}

Request

Responses

The LLM has been created