Skip to main content
Version: 2.0

REST APIs

While gRPC provides low latency and excellent scalability, REST APIs provide a traditional integration path for web-based applications. With the introduction of Vectara's REST API 2.0, developers can now leverage a more intuitive and user-friendly API design that follows RESTful principles and simplifies the getting started experience.

Deprecation Notice

REST API v1 was deprecated and fully retired on August 16, 2025. Migrate to API v2 as soon as possible.

caution

Review the REST API 1.0 to 2.0 migration guide which highlights important differences between the Vectara REST API v1 and REST API v2.

API formatting guidelines

You can find all of our APIs at https://api.vectara.io/v2/<api-endpoint> The API endpoints are outlined in the various subsections of this API Reference section and are designed to be intuitive and follow a standard RESTful structure.

  • The current version is v2.
  • api-endpoint follows a hierarchical structure, using lowercase and hyphens. For example, /corpora/:corpus_key/documents.

API authentication

All Vectara APIs are authenticated. Indexing and Search APIs can be authenticated via API Keys. The Personal API Key enables most Admin actions for creating and deleting corpora, but for deleting accounts and accessing billing data, you need to use OAuth 2.0.

API Reference and OpenAPI specifications

You can find up-to-date OpenAPI specifications at https://docs.vectara.com/vectara-oas-2.yaml. These specifications provide a comprehensive overview of the available endpoints, request/response formats, and authentication requirements.

You can use these with tools of your choosing like Insomnia or Postman.

  1. Download the OpenAPI YAML file.
  2. Import the file into Insomonia or Postman.
  3. Start making API calls directly from the tool.

Want to try the REST APIs live in your browser? Head over to our our API Reference and make real-time API calls from your browser.

List of Vectara REST 2.0 endpoints

Vectara provides the following REST 2.0 endpoints:

Request timeouts

By default, requests will take as long as they need to complete. However, you can request a maximum time for most of the APIs to take by specifying the Request-Timeout or Request-Timeout-Millis parameters in the HTTP headers. Request-Timeout is specified in seconds and if you need a more granular timeout, you can use Request-Timeout-Millis. Note that both parameters are considered best-effort: in the event either time lapses, Vectara will attempt to terminate the request as soon as possible after.

Queries

The following endpoints help you with queries:

  • Query API: Perform searches across one or more corpora using advanced filtering, pagination, and summarization options.
  • Simple Corpus Query API: Execute lightweight searches on a single corpus.
  • Advanced Corpus Query API: Perform advanced queries on a specific corpus with additional filtering and customization options.

Query histories

The following endpoints help you with query histories:

Corpora

The following endpoints enable you to programmatically manipulate corpora and perform many operations such as viewing corpus consumption, size, associated API keys, and more:

Index and upload documents

The following endpoints help you index, upload files, and manage documents:

Document Processing

The following endpoints provide specialized document processing capabilities:

Table Extractors

The following endpoints help you extract and process tabular data from documents:

Chats

The following endpoints provide a streamlined solution for integrating chatbot functionalities into domain-specific applications and websites using Retrieval Augmented Generation (RAG):

Agents

The following endpoints help you build and manage AI agents with custom instructions and tool integrations:

Agent Sessions

Agent Events

Agent Instructions

Agent Tools and Tool Servers

Encoders, rerankers, and large language models (LLMs)

The following endpoints help you manage encoders, rerankers, and LLMs:

  • Create Encoder API: Create a custom encoder for document embedding and vectorization.
  • List Encoders API: Get a list of available encoders for document embedding.
  • List Rerankers API: Get a list of available rerankers for improving search result ranking.
  • Create LLM API: Add a custom large language model configuration for query and chat endpoints.
  • Get LLM API: Retrieve details about a specific large language model.
  • List LLMs API: Get a list of available large language models for query and chat endpoints.
  • Delete LLM API: Remove a custom large language model configuration.
  • List Generation Presets API: Get a list of available generation presets for configuring prompt templates and model parameters.

LLM Chat Completions

The following endpoint provides OpenAI-compatible chat completions:

Hallucination Detection and Correction

The following endpoints help you detect and correct hallucinations in AI-generated content:

Jobs

The following endpoints help you manage background jobs:

  • Get Job API: Retrieve details about a specific background job.
  • List Jobs API: Get a list of all background jobs for the account.

Users

The following endpoints help you manage users on your account:

API keys

The following endpoints help you manage the lifecycle and security of API keys:

Application Clients

The following endpoints help you manage OAuth 2.0 application clients on your account:

note

Not all REST API endpoints have long-form documentation in this API Reference. For example, information about Get Usage Metrics is in the API Reference.