Skip to main content
Version: 2.0

Understanding Vectara

Vectara is an API-first, headless agentic platform. The end-user application — the UI, the brand, the workflow — sits on top and calls Vectara over REST. Underneath, the platform handles the AI heavy lifting: agent orchestration, retrieval, generation, factual grading, governance, and observability.

Two ways to get that application built and running, both on the same platform: your team builds it with the API + Vectara Skills + a coding agent, or Vectara Managed Agents delivers it turnkey. See The application layer for both options.

This section is the conceptual map

Use this section to orient yourself. It explains what the parts of the platform are, how they fit together, and the trade-offs that shape the design. For the implementation reference — configuration fields, API schemas, tuning playbooks — each topic links to its canonical guide in /docs/agents/, /docs/search-and-retrieval/, /docs/pipelines/, and the REST API reference. Read this section first if you're evaluating Vectara or orienting a new engineer; reach for the canonical guides once you're building.

The platform is production-grade out of the box. HHEM grades every answer in under 50ms. Boomerang leads multilingual retrieval at 76.2% on XQuAD-R cross-lingual. Slingshot reranking is chainable. The agent runtime is a declarative, step-gated, auditable state machine. SOC 2 Type II, HIPAA on request. SaaS, VPC, on-prem, or air-gapped — your choice.

Want to feel it first?

The fastest way to get what a Vectara agent does is to drive one yourself. Open the Agent Playground — paste an API key, paste an agent key, watch session metadata, step transitions, tool calls, and structured outputs stream in real time. See the playground walkthrough for setup details.

The three layers

Every Vectara deployment has the same three layers. Knowing which layer owns what keeps integration decisions clear.

LayerOwnerResponsibility
End userThe userSees only your branded UI. Has no concept of an "agent".
The applicationYouA thin layer of code — UI, business logic, identity — that calls Vectara over REST. The platform does the AI heavy lifting underneath, so this layer stays small. Built by your team in hours with the API, Vectara Skills, and a coding agent, or delivered turnkey by Vectara Managed Agents — you own it either way.
Vectara platformVectaraHeadless. Runs your declared agents over sessions. Calls tools, queries corpora, generates with your chosen LLM, grades with HHEM, streams events back.

The end user never sees Vectara. The application is the only thing they touch. See The application layer for the two ways to build and operate it.

The platform stack

Read top-down. Clients call the interfaces. Agents orchestrate tools and the LLM gateway. Retrieval queries the corpora that pipelines populate. The foundation enforces isolation and compliance. None of these layers are custom-built per customer.

LayerWhat it does
InterfacesREST API for developers, Vectara Skills for coding agents like Claude Code, Admin Console for operators.
Agent runtimeStepped state machines, sub-agent delegation, structured-output gating, cross-session approvals.
Tools35+ built-in tools (search, write, SQL, code, image), Python Lambdas, MCP clients, web_get with OAuth.
LLM gatewayAnthropic, OpenAI, Gemini, on-prem models, BYO LLM. Velocity prompts. Hallucination Corrector.
Retrieval engineHybrid BM25 + dense retrieval, Slingshot reranker (chain, MMR, UDF), metadata filters, citations.
Corpora & ingestionBoomerang embeddings, SmartChunk, pipelines and connectors. Knowledge, memory, and state in one primitive.
FoundationTenant isolation, IdP / SSO, RBAC by corpus, audit and traces, SOC 2 Type II, HIPAA, KMS-managed encryption.

For a layer-by-layer walkthrough of what each component does, what is configurable, and how it connects to the rest, see The platform stack.

Concepts — what the platform is and how it runs:

  • The application layer — custom-built application vs. Vectara Managed Agents. Who builds and operates it.
  • The platform stack — each layer of the platform, what it does, and what you control.
  • Agent anatomy — the parts of a Vectara agent and how they snap together.
  • Request lifecycle — what happens between one user message and one streamed answer.

What an agent can do — the four capabilities:

  • Knowledge — what the agent knows. RAG pipeline (SmartChunk, Boomerang, hybrid search, Slingshot, citations).
  • Context & memory — what the agent remembers. Per-user memory and tool-result scratchpad on the same corpus primitive.
  • Workflows — what the agent does. Stepped state machines, conditional routing, sub-agents, cross-session approvals.
  • Tools & connectors — how the agent reaches out. Built-in tools, Python Lambdas, MCP, web_get, Slack connectors. Minutes to live, not platform releases.

Where Vectara sits in the market:

  • Vectara vs other solutions — side-by-side comparison against product-first vendors and specialized tooling. Why the platform-first shape compounds.

Ready to build? In Getting Started you'll find:

  • Build with coding agents — scaffold connector UIs, dashboards, and stepped agents in 30 minutes with Claude Code, Cursor, or Codex.
  • Try the playground — drive any Vectara agent live and watch its events stream in.
  • Agents quickstart — create your first agent in the Console in a few minutes.