Version: 2.0

Agents

This guide covers the Vectara Python SDK for creating and managing AI agents with conversational capabilities, tool use, and multi-turn context. Agents combine RAG search with LLM reasoning to provide intelligent, context-aware responses grounded in your corpus data.

Prerequisites

This guide assumes you have a corpus called my-docs with indexed documents. If you haven't created a corpus yet, follow the Quick Start guide to set up your first corpus and add some documents.

Current limitation

Only a single step is supported — the first_step is the only step the agent will execute. Multi-step workflows will be supported in future updates.

Create an agent

CREATE AN AGENT WITH CORPORA SEARCH

Code example with python syntax.

1?

Create an agent configured with a corpora search tool that can answer questions using your indexed documents. The agent combines semantic search with LLM generation to provide grounded, accurate responses.

The agents.create method corresponds to the HTTP POST /v2/agents endpoint. For more details, see the Create Agent API reference.

Key Parameters:

name: Human-readable agent identifier
tool_configurations: Map of tools the agent can use (see Available tools below)
model: LLM model — includes name (required), parameters, and retry_configuration
first_step: Defines instructions, output parser, and allowed tools
key (optional): Custom agent key — auto-generated if omitted
description (optional): Purpose description
metadata (optional): Arbitrary key-value pairs for tracking
compaction (optional): Automatic context management for long conversations
enabled (optional): Whether the agent is active (default: true)

Returns:

key: Unique agent identifier (e.g., agt_support_agent_a1b2)
name, description, enabled, created_at, updated_at

Available tools

AGENT WITH MULTIPLE TOOLS

Code example with python syntax.

Agents can be configured with multiple tool types. Each tool is a key-value pair in the tool_configurations map:

Tool Type	Description	Use Case
`corpora_search`	Search indexed document collections	RAG over your knowledge base
`web_search`	Search the internet (Tavily provider)	Real-time web information
`web_get`	Fetch content from a URL	Read specific web pages
`lambda`	Run custom Python functions	Custom business logic
`sub_agent`	Delegate to another agent	Multi-agent workflows
`mcp`	Model Context Protocol tools	External tool servers
`dynamic_vectara`	Custom Vectara tools	Dynamic tool configuration
`artifact_create`	Create artifacts from text or data	Store outputs for later use
`artifact_read`	Read stored artifacts	Access session outputs
`artifact_grep`	Search artifact content	Find info in artifacts
`image_read`	Process and analyze images	Vision capabilities
`document_conversion`	Transform document formats	File processing
`get_document_text`	Extract text from documents	Content extraction

Instructions and templates

DYNAMIC INSTRUCTIONS WITH VELOCITY TEMPLATES

Code example with python syntax.

Agent instructions use Apache Velocity templates with access to runtime variables:

Variable	Description
`$agent.name`	Agent display name
`$agent.key`	Agent unique key
`$agent.metadata`	Agent metadata map
`$session.key`	Current session key
`$session.metadata`	Session metadata map
`$currentDate`	Current date in ISO 8601 format
`$tools`	List of available tools (each with `name` and `description`)

Tips for effective instructions:

Use CAPS for emphasis on critical behaviors
Include negative prompts ("NEVER make up information") to prevent unwanted behaviors
Reference $tools so the agent knows what capabilities it has

Create a session and send messages

INTERACT WITH AN AGENT

Code example with python syntax.

Create a session to start a conversation with the agent, then send messages and receive responses. Each session maintains its own conversation context.

The agent_events.create method corresponds to the HTTP POST /v2/agents/{agent_key}/sessions/{session_key}/events endpoint.

Event Types in Response:

Event Type	Description
`input_message`	The user's original message
`agent_output`	The agent's text response
`tool_input`	Parameters sent to a tool
`tool_output`	Results returned from a tool
`thinking`	Agent's internal reasoning
`structured_output`	Structured data output from the agent
`skill_load`	Skill activation events
`step_transition`	Step change events
`step_transition_limit_exceeded`	Maximum step transitions reached
`compaction`	Context compaction event
`context_limit_exceeded`	Context window limit reached
`session_interrupted`	Session was interrupted
`artifact_upload`	Artifact uploaded to session
`image_read`	Image processing event

Multi-turn conversation

MULTI-TURN AGENT CONVERSATION

Code example with python syntax.

Build natural multi-turn conversations where the agent maintains context across exchanges. Each message builds on the previous conversation history without requiring explicit context management.

Conversation Flow:

Initial Question: Establishes the topic and context
Follow-up Questions: Reference previous answers naturally
Automatic Context: The session maintains full conversation history

List and manage agents

LIST AND MANAGE AGENTS

Code example with python syntax.

Manage the full lifecycle of agents — list, inspect, update, and delete. All omitted fields are preserved during updates.

The agents.list method corresponds to the HTTP GET /v2/agents endpoint. For more details, see the List Agents API reference.

Updatable Properties:

name, description, enabled, metadata
tool_configurations (add, remove, or modify tools)
model (change LLM model or parameters)
first_step (update instructions, output parser, allowed tools)
compaction (context management settings)

Session management

MANAGING AGENT SESSIONS

Code example with python syntax.

Manage agent sessions for tracking conversations, branching dialogues, and organizing multi-user interactions.

Session Operations:

Create: Start a new conversation, optionally with metadata
List: Monitor active sessions for an agent
Get: Inspect session details and metadata
Update: Modify description or metadata
Fork: Branch a conversation by copying events to a new session
Delete: End and clean up a session

Agent identity

AGENT IDENTITY AND PERMISSIONS

Code example with python syntax.

Each agent has a service account identity that controls what resources it can access. In auto mode, Vectara manages permissions automatically. In manual mode, you explicitly control roles.

See the Agent Identity API reference.

Identity Fields:

mode: "auto" (managed by Vectara) or "manual" (user-controlled)
client_id: OAuth2 client identifier for the agent's service account
api_roles: Customer-level permissions (e.g., corpus_viewer, agent_user)
corpus_roles: Per-corpus role assignments with corpus_key and role
agent_roles: Per-agent role assignments with agent_key and role

Context compaction

CONFIGURE AUTOMATIC CONTEXT COMPACTION

Code example with python syntax.

For long-running conversations, context compaction automatically summarizes older messages to keep the context window manageable. This prevents token limit errors and maintains conversation quality over many turns.

Best practices

One agent per use case: Create dedicated agents for different domains (support, sales, etc.)
Use Velocity templates: Dynamic instructions with $agent.name, $currentDate, and $tools make agents more flexible
Use CAPS for critical instructions: "NEVER make up information" is clearer than "don't make up information"
Include negative prompts: Tell the agent what NOT to do to prevent common issues
Reuse agents, not create/delete: Create agents once and reuse across sessions — minimize API calls
Enable compaction for long chats: Prevents context window overflow in multi-turn conversations
Session metadata for tracking: Attach user IDs and channel info for analytics
Clean up sessions: Delete sessions after conversations end

Error handling

400 Bad Request: Check agent configuration — common issues:
- Missing first_step or first_step_name
- Empty corpora list in tool configuration
- Invalid model name
403 Forbidden: Verify API key has agent permissions (agent_user, agent_developer, or agent_administrator role)
404 Not Found: Ensure agent or session key is correct
500 Internal Server Error: Transient — the SDK retries automatically with exponential backoff. Override via request_options={"max_retries": N}

Next steps

After understanding agent functionality:

Tools: Explore custom tools and lambda tools
Instructions: Fine-tune behavior with instruction templates
Sessions: Learn more about session management
Chats: Compare with simpler chat sessions for basic RAG
API Reference: See the full Agents API reference

Create an agent​

CREATE AN AGENT WITH CORPORA SEARCH

Available tools​

AGENT WITH MULTIPLE TOOLS

Instructions and templates​

DYNAMIC INSTRUCTIONS WITH VELOCITY TEMPLATES

Create a session and send messages​

INTERACT WITH AN AGENT

Multi-turn conversation​

MULTI-TURN AGENT CONVERSATION

List and manage agents​

LIST AND MANAGE AGENTS

Session management​

MANAGING AGENT SESSIONS

Agent identity​

AGENT IDENTITY AND PERMISSIONS

Context compaction​

CONFIGURE AUTOMATIC CONTEXT COMPACTION

Best practices​

Error handling​

Next steps​

Create an agent

Available tools

Instructions and templates

Create a session and send messages

Multi-turn conversation

List and manage agents

Session management

Agent identity

Context compaction

Best practices

Error handling

Next steps