Agents
This guide covers the Vectara Python SDK for creating and managing AI agents with conversational capabilities, tool use, and multi-turn context. Agents combine RAG search with LLM reasoning to provide intelligent, context-aware responses grounded in your corpus data.
This guide assumes you have a corpus called my-docs with indexed documents. If you haven't
created a corpus yet, follow the Quick Start guide to set
up your first corpus and add some documents.
Only a single step is supported — the first_step is the only step the agent will execute.
Multi-step workflows will be supported in future updates.
Create an agent
CREATE AN AGENT WITH CORPORA SEARCH
Code example with python syntax.1?
Create an agent configured with a corpora search tool that can answer questions using your indexed documents. The agent combines semantic search with LLM generation to provide grounded, accurate responses.
The agents.create method corresponds to the HTTP POST /v2/agents endpoint.
For more details, see the
Create Agent API reference.
Key Parameters:
name: Human-readable agent identifiertool_configurations: Map of tools the agent can use (see Available tools below)model: LLM model — includesname(required),parameters, andretry_configurationfirst_step: Defines instructions, output parser, and allowed toolskey(optional): Custom agent key — auto-generated if omitteddescription(optional): Purpose descriptionmetadata(optional): Arbitrary key-value pairs for trackingcompaction(optional): Automatic context management for long conversationsenabled(optional): Whether the agent is active (default:true)
Returns:
key: Unique agent identifier (e.g.,agt_support_agent_a1b2)name,description,enabled,created_at,updated_at
Available tools
AGENT WITH MULTIPLE TOOLS
Code example with python syntax.1
Agents can be configured with multiple tool types. Each tool is a key-value
pair in the tool_configurations map:
| Tool Type | Description | Use Case |
|---|---|---|
corpora_search | Search indexed document collections | RAG over your knowledge base |
web_search | Search the internet (Tavily provider) | Real-time web information |
web_get | Fetch content from a URL | Read specific web pages |
lambda | Run custom Python functions | Custom business logic |
sub_agent | Delegate to another agent | Multi-agent workflows |
mcp | Model Context Protocol tools | External tool servers |
dynamic_vectara | Custom Vectara tools | Dynamic tool configuration |
artifact_create | Create artifacts from text or data | Store outputs for later use |
artifact_read | Read stored artifacts | Access session outputs |
artifact_grep | Search artifact content | Find info in artifacts |
image_read | Process and analyze images | Vision capabilities |
document_conversion | Transform document formats | File processing |
get_document_text | Extract text from documents | Content extraction |
Instructions and templates
DYNAMIC INSTRUCTIONS WITH VELOCITY TEMPLATES
Code example with python syntax.1
Agent instructions use Apache Velocity templates with access to runtime variables:
| Variable | Description |
|---|---|
$agent.name | Agent display name |
$agent.key | Agent unique key |
$agent.metadata | Agent metadata map |
$session.key | Current session key |
$session.metadata | Session metadata map |
$currentDate | Current date in ISO 8601 format |
$tools | List of available tools (each with name and description) |
Tips for effective instructions:
- Use CAPS for emphasis on critical behaviors
- Include negative prompts ("NEVER make up information") to prevent unwanted behaviors
- Reference
$toolsso the agent knows what capabilities it has
Create a session and send messages
INTERACT WITH AN AGENT
Code example with python syntax.1
Create a session to start a conversation with the agent, then send messages and receive responses. Each session maintains its own conversation context.
The agent_events.create method corresponds to the HTTP POST
/v2/agents/{agent_key}/sessions/{session_key}/events endpoint.
Event Types in Response:
| Event Type | Description |
|---|---|
input_message | The user's original message |
agent_output | The agent's text response |
tool_input | Parameters sent to a tool |
tool_output | Results returned from a tool |
thinking | Agent's internal reasoning |
structured_output | Structured data output from the agent |
skill_load | Skill activation events |
step_transition | Step change events |
step_transition_limit_exceeded | Maximum step transitions reached |
compaction | Context compaction event |
context_limit_exceeded | Context window limit reached |
session_interrupted | Session was interrupted |
artifact_upload | Artifact uploaded to session |
image_read | Image processing event |
Multi-turn conversation
MULTI-TURN AGENT CONVERSATION
Code example with python syntax.1
Build natural multi-turn conversations where the agent maintains context across exchanges. Each message builds on the previous conversation history without requiring explicit context management.
Conversation Flow:
- Initial Question: Establishes the topic and context
- Follow-up Questions: Reference previous answers naturally
- Automatic Context: The session maintains full conversation history
List and manage agents
LIST AND MANAGE AGENTS
Code example with python syntax.1
Manage the full lifecycle of agents — list, inspect, update, and delete. All omitted fields are preserved during updates.
The agents.list method corresponds to the HTTP GET /v2/agents endpoint.
For more details, see the List Agents API reference.
Updatable Properties:
name,description,enabled,metadatatool_configurations(add, remove, or modify tools)model(change LLM model or parameters)first_step(update instructions, output parser, allowed tools)compaction(context management settings)
Session management
MANAGING AGENT SESSIONS
Code example with python syntax.1
Manage agent sessions for tracking conversations, branching dialogues, and organizing multi-user interactions.
Session Operations:
- Create: Start a new conversation, optionally with metadata
- List: Monitor active sessions for an agent
- Get: Inspect session details and metadata
- Update: Modify description or metadata
- Fork: Branch a conversation by copying events to a new session
- Delete: End and clean up a session
Agent identity
AGENT IDENTITY AND PERMISSIONS
Code example with python syntax.1
Each agent has a service account identity that controls what resources it can access. In auto mode, Vectara manages permissions automatically. In manual mode, you explicitly control roles.
See the Agent Identity API reference.
Identity Fields:
mode:"auto"(managed by Vectara) or"manual"(user-controlled)client_id: OAuth2 client identifier for the agent's service accountapi_roles: Customer-level permissions (e.g.,corpus_viewer,agent_user)corpus_roles: Per-corpus role assignments withcorpus_keyandroleagent_roles: Per-agent role assignments withagent_keyandrole
Context compaction
CONFIGURE AUTOMATIC CONTEXT COMPACTION
Code example with python syntax.1
For long-running conversations, context compaction automatically summarizes older messages to keep the context window manageable. This prevents token limit errors and maintains conversation quality over many turns.
Best practices
- One agent per use case: Create dedicated agents for different domains (support, sales, etc.)
- Use Velocity templates: Dynamic instructions with
$agent.name,$currentDate, and$toolsmake agents more flexible - Use CAPS for critical instructions: "NEVER make up information" is clearer than "don't make up information"
- Include negative prompts: Tell the agent what NOT to do to prevent common issues
- Reuse agents, not create/delete: Create agents once and reuse across sessions — minimize API calls
- Enable compaction for long chats: Prevents context window overflow in multi-turn conversations
- Session metadata for tracking: Attach user IDs and channel info for analytics
- Clean up sessions: Delete sessions after conversations end
Error handling
- 400 Bad Request: Check agent configuration — common issues:
- Missing
first_steporfirst_step_name - Empty
corporalist in tool configuration - Invalid model name
- Missing
- 403 Forbidden: Verify API key has agent permissions (
agent_user,agent_developer, oragent_administratorrole) - 404 Not Found: Ensure agent or session key is correct
- 500 Internal Server Error: Transient — the SDK retries automatically with exponential backoff. Override via
request_options={"max_retries": N}
Next steps
After understanding agent functionality:
- Tools: Explore custom tools and lambda tools
- Instructions: Fine-tune behavior with instruction templates
- Sessions: Learn more about session management
- Chats: Compare with simpler chat sessions for basic RAG
- API Reference: See the full Agents API reference