Version: 2.0

Chats

This guide covers the Vectara Python SDK for managing chat conversations, enabling conversational AI with Retrieval Augmented Generation (RAG) and chat history. These methods enable you to create chats, maintain multi-turn conversations, and manage chat history, ideal for building interactive applications like support chatbots or customer service platforms.

Prerequisites

This guide assumes you have a corpus called my-docs with indexed documents. If you haven't created a corpus yet, follow the Quick Start guide to set up your first corpus and add some documents.

Create a chat session

CREATE A CHAT SESSION

Code example with python syntax.

Create a chat session that can maintain conversation context across multiple exchanges. The session handles RAG integration automatically, providing contextual responses based on your corpus content.

The create_chat_session method corresponds to the HTTP POST /v2/chats endpoint. For more details on request and response parameters, see the Create Chat REST API.

Key Parameters:

SearchCorporaParameters: Defines which corpora to search and filtering options
GenerationParameters: Controls response generation quality and style
ChatParameters: Enables conversation history storage for multi-turn interactions
store=True: Essential for maintaining context across conversation turns

Returns:

chat_id: Unique identifier for the conversation session
answer: AI-generated response based on corpus content
factual_consistency_score: Reliability score for the response

Multi-turn conversation

MULTI-TURN CONVERSATION EXAMPLE

Code example with python syntax.

Demonstrate a natural multi-turn conversation where the AI maintains context across exchanges. Each subsequent message builds on the previous conversation history without requiring explicit context management.

The chat turn method corresponds to the HTTP POST /v2/chats/{chat_id}/turns endpoint. For more details on request and response parameters, see the Create Chat Turn REST API.

Conversation Flow:

Initial Question: Establishes the topic and context
Follow-up Questions: Reference previous answers using pronouns and implicit context
Automatic Context: The session maintains conversation history transparently

Benefits:

Natural conversation flow without manual context passing
Each response considers the full conversation history
Factual consistency maintained across all turns
Easy to implement - just call session.chat() for each turn

List chat conversations

LIST CHAT CONVERSATIONS

Code example with python syntax.

1?

Retrieve and display chat conversation history for monitoring, analytics, or user interface display. Useful for building chat interfaces that show conversation lists.

The chats.list method corresponds to the HTTP GET /v2/chats endpoint. For more details on request and response parameters, see the List Chats REST API.

Chat Metadata Includes:

id: Unique chat identifier
first_query: Opening message of the conversation
created_at: Timestamp of chat creation
enabled: Whether the chat is active

Streaming chat responses

STREAMING CHAT RESPONSES

Code example with python syntax.

Stream chat responses in real-time for better user experience in interactive applications. Perfect for creating responsive chat interfaces where users see responses as they're generated.

The chat stream method corresponds to the HTTP POST /v2/chats/{chat_id}/turns/stream endpoint.

Streaming Benefits:

Immediate feedback as the response generates
Better perceived performance for longer responses
Natural conversation feel in interactive applications
Can be stopped early if needed

Chat history management

CHAT HISTORY MANAGEMENT

Code example with python syntax.

1?

Access and display complete conversation history for a specific chat session for audit, analysis, or display purposes.

The chats.get method corresponds to the HTTP GET /v2/chats/{chat_id} endpoint. For more details on request and response parameters, see the Get Chat REST API.

The chats.turns.list method corresponds to the HTTP GET /v2/chats/{chat_id}/turns endpoint. For more details on request and response parameters, see the List Chat Turns REST API.

History Components:

Chat Metadata: Overall conversation information
Turns: Individual message exchanges between user and assistant
Turn Details: Each turn includes query, answer, and timestamp

Best Practices

Monitor factual consistency scores for quality control
Use appropriate max_used_search_results (25-50 for most cases)
Enable chat storage (store=True) for multi-turn conversations
Implement session management for user conversations
Consider streaming for better user experience
Limit conversations to 50-100 turns to maintain context quality

Error Handling

400 Bad Request: Check query parameters and corpus configuration
403 Forbidden: Verify API key has chat permissions
404 Not Found: Ensure corpus exists and is accessible
Rate Limiting: Implement retry logic with exponential backoff

Next steps

After understanding chat functionality:

Integration: Combine with document indexing for dynamic knowledge bases
Customization: Experiment with different generation presets and prompts
Analytics: Track conversation patterns and user satisfaction
Scaling: Implement session management for multiple concurrent users

Create a chat session​

CREATE A CHAT SESSION

Multi-turn conversation​

MULTI-TURN CONVERSATION EXAMPLE

List chat conversations​

LIST CHAT CONVERSATIONS

Streaming chat responses​

STREAMING CHAT RESPONSES

Chat history management​

CHAT HISTORY MANAGEMENT

Best Practices​

Error Handling​

Next steps​

Create a chat session

Multi-turn conversation

List chat conversations

Streaming chat responses

Chat history management

Best Practices

Error Handling

Next steps