Chats
This guide covers the Vectara Python SDK for managing chat conversations, enabling conversational AI with Retrieval Augmented Generation (RAG) and chat history. These methods enable you to create chats, maintain multi-turn conversations, and manage chat history, ideal for building interactive applications like support chatbots or customer service platforms.
This guide assumes you have a corpus called my-docs
with indexed documents. If you haven't
created a corpus yet, follow the Quick Start guide to set
up your first corpus and add some documents.
Create a chat session
1
Create a chat session that can maintain conversation context across multiple exchanges. The session handles RAG integration automatically, providing contextual responses based on your corpus content.
The create_chat_session
method corresponds to the HTTP POST /v2/chats
endpoint.
For more details on request and response parameters, see the
Create Chat REST API.
Key Parameters:
SearchCorporaParameters
: Defines which corpora to search and filtering optionsGenerationParameters
: Controls response generation quality and styleChatParameters
: Enables conversation history storage for multi-turn interactionsstore=True
: Essential for maintaining context across conversation turns
Returns:
chat_id
: Unique identifier for the conversation sessionanswer
: AI-generated response based on corpus contentfactual_consistency_score
: Reliability score for the response
Multi-turn conversation
1
Demonstrate a natural multi-turn conversation where the AI maintains context across exchanges. Each subsequent message builds on the previous conversation history without requiring explicit context management.
The chat turn method corresponds to the HTTP POST /v2/chats/{chat_id}/turns
endpoint. For more details on request and response parameters, see the
Create Chat Turn REST API.
Conversation Flow:
- Initial Question: Establishes the topic and context
- Follow-up Questions: Reference previous answers using pronouns and implicit context
- Automatic Context: The session maintains conversation history transparently
Benefits:
- Natural conversation flow without manual context passing
- Each response considers the full conversation history
- Factual consistency maintained across all turns
- Easy to implement - just call
session.chat()
for each turn
List chat conversations
1?
Retrieve and display chat conversation history for monitoring, analytics, or user interface display. Useful for building chat interfaces that show conversation lists.
The chats.list
method corresponds to the HTTP GET /v2/chats
endpoint.
For more details on request and response parameters, see the
List Chats REST API.
Chat Metadata Includes:
id
: Unique chat identifierfirst_query
: Opening message of the conversationcreated_at
: Timestamp of chat creationenabled
: Whether the chat is active
Streaming chat responses
1
Stream chat responses in real-time for better user experience in interactive applications. Perfect for creating responsive chat interfaces where users see responses as they're generated.
The chat stream method corresponds to the HTTP POST
/v2/chats/{chat_id}/turns/stream
endpoint.
Streaming Benefits:
- Immediate feedback as the response generates
- Better perceived performance for longer responses
- Natural conversation feel in interactive applications
- Can be stopped early if needed
Chat history management
1?
Access and display complete conversation history for a specific chat session for audit, analysis, or display purposes.
The chats.get
method corresponds to the HTTP GET /v2/chats/{chat_id}
endpoint. For more details on request and response parameters, see the
Get Chat REST API.
The chats.turns.list
method corresponds to the HTTP GET /v2/chats/{chat_id}/turns
endpoint. For more details on request and response parameters, see the
List Chat Turns REST API.
History Components:
- Chat Metadata: Overall conversation information
- Turns: Individual message exchanges between user and assistant
- Turn Details: Each turn includes query, answer, and timestamp
Best Practices
- Monitor factual consistency scores for quality control
- Use appropriate
max_used_search_results
(25-50 for most cases) - Enable chat storage (
store=True
) for multi-turn conversations - Implement session management for user conversations
- Consider streaming for better user experience
- Limit conversations to 50-100 turns to maintain context quality
Error Handling
- 400 Bad Request: Check query parameters and corpus configuration
- 403 Forbidden: Verify API key has chat permissions
- 404 Not Found: Ensure corpus exists and is accessible
- Rate Limiting: Implement retry logic with exponential backoff
Next steps
After understanding chat functionality:
- Integration: Combine with document indexing for dynamic knowledge bases
- Customization: Experiment with different generation presets and prompts
- Analytics: Track conversation patterns and user satisfaction
- Scaling: Implement session management for multiple concurrent users