Skip to main content
Version: 2.0

Integrations

Integrations with a Vectara agent come in three flavors: feeding data in, giving the agent something it can call mid-turn, and routing messages to the agent from another surface.

A data source pulls records in on a schedule. A tool lets an agent reach out during a turn. A connector lets a third-party surface deliver messages to an agent and receive its replies. The three are independent (wiring up one doesn't imply the others), but most real applications use at least two.

Data sources

A data source is the input to a pipeline. The pipeline runs on a trigger, fetches new or changed records from the source, and hands each record to an agent session as its first input. One record becomes one session. Today the only first-class source is S3, with more on the way. For the data model, the trigger types, and the sync modes, see Pipeline concepts.

If your data lives somewhere a pipeline can't reach yet, vectara-ingest is the older path. It is a Python crawler library you operate yourself, with much wider source coverage and arbitrary customization, at the cost of scheduling and running it on your own infrastructure. Reach for a pipeline when one fits the source you have, and fall back to vectara-ingest when no first-class source covers it yet.

For agentic ingestion pipelines, attach the built-in document processing tools to the ingestion agent. These are tools for analyzing, converting, chunking, merging, and indexing documents. The agent reads documents from a source, passes them through these tools in sequence, and writes the result to a corpus. See Pipelines for the full setup.

Tools

A tool is a capability an agent can call mid-conversation. Built-in tools cover corpus search, the web, artifact handling, and a handful of other primitives. Custom tools cover the rest.

The fastest way to expose a third-party REST API to an agent is web_get, the built-in HTTP fetcher. On paper an agent can already call any URL with it, but a generic tool is a poor tool: the LLM has to guess the endpoint, the headers, and the body, and gets the whole page back. With a focused name, a tight description, locked-down parameters, and an output transform, the same primitive becomes a purpose-built API client without writing any code. The recipe is in Wrap a REST API with web_get.

For the credential side — how to wire user-OAuth, service-account OAuth, and static-bearer auth onto web_get tools without leaking secrets into the prompt — see Connector authentication patterns.

When web_get isn't enough (multi-step auth flows, response pagination, side effects you don't want to encode in a description), write a lambda tool or expose your service through MCP.

To attach any tool to an agent in the console:

  1. Open the agent.
  2. Select the Settings tab.
  3. Click the Edit button.
  4. Navigate to step 3 Abilities, click + Add tools.
  5. Switch to the All tools tab, search by name, and click Add. Pre-built integrations for specific systems (such as Wolken, Jira, and Slack) display in that tab once they are registered in your account.

Connectors

A connector is a venue. It is the inverse of a tool: instead of the agent reaching out, a third-party surface delivers an event to the agent and the agent responds back through the same channel. A connector owns its webhook, its credentials, and the translation between the venue's message shape and an agent session.

The only connector type supported today is Slack. A Slack connector binds an agent to a Slack app and signing secret, exposes a webhook path under /v2/agents/{agent_key}/connectors/{connector_id}/input, and turns threads in your workspace into agent sessions. We add new connector types as customers ask for them, so if you need Teams, Discord, or a custom surface, tell us. For the configuration shape, see the agent connectors section of the API reference.

Where permissions live

Vectara supports two distinct patterns for agent access to enterprise data: indexing data into a corpus, or reaching an external system through a tool. These patterns serve different use cases and are not interchangeable. The right choice depends on whether the data should be optimized for retrieval inside Vectara, or whether the source system should remain the live point of access-control enforcement.

Data sources index data into a corpus. When data is ingested, Vectara stores and retrieves the indexed content from the corpus. The source system’s permissions do not automatically travel with each record. Vectara enforces corpus-level RBAC, which controls who can query the corpus, but it does not natively inherit per-document ACLs from the source system. To preserve user-specific access boundaries, replicate the source system’s access metadata onto each document at index time, then apply a metadata_filter on every query based on the user’s verified attributes. See attribute-based access control. You are also responsible for keeping that metadata synchronized when permissions change in the source system.

Tools access external systems at request time. When an agent calls a tool that reaches an external system, the data remains in the system of record. If the tool runs under the user’s own credentials, the upstream system enforces its permissions on each request. In this model, the user’s access travels with the request because the request is evaluated by the source system at runtime. See connector authentication patterns.

Choose the pattern based on the data’s permission model, freshness requirements, and retrieval needs. Use a data source when you want fast, grounded RAG over indexed content and can maintain the required access metadata in the corpus. Use a tool when permissions are volatile, highly audited, compliance-sensitive, or best enforced directly by the source system. This lets agents retrieve from external systems without accumulating duplicate copies of enterprise data, while still supporting full ingestion when indexed retrieval is the better fit.