Skip to main content
Version: 2.0

Skills

A common mistake when building an agent is to stuff every specialist instruction into the system prompt up front: a code-review checklist, a customer-support playbook, a five-page SQL style guide, rules for how to handle refunds. The result is a 15,000-character system prompt that costs tokens on every turn, spreads the model's attention thin, and still doesn't help the agent pick the right guidance for the situation.

Skills are the progressive-disclosure pattern for this problem. A skill has a short description that lives in the system prompt and a larger body of content that's loaded into the conversation only when the agent decides it needs it. This is a standard technique in modern agent harnesses (Claude Code uses it for its built-in skills); Vectara exposes it as a first-class field on the agent.

How skills work

A skill has two parts: a short description that lives in the system prompt and tells the LLM what the skill is good for, and a longer content body that's hidden until the skill is invoked. Only the description pays attention cost on every turn.

When any skills are configured on an agent, the platform makes an invoke_skill tool available to the agent. Calling it with a skill name emits a skill_load event, which inserts the skill's content into the conversation as a user message. From that point on, the skill content is part of the session's history and the agent can act on it.

Loaded skill content is not persisted back to the agent's instructions. If the session continues past the point where the skill is useful, compaction can fold the loaded content into a summary like any other history.

Write descriptions the LLM can act on

The description is the only thing the model sees up front, so it decides whether the skill ever gets picked up. Write it as a decision aid — what the skill is for and when to reach for it.

Weak: "Knowledge base search."

The model has no signal about when to invoke this over any other lookup. It will either over-invoke it on every turn or ignore it.

Strong: "Looks up billing and subscription facts from the customer knowledge base. Use when the user asks about plan pricing, invoice history, or why a charge appeared."

The model now has both the domain (billing, subscriptions) and the trigger conditions (pricing, invoices, charges).

Define skills on the agent

Skills live in the skills map on the agent, keyed by skill name. Keep the name short and action-oriented — the agent chooses among skills by name, and a clear name makes that easier.

AGENT WITH SKILLS CONFIGURED

Code example with json syntax.
1

When this agent starts a session, its system prompt includes a block like:

<skill>
<name>code_review</name>
<description>Reviews code for bugs, security issues, and style...</description>
</skill>

The LLM sees the name and description; everything in content is deferred until the skill is actually needed.

Invoking a skill

The underlying flow (invoke_skillskill_load → user message) is covered in How skills work. What matters here is how to get the agent to actually call it.

No configuration is needed to enable invoke_skill — it's automatically present whenever the agent has skills defined. The agent's instructions should explain when to reach for a skill (e.g., "If the user shares code, invoke the code_review skill before commenting"), so the LLM knows it has that option. Without that nudge, a model with strong prior behaviors may ignore the tool.

Client-triggered skill invocation

A client can also invoke a skill directly, without waiting for the model to choose one, by sending an input of type skill with the skill's name instead of a text input. The platform loads the skill content exactly as if the model had called invoke_skill.

Use this for deterministic UI flows where the user has explicitly chosen the skill — a "Run code review" button, a support ticket pre-classified as a refund request, a step that must always start by loading a specific playbook. Let the model choose otherwise.

Scope skills per step

If you're using steps, each step has an allowed_skills field that scopes which skills are visible during that step. The semantics match allowed_tools: leaving it unset exposes every skill, an empty list hides the invoke_skill tool entirely, and a named list filters down to just those skills.

Scoping skills per step keeps the system prompt lean when a step doesn't need them. A classifier step that only routes to other steps probably doesn't need code_review in its prompt; a draft step does.

Skills vs instructions vs reminders

All three put text into the agent's context. Each solves a different problem:

InstructionsSkillsReminders
WhenEvery turn.Only after the agent (or client) invokes them.Re-asserted on every matching turn.
Attention costPaid on every request, regardless of need.Paid only in sessions that actually load them.Paid on every matching turn, but kept short.
LifecyclePersistent agent-level config.Persistent config; loaded content is per-session.Persistent config; text re-appended near recency.
Best forPersona, rules that always apply, defaults.Long specialist guidance for specific situations.Short constraints the model tends to forget.

A good heuristic: if the guidance applies to every turn, it's an instruction. If it applies to a specific situation and is too long to always include, it's a skill. If it's short and needs to stay near the end of the prompt to fight recency bias, it's a reminder.

When to reach for a skill

Typical uses:

  • Playbooks and procedures — step-by-step guidance for a specific task (refund handling, incident triage, code review).
  • Style guides — long-form rules that only apply when the agent is writing something of that kind (SQL, API responses, formal letters).
  • Domain-specific reasoning templates — "how to work through a financial reconciliation," "how to draft a security-review response."
  • Less-common escalation paths — procedures that exist for completeness but shouldn't clutter the prompt when they're not needed.

Skills are a poor fit for content that the agent will need on almost every turn — at that point, the loaded content is just the system prompt with an extra step. Promote that content to the instructions instead.

Limits and trade-offs

  • Descriptions matter. The only thing the LLM sees up front is the name and description. If the description is vague, the model won't know when to invoke the skill. Treat each description as a pitch to a reasonable-but-busy reader.
  • Skill content sizes add up. Each loaded skill stays in the conversation for the rest of the session (or until compaction folds it in). Loading many skills in quick succession can blow past the same context limits you were trying to avoid.
  • One invocation loads one skill. The agent must decide, per invocation, which skill to load. There is no "load these three at once" primitive.
  • Not a retrieval system. Skills are a fixed set defined on the agent. If you need the agent to dynamically pull specialist content from a large corpus, that's what corpora_search is for.