Skills
A common mistake when building an agent is to stuff every specialist instruction into the system prompt up front: a code-review checklist, a customer-support playbook, a five-page SQL style guide, rules for how to handle refunds. The result is a 15,000-character system prompt that costs tokens on every turn, spreads the model's attention thin, and still doesn't help the agent pick the right guidance for the situation.
Skills are the progressive-disclosure pattern for this problem. A skill has a short description that lives in the system prompt and a larger body of content that's loaded into the conversation only when the agent decides it needs it. This is a standard technique in modern agent harnesses (Claude Code uses it for its built-in skills); Vectara exposes it as a first-class field on the agent.
How skills work
A skill has two parts: a short description that lives in the system prompt and tells the LLM what the skill is good for, and a longer content body that's hidden until the skill is invoked. Only the description pays attention cost on every turn.
When any skills are configured on an agent, the platform makes an
invoke_skill tool available to the agent. Calling it with a skill
name emits a skill_load event, which inserts the skill's content
into the conversation as a user message. From that point on, the
skill content is part of the session's history and the agent can act
on it.
Loaded skill content is not persisted back to the agent's instructions. If the session continues past the point where the skill is useful, compaction can fold the loaded content into a summary like any other history.
Write descriptions the LLM can act on
The description is the only thing the model sees up front, so it decides whether the skill ever gets picked up. Write it as a decision aid — what the skill is for and when to reach for it.
Weak: "Knowledge base search."
The model has no signal about when to invoke this over any other lookup. It will either over-invoke it on every turn or ignore it.
Strong: "Looks up billing and subscription facts from the customer knowledge base. Use when the user asks about plan pricing, invoice history, or why a charge appeared."
The model now has both the domain (billing, subscriptions) and the trigger conditions (pricing, invoices, charges).
Define skills on the agent
Skills live in the skills map on the agent, keyed by skill name.
Keep the name short and action-oriented — the agent chooses among
skills by name, and a clear name makes that easier.
AGENT WITH SKILLS CONFIGURED
Code example with json syntax.1
When this agent starts a session, its system prompt includes a block like:
<skill>
<name>code_review</name>
<description>Reviews code for bugs, security issues, and style...</description>
</skill>
The LLM sees the name and description; everything in content is
deferred until the skill is actually needed.
Invoking a skill
The underlying flow (invoke_skill → skill_load → user message) is
covered in How skills work. What matters here is
how to get the agent to actually call it.
No configuration is needed to enable invoke_skill — it's
automatically present whenever the agent has skills defined. The
agent's instructions should explain when to reach for a skill (e.g.,
"If the user shares code, invoke the code_review skill before
commenting"), so the LLM knows it has that option. Without that
nudge, a model with strong prior behaviors may ignore the tool.
Client-triggered skill invocation
A client can also invoke a skill directly, without waiting for the
model to choose one, by sending an input of type skill with the
skill's name instead of a text input. The platform loads the skill
content exactly as if the model had called invoke_skill.
Use this for deterministic UI flows where the user has explicitly chosen the skill — a "Run code review" button, a support ticket pre-classified as a refund request, a step that must always start by loading a specific playbook. Let the model choose otherwise.
Scope skills per step
If you're using steps, each step has an
allowed_skills field that scopes which skills are visible during
that step. The semantics match allowed_tools: leaving it unset
exposes every skill, an empty list hides the invoke_skill tool
entirely, and a named list filters down to just those skills.
Scoping skills per step keeps the system prompt lean when a step
doesn't need them. A classifier step that only routes to other steps
probably doesn't need code_review in its prompt; a draft step does.
Skills vs instructions vs reminders
All three put text into the agent's context. Each solves a different problem:
| Instructions | Skills | Reminders | |
|---|---|---|---|
| When | Every turn. | Only after the agent (or client) invokes them. | Re-asserted on every matching turn. |
| Attention cost | Paid on every request, regardless of need. | Paid only in sessions that actually load them. | Paid on every matching turn, but kept short. |
| Lifecycle | Persistent agent-level config. | Persistent config; loaded content is per-session. | Persistent config; text re-appended near recency. |
| Best for | Persona, rules that always apply, defaults. | Long specialist guidance for specific situations. | Short constraints the model tends to forget. |
A good heuristic: if the guidance applies to every turn, it's an instruction. If it applies to a specific situation and is too long to always include, it's a skill. If it's short and needs to stay near the end of the prompt to fight recency bias, it's a reminder.
When to reach for a skill
Typical uses:
- Playbooks and procedures — step-by-step guidance for a specific task (refund handling, incident triage, code review).
- Style guides — long-form rules that only apply when the agent is writing something of that kind (SQL, API responses, formal letters).
- Domain-specific reasoning templates — "how to work through a financial reconciliation," "how to draft a security-review response."
- Less-common escalation paths — procedures that exist for completeness but shouldn't clutter the prompt when they're not needed.
Skills are a poor fit for content that the agent will need on almost every turn — at that point, the loaded content is just the system prompt with an extra step. Promote that content to the instructions instead.
Limits and trade-offs
- Descriptions matter. The only thing the LLM sees up front is the name and description. If the description is vague, the model won't know when to invoke the skill. Treat each description as a pitch to a reasonable-but-busy reader.
- Skill content sizes add up. Each loaded skill stays in the conversation for the rest of the session (or until compaction folds it in). Loading many skills in quick succession can blow past the same context limits you were trying to avoid.
- One invocation loads one skill. The agent must decide, per invocation, which skill to load. There is no "load these three at once" primitive.
- Not a retrieval system. Skills are a fixed set defined on the
agent. If you need the agent to dynamically pull specialist content
from a large corpus, that's what
corpora_searchis for.
Related
- Context engineering overview
- Instructions
- Reminders
- Compaction — interacts with loaded skill content in long sessions.
- Steps — per-step
allowed_skillsscoping.