Skip to main content
Version: 2.0

Structured outputs

Structured outputs let you attach a JSON schema to a step's output parser so the model returns JSON that downstream code, another step, or an automation can consume without re-parsing. Reach for them whenever a step's final response feeds machine logic rather than a human reader.

This is particularly useful in multi-step agents: a classifier step that emits a structured label can be read by next_steps conditions like get('$.output.intent') == 'sales' without string parsing. It's also a clean integration point for any pipeline or downstream system that expects a well-typed response.

Configure on a step

output_parser is a field on every agent step. It defaults to the default parser (free-form text). Switch it to structured and attach a json_schema:

CLASSIFIER STEP EMITTING AN INTENT AND A CORPUS FILTER

Code example with json syntax.
1

The schema constrains only the agent's final response. Tool calls still work normally mid-turn; the schema only applies when the agent is ready to end the turn and produce its final text output.

Strict mode and provider support

Vectara forwards the JSON schema itself to every provider that has a native structured-output mode, but the strict flag is only honored on OpenAI-family clients (chat completions and the Responses API). Anthropic and Vertex drop strict and pass the schema on its own, so setting it on those models is a no-op. The schema itself is still respected wherever the provider supports JSON-schema responses.

Strict mode enforces a subset of JSON Schema; consult OpenAI's docs or the API reference for the exact rules.

Vectara does not re-validate the returned JSON — it trusts the provider's structured-output enforcement. If the provider returns malformed or schema-violating JSON, the structured_output event carries whatever content came back, so validate at your application boundary if you need guarantees beyond the provider's.

What the agent emits

With the structured parser, the agent emits a structured_output event in place of the usual agent_output event. Downstream code — and next_steps conditions within the same agent — can key on that event type and address fields on the payload like $.output.intent. See Steps for the full condition syntax.

Streaming

Unlike free-form text (which streams token by token), structured output is delivered as a single complete event once the model finishes producing valid JSON. There is no partial/streaming rendering of the payload, since a partial JSON object is usually unusable anyway.

When to reach for structured outputs

Typical uses:

  • Classifier / router steps. The first step in a multi-step agent emits a structured label the next steps dispatch on.
  • Information extraction. The agent reads an unstructured input (an email, a document, a transcript) and returns the extracted fields in a well-defined shape.
  • Tool-like agent responses. The agent is invoked by another system that expects a JSON response in a specific format — for example, a pipeline calling an agent to decide what to do with a record.
  • Scoring and ranking. The agent returns numeric scores and justifications that downstream systems can sort on.

Limits and trade-offs

  • Schema is enforced on the final response only. Intermediate tool calls are free-form. If you need structured tool output, that belongs in the tool's own input/output schema, not the agent's output parser.
  • Strict mode only binds on OpenAI-family models. See Strict mode and provider support for the full story.
  • Not every provider honors the schema the same way. OpenAI chat completions, the OpenAI Responses API, Anthropic, and Vertex all accept the schema, but only the OpenAI-family clients enforce strict. Other providers return best-effort JSON.
  • Large schemas cost more. The schema is sent to the model on every turn the step runs. Keep it focused on what the final response actually needs.