Multi-client steering
This page covers two related capabilities:
- Multi-client consumption — multiple simultaneous clients can read the same session's events without losing messages, and any client can resume from a known point.
- Mid-turn steering and interrupts — a user can steer the agent into a new direction without waiting for it to finish, or cancel its current turn entirely.
Event stream model
Every session has a single, durably-ordered event list. Events are appended by the platform as the agent produces them (LLM outputs, tool calls, tool outputs), and by clients when they send input. Any client can read the list at any time, from anywhere in it.
There is no per-client queue and no "which client owns this turn?" concept.
Reading and streaming events
GET /v2/agents/{agent_key}/sessions/{session_key}/events returns
the session's events as a paginated JSON list. It's the right call
for fetching history or polling as a read-only follower; it does not
stream.
Live event streaming happens on POST to the same path with
stream_response: true. The POST queues input and the response is a
server-sent-events stream from the since cursor onward, including
events produced as a result of the new input. Live SSE tailing is
tied to a POST that queues input — there is no separate "subscribe"
endpoint.
Resuming from a cursor
Every event has an id of the form aev_*. A client that was
disconnected, or that just opened the session in a new tab, sends its
last-seen event id as since in its next POST; the platform returns
everything after that id and then tails live. The special value
"start" returns the whole history from the beginning.
SECOND CLIENT JOINING AN IN-PROGRESS SESSION
Code example with multiple language options.1
Steering is just as useful when the agent is retrieving the wrong thing. If the user sees the agent searching a corpus that won't have the answer, they can redirect it between tool calls without waiting for the current retrieval to finish:
REDIRECTING RETRIEVAL MID-TURN
Code example with json syntax.1
Input behavior: steer vs follow_up
When a client sends input to a session that is already running, the
behavior field controls when the input is applied:
steer— insert the input as soon as possible on the next iteration of the agent loop. Use this when the user is correcting or redirecting the agent mid-turn. The agent notices the new input between tool calls or before its next LLM call.follow_up— queue the input for after the current turn completes. Follow-ups are consumed one at a time, so each gets a full agent loop iteration. This is the right behavior when the user is adding a new message that should be treated as a fresh turn, not a mid-course correction.
QUEUEING A FOLLOW-UP TURN
Code example with json syntax.1
The snippet omits type; it defaults to input_message.
Both behaviors are safe to send concurrently from multiple clients. Inputs are appended to a queue and processed in order.
Interrupting the agent
If the user wants to stop the agent mid-turn without sending new
input, send an interrupt request:
CANCEL THE CURRENT TURN
Code example with multiple language options.1
An interrupt causes the platform to:
- Stop consuming the current LLM stream. Partial output produced so far is discarded, not persisted.
- Cancel any in-flight tool calls. Each running tool call ends with a synthetic error event so the agent sees it as a failed call rather than a hanging one.
- Emit a
session_interruptedevent to every connected client so they know the turn ended.
The session remains open and ready for a new input message. An
interrupt on its own doesn't steer — it just cancels. To interrupt
and redirect in a single request, send an input_message with
behavior: "steer" instead; it interrupts the current iteration and
injects the new message at the next safe point.
What multiple clients can and can't do
Multi-client consumption on a session is fully supported:
- Any number of clients can read the same session's events concurrently. Clients that are also sending input can stream the response; pure followers page the GET endpoint.
- Any client can send input concurrently. Inputs queue in the order received.
- Any client can interrupt.
- Each client can independently resume from its own cursor.
The one restriction is that only one agent loop runs at a time
per session. If a client attempts to start a new loop on a session
that is already running — for example, by sending an input without a
since value — the request fails with 409 Conflict and a message
telling the caller to include since and treat the session as
in-progress. This prevents two concurrent loops from producing
interleaved events on the same session.
The fix is what the error message says: include a since cursor and
treat the session as in-progress rather than starting a new one.
If inputs arrive faster than the agent can consume them and the
internal queue fills, the platform returns 429 Too Many Requests.
In practice this only happens under abuse or a client bug; normal
multi-device usage does not come close.
Ordering and consistency guarantees
- Events are durably ordered. Every connected client sees the same events in the same order. No client sees "future" events before "past" ones.
- No dropped messages. Inputs accepted by the platform are always reflected in the event list; a client that resumes from its last cursor cannot miss them.
- Steer inputs interleave at safe points. A steer input lands between tool calls or before the next LLM call, never in the middle of a partial LLM response.
- Interrupted turns are visible. Interrupts emit a
session_interruptedevent so every client can render the state correctly.
Common patterns
- Mobile + web. The same user keeps both open; events stream to both. Sending from either one just works.
- Agent-pairing UI. A human operator watches an agent handle a customer. The operator's UI reads the session events in real time and can send a steer input to redirect the agent when needed.
- Transfer a session between clients. A user logs out on device A
and opens the same session on device B. The new client pages the
GET endpoint to fetch history, then sends its first input POST with
a recent
sincecursor andstream_response: trueto catch up and tail live. - Cancel a long tool chain. The user sees the agent running a
slow
sql_querythey realize is wrong. They sendinterruptand the agent stops cleanly.
Limits and trade-offs
- Partial output is discarded on interrupt. If you need the agent
to commit what it has so far before stopping, don't use
interrupt. Send a steering message that tells the agent to wrap up what it's doing instead. - Steer inputs land at safe points. The agent does not mid-stream-edit an LLM response; it finishes the current in-flight LLM call, then checks for steer input. For most use cases this feels instantaneous; under a long-running tool call the perceived latency to steer is the time until the tool returns.
- Queue capacity is finite. If your product fan-ins a lot of concurrent input to a single session, you can hit the queue limit and receive 429s. This is a design smell — consider whether you should be using sub-agents for the parallel work.
Related
- Sessions — creating and managing sessions, session metadata.
- Agent events — full event schema and event types.
- Context engineering overview — adjacent, but a separate set of concerns.