Concepts
A pipeline has five parts: a source (where data comes from), a trigger (when it runs), a transform (the agent that processes each record), an optional verification (how to confirm success), and a sync mode (incremental vs full refresh).
Source
The source defines where data comes from. Each record in the source is sent as a separate file upload to a new agent session.
See Sources for details on each supported source type and configuration.
Source credentials are encrypted at rest using a customer-specific encryption key and are never returned in API responses.
Trigger
The trigger defines when the pipeline runs.
Cron
Run on a cron schedule. The expression is a standard 5-field cron expression (minute, hour, day-of-month, month, day-of-week) evaluated in UTC.
TRIGGER FIELD (CRON)
Code example with json syntax.1
Interval
Run at a fixed interval. The duration is an ISO-8601 duration string.
TRIGGER FIELD (INTERVAL)
Code example with json syntax.1
Manual
The pipeline never runs automatically — it only runs when explicitly
triggered via the /trigger endpoint.
TRIGGER FIELD (MANUAL)
Code example with json syntax.1
Overlap policy
Only one run can be active per pipeline at a time. If a scheduled trigger
fires while a previous run is still in progress, the new run is skipped.
Manual triggers return 409 Conflict if a run is already in progress.
Transform
The transform defines how source records are processed. Currently only agent transforms are supported.
Each source record creates a fresh agent session, and the source file is uploaded to that session as the first input. The agent then processes the file according to its configured instructions and tools. One source record maps to one agent session.
TRANSFORM FIELD
Code example with json syntax.1
Verification
By default, any agent session that completes without throwing an exception
is treated as success. You can configure stricter success criteria by
adding a verification field to the transform — either a UserFn condition
expression or a separate judge agent. Records that fail verification are
added to the dead letter queue.
See Verification for configuration, available context fields, and examples.
Sync mode
| Mode | Behavior |
|---|---|
incremental | Only processes records that are new or changed since the last successful run. Uses an internal watermark to track progress. |
full_refresh | Processes all records from the source on every run. |
Incremental mode is the default and recommended for most use cases. The pipeline tracks a watermark (source-specific, e.g. a timestamp for S3) after each successful run. The next run only fetches records whose watermark is after the stored value.
Use full_refresh when you need to reprocess the entire source, such as
after changing the agent's instructions or to recover from a data
corruption.
Pipeline vs pipeline run
A pipeline is the persistent configuration — source, trigger, transform, and sync mode. It has a stable key and can be enabled, disabled, updated, or deleted.
A pipeline run is a single execution. Each run fetches records from the source and creates one agent session per record. Runs have their own status (running, completed, failed, cancelled) and track how many records were fetched, processed, and failed.
Comparison to other features
| Feature | Purpose | Unit of work |
|---|---|---|
| Pipeline | Automated flow of all source data through an agent | One session per source record |
| Agent schedule | Recurring single execution of an agent with a fixed message | One session per trigger |
| Agent connector (e.g. Slack) | Bidirectional chat integration | One session per conversation |