Skip to main content
Version: 2.0

Concepts

A pipeline has five parts: a source (where data comes from), a trigger (when it runs), a transform (the agent that processes each record), an optional verification (how to confirm success), and a sync mode (incremental vs full refresh).

Source

The source defines where data comes from. Each record in the source is sent as a separate file upload to a new agent session.

See Sources for details on each supported source type and configuration.

Source credentials are encrypted at rest using a customer-specific encryption key and are never returned in API responses.

Trigger

The trigger defines when the pipeline runs.

Cron

Run on a cron schedule. The expression is a standard 5-field cron expression (minute, hour, day-of-month, month, day-of-week) evaluated in UTC.

TRIGGER FIELD (CRON)

Code example with json syntax.
1

Interval

Run at a fixed interval. The duration is an ISO-8601 duration string.

TRIGGER FIELD (INTERVAL)

Code example with json syntax.
1

Manual

The pipeline never runs automatically — it only runs when explicitly triggered via the /trigger endpoint.

TRIGGER FIELD (MANUAL)

Code example with json syntax.
1

Overlap policy

Only one run can be active per pipeline at a time. If a scheduled trigger fires while a previous run is still in progress, the new run is skipped. Manual triggers return 409 Conflict if a run is already in progress.

Transform

The transform defines how source records are processed. Currently only agent transforms are supported.

Each source record creates a fresh agent session, and the source file is uploaded to that session as the first input. The agent then processes the file according to its configured instructions and tools. One source record maps to one agent session.

TRANSFORM FIELD

Code example with json syntax.
1

Verification

By default, any agent session that completes without throwing an exception is treated as success. You can configure stricter success criteria by adding a verification field to the transform — either a UserFn condition expression or a separate judge agent. Records that fail verification are added to the dead letter queue.

See Verification for configuration, available context fields, and examples.

Sync mode

ModeBehavior
incrementalOnly processes records that are new or changed since the last successful run. Uses an internal watermark to track progress.
full_refreshProcesses all records from the source on every run.

Incremental mode is the default and recommended for most use cases. The pipeline tracks a watermark (source-specific, e.g. a timestamp for S3) after each successful run. The next run only fetches records whose watermark is after the stored value.

Use full_refresh when you need to reprocess the entire source, such as after changing the agent's instructions or to recover from a data corruption.

Pipeline vs pipeline run

A pipeline is the persistent configuration — source, trigger, transform, and sync mode. It has a stable key and can be enabled, disabled, updated, or deleted.

A pipeline run is a single execution. Each run fetches records from the source and creates one agent session per record. Runs have their own status (running, completed, failed, cancelled) and track how many records were fetched, processed, and failed.

Comparison to other features

FeaturePurposeUnit of work
PipelineAutomated flow of all source data through an agentOne session per source record
Agent scheduleRecurring single execution of an agent with a fixed messageOne session per trigger
Agent connector (e.g. Slack)Bidirectional chat integrationOne session per conversation