Version: 2.0.0 (Latest)

Object storage connector

Class: object_storage — Sink-only in the UI: subscribes to a stream and writes objects to S3-compatible storage as Parquet or JSON Lines, with optional compression and batching for large uploads.

Create and edit under Sinks only. Advanced Settings may expose more runtime options depending on deployment and permissions.

Source and sink behavior

Role	Behavior
Source	Not supported in the UI for this class (`type` must be `sink`).
Sink	Consumes the subscribed stream, batches events, and uploads objects using `bucket`, `region`, optional custom endpoint, and format settings.
Streams	Upstream tasks / pipelines must publish to the stream id this sink consumes (Streams).

Required fields

Every connector row

Field	Required	Notes
`name`	Yes	Display name; `id` derived from it.
`class`	Yes	Must be `object_storage`.
`stream`	Yes	Resolved stream id (sink reads here).
`type`	Yes	Must be `sink` (UI rejects source for this class).
`config`	Yes	Class-specific object; see below.

Class `object_storage` — required configuration

Setting	Required	Notes
`bucket`	Yes	Target bucket (UI errors if empty).
`region`	Conditional	Required for default AWS-style endpoints; optional when a custom `endpoint` is set—confirm for your deployment.

UI validation

emit_verification_manifest requires integrity features to be enabled.
max_batch_bytes ≤ max_object_bytes when both are set (same relationship for nested upload.batch.max_bytes vs upload.object.max_bytes when present).
format: parquet must not combine with outer compression: gzip (Core + UI).

Create connector

Open Sinks → Create.
Set Class to Object storage (S3-compatible), set Sink Name, stream behavior, and Enabled.
Enter bucket and region (or custom endpoint per form).
Choose format, compression, batch/object size limits, and credentials.
Save, then ensure tasks publish into the subscribed stream.

Sink (UI)

Create New Sink modal with class Object storage S3-compatible, Auto Create Stream checked, Bucket and Region — The Object storage sink connector form.

UI area	Connector settings (typical)
Bucket / region	`bucket`, `region`
Endpoint	`object_storage_endpoint`, `force_path_style`
Prefix / layout	`prefix`
Format	`object_storage_format`, `parquet_compression`, …
Batch / size	`max_batch_*`, `max_object_bytes`
Credentials	`role_arn`, keys, or instance metadata patterns per build

Configuration

Bucket and layout

bucket, region, prefix, object_storage_endpoint, force_path_style — Namespace and S3-compatible API targets (MinIO, ECS, etc.).

Format

object_storage_format — json_lines vs Parquet plus Parquet row-group and compression options.

Batching and size limits

max_batch_events, max_batch_bytes, max_batch_age_ms, max_object_bytes — Flush cadence and object caps.

Credentials and IAM

role_arn, role_session_name, external_id — Assume-role style access when supported.

Reliability

max_retries, retry_initial_backoff_ms, integrity_enabled, emit_verification_manifest — Retries and optional integrity features when present.

Timestamps

timestamp — How event time is reflected in object metadata or columns.

Runtime behavior

The sink runs after deployment when Enabled; it drains the stream and uploads asynchronously according to batch settings.
Disabled sinks do not write objects.

Performance and operational notes

Right-size batch thresholds to object size limits and storage rate policies.
Prefer integrity manifests when compliance requires end-to-end object verification.

Sinks
Core concepts — Sink connectors (Object Storage row)

Source and sink behavior​

Required fields​

Every connector row​

Class object_storage — required configuration​

UI validation​

Create connector​

Sink (UI)​

Configuration​

Bucket and layout​

Format​

Batching and size limits​

Credentials and IAM​

Reliability​

Timestamps​

Runtime behavior​

Performance and operational notes​

Related pages​