r[ae]ym
Streams

Event Archival

Automatic archival of old stream events to S3-compatible storage

Events are stored in Postgres for fast, low-latency access. Over time, older events that are no longer needed for active processing accumulate and grow the hot database indefinitely. The archival feature addresses this by periodically moving older events out of Postgres and into S3-compatible object storage, keeping the database lean while preserving the full event history.

Consumers are not affected by archival. When the service needs to serve events that have already been archived, it fetches them transparently from S3 and returns them to the consumer as if they were still in the database.

Configuration

Configure archival using the following environment variables:

VariableDefaultDescription
STREAMS_ARCHIVEfalseSet to true to enable the archival feature. When disabled, no S3 connection is established and the background process does not run
S3_ENDPOINTHostname and port of the S3-compatible storage endpoint (e.g. s3:9000 for MinIO)
S3_ACCESS_KEYAccess key used to authenticate with the storage backend
S3_SECRET_KEYSecret key used to authenticate with the storage backend
S3_SSLtrueWhether to use TLS when connecting to the storage endpoint
S3_BUCKETName of the bucket to store archived events in. The bucket is created automatically on startup if it does not exist
STREAMS_ARCHIVE_SIZE100000Minimum number of eligible events required to trigger an archive run. Also the maximum number of events written per batch
STREAMS_ARCHIVE_OLDER_THAN_DAYS30Events must be at least this many days old (by raised_at) to be eligible for archival

S3 variables are not required when archival is disabled

When STREAMS_ARCHIVE=false, none of the S3 environment variables need to be set.

Object Layout

Each archive batch is stored as a single object under:

{tenantId}/{topicName}/{archiveId}.json.gz

The object is a gzip-compressed JSON array of events. Tenant IDs that are empty are stored as _ for the first path segment.

Excluded Topics

Topics that are configured for partial persistence (deleteAfterConsumption or deleteAfterParallelConsumption) are excluded from archival entirely. Events that are deleted at some point in time do not need to be archived.

Excluded topics

See the partial persistence configuration in the introduction for details on how to configure these topics.

GDPR

Archival does not prevent GDPR payload updates. When a payload is updated for an event that has already been archived, the service rewrites the affected archive object in-place on S3.

On this page