Storage Engines
LogTide uses Reservoir, a pluggable storage abstraction that lets you choose between TimescaleDB and ClickHouse for log and span storage. Both engines share the same API — your application code, queries, and dashboards work identically regardless of which engine you run.
Overview
Reservoir is LogTide's internal storage layer (@logtide/reservoir).
It provides a single, typed interface for all log and span operations — ingestion, querying, aggregation,
deletion — and translates them into optimized native queries for the chosen engine.
App → Reservoir API → TimescaleDB (PostgreSQL)
→ ClickHouse (columnar OLAP) All log operations go through Reservoir: ingestion, search, filtering, aggregation, retention cleanup, alert evaluation, Sigma detection, correlation, dashboard stats, and admin monitoring. The engine choice is transparent to the rest of the platform.
Choosing an Engine
Default — recommended for most users
- Single database for everything (logs + app data)
- No extra services to run
- Continuous aggregates for fast dashboards
- Built-in compression (10x+ reduction)
- Full PostgreSQL ecosystem (extensions, tooling)
- Ideal for up to ~50K logs/sec
High-volume & analytics-heavy workloads
- Columnar storage — blazing-fast analytical queries
- Handles 100K+ logs/sec with ease
- Superior compression ratios (20-40x)
- Native n-gram indexes for full-text search
- Async inserts for maximum write throughput
- Best when you already operate ClickHouse
TimescaleDB is the default and requires zero extra configuration. It runs inside the same PostgreSQL instance that LogTide already uses for application data. You can always switch to ClickHouse later if your volume grows.
TimescaleDB (Default)
TimescaleDB is a PostgreSQL extension for time-series data. LogTide stores logs in a hypertable that is automatically partitioned by time, compressed, and pruned according to your retention policy.
Hypertables: Automatic time-based partitioning (chunks). Queries only scan relevant chunks for the requested time range.
Continuous Aggregates: Pre-computed hourly and daily stats (logs_hourly_stats, logs_daily_stats) that power dashboard charts in milliseconds.
Compression: Chunks older than 1 day are automatically compressed, reducing storage by 10x or more.
Retention: Configurable per-organization. Uses drop_chunks() for efficient bulk deletion.
Full-text Search: PostgreSQL tsvector for word-based search, pg_trgm GIN index for substring search.
Batch Inserts: Uses UNNEST-based batch inserts for high-throughput ingestion.
TimescaleDB is included in the default docker-compose.yml — no extra setup needed.
ClickHouse
ClickHouse
is a columnar OLAP database built for analytical queries at scale. When selected as the storage engine,
LogTide creates the logs, spans,
and traces tables in ClickHouse automatically on startup.
Columnar Storage: Only reads the columns needed for each query, making aggregations and scans extremely fast.
MergeTree Engine: Logs are stored in a MergeTree table ordered by (project_id, time) for optimal query performance.
Async Inserts: Writes are batched asynchronously by ClickHouse for maximum throughput without blocking the application.
N-gram Indexes: ngrambf_v1 bloom filter indexes on the message column for fast full-text search.
PREWHERE: Reservoir uses PREWHERE clauses for partition-level filtering before reading data blocks.
Compression: Native LZ4/ZSTD compression achieves 20-40x reduction on log data.
- Extra service: ClickHouse runs as a separate container alongside PostgreSQL (which is still needed for app data like users, orgs, alert rules, etc.).
- No continuous aggregates: Dashboard queries run against raw data. Still fast for most datasets thanks to columnar storage.
- Async deletes:
DELETEoperations in ClickHouse are asynchronous mutations — they execute in the background and may not reflect immediately.
Configuration
The storage engine is selected via a single environment variable. All other settings are optional and have sensible defaults.
Engine Selection
# In your .env file
# Use TimescaleDB (default — no configuration needed)
STORAGE_ENGINE=timescale
# Or use ClickHouse
STORAGE_ENGINE=clickhouse ClickHouse Environment Variables
Required only when STORAGE_ENGINE=clickhouse:
| Variable | Description | Default |
|---|---|---|
| CLICKHOUSE_HOST | ClickHouse server hostname | localhost |
| CLICKHOUSE_PORT | HTTP interface port | 8123 |
| CLICKHOUSE_DATABASE | Database name | logtide |
| CLICKHOUSE_USERNAME | Authentication username | default |
| CLICKHOUSE_PASSWORD | Authentication password | (empty) |
Docker Compose Setup
TimescaleDB (Default)
No changes needed — TimescaleDB is the default engine. Just run:
docker compose up -d ClickHouse
The default docker-compose.yml includes a ClickHouse service
behind a Docker Compose profile. To enable it:
# 1. Set the storage engine in your .env
STORAGE_ENGINE=clickhouse
# 2. Start with the ClickHouse profile
docker compose --profile clickhouse up -d
# With Fluent Bit log collection:
docker compose --profile clickhouse --profile logging up -d
This starts a clickhouse/clickhouse-server:24.1 container
alongside the existing PostgreSQL instance. PostgreSQL is still required for application data
(users, organizations, projects, alert rules, etc.) — only logs, spans, and traces are stored in ClickHouse.
When using ClickHouse, Reservoir automatically creates the required tables
(logs, spans,
traces) on first startup.
No manual migration is needed. For TimescaleDB, tables are managed by the existing Kysely migration system.
Switching Engines
Changing the STORAGE_ENGINE variable only changes where
new logs are written. Existing data stays in the previous engine.
Use the migration script to
copy historical logs, spans, and traces to the new engine.
TimescaleDB → ClickHouse
- 1. Stop the backend and worker:
docker compose stop backend worker - 2. Set
STORAGE_ENGINE=clickhousein your.env - 3. Start with the ClickHouse profile:
docker compose --profile clickhouse up -d - 4. Reservoir creates the ClickHouse tables on startup
- 5. Run the migration script to copy historical data
- 6. Start backend and worker — new logs flow into ClickHouse
ClickHouse → TimescaleDB
- 1. Stop the backend and worker
- 2. Set
STORAGE_ENGINE=timescale(or remove the variable entirely) - 3. Run the migration script to copy historical data
- 4. Restart:
docker compose up -d - 5. Optionally stop the ClickHouse container if no longer needed
Data Migration
LogTide includes a built-in migration script that copies logs, spans, and traces between engines. It preserves log IDs so that existing references (detection events, exceptions, error groups) continue to work after migration.
Only the 3 tables managed by Reservoir are migrated:
logs— core log data (IDs preserved)spans— OpenTelemetry spanstraces— aggregated trace records
Everything else (users, organizations, projects, alert rules, Sigma rules, incidents, etc.) stays in PostgreSQL and is not affected.
Basic Usage
# From the backend directory
cd packages/backend
# Auto-detect direction (current STORAGE_ENGINE → the other)
pnpm migrate:storage
# Explicit direction
pnpm migrate:storage -- --from timescale --to clickhouse
# Dry run — show record counts without migrating
pnpm migrate:storage -- --dry-run Options
| Flag | Description | Default |
|---|---|---|
| --from <engine> | Source engine | current STORAGE_ENGINE |
| --to <engine> | Destination engine | the other engine |
| --dry-run | Show record counts only, don't write | false |
| --project-id <uuid> | Migrate a single project | all projects |
| --batch-size <n> | Records per batch | 5000 |
| --skip-validation | Skip post-migration count check | false |
The migration is safe to interrupt and resume. It tracks how many records already exist in the destination and picks up where it left off. If you stop the script mid-way (Ctrl+C), just run the same command again to continue.
Log IDs are preserved during migration. This is important because detection events, exceptions, and error groups reference logs by ID. Without ID preservation, those cross-references would break silently after switching engines.
Example Output
Storage Migration: timescale → clickhouse
==================================================
Found 3 projects across 2 organizations
[1/3] api-service (acme-corp)
Logs: source=1,234,567 | dest=500,000 | to migrate=734,567
Spans: source=456,789 | dest=0 | to migrate=456,789
Traces: source=12,345 | dest=0 | to migrate=12,345
Logs: 100.0% | 734,567/734,567 | done in 2m 24s
Spans: 100.0% | 456,789/456,789 | done in 58s
Traces: 100.0% | 12,345/12,345 | done in 4s
Validation: logs OK | spans OK | traces OK
[2/3] web-app (acme-corp)
Already migrated.
Summary
==================================================
Projects: 3/3 completed
Logs: 2,500,000 migrated
Spans: 800,000 migrated
Traces: 25,000 migrated
Time: 5m 12s
Errors: 0 How Reservoir Works
Reservoir is an internal package (@logtide/reservoir) that
abstracts all log and span I/O behind a unified TypeScript API. Here's what it handles:
ingest()— Batch insert logsquery()— Search with filters, paginationaggregate()— Time-bucketed statscount()— Filtered log countsdistinct()— Unique field valuestopValues()— Top N values by countgetById()/getByIds()— Direct lookupsdeleteByTimeRange()— Retention cleanup
ingestSpans()— Batch insert spansupsertTrace()— Create/update trace recordsquerySpans()— Search spansqueryTraces()— Search tracesgetSpansByTraceId()— Trace detail viewgetServiceDependencies()— Service maphealthCheck()— Engine connectivitygetEngineType()— Runtime engine info
Reservoir doesn't just wrap SQL — each engine has its own query translator that generates optimized native queries:
| Feature | TimescaleDB | ClickHouse |
|---|---|---|
| Batch insert | UNNEST($1::text[], ...) | INSERT ... FORMAT JSONEachRow |
| Full-text search | tsvector @@ tsquery | hasToken() / positionCaseInsensitive() |
| Substring search | pg_trgm GIN index | ngrambf_v1 bloom filter |
| Time filtering | Chunk pruning | PREWHERE partition key |
| Aggregation | Continuous aggregates | Raw GROUP BY (columnar scan) |
| Deletion | drop_chunks() / DELETE | ALTER TABLE DELETE (async) |
Some features have engine-specific fast paths. For example, dashboard charts use
TimescaleDB continuous aggregates when available, but fall back to raw Reservoir
queries on ClickHouse. The application detects the engine at runtime via
reservoir.getEngineType() and chooses
the optimal code path automatically.