Storage Engines

LogTide uses Reservoir, a pluggable storage abstraction that lets you choose between TimescaleDB, ClickHouse, and MongoDB for log and span storage. All engines share the same API — your application code, queries, and dashboards work identically regardless of which engine you run.

Overview

One API, Three Engines

Reservoir is LogTide's internal storage layer (@logtide/reservoir). It provides a single, typed interface for all log and span operations — ingestion, querying, aggregation, deletion — and translates them into optimized native queries for the chosen engine.

App → Reservoir API → TimescaleDB (PostgreSQL)
                    → ClickHouse (columnar OLAP)
                    → MongoDB (document-based)

All log operations go through Reservoir: ingestion, search, filtering, aggregation, retention cleanup, alert evaluation, Sigma detection, correlation, dashboard stats, and admin monitoring. The engine choice is transparent to the rest of the platform.

Choosing an Engine

TimescaleDB

Default — recommended for most users

Single database for everything (logs + app data)
No extra services to run
Continuous aggregates for fast dashboards
Built-in compression (10x+ reduction)
Ideal for up to ~50K logs/sec

ClickHouse

High-volume & analytics-heavy workloads

Blazing-fast analytical queries
Handles 100K+ logs/sec with ease
Superior compression ratios (20-40x)
Native n-gram indexes for search
Best when you already operate ClickHouse

MongoDB

Flexible schema & easy horizontal scaling

Dynamic metadata — no fixed schema
Time-series collections (MongoDB 5.0+)
Native TTL indexes for retention
Easy horizontal scaling via sharding
Ideal for flexible, unstructured logs

Not Sure? Start with TimescaleDB

TimescaleDB is the default and requires zero extra configuration. It runs inside the same PostgreSQL instance that LogTide already uses for application data. You can always switch to ClickHouse later if your volume grows.

TimescaleDB (Default)

TimescaleDB is a PostgreSQL extension for time-series data. LogTide stores logs in a hypertable that is automatically partitioned by time, compressed, and pruned according to your retention policy.

Key Features

Hypertables: Automatic time-based partitioning (chunks). Queries only scan relevant chunks for the requested time range.

Continuous Aggregates: Pre-computed hourly and daily stats (logs_hourly_stats, logs_daily_stats) that power dashboard charts in milliseconds.

Compression: Chunks older than 1 day are automatically compressed, reducing storage by 10x or more.

Retention: Configurable per-organization. Uses drop_chunks() for efficient bulk deletion.

Full-text Search: PostgreSQL tsvector for word-based search, pg_trgm GIN index for substring search.

Batch Inserts: Uses UNNEST-based batch inserts for high-throughput ingestion.

TimescaleDB is included in the default docker-compose.yml — no extra setup needed.

ClickHouse

ClickHouse is a columnar OLAP database built for analytical queries at scale. When selected as the storage engine, LogTide creates the logs, spans, and traces tables in ClickHouse automatically on startup.

Key Features

Columnar Storage: Only reads the columns needed for each query, making aggregations and scans extremely fast.

MergeTree Engine: Logs are stored in a MergeTree table ordered by (project_id, time) for optimal query performance.

Async Inserts: Writes are batched asynchronously by ClickHouse for maximum throughput without blocking the application.

N-gram Indexes: ngrambf_v1 bloom filter indexes on the message column for fast full-text search.

PREWHERE: Reservoir uses PREWHERE clauses for partition-level filtering before reading data blocks.

Compression: Native LZ4/ZSTD compression achieves 20-40x reduction on log data.

ClickHouse Considerations

Extra service: ClickHouse runs as a separate container alongside PostgreSQL (which is still needed for app data like users, orgs, alert rules, etc.).
No continuous aggregates: Dashboard queries run against raw data. Still fast for most datasets thanks to columnar storage.
Async deletes: DELETE operations in ClickHouse are asynchronous mutations — they execute in the background and may not reflect immediately.

MongoDB

MongoDB is a document-oriented NoSQL database. LogTide uses MongoDB for flexible log storage, leveraging Time-Series Collections (available in MongoDB 5.0+) for optimized ingestion and storage.

Key Features

Time-Series Collections: Stores sequences of measurements over a period of time, optimized for high-throughput ingestion and real-time analytics.

Dynamic Metadata: No fixed schema — metadata fields are stored as flexible documents, allowing you to send any nested JSON without prior configuration.

TTL Indexes: Automatic retention cleanup via native Time-To-Live indexes on the timestamp field.

Atomic Upserts: Trace records are updated using single bulkWrite operations with $min/$max/$inc for maximum efficiency.

Search: Uses $text indexes for word-based search and regex for flexible substring matching.

MongoDB Requirements

Version 5.0+: Required for Time-Series collections support.
Version 6.0+: Recommended for advanced aggregation performance and better compression.

Configuration

The storage engine is selected via a single environment variable. All other settings are optional and have sensible defaults.

Engine Selection

# In your .env file

# Use TimescaleDB (default)
STORAGE_ENGINE=timescale

# Use ClickHouse
STORAGE_ENGINE=clickhouse

# Use MongoDB
STORAGE_ENGINE=mongodb

ClickHouse Environment Variables

Required only when STORAGE_ENGINE=clickhouse.

Variable	Default
CLICKHOUSE_HOST	localhost
CLICKHOUSE_PORT	8123
CLICKHOUSE_DATABASE	logtide
CLICKHOUSE_USERNAME	default
CLICKHOUSE_PASSWORD	(empty)

MongoDB Environment Variables

Required only when STORAGE_ENGINE=mongodb.

Variable	Default
MONGODB_URI	mongodb://localhost:27017/logtide
MONGODB_DB	logtide

Docker Compose Setup

ClickHouse

# In your .env
STORAGE_ENGINE=clickhouse

# Start with the ClickHouse profile
docker compose --profile clickhouse up -d

MongoDB

The default docker-compose.yml includes a MongoDB service behind a profile:

# In your .env
STORAGE_ENGINE=mongodb

# Start with the MongoDB profile
docker compose --profile mongodb up -d

Switching Engines

Data Is Not Migrated Automatically

Changing the STORAGE_ENGINE variable only changes where new logs are written. Existing data stays in the previous engine. The migration script supports TimescaleDB, ClickHouse, and MongoDB.

Example Usage with MongoDB

# Migrate from TimescaleDB to MongoDB
pnpm migrate:storage -- --from timescale --to mongodb

# Migrate from MongoDB to ClickHouse
pnpm migrate:storage -- --from mongodb --to clickhouse

Data Migration

LogTide includes a built-in migration script that copies logs, spans, and traces between engines. It preserves log IDs so that existing references (detection events, exceptions, error groups) continue to work after migration.

What Gets Migrated

Only the 3 tables managed by Reservoir are migrated:

logs — core log data (IDs preserved)
spans — OpenTelemetry spans
traces — aggregated trace records

Everything else (users, organizations, projects, alert rules, Sigma rules, incidents, etc.) stays in PostgreSQL and is not affected.

Basic Usage

# From the backend directory
cd packages/backend

# Auto-detect direction (current STORAGE_ENGINE → the other)
pnpm migrate:storage

# Explicit direction
pnpm migrate:storage -- --from timescale --to clickhouse

# Dry run — show record counts without migrating
pnpm migrate:storage -- --dry-run

Options

Flag	Description	Default
--from <engine>	Source engine	current STORAGE_ENGINE
--to <engine>	Destination engine	the other engine
--dry-run	Show record counts only, don't write	false
--project-id <uuid>	Migrate a single project	all projects
--batch-size <n>	Records per batch	5000
--skip-validation	Skip post-migration count check	false

Resumable

The migration is safe to interrupt and resume. It tracks how many records already exist in the destination and picks up where it left off. If you stop the script mid-way (Ctrl+C), just run the same command again to continue.

ID Preservation

Log IDs are preserved during migration. This is important because detection events, exceptions, and error groups reference logs by ID. Without ID preservation, those cross-references would break silently after switching engines.

Example Output

Storage Migration: timescale → clickhouse
==================================================

Found 3 projects across 2 organizations

[1/3] api-service (acme-corp)
  Logs:   source=1,234,567 | dest=500,000 | to migrate=734,567
  Spans:  source=456,789   | dest=0       | to migrate=456,789
  Traces: source=12,345    | dest=0       | to migrate=12,345

  Logs: 100.0% | 734,567/734,567 | done in 2m 24s
  Spans: 100.0% | 456,789/456,789 | done in 58s
  Traces: 100.0% | 12,345/12,345 | done in 4s
  Validation: logs OK | spans OK | traces OK

[2/3] web-app (acme-corp)
  Already migrated.

Summary
==================================================
Projects: 3/3 completed
Logs:     2,500,000 migrated
Spans:    800,000 migrated
Traces:   25,000 migrated
Time:     5m 12s
Errors:   0

How Reservoir Works

Reservoir is an internal package (@logtide/reservoir) that abstracts all log and span I/O behind a unified TypeScript API. Here's what it handles:

Log Operations

ingest() — Batch insert logs
query() — Search with filters, pagination
aggregate() — Time-bucketed stats
count() — Filtered log counts
distinct() — Unique field values
topValues() — Top N values by count
getById() / getByIds() — Direct lookups
deleteByTimeRange() — Retention cleanup

Span & Trace Operations

ingestSpans() — Batch insert spans
upsertTrace() — Create/update trace records
querySpans() — Search spans
queryTraces() — Search traces
getSpansByTraceId() — Trace detail view
getServiceDependencies() — Service map
healthCheck() — Engine connectivity
getEngineType() — Runtime engine info

Engine-Specific Optimizations

Reservoir doesn't just wrap SQL — each engine has its own query translator that generates optimized native queries:

Feature	TimescaleDB	ClickHouse
Batch insert	UNNEST($1::text[], ...)	INSERT ... FORMAT JSONEachRow
Full-text search	tsvector @@ tsquery	hasToken() / positionCaseInsensitive()
Substring search	pg_trgm GIN index	ngrambf_v1 bloom filter
Time filtering	Chunk pruning	PREWHERE partition key
Aggregation	Continuous aggregates	Raw GROUP BY (columnar scan)
Deletion	drop_chunks() / DELETE	ALTER TABLE DELETE (async)

Engine Branching

Some features have engine-specific fast paths. For example, dashboard charts use TimescaleDB continuous aggregates when available, but fall back to raw Reservoir queries on ClickHouse. The application detects the engine at runtime via reservoir.getEngineType() and chooses the optimal code path automatically.