API Monitoring and Analytics
Monitor API performance, track latency percentiles, detect error spikes, and analyze consumer usage patterns with LogTide.
Your API is your product. When response times creep up, error rates spike, or a specific consumer hammers your endpoints, you need to know immediately — not when customers start complaining. This guide shows how to build comprehensive API monitoring with LogTide, from basic request logging to latency percentile tracking and consumer analytics.
The Problem
Most teams start with no API monitoring at all. The first sign of trouble is a customer support ticket: “Your API is slow.”
❌ API monitoring anti-patterns:
1. No request logging → "How many requests do we actually get?"
2. Average-only metrics → P99 is 10x worse than average, but you can't see it
3. No error breakdown → "500 errors are up" -- which endpoint? which consumer?
4. No rate limit tracking → Abusive consumers degrade everyone's experience
5. No version tracking → Can't tell if v2 is faster than v1
| Problem | Business Impact |
|---|---|
| Undetected latency spikes | Users abandon requests, revenue drops |
| Silent error rate increase | Data corruption, broken integrations |
| No consumer visibility | One bad actor degrades service for everyone |
| Missing audit trail | Cannot debug partner integration issues |
The LogTide Approach
Turn every API request into a structured, queryable event:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Mobile │ │ Web │ │ Partners │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
▼ ▼ ▼
┌────────────────────────────────────────────────────────┐
│ Express / Fastify API │
│ ┌──────────────────────────────────────┐ │
│ │ LogTide Request Middleware │ │
│ │ - method, path, status, latency │ │
│ │ - consumer identity, API version │ │
│ │ - request/response sizes │ │
│ └──────────────────┬───────────────────┘ │
└─────────────────────┼──────────────────────────────────┘
│ Batched, async
▼
┌────────────────┐
│ LogTide │
│ Dashboards │
│ Alerts │
│ Analytics │
└────────────────┘
Implementation
1. Express Request Logging Middleware
A production-ready middleware that captures everything you need:
// middleware/api-logger.ts
import { LogTideClient } from '@logtide/node';
import { Request, Response, NextFunction } from 'express';
const client = new LogTideClient({
dsn: process.env.LOGTIDE_DSN!,
service: 'api',
batchSize: 200,
flushInterval: 3000,
compress: true,
});
export function apiLogger() {
return (req: Request, res: Response, next: NextFunction) => {
const startTime = process.hrtime.bigint();
const requestSize = parseInt(req.headers['content-length'] || '0', 10);
// Track response size
let responseSize = 0;
const originalEnd = res.end.bind(res);
res.end = function (chunk: any, ...args: any[]) {
if (chunk) responseSize += Buffer.byteLength(chunk);
return originalEnd(chunk, ...args);
};
res.on('finish', () => {
const durationMs = Number((process.hrtime.bigint() - startTime) / 1_000_000n);
const consumerId = extractConsumerId(req);
const apiVersion = extractApiVersion(req);
const metrics = {
method: req.method,
path: req.path,
route: req.route?.path || req.path, // Route pattern, not actual path
statusCode: res.statusCode,
durationMs,
requestSize,
responseSize,
consumerId,
apiVersion,
userAgent: req.headers['user-agent'] || null,
ip: req.ip,
};
if (res.statusCode >= 500) {
client.error(`${req.method} ${req.path} ${res.statusCode}`, metrics);
} else if (res.statusCode >= 400) {
client.warn(`${req.method} ${req.path} ${res.statusCode}`, metrics);
} else {
client.info(`${req.method} ${req.path} ${res.statusCode}`, metrics);
}
});
next();
};
}
function extractConsumerId(req: Request): string | null {
const apiKey = req.headers['x-api-key'] as string;
if (apiKey) return `key:${apiKey.slice(0, 8)}...`;
const auth = req.headers.authorization;
if (auth?.startsWith('Bearer ')) {
try {
const payload = JSON.parse(Buffer.from(auth.split('.')[1], 'base64').toString());
return payload.client_id || payload.sub || null;
} catch { return null; }
}
return null;
}
function extractApiVersion(req: Request): string | null {
const pathMatch = req.path.match(/^\/(v\d+)\//);
if (pathMatch) return pathMatch[1];
return req.headers['x-api-version'] as string || null;
}
2. Fastify Request Logging Plugin
// plugins/api-logger.ts
import { FastifyPluginAsync } from 'fastify';
import fp from 'fastify-plugin';
import { LogTideClient } from '@logtide/node';
const client = new LogTideClient({
dsn: process.env.LOGTIDE_DSN!,
service: 'api',
batchSize: 200,
flushInterval: 3000,
});
const apiLoggerPlugin: FastifyPluginAsync = async (fastify) => {
fastify.addHook('onRequest', async (request) => {
request.startTime = process.hrtime.bigint();
});
fastify.addHook('onResponse', async (request, reply) => {
const durationMs = Number((process.hrtime.bigint() - request.startTime) / 1_000_000n);
const metrics = {
method: request.method,
route: request.routeOptions?.url || request.url,
statusCode: reply.statusCode,
durationMs,
consumerId: request.headers['x-api-key'] || null,
ip: request.ip,
};
if (reply.statusCode >= 500) {
client.error(`${request.method} ${request.url} ${reply.statusCode}`, metrics);
} else if (reply.statusCode >= 400) {
client.warn(`${request.method} ${request.url} ${reply.statusCode}`, metrics);
} else {
client.info(`${request.method} ${request.url} ${reply.statusCode}`, metrics);
}
});
fastify.addHook('onClose', async () => { await client.flush(); });
};
export default fp(apiLoggerPlugin, { name: 'api-logger' });
3. Latency Percentile Tracking
Average latency hides problems. A p50 of 50ms with a p99 of 5000ms means 1 in 100 users waits 100x longer:
// middleware/latency-tracker.ts
import { LogTideClient } from '@logtide/node';
const client = new LogTideClient({ dsn: process.env.LOGTIDE_DSN!, service: 'api-metrics' });
const latencyBuffer: Map<string, number[]> = new Map();
export function trackLatency(route: string, durationMs: number) {
if (!latencyBuffer.has(route)) latencyBuffer.set(route, []);
latencyBuffer.get(route)!.push(durationMs);
}
// Report percentiles every 60 seconds
setInterval(() => {
for (const [route, latencies] of latencyBuffer.entries()) {
if (latencies.length === 0) continue;
latencies.sort((a, b) => a - b);
const pct = (p: number) => latencies[Math.max(0, Math.ceil((p / 100) * latencies.length) - 1)];
client.info('API latency report', {
event: 'api.latency_report',
route,
request_count: latencies.length,
p50_ms: pct(50),
p95_ms: pct(95),
p99_ms: pct(99),
max_ms: latencies[latencies.length - 1],
avg_ms: Math.round(latencies.reduce((a, b) => a + b, 0) / latencies.length),
});
latencyBuffer.set(route, []);
}
}, 60_000);
4. Rate Limit Logging
When you enforce rate limits, log the events for analysis:
// middleware/rate-limiter.ts
import { LogTideClient } from '@logtide/node';
import { Request, Response, NextFunction } from 'express';
const client = new LogTideClient({ dsn: process.env.LOGTIDE_DSN!, service: 'api' });
const requestCounts: Map<string, { count: number; resetAt: number }> = new Map();
export function rateLimiter(windowMs: number, maxRequests: number) {
return (req: Request, res: Response, next: NextFunction) => {
const key = (req.headers['x-api-key'] as string) || req.ip || 'anonymous';
const now = Date.now();
let entry = requestCounts.get(key);
if (!entry || now > entry.resetAt) {
entry = { count: 0, resetAt: now + windowMs };
requestCounts.set(key, entry);
}
entry.count++;
res.setHeader('X-RateLimit-Limit', maxRequests);
res.setHeader('X-RateLimit-Remaining', Math.max(0, maxRequests - entry.count));
if (entry.count > maxRequests) {
client.warn('Rate limit exceeded', {
event: 'api.rate_limit_exceeded',
consumer: key,
path: req.path,
request_count: entry.count,
limit: maxRequests,
});
return res.status(429).json({ error: 'Rate limit exceeded' });
}
// Warn when approaching limit (80%)
if (entry.count > maxRequests * 0.8) {
client.info('Consumer approaching rate limit', {
event: 'api.rate_limit_warning',
consumer: key,
request_count: entry.count,
limit: maxRequests,
});
}
next();
};
}
5. Consumer Analytics
Track which consumers use which endpoints and how much:
// analytics/consumer-tracker.ts
import { LogTideClient } from '@logtide/node';
const client = new LogTideClient({ dsn: process.env.LOGTIDE_DSN!, service: 'api-analytics' });
const activity: Map<string, {
requests: number; errors: number;
totalLatency: number; endpoints: Record<string, number>;
}> = new Map();
export function trackConsumer(consumerId: string, route: string, status: number, durationMs: number) {
if (!consumerId) return;
if (!activity.has(consumerId)) {
activity.set(consumerId, { requests: 0, errors: 0, totalLatency: 0, endpoints: {} });
}
const a = activity.get(consumerId)!;
a.requests++;
if (status >= 500) a.errors++;
a.totalLatency += durationMs;
a.endpoints[route] = (a.endpoints[route] || 0) + 1;
}
// Report every 5 minutes
setInterval(() => {
for (const [id, a] of activity.entries()) {
if (a.requests === 0) continue;
client.info('Consumer activity report', {
event: 'api.consumer_report',
consumer_id: id,
request_count: a.requests,
error_rate_pct: Math.round((a.errors / a.requests) * 10000) / 100,
avg_latency_ms: Math.round(a.totalLatency / a.requests),
top_endpoints: Object.fromEntries(
Object.entries(a.endpoints).sort((x, y) => y[1] - x[1]).slice(0, 10)
),
});
activity.set(id, { requests: 0, errors: 0, totalLatency: 0, endpoints: {} });
}
}, 300_000);
6. nginx Access Log Integration
Capture nginx access logs upstream of your application for a complete picture:
# nginx.conf - Structured JSON access log
log_format logtide_json escape=json
'{'
'"timestamp":"$time_iso8601",'
'"method":"$request_method",'
'"path":"$uri",'
'"status":$status,'
'"body_bytes_sent":$body_bytes_sent,'
'"request_time":$request_time,'
'"upstream_response_time":"$upstream_response_time",'
'"remote_addr":"$remote_addr",'
'"http_user_agent":"$http_user_agent",'
'"http_x_api_key":"$http_x_api_key"'
'}';
access_log /var/log/nginx/api-access.log logtide_json;
Ship nginx JSON logs to LogTide with the nginx integration or a log shipper container.
Real-World Example: Fintech API Platform
A fintech company exposes a payments API to 200+ partners, processing 2 million requests per day.
Before LogTide:
- Latency SLAs measured only by synthetic checks every 60s
- Consumer complaints about slowness took days to investigate
- Rate limit violations discovered after the fact
- No per-consumer usage analytics for billing
After LogTide:
1. Alert: "p99 latency > 2000ms on POST /v1/payments"
2. Query: Recent slow requests
route:/v1/payments AND durationMs:>2000 AND time:>30m
3. Discovery: 80% of slow requests from consumer "key:a1b2c3d4..."
They're sending 500-item batch requests (normal is 10-20)
4. Resolution: Contact consumer about batch size limits.
Add request body size validation.
Total investigation time: 8 minutes
Results:
- Latency issues detected in minutes, not days
- Per-consumer SLA tracking automated
- Rate limit abusers identified proactively
- API usage reports generated for billing
Query Patterns for API Monitoring
# Top endpoints by request volume
service:api AND time:>1h | group by route | count | sort desc
# Endpoints with highest error rate
service:api AND time:>1h
| group by route | ratio(statusCode:>=500) | sort desc
# Slowest endpoints by p99
event:api.latency_report AND time:>1h | sort by p99_ms desc
# Consumers hitting rate limits most
event:api.rate_limit_exceeded AND time:>24h
| group by consumer | count | sort desc
# API version adoption
service:api AND time:>7d | group by apiVersion | count
# Large response payloads (optimization targets)
service:api AND responseSize:>1000000 AND time:>24h
| group by route | avg(responseSize) | sort desc
Alerting Configuration
# High error rate on any endpoint
- name: api-error-rate
query: 'service:api AND statusCode:>=500 AND time:>5m'
threshold: 50
window: 5m
severity: critical
# Latency spike
- name: api-latency-spike
query: 'event:api.latency_report AND p99_ms:>2000'
threshold: 1
window: 5m
severity: warning
# Rate limit abuse
- name: api-rate-limit-abuse
query: 'event:api.rate_limit_exceeded AND time:>10m'
threshold: 100
window: 10m
severity: warning
# Zero traffic (API may be down)
- name: api-zero-traffic
query: 'service:api AND time:>5m'
threshold: 0
condition: equals
window: 5m
severity: critical
API Monitoring Checklist
Request Logging
- Every request logged with method, path, status, and latency
- Route patterns used (not actual paths with IDs) for grouping
- Request and response sizes captured
- Consumer identity extracted from API key or JWT
- API version tracked (header or path-based)
Latency Tracking
- High-resolution timing with
process.hrtime.bigint() - Percentile reports generated (p50, p95, p99)
- Slow request threshold alerts configured
- Upstream (nginx) and application latency correlated
Error and Rate Limit Monitoring
- Error rates calculated per endpoint
- Status code breakdown tracked
- Rate limit events logged with consumer identity
- Approaching-limit warnings generated
- Abuse patterns detectable through queries
Common Pitfalls
1. “Using the actual path instead of the route pattern”
Logging /v1/products/abc123 instead of /v1/products/:id creates thousands of unique “endpoints” that cannot be grouped or analyzed.
Solution: Always use the route pattern from your framework (req.route.path in Express, request.routeOptions.url in Fastify).
2. “Setting alerts on averages”
Average latency of 100ms sounds great. But if p99 is 5000ms, 1% of your users are having a terrible experience.
Solution: Alert on percentiles (p95 or p99), not averages. A p99 spike is often the first sign of a systemic issue.
3. “Only monitoring from the application layer”
Your application may report 200ms latency, but the user experiences 800ms because of TLS handshake, nginx buffering, and network overhead.
Solution: Monitor at both the nginx layer and the application layer. Compare upstream vs. downstream latency to identify infrastructure bottlenecks.
Next Steps
- Express Integration - Full Express SDK setup
- Fastify Integration - Fastify plugin configuration
- nginx Integration - Upstream access log shipping
- Real-Time Alerting - Configure alert rules
- Incident Response - Debug API issues fast
Ready to monitor your API?
- Deploy LogTide - Free, open-source
- Join GitHub Discussions - Share your API monitoring setup