The Best Logging Strategy for Node.js APIs

Why console.log doesn't belong in production

Every Node.js developer starts with console.log. It's fast to write, shows up immediately, and requires zero setup. In development, it's fine. In production, it's a liability.

The problems: unstructured strings can't be queried. You can't filter all logs for userId: "usr_123" if userId is buried in "User 123 performed action checkout". You can't set alert thresholds on log levels. You can't correlate a request through multiple async operations. And console.log is synchronous on stdout — it blocks the event loop on every write.

The alternative is structured logging: every log entry is a JSON object with consistent fields. Every log aggregator (Datadog, Grafana Loki, CloudWatch, Elastic) speaks JSON. You can query, filter, and alert on structured fields.

Choosing a logger: pino vs winston

pino is the right default for most Node.js apps. It's the fastest Node.js logger by a significant margin, uses an async transport stream to keep the hot path non-blocking, has excellent TypeScript support, and produces clean JSON output.

winston is more flexible and has a larger ecosystem of transports. It's a good choice if you need to write to multiple destinations simultaneously (file + console + HTTP endpoint) or if your team is already familiar with it. It's measurably slower than pino under load.

import pino from 'pino';

export const log = pino({
  level: process.env.LOG_LEVEL ?? 'info',

  // Pretty print for local development, raw JSON for production
  transport: process.env.NODE_ENV !== 'production'
    ? {
        target: 'pino-pretty',
        options: { colorize: true, translateTime: 'HH:MM:ss', ignore: 'pid,hostname' },
      }
    : undefined,

  // Redact before the log ever leaves your process
  redact: {
    paths: [
      'req.headers.authorization',
      'req.headers.cookie',
      'req.body.password',
      'req.body.creditCard',
      'user.password',
      'user.ssn',
    ],
    censor: '[REDACTED]',
  },

  // Add service-level fields to every log entry
  base: {
    pid: process.pid,
    service: process.env.SERVICE_NAME ?? 'api',
    env: process.env.NODE_ENV,
    version: process.env.APP_VERSION,
  },
});

Log levels: a practical guide

Using the right log level matters because it determines what gets stored, what gets alerted on, and how much noise operators have to deal with.

fatal — The process is about to crash. A dependency is completely unavailable. Someone should be paged immediately. Example: database connection pool exhausted and can't recover
error — A request failed that shouldn't have. The user got a 500. The app is still running. Alert if error rate increases. Example: unhandled exception in a route handler
warn — Something unexpected happened but was handled gracefully. Track trends. Example: rate limit threshold reached, deprecated API version called, retry succeeded after first failure
info — Normal operational events worth knowing about. Low noise, high signal. Example: server started, user authenticated, background job completed
debug — Detailed information useful for diagnosing specific issues. Off by default in production. Example: SQL query executed, cache hit/miss, function input/output
trace — Extremely verbose. Everything that happens. Only for deep debugging sessions.

Set LOG_LEVEL=info in production. Set LOG_LEVEL=debug in staging. Enable debug dynamically per-request in production via a signed header without restarting the process.

Request correlation IDs: the key to distributed debugging

When a user reports a bug, you need to find every log entry related to their specific request — through your API handler, into async jobs, and into downstream services. Without a correlation ID, you're searching through millions of logs with nothing to filter on.

The pattern: assign a unique ID to every request, propagate it through all async operations using AsyncLocalStorage, and include it in every log entry automatically.

import { AsyncLocalStorage } from 'async_hooks';
import crypto from 'crypto';

const requestStore = new AsyncLocalStorage<{ requestId: string; userId?: string }>();

// Middleware: assign requestId to every request
app.use((req, res, next) => {
  const requestId = (req.headers['x-request-id'] as string) || crypto.randomUUID();
  res.setHeader('x-request-id', requestId);

  requestStore.run({ requestId }, next);
});

// Middleware: attach userId once authenticated
export function attachUserToContext(userId: string) {
  const store = requestStore.getStore();
  if (store) store.userId = userId;
}

// Logger that automatically picks up context
export const contextLog = {
  info: (obj: object, msg: string) => {
    const ctx = requestStore.getStore() ?? {};
    log.info({ ...ctx, ...obj }, msg);
  },
  error: (obj: object, msg: string) => {
    const ctx = requestStore.getStore() ?? {};
    log.error({ ...ctx, ...obj }, msg);
  },
  // ... other levels
};

Now every log entry automatically includes requestId and userId. To debug any user's issue, filter your log aggregator by their requestId — you'll see the entire request lifecycle in sequence.

Automatic request logging middleware

Log every request automatically with consistent structure. Use the res.on('finish') hook to capture the final status code after the response is sent.

app.use((req, res, next) => {
  const start = Date.now();
  req.startTime = start;

  res.on('finish', () => {
    const latencyMs = Date.now() - start;
    const logFn = res.statusCode >= 500 ? 'error' : res.statusCode >= 400 ? 'warn' : 'info';

    log[logFn]({
      method: req.method,
      path: req.path,
      query: req.query,
      statusCode: res.statusCode,
      latencyMs,
      contentLength: res.getHeader('content-length'),
      userAgent: req.headers['user-agent'],
      ip: req.ip,
    }, 'HTTP request');
  });

  next();
});

Error logging: capture the full context

When errors occur, log everything you'll need to diagnose the problem — the error, its stack trace, the request context, and any relevant business state.

// Express error handler
app.use((err: Error, req: Request, res: Response, next: NextFunction) => {
  const status = (err as any).statusCode ?? 500;

  if (status >= 500) {
    log.error({
      err: {
        message: err.message,
        stack: err.stack,
        name: err.name,
        code: (err as any).code,
      },
      request: {
        method: req.method,
        path: req.path,
        body: req.body,   // Will be redacted for sensitive fields
        query: req.query,
      },
    }, 'Unhandled error');
  }

  // Never send stack traces to clients in production
  res.status(status).json({
    error: {
      message: status < 500 ? err.message : 'An unexpected error occurred',
      code: (err as any).code ?? 'INTERNAL_ERROR',
    },
  });
});

What every log entry should contain

A well-structured request log gives you everything you need to answer any production question:

{
  "level": "info",
  "time": "2026-06-20T14:32:01.234Z",
  "service": "api",
  "version": "1.4.2",
  "requestId": "550e8400-e29b-41d4-a716-446655440000",
  "userId": "usr_abc123",
  "method": "POST",
  "path": "/projects",
  "statusCode": 201,
  "latencyMs": 142,
  "ip": "203.0.113.42",
  "msg": "HTTP request"
}

Log aggregation: stdout is the contract

In containerized environments, your app should write logs to stdout/stderr only. Never write to log files inside a container — they'll be lost when the container restarts, and they can fill up the container's ephemeral filesystem. Your orchestration platform (Kubernetes, ECS, Docker) collects stdout and ships it to your log aggregator.

The log aggregator receives structured JSON and indexes the fields for search. You can then query: "show all requests with statusCode: 500 in the last hour for userId: usr_123".

Log sampling for high-traffic services

At 50k req/s, logging every request is expensive. A single server producing 50,000 log entries per second at ~300 bytes each generates ~15MB/s of log data. At $0.50/GB ingest, that's $648/day just for request logs on one server.

The solution is sampling: log 100% of errors and slow requests, a fraction of successful fast requests.

res.on('finish', () => {
  const latencyMs = Date.now() - start;
  const isError = res.statusCode >= 400;
  const isSlow = latencyMs > 1000;

  // Always log errors and slow requests
  // Sample 10% of normal requests
  if (!isError && !isSlow && Math.random() > 0.1) return;

  log.info({ method: req.method, path: req.path, statusCode: res.statusCode, latencyMs });
});