All posts
APIsMay 16, 2026·8 min read

Designing REST APIs That Scale

Versioning, pagination, error responses, idempotency keys, rate limit headers, OpenAPI specs — the design decisions that separate a maintainable API from one that becomes a liability.

Version from day one

Prefix your API with /v1/. It costs nothing to add now and is expensive to add later. When you ship a breaking change, add /v2/ and give clients time to migrate. Never break existing clients without a deprecation window.

Consistent error responses

Every error should have the same shape. Clients shouldn't need to guess whether to look at error, message, errors, or detail.

// Always return this shape for errors:
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Request body is invalid",
    "details": [
      { "field": "email", "message": "Must be a valid email address" }
    ]
  }
}

// Express error handler:
app.use((err, req, res, next) => {
  const status = err.statusCode || 500;
  res.status(status).json({
    error: {
      code: err.code || 'INTERNAL_ERROR',
      message: status < 500 ? err.message : 'An unexpected error occurred',
    }
  });
});

Pagination that doesn't break

Offset pagination (?page=2&limit=20) is simple but breaks under concurrent writes — a new row can shift everything off-by-one. Cursor pagination is stable:

// Response:
{
  "data": [...],
  "pagination": {
    "cursor": "eyJpZCI6MTAwfQ==",
    "hasMore": true
  }
}

// Next request:
GET /projects?cursor=eyJpZCI6MTAwfQ==&limit=20

// Implementation:
const items = await prisma.project.findMany({
  take: limit + 1,
  cursor: cursor ? { id: decodeCursor(cursor) } : undefined,
  orderBy: { createdAt: 'desc' },
});
const hasMore = items.length > limit;
if (hasMore) items.pop();

Idempotency keys

For write operations (payments, email sends, state mutations), support an idempotency key in the request header. If the client retries a failed request with the same key, return the cached first response instead of executing twice.

app.post('/payments', async (req, res) => {
  const idempotencyKey = req.headers['idempotency-key'];
  if (idempotencyKey) {
    const cached = await redis.get(`idempotency:${idempotencyKey}`);
    if (cached) return res.status(200).json(JSON.parse(cached));
  }
  const result = await processPayment(req.body);
  if (idempotencyKey) {
    await redis.setex(`idempotency:${idempotencyKey}`, 86400, JSON.stringify(result));
  }
  res.status(201).json(result);
});

Rate limit headers

Always return rate limit information in response headers. Clients can back off proactively rather than hitting 429s.

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 987
X-RateLimit-Reset: 1751000000
Retry-After: 60   // only on 429

OpenAPI spec as the source of truth

Define your API in an OpenAPI YAML/JSON spec. Generate types, validation schemas, and client SDKs from it. This ensures docs, code, and client contracts stay in sync. Tools: openapi-typescript, zod-openapi, fastify-swagger.

HATEOAS (use sparingly)

Including related resource links in responses (_links) makes APIs self-describing. Useful for complex workflows but add overhead. Implement for public APIs where discoverability matters; skip it for internal services.

Response envelope vs flat

Flat is simpler: { "id": "123", "name": "..." }. Enveloped adds metadata: { "data": {...}, "meta": {...} }. Use envelopes for list responses (you need pagination metadata). Use flat for single-resource responses.

Ready to put this into practice?

Deploy your Node.js app to production in minutes — zero YAML, automatic CI/CD, and HTTPS included.