All posts
ScalingMay 4, 2026·5 min read

Horizontal vs Vertical Scaling for Node.js

Scaling up vs scaling out — when each makes sense for Node.js apps, and how statelessness is the prerequisite for horizontal scaling to actually work.

Vertical scaling: more resources, same instance

Vertical scaling means giving your existing instances more compute resources — more vCPUs, more RAM, faster storage. It's the path of least resistance: no code changes, no architecture changes, no load balancer setup. You just upgrade the instance size and restart.

For Node.js, there's a critical nuance: additional vCPUs only help if you use cluster mode. A single Node.js process uses exactly one CPU core. Moving from a 2 vCPU to an 8 vCPU machine gives you zero benefit without cluster mode — you're paying for 6 idle cores.

// Without cluster mode: 1 process, 1 CPU core used regardless of machine size
// With cluster mode: one process per core, all CPUs utilized

import cluster from 'cluster';
import { availableParallelism } from 'os';

if (cluster.isPrimary) {
  for (let i = 0; i < availableParallelism(); i++) cluster.fork();
  cluster.on('exit', () => cluster.fork()); // auto-restart on crash
} else {
  startServer(); // Each worker runs the full app
}

Vertical scaling is the right first choice when:

  • Your database is the bottleneck (more RAM = larger buffer pool = fewer disk reads)
  • You haven't implemented cluster mode yet (easy 4–8x throughput improvement first)
  • You need more memory per process for large working sets
  • Simplicity matters more than cost efficiency at your current scale

Horizontal scaling: more instances, smaller each

Horizontal scaling adds more instances of your application behind a load balancer, distributing traffic across all of them. The ceiling is theoretically unlimited — add more instances and you add more capacity linearly.

The prerequisite: your application must be stateless. Any request can hit any instance. If instance A holds state that instance B doesn't know about, requests will fail randomly.

Horizontal scaling is the right choice when:

  • You need high availability — multiple instances means a single instance failure doesn't cause downtime
  • Your traffic patterns require elastic capacity — scale out for daytime traffic spikes, scale in at night
  • You've hit the vertical ceiling for your use case
  • Cost efficiency at scale — many small instances are often cheaper per req/s than few large ones

Making your Node.js app stateless

This is the necessary prerequisite for horizontal scaling. Audit your application for these common statefulness issues:

// ✗ In-memory sessions — only instance A has this session
app.use(session({ store: new MemoryStore() }));

// ✓ Redis-backed sessions — any instance can read any session
app.use(session({ store: new RedisStore({ client: redis }) }));

// ✗ In-process rate limiting — users can bypass by hitting different instances
const counts = new Map<string, number>();

// ✓ Redis-backed rate limiting — shared state across all instances
const limiter = new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(100, '1m') });

// ✗ File uploads saved to local disk — only available on that instance
app.post('/upload', upload.single('file'), (req, res) => {
  fs.writeFileSync('/tmp/uploads/' + req.file.originalname, req.file.buffer);
});

// ✓ Files in object storage — available to all instances
app.post('/upload', upload.single('file'), async (req, res) => {
  const url = await uploadToS3(req.file.buffer, req.file.originalname);
  res.json({ url });
});

// ✗ Scheduled job assuming single-process execution
cron.schedule('0 * * * *', () => sendHourlyDigests()); // Runs N times with N replicas

// ✓ Distributed lock prevents duplicate execution
cron.schedule('0 * * * *', async () => {
  const lock = await redis.set('cron:hourly-digest', '1', { NX: true, EX: 3600 });
  if (lock) await sendHourlyDigests(); // Only one instance wins the lock
});

The cost comparison

At lower scale, vertical scaling is almost always cheaper due to per-instance overhead (monitoring, logging agents, load balancer capacity units). At higher scale, horizontal scaling typically wins on cost per req/s because smaller instances have better price/performance ratios and unused capacity can scale in.

// Rough cost model (prices vary by provider):
// 1× 16 vCPU / 32GB instance:  $0.60/hr
// 4× 4 vCPU / 8GB instances:   $0.60/hr (same cost, better availability)
// 8× 2 vCPU / 4GB instances:   $0.48/hr (cheaper, easier to scale elastically)

The practical decision framework

  1. Start with vertical scaling — If you haven't hit a ceiling, the simplest option is the right option.
  2. Add cluster mode before upgrading instance size — Free throughput increase before paying for more hardware.
  3. Move to horizontal before you need high availability — Running multiple instances means a single failure doesn't cause downtime. This is often the first reason to scale out, even before traffic requires it.
  4. At 5+ instances, revisit instance size — Smaller instances with better auto-scaling policies often provide better cost/performance.

Ready to put this into practice?

Deploy your Node.js app to production in minutes — zero YAML, automatic CI/CD, and HTTPS included.