The Runtime — FM India

You already write JavaScript. You know how fetch works, how the DOM behaves, how to reason about async code in the browser. This chapter is about the other place JavaScript runs: a server, with no browser around it, talking to files, networks, and databases.

The good news is that the language is the same. The surprising news is that the runtime around it is not. Let's look at the machine.

Fuzzy on the big picture first?

If terms like load balancer, reverse proxy, nginx, and microservices feel like a black box, read What happens when the UI calls the backend? first. It builds the whole architecture up from one server, in plain language with everyday analogies, then this chapter zooms into the runtime that powers each of those boxes.

How JavaScript actually runs on the server

Node is three things stacked on top of each other:

V8runs your JS

libuvasync I/O

bindingsC++ glue

V8 is Google's JavaScript engine, the same one inside Chrome. Its job is to run your code. libuv is a C library that handles slow work like reading files and waiting on network sockets. Bindings are the glue that lets your JavaScript reach into operating-system features.

V8 runs your code in two passes. First an interpreter called Ignition turns it into bytecode and starts running right away. Then a compiler called TurboFan watches which functions get called a lot and recompiles those "hot" ones into fast machine code. This is JIT, or just-in-time compilation.

Two things follow from that, and both matter in production:

Your code gets faster the longer it runs
V8 needs time to spot the hot paths and optimise them. A freshly started process is genuinely slower than one that's been serving traffic for a few minutes. This is why cold starts feel sluggish.
V8 likes consistent object shapes
If you keep passing the same shape of object into a function, V8 stays fast. If you pass wildly different shapes, it gives up and falls back to slow mode. Consistent shapes aren't just tidy, they keep your code quick.

V8 on its own can't open a file or a socket. That's libuv's job. When you call fs.readFile, V8 doesn't read anything. It asks libuv to. libuv either uses the operating system's own async features (for network sockets) or a small pool of background threads (for files, DNS, and crypto). When the work finishes, libuv drops a callback on a queue, and the event loop picks it up.

What people mean by single-threaded

When someone says "Node is single-threaded," they mean your JavaScript runs on one thread. The runtime around it uses many threads. That one JS thread is precious. Anything that blocks it (a slow loop, a huge JSON.parse, heavy synchronous work) freezes the entire server for every user at once.

The event loop, properly this time

The event loop is a loop that does one job: pick the next callback that's ready, run it, repeat. It moves through six phases in order.

Timers
Runs setTimeout and setInterval callbacks whose time is up.
Pending callbacks
A few system-level callbacks. You'll rarely think about these.
Idle / prepare
Internal Node housekeeping.
Poll (the big one)
Waits here for I/O: a socket got data, a file finished reading. Runs their callbacks.
Check
Runs setImmediate callbacks.
Close callbacks
Cleanup, like socket.on('close').
…then back to the top, forever.

One trip around the loop. Between every phase, two queues drain completely first.

There's a twist. Between every single phase, two special queues drain all the way: the nextTick queue (process.nextTick) and the microtask queue (resolved promises, queueMicrotask). nextTick goes first, then microtasks. Both empty completely before the loop moves on.

Run this and watch the order. The browser doesn't have process.nextTick, so we use queueMicrotask, which sits on the same queue as promises.

Predict the output, then run itrun · edit · saved to you

Loading editor…

Synchronous code finishes first. Then microtasks drain before the loop even starts its next phase. Only then do timer and immediate callbacks run.

The starvation trap

Code like process.nextTick(() => process.nextTick(loop)) queues another callback every time it runs. Since that queue drains completely before the loop advances, your server stops handling requests. The same thing happens with endless Promise.resolve().then(loop) chains. When you have long work to do, break it up with setImmediate so the loop can get back to real I/O.

libuv and the thread pool

Network I/O is genuinely async at the operating-system level. The OS tells Node when a socket has data, so no extra thread is needed. Node uses this for HTTP, TCP, and UDP.

File I/O is different. On most systems it isn't async at the OS level, so libuv hands it to a background thread. By default the thread pool has four threads, and it also handles DNS lookups, crypto (pbkdf2, bcrypt), and compression.

Network scales for free, file and crypto work does not

If your login endpoint hashes a password with bcrypt on every request, you get exactly four parallel logins by default. The fifth waits for a thread. Under load, this shows up as latency spikes that look mysterious until you remember the pool. You can raise it with UV_THREADPOOL_SIZE=16, but past your CPU count the threads just fight each other.

The pattern of "many jobs, only a few running at once" shows up everywhere in backend work. Here it is in plain JavaScript: run a batch of tasks with a fixed concurrency limit, the same way the thread pool does.

Run tasks with a concurrency limitrun · edit · saved to you

Loading editor…

Streams and backpressure

A stream is how you handle data that's too big to hold in memory all at once, or whose size you don't know yet. Instead of "give me the whole 5 GB file," you say "give it to me in chunks, and tell me when it's done."

There are four kinds: readable (a source), writable (a sink), duplex (both, like a socket), and transform (changes data as it passes through, like gzip).

The whole reason streams exist is backpressure. Imagine reading from a fast disk and writing to a slow network. If you pull data as fast as you can and shove it at the slow side, it piles up in memory until the process dies.

Leaks memory under load

// write() returns false when the buffer is full,
// but this loop ignores that and keeps going.
for (const chunk of source) writable.write(chunk);

Respects backpressure

import { pipeline } from 'node:stream/promises';
// pipeline pauses the source when the sink is full.
await pipeline(readable, transformStream, writable);

Prefer pipeline over chaining .pipe(). It handles errors and cleanup correctly, while .pipe() is known for leaving streams dangling when one of them errors. Modern Node also lets you treat a readable stream as an async iterable, which is often the cleanest option of all:

for await (const chunk of readable) {
  process(chunk);
}

The loop pauses while process(chunk) is awaited, so backpressure happens naturally.

HTTP, without the magic

Before any framework, know what an HTTP server really does. The raw http module is small:

import { createServer } from 'node:http';
const server = createServer((req, res) => {
  res.writeHead(200, { 'content-type': 'application/json' });
  res.end(JSON.stringify({ ok: true }));
});
server.listen(3000);

A request arrives in pieces: the TCP connection opens, the request line shows up (GET /foo HTTP/1.1), then headers, then the body if there is one. Frameworks parse all of that for you.

A handful of status codes are worth memorising:

2xx

Success

200 OK · 201 created · 204 no content

4xx

You messed up

400 bad · 401 / 403 auth · 404 missing · 429 too many

5xx

We messed up

500 bug · 502 / 503 / 504 upstream / overload / timeout

One distinction trips people up constantly: 401 means "I don't know who you are," and 403 means "I know who you are, and you can't do this." Mixing them up is a classic bug.

Express vs Fastify vs Hono

You'll spend most of your time inside one of three frameworks. They share the same idea (routes and middleware) but differ in ways worth knowing.

Express

The historic default. Huge ecosystem, fine performance, easy to hire for. Its weak spot is async errors: a thrown error inside async middleware doesn't propagate unless you wrap it. Express 5 improves this, but adoption is slow.

Fastify

Modern and schema-first. Roughly two to three times faster than Express on benchmarks, with built-in JSON-schema validation for request bodies and responses. A good pick when you're starting fresh.

Hono is the newcomer, built on web standards (Request, Response, fetch). The same code runs on Node, Cloudflare Workers, Bun, and Deno. If you're thinking about edge deployment, or you want the same fetch model you already use on the frontend, Hono fits.

How to choose

Hono for edge or multi-runtime deploys. Fastify for a fresh Node server where you want speed and schemas. Express when you're inheriting a codebase or onboarding people fast.

Middleware patterns

Middleware is the backend's version of composition. A request flows through a stack of functions (logging, auth, validation, your actual logic), and each one can change the request, send a response early, or pass control onward.

The shape is called the onion model: each layer wraps the next, so code can run both before and after the inner work.

// A timing middleware. Note the work happening after next().
app.use(async (req, res, next) => {
  const start = Date.now();
  await next();
  console.log(`${req.method} ${req.path} ${Date.now() - start}ms`);
});

Sometimes you need the current request's ID or user from deep inside your code, without threading it through every argument. Node's answer is AsyncLocalStorage, which keeps a per-request "store" alive across async boundaries. A function five levels deep can still read the right request ID, because each request gets its own store. A plain global wouldn't work, since concurrent requests would overwrite each other.

Process model and graceful shutdown

One Node process uses one CPU core for your JavaScript. To use more cores you can run cluster (fork worker processes that share a socket) or worker_threads (threads for CPU-heavy work like image processing). In practice, the modern answer to "use more cores" is usually "run more containers" and let your platform schedule copies of your app.

The part people forget is shutdown. When your container is told to stop (a deploy, a scale-down), the orchestrator sends SIGTERM and gives you a grace period, often 30 seconds, before it forces the issue. If you ignore that signal, in-flight requests get dropped mid-response.

const server = app.listen(3000);
const shutdown = async () => {
  server.close();                   // stop taking new connections
  await drainInFlightRequests();    // let current ones finish
  await pgPool.end();
  await redis.quit();
  process.exit(0);
};
process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);

Pair this with a readiness check that returns 503 once shutdown starts, so the load balancer stops sending you traffic before your grace window runs out. Skip this and you'll see 503 errors on every single deploy.

Test yourself

Questions· say the answer out loud before you open it. If you can't, the chapter isn't done.

QWhat is the output order, and why? setTimeout(…,0); setImmediate(…); Promise.resolve().then(…); process.nextTick(…); log('S');+

S, then nextTick, then promise, then timeout or immediate. Synchronous code runs first. Before the loop advances, the nextTick queue drains, then the microtask queue. Only then does the loop pick up timer and immediate callbacks. The order between those last two can vary with timing.

QAn endpoint calls bcrypt.compare. Under load, p99 latency spikes. Why?+

bcrypt is deliberately slow and runs on the libuv thread pool, which defaults to four threads. Once all four are busy hashing, the fifth request waits. Raise UV_THREADPOOL_SIZE, move hashing to a dedicated worker, or switch to argon2. Going past your CPU count gives diminishing returns.

QWhat's wrong with `for (const chunk of bigArray) await writable.write(chunk)`?+

It ignores backpressure. write returns a boolean, not a promise, so awaiting it does nothing useful. Under load you buffer unbounded data in memory. Use stream.pipeline, or check the return value and wait for the 'drain' event when it's false.

QDifference between process.nextTick and queueMicrotask?+

Both run before the next loop phase. nextTick is Node-specific and runs before the microtask queue. queueMicrotask is web-standard and shares the queue with resolved promises. Both can starve the loop if used recursively. Prefer queueMicrotask unless you specifically need to run before promise callbacks.

QYour Express handler throws inside an async function and the client hangs. Why?+

Express 4's default error handler doesn't catch rejections from async functions, only synchronous throws. Wrap handlers in try/catch that calls next(err), use express-async-errors, or move to Express 5 or Fastify, which handle this natively.

QWhen would you use worker_threads over cluster?+

worker_threads for CPU-heavy work you want off the main loop: image processing, parsing huge JSON, synchronous crypto. cluster when you want to handle more concurrent requests by running multiple processes on one machine. In modern deploys, cluster is usually replaced by running more containers.

QWhy is JSON.parse on a 200 MB string a problem, even though it returns quickly in a script?+

It's synchronous and blocks the event loop for the whole parse, possibly for seconds. During that time the process serves nobody. For large payloads, stream-parse with a library like stream-json, or push the parse to a worker thread.

QYou see 503 errors on every deploy. What's happening?+

Your app is being killed mid-request because it doesn't handle SIGTERM, or the load balancer keeps routing to it after shutdown begins. Listen for SIGTERM, stop accepting new connections, drain in-flight requests, then exit. Add a readiness check that flips to 503 at shutdown so traffic drains first.