Most Node.js middleware assumes buffered bodies — great for JSON APIs, but terrible for performance when working with large files, proxied responses, or real-time content.
In this article, you’ll learn how to build streaming middleware in Node.js that operates on-the-fly, without ever buffering the full response in memory — ideal for:
- HTML injection without latency
- JSON rewriting in proxies
- Compression/encryption on the fly
- Streaming large logs/files through a filter
Step 1: Understand the Problem with Traditional Middleware
Common middleware like body-parser
, or response rewrites in express
, assume the request/response is buffered:
// This blocks until the full response is received
app.use((req, res, next) => {
let chunks = [];
res.on('data', chunk => chunks.push(chunk));
res.on('end', () => {
const body = Buffer.concat(chunks).toString();
// modify body here (too late for streaming)
});
});
This doesn't scale. For large files or real-time proxies, we want transformations mid-stream, before the full body is received.
Step 2: Use on-headers
to Hook Into Streaming Response
We’ll write a middleware that intercepts the response before headers are sent, and replaces res.write
and res.end
with our own streaming pipeline.
Install on-headers
:
npm install on-headers
Then create a middleware like this:
const onHeaders = require('on-headers');
const { Transform } = require('stream');
function streamingTransformMiddleware(rewriteFn) {
return (req, res, next) => {
const originalWrite = res.write;
const originalEnd = res.end;
const transformStream = new Transform({
transform(chunk, encoding, callback) {
const output = rewriteFn(chunk.toString());
callback(null, output);
}
});
// Delay piping until headers are about to be sent
onHeaders(res, () => {
res.write = (...args) => transformStream.write(...args);
res.end = (...args) => transformStream.end(...args);
transformStream.on('data', (chunk) => originalWrite.call(res, chunk));
transformStream.on('end', () => originalEnd.call(res));
});
next();
};
}
This replaces the write stream with a transform that operates chunk-by-chunk, ideal for streaming.
Step 3: Use It in Your Express App
Let’s apply a simple example: rewrite every instance of “dog” to “cat” in streamed HTML:
app.use(streamingTransformMiddleware((chunk) => {
return chunk.replace(/dog/g, 'cat');
}));
Now this middleware will modify every chunk as it’s being streamed to the client — no full buffer, no delay.
You can also pipe incoming proxy streams (e.g., via http-proxy
) directly through this transform for on-the-fly rewriting.
Step 4: Bonus – Add Compression in the Same Stream
Need gzip on top? Just add another transform layer using zlib
:
const zlib = require('zlib');
const gzip = zlib.createGzip();
transformStream
.pipe(gzip)
.on('data', (chunk) => originalWrite.call(res, chunk))
.on('end', () => originalEnd.call(res));
This allows stacked streaming transforms, such as:
- HTML injection
- Content rewriting
- Minification
- Compression
All in a single pass, fully streamed.
✅ Pros:
- 🧠 No buffering = instant response start
- 🚀 Works with huge payloads (MBs/GBs) without memory spikes
- 🧱 Fully composable with gzip, proxying, and chunked streams
- 🌐 Ideal for proxy servers, SSR, CDN-like behavior
⚠️ Cons:
- 🐛 Care needed with chunk boundaries (regex may split words)
- 🔧 Harder to debug than buffered content
- 🚫 Not suitable for JSON parsing unless done line-by-line
- ⚠️ Express 5+ may change some internal assumptions
Summary
Most middleware assumes buffered content, making it a bad fit for large or real-time responses. This article showed how to build streaming transformation middleware in Node.js, enabling mid-stream content rewriting with minimal latency and memory use. It’s an ideal pattern for proxies, CDNs, or custom edge logic in Node without waiting for the full response.
If this was helpful, you can support me here: Buy Me a Coffee ☕