Streams can be tricky. No doing it wrong can help. 😉
TL;DR
You pipe from a ReadableStream
into a WritableStream
:
- a
WritableStream
has a.write()
method that reads data (it can be written to) - a
ReadableStream
has a.read()
method that can be used to write data elsewhere - a full-duplex stream (like a WebSocket or TCP connection) is both
note: it’s always spelled Writable
in the API, but is sometimes spelled writeable in the docs
Here’s a copy/paste example of typical stream usage
(piping from readable file stream to a writable http response stream)
// Handlers MUST be set FIRST
// (otherwise some events may not fire, or fire unexpectedly)
f.once('error', function () {
// WritableStream must be closed manually on (read) error
res.end();
});
res.once('error', function () {
// ReadableStream must be closed manually on (write) error
f.close();
});
res.once('close', function () {
// final cleanup
});
// Pipe from file stream to response stream
// 'finish' the response when the file ends (default)
f.pipe(res, { end: true });
Note: if there are multiple pipe()
operations, each one must have an 'error'
handler BEFORE each pipe. For this reason, I don’t recommend chaining 🚫⛓ - just write each line instead.
On the happy path the events will be handled in this order:
Ideal Stream Event Sequence
This is the Happy Path order of stream events - when all the handlers are connected correctly, in the right order, and there are no errors.
ReadableStream | WritableStream |
---|---|
'pipe' |
|
'open' (files) |
|
'ready' (files) |
|
'readable' (1+) |
|
'drain' (0+) |
|
'end' |
|
'finish' |
|
'unpipe' |
|
'close' |
|
'close' |
Rule #1: Always Pipe, if you can
Most of the time you have at least one target (WritableStream) that you want to write to (in other ways: have it read or sink data) from typically just one source (ReadableStream).
You may also want to do something else simultaneously - like shasum the data stream or whatnot, but the bottleneck is typically the WritableStream
- or at least you’ll create a memory leak if you read in (fill up memory) faster than you write out (clear out from memory) - or rather a temporary, but possibly crash-inducing, memory balloon.
rs.pipe(ws);
ReadableStream
s may also emit data
(dangerous! DO NOT USE) and pause
and resume
(better to use ws.cork()
and ws.uncork()
, except as a rare workaround to certain bugs).
Rule #2: You MUST Cross-Handle Errors
If there’s an error on one stream, the default behavior is that the other will hang in limbo - neither error
ing nor close
ing 😬, possibly with memory leaks 😱.
So it’s important to add error handlers on both streams.
Here’s what the events look like with and without error handling for each side of the stream:
Without Error Handling
rs.pipe(ws);
On Readable Error
ReadableStream | WritableStream |
---|---|
'pipe' |
|
'error' |
|
'close' |
The WritbaleStream
is hung, waiting forever…
On Writable Error
ReadableStream | WritableStream |
---|---|
'pipe' |
|
'open' (files) |
|
'ready' (files) |
|
'unpipe' |
|
'error' |
|
'close' |
|
'readable' |
The ReadableStream
is waiting with a full buffer, waiting for a rs.read()
that will never come…
(that’s a Memory Leak, btw)
With Error Handling
This is what you can expect when your code has the proper handlers, as in the TL;DR example up above - both streams will receive the close
event (along with other applicable cleanup events) and free their buffers.
rs.once('error', function () {
ws.end();
});
ws.once('error', function () {
rs.close();
});
ws.once('close', function () {
// final cleanup
});
On Handled Readable Error
ReadableStream | WritableStream |
---|---|
'pipe' |
|
'error' => ws.end() |
|
'close' |
|
'finish' |
|
'unpipe' |
|
'close' |
On Handled Writable Error
ReadableStream | WritableStream |
---|---|
'pipe' |
|
'open' (files) |
|
'ready' (files) |
|
'unpipe' |
|
'error' => rs.close() |
|
'close' |
|
'close' |
Depending on the initial state of the streams, these events could come in a different order, but this gives a clear picture nonetheless.
Rule #3: Always readable
, NEVER data
There are a lot of old, dangerous tutorials out there that recommend the 'data'
⛔️ event. DO NOT DO THIS.
As explained in Rule #1, reading into memory faster than you write out (free up the memory up) will cause a balloon which can cause bad performance, and possibly a crash.
The source you’re reading data from (e.g. SSD) is almost always faster than the target you’re writing to (e.g. Web Browser on mobile or low-tier Comcast customer).
If you want to do something with the data that is incompatible or inconvenient to use with pipe, use readable
instead.
rs.on('readable', function () {
for (;;) {
let chunk = rs.read();
if (!chunk) {
break;
}
// process `chunk`
}
});
rs.on('close', function () {
// cleanup
});
Here’s an example of taking the sha256 sum of an upload:
let Crypto = require('node:crypto');
app.use('/api/upload', sha256sum);
function sha256sum(req, res, next) {
next();
let sha256 = Crypto.createHash('sha256').setEncoding('hex');
res.body.on('readable', function () {
for (let chunk = rs.read(); chunk; chunk = rs.read()) {
sha256.update(chunk);
}
});
rs.on('close', function () {
req.sha256sum = sha256.digest();
});
}