Introduction to Piping Streams in Node.js

11
SHARES
101
VIEWS

The pipe method is one of the most well-known features of streams. It allows us to compose advanced streaming pipelines as a single line of code.

As a part of Node core it can be useful for cases where process uptime isn’t important (such as CLI tools).

Unfortunately, however, it lacks a very important feature: error handling.

If one of the streams in a pipeline composed with pipe fails, the pipeline is simply unpiped. It is up to us to detect the error and then afterwards destroy the remaining streams so they do not leak any resources. This can easily lead to memory leaks.

Let’s consider the following example:

const http = require('http') 
const fs = require('fs') 
 
const server = http.createServer((req, res) => { 
  fs.createReadStream('big.file').pipe(res) 
}) 
 
server.listen(8080) 

A simple, straightforward, HTTP server that serves a big file to its users.

Since this server is using pipe to send back the file, there is a big chance that this server will produce memory and file descriptor leaks while running.

If the HTTP response were to close before the file has been fully streamed to the user (for instance, when the user closes their browser), we will leak a file descriptor and a piece of memory used by the file stream. The file stream stays in memory because it’s never closed.

We have to handle error and close events, and destroy other streams in the pipeline. This adds a lot of boilerplate, and can be difficult to cover in all cases.

In this artice, we’re going to explore the pump module, which is built specifically to solve this problem.

Let’s Start

Let’s create a folder called file-server, with an app.js.

We’ll need to initialize the folder as a package, install the pump module, and create and app.js file:

$ mkdir file-server 
$ cd file-server 
$ npm init -y  
$ npm install --save pump 
$ touch app.js 

We’ll also need a big file, so let’s create that quickly:

$ node -e "process.stdout.write(crypto.randomBytes(1e9))" > big.file

How to do?

We’ll begin, in our app.js file, by requiring the fs, http, and pump modules:

const fs = require('fs') 
const http = require('http') 
const pump = require('pump') 

Now let’s create our HTTP server and pump instead of pipe our big file stream to our response stream:

const server = http.createServer((req, res) => { 
  const stream = fs.createReadStream('big.file') 
  pump(stream, res, done) 
}) 
 
function done (err) { 
  if (err) { 
    return console.error('File was not fully streamed to the user', 
    err) 
  } 
  console.log('File was fully streamed to the user') 
} 
 
server.listen(3000) 

Now let’s run our server:

$ node app.js 

If we use curl and hit Ctrl + C before finishing the download, we should be able to trigger the error state, with the server logging that the file was not fully streamed to the user:

$ curl http://localhost:8080 # hit Ctrl + C before finish 

How it works

Every stream we pass into the pump function will be piped to the next (as per order of arguments passed into pump). If the last argument passed to pump is a function, the pump module will call that function when all streams have finished (or one has errored).

Internally, pump attaches close and error handlers, and also covers other esoteric cases where a stream in a pipeline may close without notifying other streams.

If one of the streams closes, the other streams are destroyed and the callback passed to pump is called.

It is possible to handle this manually, but the boilerplate overhead and potential for missed cases is generally unacceptable for production code.

For instance, here’s our specific case from the tutorial, altered to handle the response closing:

const server = http.createServer((req, res) => { 
  const stream = fs.createReadStream('big.file') 
  stream.pipe(res) 
  res.on('close', () => { 
    stream.destroy() 
  }) 
}) 

If we multiply that by every stream in a pipeline, and then multiply it again by every possible case (mostly close and error but also esoteric cases), we end up with an extraordinary amount of boilerplate.

There are very few use cases where we want to use pipe (sometimes we want to apply manual error handling) instead of pump but generally, for production purposes, it’s a lot safer to use pump instead of pipe.

Muhammad Mubeen

Muhammad Mubeen

Mubeen is a full-stack web & mobile app developer who is very proficient in MEAN.js, Vue, Python, Ionic 4, Flutter, Firebase, ROR, and PHP. He has created multiple mobile and web applications. He is very passionate about sharing his knowledge.

Leave a Reply

Your email address will not be published. Required fields are marked *

Trending