🤖🚫 AI-free content. This post is 100% written by a human, as is everything on my blog. Enjoy!

Handle large JSONs effortlessly with jq

September 15, 2015 in Software

I wanted to see the distribution of file sizes in our Webpack build which is getting really big. To do this, you generate a JSON build profile, as described in the optimisation guide, and then you load it in the online Webpack analyze tool. Unfortunately, it doesn’t have the feature to order modules by size, so I was left on my own.

Now, how would you find sense in a 4-megabyte JSON file? Turns out, this is a great opportunity to use jq - a lightweight and flexible command-line JSON processor.

First, let’s see what keys does our file contain:

$ jq 'keys' webpack.stats.json

[
  "assets",
  "assetsByChunkName",
  "children",
  "chunks",
  "errors",
  "filteredModules",
  "hash",
  "modules",
  "publicPath",
  "time",
  "version",
  "warnings"
]

(The jq command takes a filter command - in this case the built-in keys function - and a JSON source file.)

Looks like modules is what we need - let’s peek inside that key:

$ jq '.modules' webpack.stats.json |less

[
  {
    "id": 0,
    ...

(A familiar dot notation is used to query the structure).

Bingo! There’s our list of modules (it’s really large, so I pipe it into less). Now let’s make sense of the module objects:

$ jq '.modules[1]|keys' webpack.stats.json
[
  "assets",
  "built",
  "cacheable",
  "chunks",
  "errors",
  "failed",
  "id",
  "identifier",
  "index",
  "index2",
  "issuer",
  "name",
  "optional",
  "prefetched",
  "reasons",
  "size",
  "warnings"
]

(jq uses the concept of chaining, or piping, operations. First we extract a single object from the initial input, then apply the keys function to it.)

OK, so we want to sort the modules by size, descending. Also let’s only keep the size and name of the module to keep the output short.

$ jq '.modules|sort_by(.size)|reverse|.[]|{size, name}' webpack.stats.json |less

{
  "size": 151827,
  "name": "./~/bluebird/js/browser/bluebird.js"
}
{
  "size": 141445,
  "name": "./~/immutable/dist/immutable.js"
}
{
  "size": 104409,
  "name": "./~/moment/moment.js"
}
...

(sort_by, reverse are built-in functions operating on an array. Then we convert the array structure into a stream of objects with .[], so that the final command applies to each separate object, which we re-construct using a notation similar to ES6)

This is just the information we need. As it’s intended for human consumption, let’s format the output better:

$ jq -r '.modules|sort_by(.size)|reverse|.[]|"\(.size) => \(.name)"' webpack.stats.json | less

151827 => ./~/bluebird/js/browser/bluebird.js
141445 => ./~/immutable/dist/immutable.js
104409 => ./~/moment/moment.js
...

(We pipe the output to a string formatter. The -r key in the command prints strings unquoted.)

Nice! Ready to copy and paste into a bug tracker.

As you see, jq is a succinct and easy way to inspect and process JSON files right from the command line. Lay off your REPL for this one.

Buy me a coffee Liked the post? Treat me to a coffee