Handle large JSONs effortlessly with jq
September 15, 2015 in SoftwareI wanted to see the distribution of file sizes in our Webpack build which is getting really big. To do this, you generate a JSON build profile, as described in the optimisation guide, and then you load it in the online Webpack analyze tool. Unfortunately, it doesn’t have the feature to order modules by size, so I was left on my own.
Now, how would you find sense in a 4-megabyte JSON file? Turns out, this is a great opportunity to use jq - a lightweight and flexible command-line JSON processor.
First, let’s see what keys does our file contain:
$ jq 'keys' webpack.stats.json
[
"assets",
"assetsByChunkName",
"children",
"chunks",
"errors",
"filteredModules",
"hash",
"modules",
"publicPath",
"time",
"version",
"warnings"
]
(The jq
command takes a filter command - in this case the built-in keys
function - and a JSON source file.)
Looks like modules
is what we need - let’s peek inside that key:
$ jq '.modules' webpack.stats.json |less
[
{
"id": 0,
...
(A familiar dot notation is used to query the structure).
Bingo! There’s our list of modules (it’s really large, so I pipe it into less
). Now let’s make sense of the module objects:
$ jq '.modules[1]|keys' webpack.stats.json
[
"assets",
"built",
"cacheable",
"chunks",
"errors",
"failed",
"id",
"identifier",
"index",
"index2",
"issuer",
"name",
"optional",
"prefetched",
"reasons",
"size",
"warnings"
]
(jq uses the concept of chaining, or piping, operations. First we extract a single object from the initial input, then apply the keys
function to it.)
OK, so we want to sort the modules by size, descending. Also let’s only keep the size and name of the module to keep the output short.
$ jq '.modules|sort_by(.size)|reverse|.[]|{size, name}' webpack.stats.json |less
{
"size": 151827,
"name": "./~/bluebird/js/browser/bluebird.js"
}
{
"size": 141445,
"name": "./~/immutable/dist/immutable.js"
}
{
"size": 104409,
"name": "./~/moment/moment.js"
}
...
(sort_by
, reverse
are built-in functions operating on an array. Then we convert the array structure into a stream of objects with .[]
, so that the final command applies to each separate object, which we re-construct using a notation similar to ES6)
This is just the information we need. As it’s intended for human consumption, let’s format the output better:
$ jq -r '.modules|sort_by(.size)|reverse|.[]|"\(.size) => \(.name)"' webpack.stats.json | less
151827 => ./~/bluebird/js/browser/bluebird.js
141445 => ./~/immutable/dist/immutable.js
104409 => ./~/moment/moment.js
...
(We pipe the output to a string formatter. The -r
key in the command prints strings unquoted.)
Nice! Ready to copy and paste into a bug tracker.
As you see, jq is a succinct and easy way to inspect and process JSON files right from the command line. Lay off your REPL for this one.
Liked the post? Treat me to a coffee