gremlin.

> space-investigator.

Disk space analysis. Point it at a directory, get the biggest files and directories back. Written in Rust, walks filesystems in parallel.

Rust MIT License Linux / macOS / Windows

01 / Origin story

Where this came from

I used to work at Rackspace, over a decade ago now. The senior engineers there kept an internal wiki of useful scripts, and over a few years it turned into this massive collection of one-liners for every situation you'd run into on a customer's server.

Some of the scripts needed extra tools the original author had installed. You'd copy a command, paste it on a box, and it'd fail because ncdu or sar or some other utility wasn't there. So you'd end up with multiple versions of the same script, slightly different depending on which tools the author preferred. Some people focused on writing versions that only used tools from the base kickstart install so they'd work on any box, which got you stuff like this:

The original

FS='./';resize;clear;date;df -h $FS; echo "Largest Directories:"; \
nice -n19 find $FS -mount -type d -print0 2>/dev/null|xargs -0 du -k| \
sort -runk1|head -n20|awk -F'\t' '{printf "%8d MB\t%s\n",($1/1024),$NF}'; \
echo "Largest Files:"; nice -n 19 find $FS -mount -type f -print0 2>/dev/null| \
xargs -0 du -k | sort -rnk1| head -n20 | \
awk -F'\t' '{printf "%8d MB\t%s\n",($1/1024),$NF}';

It works. Nobody is typing that from memory though. The knowledge base was probably the most visited page on our wiki, everyone had it bookmarked just for commands like this.

The other problem: on a filesystem with thousands of small files, it'd take forever. Single-threaded find piped to du, waiting, waiting.

So this is a Rust rewrite of that snippet. Same job, but parallel, and you just type si /path.

02 / What you get

Features

Filesystem info

Total, used, and available space for the target mount, like df -h.

Parallel walking

Uses jwalk to traverse directories across multiple threads.

Mount boundaries

Won't cross into other filesystems. Same as find -mount.

JSON output

Pass --json to get structured output you can feed into jq or other tools.

03 / Getting started

Install

Download a binary

# grab the latest release for your platform
$ curl -LO https://github.com/gremlinltd/space-investigator/releases/latest/download/si-<target>.tar.gz
$ tar xzf si-*.tar.gz
$ ./si --help

Build from source

$ git clone https://github.com/gremlinltd/space-investigator.git
$ cd space-investigator
$ cargo install --path .

04 / Usage

Running it

$ si [OPTIONS] [PATH]

Flag	What it does	Default
`PATH`	Directory to scan	`.`
`-d, --dirs <N>`	Top directories to show	`20`
`-f, --files <N>`	Top files to show	`20`
`-j, --json`	Output JSON	off

Examples

# scan current directory
$ si

# scan /var, show top 10 of each
$ si /var -d 10 -f 10

# JSON output, pipe to jq
$ si /home --json | jq '.largest_files[:5]'

05 / JSON output

Structured output

The --json flag gives you everything in a machine-readable format:

{
  "timestamp": "2026-04-09T20:57:03+03:00",
  "path": "/home",
  "filesystem": {
    "name": "Macintosh HD",
    "total_bytes": 1995218165760,
    "used_bytes": 1241711876544,
    "available_bytes": 753506289216,
    "use_percent": 62
  },
  "largest_directories": [
    { "path": "/home/user/projects", "size_bytes": 5368709120, "size_mb": 5120 }
  ],
  "largest_files": [
    { "path": "/home/user/backup.tar.gz", "size_bytes": 2147483648, "size_mb": 2048 }
  ]
}