What's eating all my disk? Using 'du' in Linux

I’ve just received an alert for a low disk space condition on a host.

If you are already a linux user, than you might be familiar with the df command. df can be used to show how much free space there is on each mounted file system. Running the df command elicits results such as:

 $ df -h
 Filesystem                                   Size  Used    Avail Use% Mounted on
 /dev/sda4                                    118G  105.2G  106G  96% /
 none                                         4.0K     0    4.0K   0% /sys/fs/cgroup
 udev                                         7.9G  4.0K    7.9G   1% /dev
 tmpfs                                        1.6G  1.6M    1.6G   1% /run
 none                                         5.0M     0    5.0M   0% /run/lock
 none                                         7.9G   26M    7.8G   1% /run/shm
 none                                         100M   16K    100M   1% /run/user
 /dev/sda7                                     98M   34M     64M  35% /boot/efi
 /dev/sda5                                    102G  3.6G     93G   4% /home

Passing the ‘-h’ flag into df means produce human readable output. Having seen the disk usage situation, how do I find out what is using all that disk space disk space? I give you du or disk usage command. Issuing the following command will start to give a picture of what size each top level folder is in the root of our file system:

du -sh /*

If you aren’t running as the root user, you may need to use sudo to avoid getting a series of permission denied errors:

sudo du -sh /*

And you may still want to redirect standard error output to /dev/null, as to avoid seeing errors for special files and directories, such as /proc/*

$ sudo du -sh * 2>/dev/null
9.8M	bin
80M		boot
4.0K	cdrom
4.0K	dev
25M		etc
7.1G	home
0		initrd.img
324M	lib
3.5M	lib32
4.0K	lib64
16K		lost+found
8.0K	media
28K		mnt
374M	opt
0		proc
150M	root
2.3M	run
16M		sbin
4.0K	srv
0		sys
52K		tmp
4.5G	usr
725M	var
0		vmlinuz

The -s flag supplied to du above means summarise. Or in other words, return the total amount of disk usage for each file/folder in the supplied path. The -h flag, as with df, means return human readable output. Right now we can see, almost at a glance, what is using the most disk space within the root of our file system (/). In the next step on this quest, we could drill down into a folder. The /usr folder is using 4.5 GB of disk, so lets take a peek.

$ sudo du -sh /usr/*
148M	/usr/bin
44K		/usr/games
39M		/usr/include
2.5G	/usr/lib
76M		/usr/lib32
33M		/usr/local
23M		/usr/sbin
1.6G	/usr/share
128M	/usr/src

We can see that the biggest child is the /usr/lib folder. There are are lot of files within that folder as it’s where most of the shared libraries live. Running the same command on that folder would elicit lots of output and completely spam our standard output. So, I’m going to use one last trick with the du command. I’m going to use it to find just the ten biggest files in the /usr/lib folder:

sudo du -sh /usr/lib/* | sort -h | tail
103M	/usr/lib/thunderbird
151M	/usr/lib/firmware
161M	/usr/lib/gcc
181M	/usr/lib/jvm
192M	/usr/lib/slack
193M	/usr/lib/chromium
196M	/usr/lib/firefox-esr
289M	/usr/lib/libreoffice
622M	/usr/lib/modules
1.5G	/usr/lib/x86_64-linux-gnu

In this example, piping the output from the du command into sort -h sorts the output sensibly in ascending order. This gives the largest files and folders at the bottom of ths list. Finally, using tail filters the output down to 10 largest items in the /usr/lib/ folder.

Article Revised: 2021-07-03T13:35:00+0100

What’s eating all my disk? Using ‘du’ in Linux