Data deduplication

The Boomla Filesystem automatically deduplicates data on disk. This means if you have multiple copies of the same data, it will only use disk space once. Alternatively, one can also call this filesystem compression.

This only applies across all branches of a single website. If you have multiple websites, their storage requirements will be calculated independently. This is required as they may be stored on different servers.

Subtree size vs storage used

Calculating the storage space used by a website is a long process and can not easily be sped up. For this reason file size, subtree size and children size mean a simple mathematical sum for the given subtree, it does not take data deduplication into account.

Because of that and how dependencies are handled by Boomla, you can easily see websites that are several GBs in size, while in reality, they only use a few MBs of disk space. Most often this is caused by packages installed under /sys/packages.

So how are file sizes any useful? First, if you substract the /sys file, the rest is usually pretty accurate. Second, file sizes give you a precise upper bound. If you see a file that is 1KB and another that is 10GB, the sizes do provide a good guess which contains more data.