Why Boomla doesn’t need Git

2018-07-21

Git is the most popular version control system used by software developers.

I’m sometimes asked if I plan to add Git integration to Boomla. I thought I would share my response here.

The short answer is no. Here is why it would be a bad idea and why we can do much better anyway.

We have been there before

Git integration is bleeding from multiple wounds. First of all, I’m not making up excuses here, I wanted to integrate git in the early days. It’s already a standard for doing version control - at least in the POSIX and similar worlds (looking at you, Windows).

In fact, in the early days, Boomla websites were version controlled with Git. Boomla had a command line feature for exporting a website to your POSIX filesystem, you could version control it there with git, then re-import it any time you wanted. Over time though, as Boomla evolved, this solution conflicted with my plans for Boomla. One of them had to go, and my plans were to stay.

Filesystem differences

The Boomla filesystem is significantly different from POSIX and similar filesystems. It supports storing files in files, file attributes, file links, etc.

One way to overcome this issue is to define a mapping between the two worlds. This has been implemented and is currently supported for example by our SFTP server. The children of a file are accessible via a virtual directory, file attributes are accessible via virtual files. We also supported exporting a website to this posix compatible format, and the inverse functionality existed for importing from it.

The SFTP layer is still great but the posix export/import functionality had to retire. We added native dependency management support to Boomla. That is, you can import (mount) packages (websites) into your website via file links. Unfortunately for this blog post, we are using the word import with a different meaning here. Let’s call it mount import from now on, and let’s call importing from the posix mapping to Boomla posix import. One example would be to mount import a gallery package with the link

import gallery.boomla.net 014db9833a9eab9ecee7f3361121cd399dd51d7cbe

This defines both the source of the package (gallery.boomla.net) and its exact state by the hash. That way, we can find out if your mount imported packages are outdated or not.

The problem is, websites are not stored in normalized form, for performance reasons. Thus, when you posix exported then imported your filesystem, you ended up with a different hash for the mounted subtree. This of course raised an error, forcing the operation to fail.

Hence using the posix filesystem mapping to version control Boomla websites is not workable. Because Git uses the posix filesystem format, we can’t use Git.

Let me rephrase, in case it wasn’t clear. Mapping the Boomla filesystem to the POSIX format is a lossy conversion, because for performance reasons the Boomla filesystem is not stored in normalized form. This necessarily breaks mount import links.

Benefits of using a custom version control system

Above we looked at why we can’t use Git for Boomla. Let’s look at why we wouldn’t want to use it anyway. I’ll keep it short.

Complexity

Git is super complex. We can make it way simpler. Just think about how complex it is to remove large files you have commited by accident.

Reclaim space

Git is about never reclaiming space, but websites are for storing documents, including videos, not only source code. There must be 2 version control solutions. One for source code, where you never want to delete from your history, and one for websites, where it must be simple to reclaim space.

Simultaneous access to branches

In Git, you can only work with one branch at a time. If we integrate a custom version control solution in Boomla, you can work with all of them, at the same time - just on different sub-domains.

Undo, redo

If we integrate version control into the platform, we can provide you with instant undo/redo for your entire filesystem. (Instant as in < 0.001s.)

Data deduplication

If we use a native, version controlled filesystem, you get data deduplication for free. If you upload the same file twice, it will only use space once.

Roll your own

Because data is already deduplicated for you, and hashes are readily available, you can use it to provide version control features for your users, in real time, on your website. For this, please look into volume links.

Conclusion

Using Git is not only impossible, we are better off rolling our own solution. So that’s what we do.


Cheers,

you can follow me on Twitter