These are some musings on build systems, and the rationale behind them, as I used them in this site. This article could probably also be subtiled: If I followed the most logical path at every step, I'd probably recreate Cmake too.
I'm creating this website. I want to not only do the following, but to do it effortlessly:
I want to add and edit content on my site. Mainly blog posts like this one where I'm writing prose: I'm trying to express an idea to another person using written words, and sometimes photos, videos, and demos. My priority is that my point is clear.
I have languages which I prefer when writing prose. Typst is my preferred language, but I also write Markdown or DOCX (Microsoft Word / Google Docs) files sometimes. These languages (along with their corresponding editing softwares) allow me to place most of my focus on the content and clarity of what I'm saying. Meanwhile, when I write HTML, my focus is rarely on content or clarity. Therefore, Typst is my language for prose.
To express my idea clearly, I write in Typst. But for the sake of accessibility, I must distribute in HTML. Therefore, the most reasonable thing I can do is to translate from one language to another, an act known as compiling.
So the premise is that in order to share my ideas on my website, I need to:
This article is mainly focused on the middle two steps: compiling and storing, as the methods I've used have changed as the website has grown.
The first pages written for this site, such as the home page, were written in raw HTML. In fact, the homepage is the last page still in raw HTML.
Thus there was no compiling needed. I would write the HTML file on my computer, and then copied it over to my server using a command like:
scp /path/to/file nate.town:/path/to/storage/placeI believe pure HTML and copying was the right decision for the time. My priority then was to get a home page quickly, and I did not have the knowledge, experience, or infrastructure that I do right now. But now, I can get pages out much quicker with greater quality. So if I were to restart the site with my current knowledge, I'd set up the infrastructure right after the homepage was up.
Very quickly I began writing in Typst instead of HTML for reasons mentioned earlier. Compiling was simple:
typst compile --format=html --features=html index.typ index.htmlI also switched to using git to copy files rather than manually copying. I had cloned my git repo on my webserver and on my local machine. To update my site, I would push a new HTML file from whatever machine I was on, and pull from the server. This worked okay for quite some time. One extra manual step was not to bad over the Summer, when I had lots of time, motivation, and published articles in sporadic bursts.
A flaw of the git technique was that all my files were copied to the server, including non HTML files. I did not want everything in the repo to be published on the internet. For a while, I avoided this problem by just being careful which directory of the repo was made public on the webserver. But it wasn't a great solution.
Another issue was that when I was making many changes to many different pages, I would forget which Typst files I had already changed and which I hadn't. There were also a lot of commands to run. Makefiles solve this problem, so I created a simple one for this site. It looked like this:
# Usage:
# make SRC_DIR=path/to/src OUT_DIR=path/to/out
SRC_DIR ?= src
OUT_DIR ?= pub
# Find all source files
TYPST_SRCS := $(shell find $(SRC_DIR) -type f -name '*.typ')
OTHER_SRCS := $(shell find $(SRC_DIR) -type f ! -name '*.typ')
# Corresponding output files
TYPST_OUTS := $(patsubst $(SRC_DIR)/%.typ, $(OUT_DIR)/%.html, $(TYPST_SRCS))
OTHER_OUTS := $(patsubst $(SRC_DIR)/%, $(OUT_DIR)/%, $(OTHER_SRCS))
# Default target: build everything
all: $(TYPST_OUTS) $(OTHER_OUTS)
# Convert compile .typ files to .html files
$(OUT_DIR)/%.html: $(SRC_DIR)/%.typ
@mkdir -p $(dir $@)
typst compile --features=html --format=html $< $@
# Copy all other files
$(OUT_DIR)/%: $(SRC_DIR)/%
@mkdir -p $(dir $@)
cp $< $@
.PHONY: clean
clean:
rm -rf $(OUT_DIR)At this point, the HTML files could be easily generated from the Typst files, so they became redundant information, and I removed them from the repo. Additionally, development became extremely convenient. I would host my website locally using live-server, and just run make after a save when I wanted to see a change.
But, the above Makefile is firmly a GNU Makefile. You can tell by its use of functions like shell, patsubst and dir. Those aren't supported by other types of Makefiles, such as bmake which is the default on OpenBSD (what this site runs on) or MacOS (another one of my development machines).
The workaround for this was to have a Makefile per OS. Then, I added a git receive script, that upon pushing my changes to the main branch, would make the changes and publish it to the website.
A Makefile per OS was not great for maintenance. I could have switched to bmake definitively, but it just did not have the convenience features that gmake had. But switching to gmake meant that I had to not give different build instructions per-OS, because what is known as gmake on BSD is just make on Linux. My site is also a staging ground for ideas and workflows. So I didn't want to have any possible confusion during the build process. I want you to be able to just make and go.²
So what I really wanted was use the subset of Makefile's that are portable - i.e. shared between all versions of Make on all the platforms I care about. I would just define a big file of rules on how to make each individual file, and then make would work flawlessly.
So, I decided to generate this file using a script. This script:
#!/bin/sh
#This file exists because this site is deployed on OpenBSD, but their default
#implementation of make does not support GNU Make functions like shell, dir,
#or patsubst
set -e # stop execution on first error
# set -x # print commands to stderr
SRC_DIR="$1"
OUT_DIR="$2"
find "$SRC_DIR" -type f | while read -r src; do
rel="${src#$SRC_DIR/}"
case "$src" in
*.typ)
out="$OUT_DIR/${rel%.typ}.html"
echo "$out: $src"
echo " mkdir -p \`dirname \$@\` "
echo " typst compile --features=html --format=html $src \$@"
;;
*)
out="$OUT_DIR/$rel"
echo "$out: $src"
echo " mkdir -p \`dirname \$@\`"
echo " cp $src $out"
;;
esac
echo "TARGETS += $out"
echo ""
done
echo "targets: \$(TARGETS)"And now this is the Makefile:
# Usage:
# make SRC_DIR=path/to/src OUT_DIR=path/to/out
# The ?= operator conditionally assigns a variable if it is not already assigned
# This allows it to be overrided by the make cli
SRC_DIR?=src
OUT_DIR?=build
# These contain both the list of targets and the rules to make them
RULE_FILE=rules.mk
.PHONY: all clean $(RULE_FILE)
all: $(RULE_FILE)
$(MAKE) -f $(RULE_FILE) targets
$(RULE_FILE):
./gen_rules.sh $(SRC_DIR) $(OUT_DIR) > $(RULE_FILE)
clean:
rm -rf $(OUT_DIR) $(RULE_FILE)Hmmmm… I've written a script which generates a Makefile. How interesting. I've reimplemented a subset of cmake.
Let's extrapolate for a little bit. Should this site get even more complicated, I could eventually see this script becoming unweildy. I'd want a domain-specific language for generating my Makefiles. By following the path of least resistance at every step, I would recreate something with identical functionality to cmake.
It still feels disturbing though, that the end result was to create a build system for my build system. Surely we've gone too many layers of abstrction deep. I really only wanted to run a series of console commands. I know what compiler I want to use. With my mind on the end-to-end argument, should cmake really be deciding on my behalf with compiler would be best for my platform? Maybe sometimes. I'm unsure.
I suppose this is the logic behind the Nix build system, which I've heard allows you to specify exact dependencies to build with, for perfect reproducability.³ At the risk of redescribing Nix, I'll say what I'd want from a build system right now:
I'd like a language which is able to:
Describe my build dependencies, and how to retrieve them (either by download or building):
git. Perhaps instructions for this could be built into the built system itself. I'm also unsure of what would constitute as an extremely common dependency. git absolutely, typst no way. What about clang?It really just sounds like a Makefile with a little bit of extending.
The idea certainly sound like a reinvention of many things: package managers, Nix, Gentoo, PKGBUILD, and especially Makefiles. A Makefile builds a dependency graph for your project with rules describing how to make everything in the graph. Again, my only issue with Makefiles is the package manager naming issue, and the lack of enforced dependency management.
I am a big fan of domain-specific languages, so this could be a fun thought for another day.
My current workflow is fantastic. To update my site I do a git push. That's it. The rest of the process is automated by a post-receive script in the remote repo. I'll post the exact script later.
The only issue I have right now is that don't do a clean build when creating the website. Meaning I'll have stale files in my public site even if they're removed from the git repo. I don't want to delete the public site ever though, because I don't want downtime ever. What I'd like it to to create a clean build of my site in a new directory, and then atomically move the new directory to my websites directory. On Linux, there is a syscall that does an atomic exchange for directories. There is no equivalent syscall on OpenBSD right now.
So instead, I just manually delete older files from the website every now and then. My philosophy with the web is to delete very little, so this isn't that big of a deal. I can think of ways around it, but none that are simpler than "manually delete every now and then".
As promised, the post-receive hook in git which does the work for me:
#!/bin/sh
set -e
set -x
PUB_SITE_DIR=/var/www/nate.town/pub
PUB_SITE_BUILD_DIR=/home/nate/site-builds/nate.town
DEV_SITE_DIR=/var/www/dev.nate.town
echo "Running post-receive script as $USER"
# Generate the site
while read oldrev newrev refname; do
branch=$(basename "$refname")
case "$branch" in
main)
echo "Deploying main..."
cd $PUB_SITE_BUILD_DIR
unset GIT_DIR
git pull
make SRC_DIR=src OUT_DIR=$PUB_SITE_DIR
;;
# dev)
# echo "Deploying dev..."
# cd $DEV_SITE_DIR
# git pull
# make
# ;;
*)
echo "No action for branch: $branch"
;;
esac
doneBy following the path of least resistance at every step, I would probably end up with a build system that looks like Cmake as well. But I feel like this path has lead us to a local optimum. A build system which makes no assumptions on packages beyond the built in OS utilities would be great. That may exist already, but I don't know what it is yet.
However, with all that being said, I'm extremely happy with how my site builds right now. It is understandable and effective. I want my build system to be extremely intuitive and simple.⁴ Because although the final product is what matters, a bad build system can make it harder to produce that product as foten. A good build system allows us to flow. It is just as much a part of getting to the final product as anything else.
Build systems are important.