How to do hermetic builds

When Bazel 1.0 was released, people were happy. One of Bazels features are hermetic builds, which means "dependent only on a known set of inputs, which is essential for ensuring that builds are reproducible."

How does Bazel ensure this? It does not really, but you get tips how to achieve it. Bazel allows to escape hermeticity but makes it easy to stay inside: Just use the standard rules. It can also sandbox the builds to detect escapes. Still, hermetic build rules are necessary but not sufficient for reproducible builds.

Indeterminism of a build product can come from the environment or from the process itself.

Environment: If the build system assumes tools like the compiler are available from the environment, then it may influence the build result. Accessing the internet during the build can easily break determinism. GCC lets you use macros like __TIME__ and -march=native makes the build depend on whatever CPU it executes on. Archive tools like zip record the modification time of files.

Process: A script will behave the same, even if the interpreter comes from the host system. However, scripts can be indeterministic, for example when iterating over a map or set. Parallelism in builds can easily go indeterministic.

It is clear that the build system alone can never guarantee reproducible builds. Every tool in the chain must be deterministic. There is no way around testing with varied environments to find problems with reproducibility. Still, the build system has a central role because it can detect or enforce environmental influences and this is what hermeticity is about.

Every build system has a way to specify file dependencies. A build system may track file accesses during the build and either analyse the consistency of the specification with the actual build behavior or even enforce consistency. Since build speed matters and tracing is not for free, a special analysis mode makes sense to me. On the other hand, enforcement means a problem is detected immediately.

For example, tup hooks into the file system to track file accesses. It creates a virtual file system with FUSE for the build instead of using whatever it is executed on. The fuse_get_context API provides information which process accesses it, so tup can trace file operations back to build commands. If the build specification does not match the actual behavior, the build fails, so consistency is enforced.

Another option would be to trace build steps with strace. Build systems like fabricate and stroll use this approach. However, they are out of scope here since dependencies are not specified but only discovered. Without a specification, there is nothing to enforce or analyse.

When I asked what Bazel actually does, the answers surprised me by how little it is. Using Linux user-namespaces a sandbox program makes everything but the working directory read-only. It seems to be concerned with storing unspecified data but reading is not restricted at all. However, for reproducibility limiting the inputs is necessary. This makes me wonder why people claim that Bazel is better with reproducibility at all since issues can creep in as easily as with other build systems.

Conclusions: FUSE, strace, and namespacing were all the mechanisms I found. Bazel uses separate wrapper program which you could reuse in other build systems, so there is no fundamental problem with adding a "hermetic builds" feature to other build systems like Meson or Cmake.