Pondering Amazon's Manyrepo Build System

A while ago, I pondered monorepo version control systems. This article is at the opposite end of the spectrum: Manyrepos.

Why Manyrepo?

Monorepos are alluring since Google, Facebook, and Microsoft use that approach. Is that a part of their secret sauce or accidental? There is another big tech company which does the opposite. At Amazon, teams work more independently. Some use different version control systems which are not git like Subversion and Perforce. I guess that most companies do not have a monorepo because it is really easy to split of a separate project but hard to merge them just from an organizational point of view. So maybe we can learn more from Amazon than Google?

A common pattern is that Amazon like the others built its own infrastructure and engineers love it and miss it after they leave.

I've heard descriptions and seen blog entries about many other large companies build systems, but to be honest, nothing even comes close to the amazing technology Amazon has produced. I would probably argue that what Google, Facebook, and most other companies of comparable size and larger do is at best objectively less good and at worst wasting millions of dollars of lost productivity. –terabyte

Once you understand the build and deployment tools you first wonder how you ever did anything before and then start to fear how you'll do anything once you leave. –hohle

Just like Ex-Googlers reinvented their build system on the outside with Pants and Buck. Likewise QBT reinvents the Amazon build system. Unfortunately, QBT is less known and less mature.

Amazon's Build System

There is less information about Amazon unfortunately. Mine is from this gist and discussions on HN and lobste.rs. If I got something wrong, please tell me. The short version is that Amazon's "Brazil" tool is more of a package system than a build system. It is closer to Nix than to Bazel.

Brazil delegates the actual build process to language-specific tools. Tools, like tmux, are also packaged with Brazil. The interesting part is how packages are managed and the core concept to understand is version set.

If you create a package at Amazon, you specify an interface version like "1.1". As long as changes are backwards-compatible, the interface version is not changed. When Brazil builds a package, it appends an additional number to turn it into a build version like "1.1.3847523". You can only specify dependencies on interface versions.

Another thing Brazil does when building a package is to record the transitive closure of dependencies with their build versions. Modern packaging tools differ between "dependencies you want" and "dependencies you actually used". For example, Cargo.toml and Cargo.lock in Rust.

A version set is a collection of packages. The "live" package is the special global one which corresponds to a trunk branch in version control. When you build a package "against" a version set, the tests of all packages in the version set are executed, and the version set is incremented. Thus the individual package is updated (or published if it was not part of the version set before).

Brazil dependencies are classified as runtime, build, and test dependencies. So for deployment, it can strip everything but the runtime dependencies from a version set.

Dependencies must be carefully managed in this environment as a "dependency hell" scenario is possible.

One of the biggest ways that brazil was misused was around handling of major versions [aka interface version]. For context, only a single major version of a package is allowed to exist in a versionset at a time. If you tried to merge in a different major version of a package into your versionset, your pipeline would fail to build due to "Major version conflicts". One of the biggest sins was around bumping the major versions of the dependencies in a library without bumping the major version of that library at the same time. This would lead to many broken pipelines. Let's say you have a library Foo-1.0 with a bunch of users on other teams. You decide to bump up the Guava version from 25 to 29 and publish the new version of Foo-1.0. Anyone consuming Foo-1.0 would automatically pick up the new version of that lib because it's just a minor version change, however the merge would fail with a "major version conflict" because the major version of Guava they're using in their versionset is still 25. This means you would either have to pin that library back at a previous version, or bump your dependency on Guava in all of you packages to 29. –pentlander

This is an insight that generalizes: Updating a dependency major version is a breaking change even if your API is stable.

The Point?

Overall, it sounds a lot like a distribution package manager like apt or Nix. The difference of version sets is that they provide a branching mechanism and this is how teams can work independently. How is that unique though? You can fork with apt and Nix as well. In a monorepo, it would be a branch. It must be about something different.

One advantage of monorepos is that one can track all users. Version sets in Brazil provide a similar mechanism since it is a central database. This is important in case of security updates, for example. Unfortunately, in manyrepo environments this information is usually not available and when an issue arise it must be arduously researched. So maybe companies should build such infrastructure instead of dreaming about monorepos?

Coming back to the advantage of manyrepos, refering to Amazon we can describe it concretely: Version sets allow you to use multiple interface versions of the same package at once (not multiple build versions though). This mixture is not technically possible with a git monorepo (but with Subversion or Perforce). This is at least one example of the general tradeoff.

I don’t really think there’s a better or worse between the Amazon and Google/FB style build/deploy/source control systems, it’s primarily a reflection of the org/team structure and what they prioritize - there’s tension between team independence/velocity and crosscutting changes that optimize the whole codebase. –revert

I would like to see more discussion online about this. Many companies should value the team independence over the crosscutting changes. So the question is: How to get the advantages we currently uniquely attribute to monorepos in a manyrepo setting? Amazon's Brazil has valuable ideas to contribute and should be more widely known.

Amazon's build system provides valuable insights for manyrepo environments.