Introducing Autopilot, an AI coding assistant
gradient
Caching Golang tests in CI

Caching Golang tests in CI

May 30, 2023
8 min read

We use Golang (go) at Airplane and run all of our unit tests through a CI on each change in our monorepo. These tests were very slow, but we were able to use go test caching to speed them up by an order-of-magnitude (from minutes to seconds in many cases!).

In the remainder of this post, we provide a high-level overview of how Golang test caching works and then describe how we adapted our CI workflows to fully take advantage of it.

Background

At Airplane, all of our backend systems are written in Golang, and we maintain everything in a single monorepo. Every time a change is made in this repo, our CI system (currently GitHub Actions) runs all of the tests in the repo to ensure that developers aren’t introducing regressions.

Initially, these tests were relatively fast to run end-to-end, but as we added more of them, our test-related CI workflows got slower and slower. This was costing us extra money (GitHub charges us by the minute) and also increasing the time required for developers to get their changes merged and deployed to production.

The fundamental problem here, and one that affects other languages and build systems as well, is that we were running all tests on all changes, even if the change had no possible impact on a particular test.

There are a couple possible solutions for this, but the easiest one to use is the test caching features built in to the go test tooling. Although this is intuitive for local development, we ran into a number of quirks and challenges (described below) when trying to use it in a cloud-based, CI environment.

Introduction to go test caching

Test caching was added in Golang 1.10. The idea is simple- if we run the tests for a package and they pass, then subsequent executions of the same tests in the "same environment" (defined later) will use the cached results instead of re-running the tests.

Let’s suppose, for example, that we have a hello package with a function that generates a hello message given a person’s name:

go

We then write a simple unit test for it:

go

The first time we run the test, go actually executes it, as indicated by the time duration shown next to the package name:

text

If we run it again without making any changes, however, the cached result is returned without actually executing anything:

text

When the tests are as small as they are here, it doesn’t really matter whether the results are cached or not. But, in a large monorepo like ours, which has thousands of tests across dozens of packages, cache hits versus misses can have a significant impact on performance.

Factors that affect caching

It was mentioned above that the cached results will be used if the same tests are run again in the "same environment". This leads to the question of what go considers when computing the "environment" for a test.

The first, and most obvious factor is the code itself. Let’s suppose, for instance, that we modify our HelloStr function a bit:

go

Now, the cache from before isn’t used:

text

This makes intuitive sense- our code change might have somehow changed the behavior we were trying to test, so we need to re-run everything to make sure the tests still pass. The same applies if we’re making changes to anything that our code depends on like an upstream package that we’re importing.

Beyond code updates, the go test caching logic considers other types of changes that have the potential for altering the test environment:

  1. Environment variables that are read during tests
  2. Files in the GOPATH that are accessed during tests
  3. Certain flags that are passed to the go test binary, e.g., -timeout or -run; see the documentation for a full list

If any of these things changes from one test run to the next, then go will, in effect, use a new cache key for looking up the results. For example, let’s suppose that our code is modified to read a DEBUG_HELLO environment variable:

go

Now, changes to that environment variable cause tests to be re-executed:

text

The same idea applies if our code is reading files, e.g. test fixtures checked in to the repo- any changes to those files will "bust" the existing cache state.

How go test caching works

Before going into the CI use case, it’s helpful to dig into some of the details of how go’s caching actually works behind the scenes. The full code is in src/cmd/go/internal/test/test.go in the golang/go GitHub repo, but below is a high-level summary.

Note that we use the term "test" in the discussion here to refer to the set of tests for an entire package. As described in the documentation, test caching only applies if the go test tool is run in "package [list] mode". This is also why, in the examples above, we have a . after the go test- to tell go to run all tests in the hello package.

The first time the test is run (assuming an empty cache):

  1. Go computes a "test ID" from a hash of the test binary and its arguments
  2. As the test is executed, go keeps a log of the external operations conducted by the test, including files accessed and environment variables read
  3. When the test finishes, go reads through the log and computes an "inputs ID". This is a hash that includes:
    • The name and value of each environment variable read
    • The path, size, and last modified time of each file in the GOPATH that was accessed (i.e., either opened or stat’ed)
  4. Two entries are written into a file-system-based cache:
    • <test ID> → test log
    • <test ID, inputs ID> → test outputs

Then, the next time the test is run:

  1. Go computes a new "test ID", and looks up the log for that in the cache
  2. If the log is found, go re-computes an "inputs ID" using the procedure from above with the log from (1)
  3. Go looks up the <test ID, inputs ID> key in the cache to try to find the outputs
  4. If the outputs are found and they indicate that the test was successful, the tests are skipped and (cached) is printed out instead
  5. Otherwise, if the log isn’t found, the outputs aren’t found, or the outputs indicate a test failure, the test is re-executed starting from step (2) in the previous list

The cache itself is implemented using the file system, with the keys mapping to file names and the values stored as file contents. The default location of this cache varies by platform; on Linux, it’s in $HOME/.cache/go-build .

The hashing of the test binary and environment variables is fairly straightforward, but there are a few nuances around file access that are important to note. First, for performance reasons, go doesn’t hash the file contents, only the size and last modified time; this has implications for caching in a CI environment, as described later.

Second, only files in the GOPATH are considered in the hash calculation. File operations and associated file changes outside of the GOPATH, e.g. in temporary directories, are ignored.

Test caching in CI

Given the details above, it follows that we can take advantage of go test caching in a CI environment by preserving the contents of the cache directory (i.e., $HOME/.cache/go-build assuming Linux) from run to run.

The exact mechanics of this vary based on the CI system used, but in GitHub Actions, this is as simple as including an actions/cache step in the test workflow:

yaml

The full details here are described in the GitHub documentation, but the idea is that this will restore the contents of $HOME/.cache/go-build (the cache we care about for testing purposes) and $HOME/go/pkg/mod (the go module cache, which is also helpful to preserve) from the most recent test run, then save an updated version at the end.

Assuming that the "test ID" and "inputs ID" for each package don’t change, the results should be cached from run to run.

CI test cache gotchas

We added a step like the one above to our CI, and it helped a lot, but we noticed that there were some packages for which the tests were always re-run. These tests had a few common traits, described along with the associated fixes in the sections below.

Writing files

First, a few of our tests were writing new, git-ignored files back into the repo during execution. This is a no-no for caching because these files will never be present after a fresh git clone and thus won’t be included in the "inputs ID" calculated at the beginning of a test run.

The fix here was to switch these tests to making their writes in a system temporary directory. As described previously, go ignores file system operations outside of the GOPATH, so operations in /tmp or a similar location won’t break caching. We also updated the tests to delete these directories after use, which is nice from a cleanliness perspective.

Altering environment variables

Some other tests were setting environment variables without cleaning them up, which, like writing files into the repo, breaks the consistency of the inputs hash.

The go T test struct has method, setenv, that sets an environment variable for a particular test and then automatically restores it to its previous state, so using this fixed that problem.

Reading fixtures

A third issue, and the most perplexing, was that caching never worked for test packages that read checked-in fixtures.

After some investigation, we realized that this was because the last modified timestamps on these fixture files were changing from run to run. This happens because these timestamps are set based on when git clone runs in the CI, not when the files were actually last modified in git. Thus, any test that read fixtures like this couldn’t take advantage of the cache.

To address this, we created a Python script that sets the modified time of each file in the repo based on a hash of the contents. This script is run before go test so the go test caching logic always sees consistent timestamps and can compute a consistent "inputs ID" for tests that read fixtures. If the file contents change, even without the length changing, the modified at timestamp also changes, which busts the cache.

The script is too big to include here, but is available in the airplanedev/blog-examples repo.

Results

After addressing the issues above and implementing some other tweaks like making our packages more granular, we reduced many of the CI test runs in our Golang monorepo from minutes to seconds!

test_cache_screenshot

Go’s test caching is a little bit quirky, particularly when used in combination with GitHub Actions, but it’s very effective at improving CI run times if set up correctly.

Share this article:
Benjamin Yolken
Benjamin Yolken is a software engineer at Airplane, where he focuses on backend infrastructure. Prior to Airplane, he worked at Twilio, Segment, Stripe, Airbnb, and Google.

Subscribe to new blog posts from Airplane.