Distributed Caching - docker/buildx

Distributed caching is a technique used to improve build performance in containerized applications. It allows multiple build agents to share a common cache, reducing the time and resources needed to build and test applications. The docker buildx build command, part of the Docker CLI, includes built-in support for distributed caching.

There are several options for implementing distributed caching with docker buildx build. These include:

Inline caching: This option embeds the build cache into the image, and is only supported when using the image exporter.
Registry caching: This option embeds the build cache into a separate image, and pushes it to a dedicated location in the registry.
Local caching: This option writes the build cache to a local directory on the filesystem.
GitHub Actions caching (beta): This option uploads the build cache to the GitHub Actions cache.
AWS S3 caching (unreleased): This option uploads the build cache to an AWS S3 bucket.
Azure Blob Storage caching (unreleased): This option uploads the build cache to Azure Blob Storage.

To use any of these caching options, you first need to specify the cache backend on build with the --cache-to option. Then, use the --cache-from option to import the cache from your storage backend of choice.

Here is an example of using local caching with docker buildx build:

$ docker buildx build --push -t <registry>/<image> \
   --cache-to type=local,dest=path/to/local/dir \
   --cache-from type=local,ref=path/to/local/dir \
<Dockerfile>

In this example, the build cache is written to path/to/local/dir on the local filesystem, and is imported from the same directory in subsequent builds.

For more information on caching with docker buildx build, see the Docker documentation on caching.

It is also possible to use a shared cache mount in a Dockerfile, which allows multiple parallel builds using the same cache mount to wait for each other and not access the same cache files at the same time. This can be useful when using a shared cache in a CI/CD pipeline. To use a shared cache mount in a Dockerfile, you can use the --mount option with the type=cache mount type. For example:

FROM some-base-image

# Use a shared cache mount
RUN --mount=type=cache,id=mycache,sharing=locked \
my-build-command

In this example, the mycache cache mount is used with the sharing=locked option, which ensures that multiple parallel builds using the same cache mount will wait for each other and not access the same cache files at the same time.

For more information on the --mount option, see the Dockerfile reference.

In addition to the caching options provided by docker buildx build, it is also possible to use a distributed, immutable cache with the werf build tool. This cache is based on the ideas of Multi-Version Concurrency Control (MVCC) and optimistic locking, and allows an arbitrary number of builders to use the shared cache in the container registry without breaking the reproducibility of previously assembled layers and images.

For more information on using the distributed, immutable cache with werf, see the werf documentation.

Finally, it is worth noting that CacheLib, an open source caching engine developed by Meta, has the potential to shape the future of caching. CacheLib and its benchmarking tool, CacheBench, are designed to be developer-friendly and accessible, and are available as open source projects. For more information on CacheLib, see the CacheLib website.