One Simple Trick for Building Images Faster

October 17, 2020
ops docker circleci

First off, I apologize for the clickbait title. It hurt me just writing it.

A common step in a Continuous Integration/Continuous Delivery (CI/CD) pipeline is building container images. Fast image building is heavily dependent on being able to use a layer cache, which is many cases happens by default. The layer cache is what allows a docker build to skip a complex or long-running build step, by instead using the already-built layer.

This obviously requires the already-built layers to be present in the context that the build is executing, and many common hosted solutions (such as CircleCI, TravisCI, Drone, etc), do not always have the cache available. Your build jobs are scheduled to whichever machine is available at the time, which does not have your previous builds accessible.

In order to address this issue, hosted providers have different solutions. For example, CircleCI can provide a Docker Layer Cache, however it is not cheap, and is not available in their free offering at all.

When we pull a docker image, we are actually pulling down each layer that was pushed to the remote registry. So what if we could pull down those layers into our new context, and force docker to use that as a layer cache. Luckily, we can.

`--cache-from`

We can use the --cache-from option of docker build to specify an image to use as a cache, but before we do that, we need the image on the local machine. We already know how to do this, we just docker pull the image that will have the most overlap with our current build. You will have to determine what image(s) work best for you, however in my experience this is typically either the latest master/main branch build, or a previous build of the current branch.

As an example, I previously had the following build step in my CircleCI config:

  - run:
      name: Build docker image
      command: |

        <authenticate docker>

        docker build -t <registry>/<app>:$CIRCLE_SHA1 .
        docker push <registry>/<app>:$CIRCLE_SHA1

        if [ "$CIRCLE_TAG" ]; then
          docker tag <registry>/<app>:$CIRCLE_SHA1 \
            <registry>/<app>:$CIRCLE_TAG

          docker push <registry>/<app>:$CIRCLE_TAG
        fi

By updating our build step, we can pull in an image to use as a layer cache.

  - run:
      name: Build docker image
      command: |

        <authenticate docker>

        # Add a `docker pull`
        docker pull <registry>/<app>:latest

        # Add our `--cache-from` line
        docker build \
          --cache-from <registry>/<app>:latest \
          -t <registry>/<app>:$CIRCLE_SHA1 ./src/flask
        docker push <registry>/<app>:$CIRCLE_SHA1

        if [ "$CIRCLE_TAG" ]; then
          docker tag \
            <registry>/<app>:$CIRCLE_SHA1 \
            <registry>/<app>:$CIRCLE_TAG

          docker push <registry>/<app>:$CIRCLE_TAG
        fi

        # Add a push to update the `latest` tag
        docker tag \
          <registry>/<app>:$CIRCLE_SHA1 \
          <registry>/<app>:latest

        docker push <registry>/<app>:latest

Because this uses only the final pushed image as a cache, this will not help for images which are squashed before push, or using multi-stage builds.

I recently added this to a project, which I am using Circle’s free tier to build. This immediately dropped by build times from approximately 6 minutes, to 40 seconds. This alone improved my build time by ~89%. Not only is the speed much more convenient, but lets me get much farther on the free tier. With Circle’s default free tier configuration, you get 250 build-minutes per week, so saving 5 minutes per build has made a non-trivial difference to my workflow.

Buildkit

This method has been built upon with BUILDKIT, which allows cache usage information to be build into images for subsequent use as an “external cache”. This prevents the need to pull the entire image down first, as docker will pull the metadata first, and only pull layers which are needed as a cache. BUILDKIT also includes functionality for utilizing this metadata as part of a multi-stage build. An image must be built to support this at build time by providing the BUILDKIT_INLINE_CACHE=1 build-arg. For example:

$ docker build -t <registry>/<app> --build-arg BUILDKIT_INLINE_CACHE=1 .
$ docker push <registry>/<app>

On another machine, you can then:

$ docker build --cache-from <registry>/<app> .

You can read more regarding using images as external caches here.

Hopefully this works out for you, or was at least helpful. If you have any questions, don’t hesitate to shoot me an email, or follow me on twitter @nrmitchi.

--cache-from

Read more

`--cache-from`