Using Kaniko cache

Overview

Kaniko cache caches container build artifacts by storing and indexing intermediate layers within a container image registry, such as Google's own Artifact Registry. To learn about additional use cases, see the Kaniko repository on GitHub.

Kaniko cache works as follows:

Cloud Build uploads container image layers directly to the registry as they are built so there is no explicit push step. If all layers are built successfully, an image manifest containing those layers is written to the registry.
Kaniko caches each layer according to the contents of the Dockerfile directive that created it, plus all directives that preceded it, up to the digest of the image in the FROM line.

Enabling Kaniko cache in your builds

You can enable Kaniko cache in a Docker build by replacing the cloud-builders/docker workers with kaniko-project/executor workers in your cloudbuild.yaml file as follows:

Kaniko Build

steps:
- name: 'gcr.io/kaniko-project/executor:latest'
  args:
  - --destination=${_LOCATION}-docker.pkg.dev/$PROJECT_ID/${_REPOSITORY}/${_IMAGE}
  - --cache=true
  - --cache-ttl=XXh

Docker Build

steps:
- name: gcr.io/cloud-builders/docker
  args: ['build', '-t', '${_LOCATION}-docker.pkg.dev/$PROJECT_ID/${_REPOSITORY}/${_IMAGE}', '.']
images:
- '${_LOCATION}-docker.pkg.dev/$PROJECT_ID/${_REPOSITORY}/${_IMAGE}'

where:

--destination=${_LOCATION}-docker.pkg.dev/$PROJECT_ID/${_REPOSITORY}/${_IMAGE} is the target container image. Cloud Build automatically substitutes the PROJECT_ID from the project containing the Dockerfile. The LOCATION, REPOSITORY and IMAGE are user-defined substitutions.
LOCATION is the is the regional or multi-regional location of the repository where the image is stored, for example us-east1.
REPOSITORYis the name of the repository where the image is stored.
IMAGE is the image's name.
--cache=true enables Kaniko cache.
--cache-ttl=XXh sets the cache expiration time, where XX is hours until cache expiration. See Configuring the cache expiration time.

If you run builds using the gcloud builds submit --tag [IMAGE] command, you can enable Kaniko cache by setting the property builds/use_kaniko to True as shown below:

gcloud config set builds/use_kaniko True

Example: Using Kaniko cache in a Node.js build

This example shows how to run incremental builds for Node.js apps using general Dockerfile best practices. The practices here apply to builds in all other supported languages.

Here, you move directives that are unlikely to change between builds to the top of your Dockerfile, and move directives that are likely to change to the bottom. This makes the build process more incremental and can increase build speeds.

Consider the following Dockerfile:

FROM node:8
WORKDIR /usr/src/app
COPY package*.json ./
RUN npm install
COPY . .
CMD [ "npm", "start" ]

This Dockerfile does the following:

Installs a Node.js app based on Node.js best practices.
Runs the app when the image is run.

When you run this build, Cloud Build must run each step every time the build runs as there is no cross-build image layer cache in Cloud Build. However, when you run this build with Kaniko cache enabled, the following happens:

During the first run, Cloud Build runs every step and each directive writes a layer to the container image registry.
Kaniko tags each layer with a cache key that it derives from the contents of the directive which produced that layer plus all preceding directives.
The next time Cloud Build runs the build from the same Dockerfile, it checks whether the file has changed. If it has not, Cloud Build uses the cached layers stored in the registry to complete the build, which allows the build to complete faster. See below:

FROM node:8                # no change -> cached!
COPY package*.json ./      # no change -> cached!
RUN npm install            # no change -> cached!
COPY . .                   # no change -> cached!
CMD [ "npm", "start" ]     # metadata, nothing to do

If you modify your package.json file, Cloud Build doesn't need to run directives before the COPY step as their contents haven't changed. However, since modifying the package.json file modifies the COPY step, Cloud Build must re-run all steps after the COPY step. See below:

FROM node:8                # no change -> cached!
COPY package*.json ./      # changed, must re-run
RUN npm install            # preceding layer changed
COPY . .                   # preceding layer changed
CMD [ "npm", "start" ]     # metadata, nothing to do

If only the app's contents change but its dependencies don't, (the most common scenario), the package.json file remains unchanged and Cloud Build must only re-run the final COPY . . step. This results in an increased build speed as the step simply copies source contents to a layer in the container image registry.

FROM node:8                # no change -> cached!
COPY package*.json ./      # no change -> cached!
RUN npm install            # no change -> cached!
COPY . .                   # changed, must re-run
CMD [ "npm", "start" ]     # metadata, nothing to do

Configuring the cache expiration time

The --cache-ttl flag directs Kaniko to ignore layers in the cache that have not been pushed within a certain expiration time.

The syntax is --cache-ttl=XXh where XX is time in hours. For example, --cache-ttl=6h sets the cache expiration to 6 hours. If you run builds using the gcloud builds submit --tag [IMAGE] command, the default value of the --cache-ttl flag is 6 hours. If you are using the Kaniko executor image directly, the default value is 2 weeks.

A longer expiration time ensures faster builds when you don't expect dependencies to change often, while a shorter expiration time ensures that your build picks up updated dependencies (such as Maven packages or Node.js modules) more quickly at the expense of lessened use of the cached layers.

To set the cache expiration time from the command line, run the following command:

gcloud config set builds/kaniko_cache_ttl XX

where XX is the cache expiration time in hours.

In our Node.js example, since the output of the RUN npm install directive remains unchanged, we need to periodically re-run it, even if it has been cached. Setting the --cache-ttl parameter to 6 hours is a good compromise, as it ensures Cloud Build runs the directive at least once per workday, but not each time the build runs, regardless of whether the contents of that directive have changed.