Multi-architecture container images for IoT devices

This document is the first part of a series that discusses building an automated continuous integration (CI) pipeline to build multi-architecture container images on Google Cloud. The concepts that are explained in this document apply to any cloud environment.

The series consists of this document and an accompanying tutorial. This document explains the structure of a pipeline for building container images and outlines its high-level steps. The tutorial guides you through building an example pipeline.

This series is intended for IT professionals who want to simplify and streamline complex pipelines for building container images, or to extend those pipelines to build multi-architecture images. It assumes that you are familiar with cloud technologies and containers.

When you implement a CI pipeline, you streamline your artifact-building procedures. You don't need to maintain dedicated tools and hardware to build container images for a given architecture. For example, if your current pipeline runs on an x86_64 architecture and produces container images for that architecture only, you might need to maintain additional tooling and hardware if you want to build container images for other architectures, such as for the ARM family.

The Internet of Things (IoT) domain often requires multi-architecture builds. When you have a large fleet of devices with different hardware and OS software stacks, the process of building, testing, and managing software applications for a specific device becomes a huge challenge. Using a multi-architecture build process helps to simplify the management of IoT applications.

The challenge of building multi-architecture container images

In most implementations, container images are architecture-dependent. For example, if you build a container image for the x86_64 architecture, it cannot run on an architecture of the ARM family.

You can overcome this limitation in several ways:

  1. Build the container images on the target architectures for which you need the container image.
  2. Maintain dedicated tooling and a hardware fleet. Your fleet needs at least one device for each architecture that you need to build a container image for.
  3. Build multi-architecture container images.

What strategy is best for you depends on various factors, including the following:

  • Pipeline complexity
  • Automation requirements
  • Available resources to design, implement, and maintain the environment for building container images

For example, if your runtime environment has limited access to a power supply, you might need to build the container images in a separate environment from your runtime environment.

The following diagram illustrates the decision points in choosing a viable strategy.

Flowchart for deciding the best strategy for building multi-architecture container images.

Build the container images on the target architectures

One strategy is to build each container image that you need directly in the runtime environment that supports the container itself, as the following diagram shows.

Build path from source code repository to runtime environment.

For each build, you do the following:

  1. Download the source code of the container image from a source code repository on each device in the runtime environment.
  2. Build the container image in the runtime environment.
  3. Store the container image in the container image repository that is local to each device in the runtime environment.

The advantage of this strategy is that you don't need to provision and maintain hardware in addition to what you need for your runtime environments. This strategy also has disadvantages. First, you have to repeat the build process on every hardware instance in your runtime environment, thus wasting resources. For example, if you deploy your containerized workloads in a runtime environment where devices don't have access to a continuous power supply, you waste time and power by running build tasks on such devices every time you need to deploy a new version of a workload. Also, you need to maintain the tooling to access the source code of each container image to build container images in your runtime environment.

Maintain dedicated tooling and hardware fleets

A second strategy is to maintain a hardware fleet that is dedicated only to tasks that build container images. The following diagram illustrates the architecture of this strategy.

Build path from source code repository to a dedicated build environment.

For each build, you do the following:

  1. Download the source code of the container image on a device in the fleet that has the required hardware architecture and the resources to build the container image.
  2. Build the container image.
  3. Store the container image in a centralized container image repository.
  4. Download the container image on each device in the runtime environment when you need to deploy a new instance of that image.

For this strategy, you provision at least one instance of each hardware architecture that you need to build container images for. In a non-trivial production environment, you might have more than one instance to increase the fault tolerance of your environment and to shorten build times if you have multiple concurrent build jobs.

This strategy has a couple of advantages. First, you can run each build job only once and store the resulting container image in a centralized container image repository, such as Container Registry. Also, you can run test suites on the devices in the building fleet that closely resemble the hardware architectures that you have in your runtime environments. The main disadvantage of this strategy is that you have to provision and maintain a dedicated infrastructure and tooling to run the tasks that build container images. Usually, each building task doesn't consume many resources or time by design, so this infrastructure sits idle most of the time.

Build multi-architecture container images

In this third strategy, you use a general-purpose pipeline to build multi-architecture container images, as the following diagram shows.

Build path from source code repository to general-purpose multi-architecture pipeline.

For each build, you do the following:

  1. Download the source code of the container image.
  2. Build the container image.
  3. Store the container image in a centralized container image repository.
  4. Download the container image on each device in the runtime environment when you need to deploy a new instance of that image.

The main advantage of this strategy is that you don't have to provision and maintain dedicated hardware or tooling. For example, you can use your existing continuous integration/continuous deployment (CI/CD) pipelines and tooling to build multi-architecture container images. You might also benefit from the better performance of a general-purpose hardware architecture, such as x86_64, compared to an energy-efficient one, such as one in the ARM family.

This strategy might also be part of a broader initiative where you adopt DevOps principles. For example, you can implement a CI/CD pipeline for specialized hardware.

Implementing a pipeline for building multi-architecture container images

In this section, we describe a reference implementation of a CI/CD pipeline that follows the third strategy—building multi-architecture container images.

The reference implementation has the following components:

  • A source code repository to manage the source code for container images. For example, you might use Cloud Source Repositories or GitLab repositories.
  • A CI/CD runtime environment to build container images, such as Cloud Build.
  • A platform to manage containers and container images that supports multi-architecture container images, such as Docker.
  • A container image registry such as Container Registry. If you want to store your container images closer to the nodes where the images are needed, you might run a container image registry, such as Docker Registry, directly in your current environment.

This reference architecture uses Moby BuildKit and QEMU to build multi-architecture Docker container images. In this case, Moby BuildKit automatically detects the architectures that are available through QEMU hardware emulation and automatically loads the appropriate binaries that are registered in the binfmt_misc capability of the Linux kernel.

The following diagram illustrates the technical stack that is responsible for each multi-architecture container image build that is supported by this reference architecture.

Related components for this multi-architecture reference architecture.

Because this reference architecture uses Docker image manifests, you don't need to provide a container image tag for each target hardware architecture; you can use the same tag for multiple architectures. For example, if you build the 1.0.0 version of a multi-architecture container image, you don't need a unique tag for each hardware architecture, such as 1.0.0-x86_64, or 1.0.0_ARMv7. You use the same 1.0.0 tag for all the hardware architectures that you build for, and use Docker image manifests to correctly identify each container image.

The following example shows the image manifest for the official Alpine Linux image, where you find information about the architectures that a particular version of that container image supports:

{
   "schemaVersion": 2,
   "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
   "manifests": [
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 528,
         "digest": "sha256:ddba4d27a7ffc3f86dd6c2f92041af252a1f23a8e742c90e6e1297bfa1bc0c45",
         "platform": {
            "architecture": "amd64",
            "os": "linux"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 528,
         "digest": "sha256:401f030aa35e86bafd31c6cc292b01659cbde72d77e8c24737bd63283837f02c",
         "platform": {
            "architecture": "arm",
            "os": "linux",
            "variant": "v7"
         }
      },
      {
         "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
         "size": 528,
         "digest": "sha256:2c26a655f6e38294e859edac46230210bbed3591d6ff57060b8671cda09756d4",
         "platform": {
            "architecture": "arm64",
            "os": "linux"
         }
      }
   ]
}

When you design an automated pipeline for building container images, we recommend that you include comprehensive test suites that validate compliance to the requirements of each container image. For example, you might use tools such as Chef InSpec, Serverspec, and RSpec to execute compliance test suites against your container images as one of the tasks of the build pipeline.

Optimizing the pipeline for building container images

After you validate and consolidate the pipelines for building your container images, you optimize the pipelines. Migration to Google Cloud: Optimizing your environment contains guidance about optimizing your environment. It describes an optimization framework that you can adopt to make your environment more efficient compared to its current state. By following the optimization framework, you go through several iterations where you modify the state of your environment.

One of the first activities of each optimization iteration is to establish a set of requirements and goals for that iteration. For example, one requirement might be to modernize your deployment processes, migrating from manual deployment processes to fully automated, containerized ones. For more information about modernizing your deployment processes, see Migration to Google Cloud: Migrating from manual deployments to automated, containerized deployments.

What's next