Image processing using microservices and asynchronous messaging

Last reviewed 2023-07-17 UTC

When you design a web application that's based on a microservices architecture, you decide how to split your application's features into microservices, and how those microservices are called as part of the application. For long-running services, you might want to use asynchronous service calls. This reference architecture discusses how to deploy a containerized application that invokes long-running processes asynchronously.

This reference architecture document is intended for developers and architects who want to implement microservices in an asynchronous manner using modern technologies, including Google Kubernetes Engine (GKE) and Pub/Sub. The document assumes that you're familiar with microservices in general, and with Pub/Sub and GKE on Google Cloud.


The following diagram illustrates an example scenario where an application generates thumbnail images. Generating thumbnail images can be a resource-intensive task and can therefore take some time.

Architecture of a thumbnail-generation application that's deployed on Compute Engine.

Figure 1. Original architecture for image processing that's based on using VMs.

In the preceding diagram, the application receives image files from clients and then generates thumbnails. In this architecture, the application is implemented by using virtual machine (VM) instances on Compute Engine and by using backend file storage on Cloud Storage. The application stores metadata using Cloud Storage. Cloud Load Balancing distributes requests to multiple VMs.

To reduce the operational overhead to maintain VMs, you migrate this system to a new architecture that doesn't use VMs.

The following diagram shows how this flow can be implemented by using managed services that use notifications and microservices to implement asynchronous calls between components of the system.

Architecture of the thumbnail-generation application that's deployed without VMs.

Figure 2. New architecture for image processing that's based on using containers and asynchronous messaging.

In the new architecture, the client submits an image to the application and the application uploads it to Cloud Storage. Then Pub/Sub notifications put a message in the Pub/Sub message queue. The message calls a microservice that runs on GKE. The microservice retrieves the image from Cloud Storage, generates a thumbnail, and uploads the thumbnail to Cloud Storage.

Design considerations

The following guidelines can help you to develop an architecture that meets your organization's requirements for operational efficiency and performance.

Operational efficiency

The new architecture has the following advantages:

  • Independent scalability: In the original architecture, the application that's running on Compute Engine addresses two core tasks. One task is to receive files, and the other task is to generate a thumbnail from the original image. Receiving uploaded files consumes network bandwidth, and thumbnail generation is a CPU-intensive task. The Compute Engine instances might run out of CPU resources to generate images but still have enough network resources to receive files. In the new architecture, these tasks are shared by Cloud Storage and GKE, making the tasks independently scalable.

  • Easy to add new functionality: In the original architecture, if you want to add functionality, you have to deploy it on the same Compute Engine instances. In the new architecture, you can develop an application and add it independently—for example, you can add a mail sender application to notify you when a new thumbnail is generated. Pub/Sub can connect to the thumbnail-generation application and to the mail-sender application in an asynchronous manner without modifying the original code that runs on GKE.

  • Reduced coupling: In the original architecture, a common problem is temporal coupling. If a mail relay server is unavailable, when the application tries to send a notification, the notification fails. Those processes are tightly coupled, and a client might not get a successful response from the application. In the new architecture, the client gets a successful response because generating a thumbnail and sending a notification are loosely coupled.

This new architecture has the following disadvantages:

  • Extra effort to modernize the application: Containerizing an application takes time and effort. The new architecture uses more services and requires a different approach to observability, which includes changes to monitoring the application, the deployment process, and resource management.

  • Requirement to handle duplication on the application side: Pub/Sub guarantees at-least-once message delivery, which means that duplicate messages might be sent. Your application must handle this possibility.


The new architecture can give you efficient resource usage: In the original architecture, scaling out Compute Engine instances consumes more resources in order to run operating systems. With GKE, you can efficiently use server resources that run multiple containers on just a few servers (bin packing). You can scale containers out and in quickly, so the new architecture can handle short bursts of high load and scale in quickly when the tasks are finished.


To deploy an example application that implements this architecture, see Deploy microservices that use Pub/Sub and GKE.

What's next

  • Read about DevOps, and learn more about the Architecture capability that's related to this reference architecture.
  • Take the DevOps quick check to understand where you stand in comparison with the rest of the industry.
  • For more reference architectures, diagrams, and best practices, explore the Cloud Architecture Center.