Using WebSockets

This page provides guidance and best practices for running WebSockets or other streaming services on Cloud Run and writing clients for such services.

WebSockets applications are supported on Cloud Run with no additional configuration required. However, WebSockets streams are HTTP requests still subject to the request timeout configured for your Cloud Run service, so you need to make sure this setting works well for your use of WebSockets such as implementing reconnects in your clients.

Even if you use session affinity on Cloud Run, which provides best effort affinity, WebSockets requests could still potentially end up at different instances, due to built-in load balancing. To solve this problem, you need to synchronize data between instances.

Note that WebSockets on Cloud Run are also supported if you are using Cloud Load Balancing.

Deploying a sample WebSockets service

Use Cloud Shell to quickly deploy a sample whiteboard service that uses WebSockets with Cloud Run: Deploy a sample

Or, if you want to deploy that sample whiteboard service manually:

Clone the Socket.IO repository locally using git command-line tool:
```
git clone https://github.com/socketio/socket.io.git
```
Navigate into the sample directory:
```
cd socket.io/examples/whiteboard/
```
Deploy a new Cloud Run service by building the service from source code using the Google Cloud CLI:
```
gcloud run deploy whiteboard --allow-unauthenticated --source=.
```
After the service is deployed, open two separate browser tabs and navigate to the service URL. Anything you draw in one tab should propagate to the other tab (and vice versa) since the clients are connected to the same instance over WebSockets.

WebSockets chat sample full tutorial

If you want a full code walkthrough, additional code samples are available in the topic Building a WebSocket Chat service for Cloud Run tutorial.

Best Practices

The most difficult part of creating WebSockets services on Cloud Run is synchronizing data between multiple Cloud Run instances. This is difficult because of the autoscaling and stateless nature of instances, and because of the limits for concurrency and request timeouts.

Handling request timeouts and client reconnects

WebSockets requests are treated as long-running HTTP requests in Cloud Run. They are subject to request timeouts (currently up to 60 minutes and defaults to 5 minutes) even if your application server does not enforce any timeouts.

Accordingly, if the client keeps the connection open longer than the required timeout configured for the Cloud Run service, the client will be disconnected when the request times out.

Therefore, WebSockets clients connecting to Cloud Run should handle reconnecting to the server if the request times out or the server disconnects. You can achieve this in browser-based clients by using libraries such as reconnecting-websocket or by handling "disconnect" events if you are using the SocketIO library.

Billing incurred when using WebSockets

A Cloud Run instance that has any open WebSocket connection is considered active, so CPU is allocated and billed.

Maximizing concurrency

WebSockets services are typically designed to handle many connections simultaneously. Since Cloud Run supports concurrent connections (up to 1000 per container), Google recommends that you increase the maximum concurrency setting for your container to a higher value than the default if your service is able to handle the load with given resources.

About sticky sessions (session affinity)

Because WebSockets connections are stateful, the client will stay connected to the same container on Cloud Run throughout the lifespan of the connection. This naturally offers a session stickiness within the context of a single WebSocket connection.

For multiple and subsequent WebSockets connections, you can configure your Cloud Run service to use session affinity, but this provides a best effort affinity, so WebSockets requests could still potentially end up at different instances. Clients connecting to your Cloud Run service might end up being serviced by different instances that do not coordinate or share data.

To mitigate this, you need to use an external data storage to synchronize state between Cloud Run instances, which is explained in the next section.

Synchronizing data between instances

You need to synchronize data to make sure clients connecting to a Cloud Run service receive the same data from the WebSockets connection.

For example, suppose you are building a chatroom service using WebSockets and set your maximum concurrency setting to 1000. If more than 1000 users connect to this service at the same time, they will be served by different instances, and therefore, they will not be able to see the same messages in the chatroom.

To synchronize data between your Cloud Run instances, such as receiving the messages posted to a chatroom from on all instances, you need an external data storage system, such as a database or a message queue.

If you use an external database such as Cloud SQL, you can send messages to the database and poll from the database periodically. However, note that Cloud Run instances do not have CPU when the container is not handling any requests. If your service primarily handles WebSockets requests, then the container will have CPU allocated as long as there is at least one client connected to it.

Message queues work better to synchronize data between Cloud Run containers in real-time, because the external message queues cannot address each instance to "push" data. Your services need to "pull" new messages from the message queue by establishing a connection to the message queue.

Google recommends that you use external message queue systems such as Redis Pub/Sub (Memorystore) or Firestore real-time updates that can deliver updates to all instances over connections initiated by the container instance.

Using Redis Pub/Sub

WebSockets chatroom service architecture

You can use the Redis Pub/Sub mechanism by creating a Redis instance from Memorystore. If you are using the Socket.IO library for WebSockets, you can use its redis adapter.

In this Redis-based architecture, each Cloud Run instance establishes a long-running connection to the Redis channel that contains the received messages (using the SUBSCRIBE command). Once the container instances receive a new message on the channel, they can send it to their clients over WebSockets in real-time.

Similarly, when a client emits a message using WebSockets, the instance that receives the message publishes the message to the Redis channel (using the PUBLISH command), and other instances that are subscribed to this channel will receive this message.

If you want a full code walkthrough, additional code samples are available in the topic Building a WebSocket Chat service for Cloud Run tutorial.