Packaging server applications as container images is quickly gaining traction across the tech landscape. Many game companies are also interested in using containers to improve VM utilization, as well as take advantage of the isolated run-time paradigm offered by containers. Despite this high interest, many game companies don't know where to start. We recommend using the container orchestration framework Kubernetes to deploy production-scale fleets of dedicated game servers.
This tutorial describes an expandable architecture for running real-time, session-based multiplayer dedicated game servers on Google Kubernetes Engine (GKE). A scaling manager process automatically starts and stops virtual machine instances as needed. Configuration of the machines as Kubernetes nodes is handled automatically by managed instance groups.
The online game structure presented in this tutorial is intentionally simple so that it's easy to understand and implement. Places where additional complexity might be useful are pointed out where appropriate.
Objectives
- Create a container image of OpenArena, a popular open source dedicated game server (DGS) on Linux using Docker. This container image adds only the binaries and necessary libraries to a base Linux image.
- Store the assets on a separate read-only persistent disk volume and mount them in the container at run time.
- Set up and configure basic scheduler processes using the Kubernetes and Google Cloud APIs to create or shut down nodes to meet demand.
Costs
This tutorial uses the following billable components of Google Cloud:- GKE
- Persistent Disk
You can use the pricing calculator to generate a cost estimate based on your projected usage.
Before you begin
This tutorial is intended to be run from a Linux or macOS environment.
-
In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.
- Enable the Compute Engine API.
-
Install and initialize the Cloud SDK.
Note: For this tutorial, you cannot use Cloud Shell. You must install the Cloud SDK.
- Install
kubectl
, the command-line interface for Kubernetes:gcloud components install kubectl
- Clone the tutorial's repository from GitHub:
git clone https://github.com/GoogleCloudPlatform/gke-dedicated-game-server.git
- Install Docker
This tutorial does not run Docker commands as the root user, so be sure to also follow the post-installation instructions for managing Docker as a non-root user. - (Optional) If you want to test a connection to the game server at the end of the tutorial, install the OpenArena game client. Running the game client requires a desktop environment. This tutorial includes instructions for testing using Linux or macOS.
Architecture
The Overview of Cloud Game Infrastructure page discusses the high-level components common to many online game architectures. In this tutorial, you implement a Kubernetes DGS cluster frontend service and a scaling manager backend service. A full production game infrastructure would also include many other frontend and backend services, but those are outside the scope of this tutorial.
Design constraints for this tutorial
To produce an example that is both instructive and simple enough to extend, this tutorial assumes the following game constraints:
- This is a match-based real-time game with an authoritative DGS that simulates the game state.
- The DGS communicates with the client over UDP.
- Each DGS process runs 1 match.
- All DGS processes generate approximately the same load.
- Matches have a maximum time length.
- DGS startup time is negligible, and pre-warming the dedicated game server process isn't necessary.
- When scaling down after peak, matches are not ended prematurely in an attempt to save cost. The priority is on avoiding impact to the player experience.
- If a DGS process encounters an error and cannot continue, the match state is lost and the player is expected to use the game client to join another match.
- The DGS process loads static assets from disk but does not require write access to the assets.
These constraints all have precedent within the game industry, and represent a real-world use case.
Preparing your GCP working environment
To make it easier to run gcloud
commands, you can set properties so that you
don't have to supply options for these properties with each command.
Set your default project, using your project ID for
[PROJECT_ID]
:gcloud config set project [PROJECT_ID]
Set your default Compute Engine zone, using your preferred zone for
[ZONE]
:gcloud config set compute/zone [ZONE]
Containerizing the dedicated game server
In this tutorial, you'll use OpenArena, which is described as "a community-produced deathmatch FPS based on GPL idTech3 technology." Although this game's technology is over fifteen years old, it's still a good example of a common DGS pattern:
- A server binary is compiled from the same code base as the game client.
- The only data assets included in the server binary are those necessary for the server to run the simulation.
- The game server container image adds only the binaries and libraries to the base OS container image that are required in order to run the server process.
- Assets are mounted from a separate volume.
This architecture has many benefits: it speeds image distribution, reduces the update load because only binaries are replaced, and consumes less disk space.
Creating the container image
A
Dockerfile
describes the image to be built. The Dockerfile for this tutorial is provided in
the repository at openarena/Dockerfile
. From the openarena/
directory, run
the
Docker build command
to generate the container image and tag it as version 0.8.8 of the OpenArena
server:
docker build -t openarena:0.8.8 .
Generating an assets disk
In most games, binaries are orders of magnitude smaller than assets. Because of this, it makes sense to create a container image that contains only binaries. Assets can be put on a persistent disk and attached to multiple VM instances that run the DGS container. This architecture saves money and eliminates the need to distribute assets to all VM instances.
Create a small Compute Engine VM instance using
gcloud
:gcloud compute instances create openarena-asset-builder \ --machine-type f1-micro --image-family debian-9 \ --image-project debian-cloud
Create a persistent disk:
gcloud compute disks create openarena-assets --size=50GB \ --type=pd-ssd --description="OpenArena data disk. \ Mount read-only at /usr/share/games/openarena/baseoa/"
The persistent disk must be separate from the boot disk, and you must configure it to remain undeleted when the virtual machine is removed. Kubernetes persistentVolume functionality works best in GKE with persistent disks. According to the Compute Engine, these persistent disks consist of a single
ext4
file system without a partition table.Attach the
openarena-assets
persistent disk to theopenarena-asset-builder
VM instance:gcloud compute instances attach-disk openarena-asset-builder \ --disk openarena-assets
Format the new disk.
Log in to the
openarena-asset-builder
VM instance and format the disk.gcloud compute ssh openarena-asset-builder
Because the
mkfs.ext4
command in the next step is a destructive command, be sure to confirm the device ID for theopenarena-assets
disk. If you're following this tutorial from the beginning, and you are using a fresh project, the ID is/dev/sdb
. Verify this using thelsblk
command to look at the attached disks and their partitions:sudo lsblk
The output shows the 10 GB OS disk
sda
with 1 partitionsda1
and the 50 GBopenarena-assets
disk with no partition as devicesdb
:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 10G 0 disk └─sda1 8:1 0 10G 0 part / sdb 8:16 0 50G 0 disk
Format the
openarena-assets
disk:sudo mkfs.ext4 -m 0 -F -E \ lazy_itable_init=0,lazy_journal_init=0,discard /dev/[DEVICE_ID]
Install OpenArena on the
openarena-asset-builder
VM instance and copy the compressed asset archives to theopenarena-assets
persistent disk.For this game, the assets are
.pk3
files located in the/usr/share/games/openarena/baseoa/
directory. To save some work, the following sequence of commands mounts the assets disk to this directory before installing, so all the.pk3
files are put on the disk by the install process. Be sure to use the device ID you verified previously.sudo mkdir -p /usr/share/games/openarena/baseoa/ sudo mount -o discard,defaults /dev/[DEVICE_ID] \ /usr/share/games/openarena/baseoa/ sudo apt-get update && sudo apt-get -y install openarena-server
Exit from the instance and then delete it:
exit
gcloud compute instances delete openarena-asset-builder
The disk is ready to be used as a persistent volume in Kubernetes.
When you implement persistent disks as part of your game development
pipeline, configure your build system to create the persistent disk with all the
asset files in an appropriate directory structure. This might take the form of a
script running gcloud
commands, or a GCP-specific plugin for your build
system of choice. It's also recommended that you create multiple copies of the
persistent disk and have VM instances connected to these copies in a balanced
manner both for additional throughput and to manage the risk of failure.
Setting up a Kubernetes cluster
Kubernetes is an open source community project and as such can be configured to run in most environments, including on-premises.
Creating a Kubernetes cluster on Kubernetes Engine
This tutorial uses a standard Kubernetes Engine cluster outfitted with the n1-highcpu machine type, which fits the usage profile of OpenArena.
Create a VPC network for the game named
game
:gcloud compute networks create game
Create a firewall rule for OpenArena:
gcloud compute firewall-rules create openarena-dgs --network game \ --allow udp:27961-28061
Use
gcloud
to create a 3-node cluster with 4 virtual CPU cores on each that uses yourgame
network:gcloud container clusters create openarena-cluster \ --network game --num-nodes 3 --machine-type n1-highcpu-4 \ --addons KubernetesDashboard
After the cluster has started, set up your local shell with the proper Kubernetes authentication credentials to control your new cluster:
gcloud container clusters get-credentials openarena-cluster
Sidebar: How many vCPUs per virtual machine?
In a production cluster, the number of vCPUs you'll run on each machine is largely influenced by two factors:
The largest number of concurrent DGS Pods you plan to run. There is a limit on the number of nodes that can be in a Kubernetes cluster pool (although the Kubernetes project plans to increase this with future releases). For example, if you run 1 DGS per virtual CPU (vCPU), a 1000-node cluster of
n1-highcpu-2
machines provides capacity for only 2000 DGS Pods. In contrast, a 1000-node cluster ofn1-highcpu-32
machines provides capacity for up to 32,000 Pods.Your VM instance granularity. The simplest way for resources to be added or removed from the cluster is in increments of a single VM instance of the type chosen during cluster creation. Therefore, don't chose a 32 vCPU machine if you want to be able to add or remove capacity in smaller amounts than 32 vCPUs at a time.
Sidebar: Why disable autoscaling and load balancing?
The managed instance groups feature that's used by default by
GKE includes VM instance autoscaling and HTTP load
balancing features. However, the command you used to create the Kubernetes
cluster disabled these features by using the
--disable-addons HttpLoadBalancing,HorizontalPodAutoscaling
flag.
HTTP load balancing is not needed because the DGS communicates with clients using UDP, not TCP. The autoscaler can currently only scale the instance group based on CPU usage, which can be a misleading indicator of DGS load. Many DGSs are designed to consume idle cycles in an effort to optimize the game's simulation.
As a result, many game developers implement a custom scaling manager process that is DGS aware to deal with the specific requirements of this type of workload. The managed instance group does still serve an important function, however—its default GKE image template includes all the necessary Kubernetes software and automatically registers the node with the control plane on startup.
Uploading the container image to Google Cloud
Google offers private Docker image storage in Container Registry (gcr.io).
Select the gcr.io region nearest to your GKE cluster (for example,
us
for the United States,eu
for Europe, orasia
for Asia, as noted in the documentation) and put the region information in an environment variable along with your project ID:export GCR_REGION=[GCR_REGION] PROJECT_ID=[PROJECT_ID]
Tag your container image with the gcr.io registry name:
docker tag openarena:0.8.8 \ ${GCR_REGION}.gcr.io/${PROJECT_ID}/openarena:0.8.8
Set up Docker authentication for Container Registry:
gcloud auth configure-docker
Upload the container image to the image repository:
docker push ${GCR_REGION}.gcr.io/${PROJECT_ID}/openarena:0.8.8
After the push is complete, the container image is available to run in your GKE cluster. Make a note of your final container image tag, because you'll put it in the Pod specification file later.
Configuring the assets disk in Kubernetes
The typical DGS does not need write access to the game assets, so you can have each DGS Pod mount the same persistent disk that contains read-only assets. This is accomplished using the persistentVolume and persistentVolumeClaim resources in Kubernetes.
Apply
asset-volume.yaml
, which contains the definition of a KubernetespersistentVolume
resource that binds to the assets disk you created before:kubectl apply -f openarena/k8s/asset-volume.yaml
Apply
asset-volumeclaim.yaml
. It contains the definition of a KubernetespersistentVolumeClaim
resource, which lets Pods mount the assets disk:kubectl apply -f openarena/k8s/asset-volumeclaim.yaml
Confirm that the volume is in
Bound
status by running the following command:kubectl get persistentVolume
Expected output:
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE asset.volume 50Gi ROX Retain Bound default/asset.disk.claim assets 3m
Similarly, confirm that the claim is in bound status:
kubectl get persistentVolumeClaim
Expected output:
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE asset.disk.claim Bound asset.volume 50Gi ROX assets 25s
Configuring the DGS Pod
Because scheduling and networking tasks are handled by Kubernetes, and because the startup and shutdown times of the DGS containers are negligible, in this tutorial, DGS instances spin up on demand.
Each DGS instance lasts only the length of a single game match, and a game match has a defined time limit specified in the OpenArena DGS server configuration file. When the match is complete, the container successfully exits. If players want to play again, they request another game. This design simplifies a number of Pod lifecycle aspects and forms the basis of the autoscaling policy discussed later in the scaling manager section.
Although this flow isn't seamless with OpenArena, it's only because the tutorial doesn't fork and change the game client code. In a commercially released game, requesting another match would be made invisible to the user behind previous match result screens and loading times. The code that requires the client to connect to a new server instance between matches doesn't represent additional development time, because that code is mandatory anyway for handling client reconnections for unforeseen circumstances such as network issues or crashing servers.
For the sake of simplicity, this tutorial assumes that the GKE nodes have the default network configuration, which assigns each node a public IP address and enables client connections.
Managing the dedicated game server process
In commercially produced game servers, all of the additional, non-DGS functionality that makes a DGS run well in a container is integrated directly into the DGS binaries whenever possible.
As a best practice, the DGS should avoid communicating directly with the matchmaker or scaling manager, and instead should expose its state to the Kubernetes API. External processes should read the DGS state from the appropriate Kubernetes endpoints rather than querying the server directly. More information about accessing the Kubernetes API directly can be found in the Kubernetes documentation.
Sidebar: Choosing the right Kubernetes resource for DGS Pods
At first glance, a single process running in a container with a constrained lifetime and defined success criteria would appear to be a use case for Kubernetes jobs, but in practice it's unnecessary to use jobs. DGS processes don't require the parallel execution functionality of jobs. They also don't require the ability to guarantee successes by automatically restarting (typically when a session-based DGS dies for some reason, the state is lost, and players simply join another DGS). Due to these factors, scheduling individual Kubernetes Pods is preferable for this use case.
In production, DGS Pods are started directly by your matchmaker using the
Kubernetes API. For the purposes of this tutorial, a human-readable YAML file
describing the DGS Pod resource is included in the tutorial repository at
openarena/k8s/openarena-pod.yaml
. When you create configurations for
dedicated game server Pods, pay close attention to the volume properties to
ensure that the asset disk can be mounted read-only in multiple Pods.
Setting up the scaling manager
The scaling manager is a process that scales the number of virtual machines used as GKE nodes, based on the current DGS load. Scaling is accomplished using a set of scripts that run forever, that inspect the total number of DGS Pods running and requested, and that resize the node pool as necessary. The scripts are packaged in Docker container images that include the appropriate libraries and the Cloud SDK. The Docker images can be created and pushed to gcr.io using the following procedure.
If necessary, put the gcr.io
GCR_REGION
value and yourPROJECT_ID
into environment variables for the build and push script. You can skip this step if you already did it earlier when you uploaded the container image.export REGION=[GCR_REGION] PROJECT_ID=[PROJECT_ID]
Change to the script directory:
cd scaling-manager
Run the build script to build all the container images and push them to gcr.io:
./build-and-push.sh
Using a text editor, open the Kubernetes Deployment file at
scaling-manager/k8s/openarena-scaling-manager-deployment.yaml
.The scaling manager scripts are designed to be be run within a Kubernetes Deployment, which ensures that these processes are restarted in the event of a failure.
Change the environment variables to values for your Deployment, as listed in the following table:
Environment Variable Default Value Notes REGION
[GCR_REGION]
Requires replacement. The region of your gcr.io repository. PROJECT_ID
[PROJECT_ID]
Requires replacement. The name of your project. GKE_BASE_INSTANCE_NAME
gke-openarena-cluster-default-pool-[REPLACE_ME]
Requires replacement. Different for every GKE cluster. To get the value for [REPLACE_ME]
, run thegcloud compute instance-groups managed list
command.GCP_ZONE
[ZONE]
Requires replacement. The name of the GCP zone that you specified at the beginning of this tutorial. K8S_CLUSTER
openarena-cluster
The name of the Kubernetes cluster. Return to the parent directory:
cd ..
Add the Deployment to your Kubernetes cluster:
kubectl apply \ -f scaling-manager/k8s/openarena-scaling-manager-deployment.yaml
How nodes are scaled
To scale nodes, the scaling manager uses the Kubernetes API to look at current node usage. As needed, the manager resizes the Kubernetes Engine cluster's managed instance group that runs the underlying virtual machines.
Scaling concerns for DGS Pods
Common sticking points for DGS scaling include:
- Standard CPU and memory usage metrics often fail to capture enough information to drive game server instance scaling.
- Keeping a buffer of available, underutilized nodes is critical, because scheduling an optimized DGS container on an existing node takes a matter of seconds. However, adding a node can take minutes, which is an unacceptable latency for a waiting player.
- Many autoscalers aren't able to handle Pod shutdowns gracefully. It's important to drain Pods from nodes that are being removed. Shutting off a node with even one running match is often unacceptable.
Although the scripts supplied by this tutorial are basic, their simple design makes it easy to add additional logic. Typical DGSs have well-understood performance characteristics, and by making these into metrics, you can determine when to add or remove VM instances. Common scaling metrics are the number of DGS instances per CPU, as used in this tutorial, or the number of available player slots.
Scaling up
Scaling up requires no special handling in this tutorial. For simplicity, this
tutorial sets the
limits
and requests
Pod properties
in the Pod's YAML file (openarena/k8s/openarena-pod.yaml
) to reserve
approximately 1 vCPU for each DGS, which is sufficient for OpenArena.
Because the cluster was created using the n1-highcpu
instance family, which
has a fixed ratio of 600 MB of memory to 1 vCPU, there is sufficient
memory if 1 DGS Pod is scheduled per vCPU. Therefore, you can scale up based
on the number of Pods in the cluster compared to the number of CPUs in all nodes
in the cluster. This ratio determines the remaining resources available,
letting you add more nodes if the value falls below a threshold. This
tutorial adds nodes if more than 70% of the vCPUs are currently allocated to
Pods.
In a production online game backend, it is recommended that you accurately
profile DGS CPU, memory, and network usage, and then set the limits
and
requests
Pod properties appropriately. For many games, it makes sense to
create multiple Pod types for different DGS scenarios with different usage
profiles, such as game types, specific maps, or number of player slots. Such
considerations fall outside the scope of this tutorial.
Scaling down
Scaling down, unlike scaling up, is a multi-step process and one of the major
reasons to run a custom, Kubernetes-aware DGS scaling manager. In this tutorial,
scaling-manager.sh
automatically handles the following steps:
Selecting an appropriate node for removal. Because a full custom game-aware Kubernetes scheduler is outside the scope of this tutorial, the nodes are selected in the order they are returned by the API.
Marking the selected node as unavailable in Kubernetes. This prevents additional Pods from being started on the node.
Removing the selected node from the managed instance group using the
abandon-instance
command. This prevents the managed instance group from attempting to recreate the instance.
Separately, the node-stopper.sh
script monitors abandoned, unschedulable nodes
for the absence of DGS Pods. After all matches on a node have finished and the
Pods exit, the script shuts down the VM instance.
Scaling the number of DGS Pods
In typical production game backends, the matchmaker controls when new DGS instances are added. Because DGS Pods are configured to exit when matches finish (refer to the design constraints earlier), no explicit action is necessary to scale down the number of DGS Pods. If there are not enough player requests coming into the matchmaker system to generate new matches, the DGS Pods slowly remove themselves from the Kubernetes cluster as matches end.
Testing the setup
So far, you've created the OpenArena container image and pushed it to the container registry, and you started the DGS Kubernetes cluster. In addition, you generated the game asset disk and configured it for use in Kubernetes, and you started the scaling manager deployment. At this point, it's time to start DGS Pods for testing.
Requesting a new DGS instance
In a typical production system, when the matchmaker process has appropriate players for a match, it directly requests an instance using the Kubernetes API. For the purposes of testing this tutorial's setup, you can directly make the request for an instance.
Open
openarena/k8s/openarena-pod.yaml
in a text editor, and find the line that specifies the container image to run.Change the value to match your
openarena
container image tag by running thedocker tag
command as described earlier in this tutorial.Run the
kubectl apply
command, specifying theopenarena-pod.yaml
file:kubectl apply -f openarena/k8s/openarena-pod.yaml
Wait for a short time and then confirm the status of the Pod:
kubectl get pods
The output looks similar to this:
NAME READY STATUS RESTARTS AGE openarena.dgs 1/1 Running 0 25s
Connecting to the DGS
After the Pod has started, you can verify that you can connect to the DGS by launching the OpenArena client.
From a macOS or Linux desktop:
export NODE_NAME=$(kubectl get pod openarena.dgs \ -o jsonpath="{.spec.nodeName}") export DGS_IP=$(gcloud compute instances list \ --filter="name=( ${NODE_NAME} )" \ --format='table[no-heading](EXTERNAL_IP)') openarena +connect ${DGS_IP}
Testing the scaling manager
The scaling manager scales the number of VM instances in the Kubernetes
cluster based on the number of DGS Pods. Therefore, testing the scaling manager
requires requests for a number of Pods over a period of time and checking that
the number of nodes scales appropriately. In order to see the nodes scale back
down, the match length in the server configuration file must have a time limit.
The tutorial server configuration file at openarena/single-match.cfg
places a
5-minute limit on the match, and is used by the tutorial DGS Pods by default. To
test, run the following script, which adds DGS Pods at regular intervals for
5 minutes:
./scaling-manager/tests/test-loader.sh
You should be able to see the number of nodes scale up and back down by running
kubectl get nodes
at regular intervals.
Cleaning up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
Delete the project
The easiest way to eliminate billing is to delete the project that you created for the tutorial.
To delete the project:
- In the Cloud Console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
Delete the GKE cluster
If you don't want to delete the whole project, run the following command to delete the GKE cluster:
gcloud container clusters delete openarena-cluster
Delete your persistent disks
To delete a persistent disk:
In the Cloud Console, go to the Compute Engine Disks page.
Select the disk you want to delete.
Click the Delete button at the top of the page.
What's next
This tutorial sketches out a bare-bones architecture for running dedicated game servers in containers and autoscaling a Kubernetes cluster based on game load. You can add many features, such as seamless player transition from session to session, by programming some basic client-side support. To add other features, such as letting players form groups and moving the groups from server to server, you can create a separate platform service living alongside the matchmaking service. You can then use the service to form groups; send, accept, or reject group invitations; and send groups of players into dedicated game server instances together.
Another common feature is a custom Kubernetes scheduler capable of choosing nodes for your DGS Pods in a more intelligent, game-centric way. For most games a custom scheduler that packs Pods together is highly desirable, making it easy for you to prioritize the order in which nodes are removed when scaling down after peak.
More guides to running a DGS on GCP:
- Overview of Cloud Game Infrastructure
- Dedicated Game Server Migration Guide
- Setting Up a Minecraft Server on Compute Engine
- Try out other Google Cloud features for yourself. Have a look at our tutorials.