A well-tested pattern for building an online game backend uses a relational database, such as MySQL. This database stores game world state and essential persistence data. For basic session-based games, the database holds nothing more complicated than the final results of matches. For large, persistent-world, massively multiplayer online (MMO) games, it may be a hugely complicated set of interrelated tables holding player progression and inventory. The speed of queries in the database layer of your backend directly affects the user's experience of responsiveness in the game client.
Although this pattern is familiar, most game development teams don't have a dedicated database administrator, and as the database grows in size and the relationships it models grow in complexity, administration can become a task that many teams would rather delegate. For small- to medium-sized, asynchronous, turn-based mobile games, a database such as Google Cloud SQL can be an excellent choice. Cloud SQL for MySQL Second Generation offers a fully hosted and managed instance of MySQL with solid performance, minimal operations, and automated backups.
Service-fronted database pattern design
The microservices paradigm is useful for mobile game database backends. A common design is to have the database fronted by a database service, which is formed by a pool of worker processes that accepts query requests from the game frontend, runs those queries against the database, and returns the results.
Advantages of the service-fronted database pattern
Having an intermediary service query the database on behalf of your game servers has several advantages:
- Increased availability — Databases often limit the number of concurrent connections. Using a service decouples the number of game servers that can make requests to the database from the maximum connections allowed.
- Fault tolerance — Database services can be built to temporarily process requests if the database experiences an issue.
- Optimized requests — A database service can optimize database
- Query validation.
- Query prioritization.
- Query rate flow control.
- In-memory, read-through caching.
- Database abstraction — As long as the service contract is fulfilled, the database service and the database backing it can be replaced without modifications to the frontend game servers. This design makes it possible to use different databases in development or QA environments, or to migrate to a different database technology in production.
A service-fronted database pattern for games
The following diagram illustrates how you can build a robust service-fronted database pattern using Google Cloud Platform services.
A robust service-fronted database pattern can be built using the following components:
Dedicated game servers/frontend — Game client applications connect directly to the game servers, making them a frontend service. Dedicated game servers are typically custom executables, built with the game engine, that need to run on virtualized hardware such as Google Compute Engine VMs. If you are writing a game where the online interaction can be modeled with HTTP-style request/response semantics, Google App Engine is also a good fit.
Game server communication with database services — This is often modeled using the create, read, update, and delete (CRUD) access style, making a REST API or gRPC endpoint on Compute Engine a great choice.
API/RPC endpoint communication to a database worker pool — A classic case for queuing, popular choices are self-managed, open-source, queuing middlewares such as RabbitMQ or ZeroMQ. On Cloud Platform, you can use the secure, durable, and highly available Google Cloud Pub/Sub. It offers a managed queuing solution without any need to manage servers.
Database workers connecting to Cloud SQL Second Generation — Database workers can be written in any language that provides an up-to-date MySQL access method. These workers can be managed manually on Compute Engine, or packaged in Docker containers to be easily managed using the Kubernetes DSL on Google Container Engine.
There are a few limitations to consider when using Cloud SQL Second Generation:
- Database size limit of 10TB.
- Connection limit of 4000 concurrent connections.
- Lack of NDB (sharding) support, although replicas are supported.
To address these items:
Choose a data model that optimizes the amount of data stored.
Incorporate processes that move rarely accessed data out of the primary database and into a data warehouse, such as Google BigQuery.
Use a pattern where game servers access the database through a microservice. Even if the number of players is modest enough that your game servers could access the database directly, there are many advantages of decoupling the database layer from the game server: queuing, query rate leveling, connection failure tolerance, and so on. In addition, attempting to add a separate database access layer after your game becomes popular can cause downtime and loss of revenue.
Run game analytics and player telemetry against a read-only replica of the database. This prevents your analysis from impacting the database’s responsiveness. Cloud SQL for MySQL Second Generation allows standard MySQL replication, and you can size your second instance appropriately for the analytics queries to keeps costs marginal.
Sample design: Massively single-player social (MASS) game
One emergent game paradigm over the last decade consists of a massive number of single players simultaneously playing sessions online, with social mechanics such as unit borrowing, unit trading, and leaderboards being the only contact points between players. Examples of MASS games include Puzzle and Dragons™ and Monster Strike™. MASS mobile games are built to be economical with client/server communication. This makes it possible for users to enjoy their game even with limited or sporadic connectivity. The data model for a game like this, where nearly all persistent state storage is related to the meta-game (collecting units and maintaining player currency), yields two basic types of objects to store in the database. These objects can be easily manipulated using CRUD mechanisms.
Player Objects, which track:
- Game and real currencies.
- Total number of unit inventory slots.
Unit Objects, which track:
- Owner (player ID).
- Cost to acquire.
- Unit inventory.
For less than 100,000 players, this data model fits well into a relational database such as Cloud SQL Second Generation.
Mimus is a mock MASS mobile game application with a backend in the style of Puzzle and Dragons™ or Monster Strike™. It assumes that each player can be logged in from only one device at a time, and that they must complete any action before beginning another. With Mimus, you can run simulated workloads to evaluate the architecture's optimal capacity, which is typically based on the number of concurrent users (CCU).
You can find the Mimus source code at https://github.com/GoogleCloudPlatform/mimus-game-simulator.
Mimus client simulation within the Mimus server
In MASS games, the rate at which the game client generates database queries
can be controlled by the game developer by requiring the player to view
animations and interact with the game client to continue. Mimus simulates
these rate-limiting strategies with
sleep() calls. In this way, Mimus
simulates a reasonable approximation of per-client database load by running
as many processes as simulated players. This is efficiently orchestrated
using a containerized Mimus client/server pod, in a Kubernetes cluster, that
generates queries against the database.
The Mimus game client simulates communication with the backend server using a continual loop that calls the Mimus server procedures directly, choosing function calls based on the state of the player and their inventory.
The player actions simulated by Mimus include:
- Playing a round of the game.
- Buying currency.
- Spending currency.
- Leveling or evolving units.
Each of these actions are implemented as multiple CRUD interactions with the player or unit objects, manipulated by the client through calls to the Mimus server. The Mimus server makes these database requests using the synchronous (blocking) Mimus database API. This API is a Python module imported by the Mimus server, and can be configured to test different database backend implementations.
Mimus database API communication with the Mimus database worker pool
The following diagram illustrates the communication between the Mimus server and the database service.
The Mimus database API accepts batches of database queries, and returns the
results. It publishes these batches as Cloud Pub/Sub messages, and waits for
the results to be returned through Redis. Before sending a message, the
database API validates all values that will be written to the database and
tags the message with a unique transaction ID. The message is then published
Work topic in Cloud Pub/Sub. Next, the database API loops, polling for the
existence of the transaction ID as a Redis key. The results in the key's
value are then retrieved by the database API and returned to the Mimus server,
from which they are made available to the Mimus client.
Communication selection considerations
Mimus uses Cloud Pub/Sub for the pending query communication, because user actions require durability and reliable delivery. Mimus uses Redis to communicate the results, where durability and reliability is less critical. If the results are lost due to an application error or Redis failure, the Mimus Client queries again for the final results from the database. By using transactions in a relational database, you're guaranteed that either all the requested changes took place, or none of them. The rare case of Mimus needing to make a second request is considered an acceptable trade-off in return for the simplicity and speed of retrieval provided by Redis. To keep the Redis usage reasonable, request results are allowed to expire from Redis after thirty seconds.
Mimus database worker pool
The Mimus database worker pool contains multiple processes running on
Container Engine. Each running container instance polls the
Cloud Pub/Sub topic in an endless loop for new messages. When it receives a
message it runs the queries in the message using the
Python MySQLdb module.
A single message represents a relational transaction and may contain multiple
queries. All queries in the message must be completed before being committed
to the database. After the queries have completed (or failed), the database
worker publishes the results to Redis under the transaction ID received as
part of the original message.
The relational database backing the Mimus database service is a Cloud SQL for MySQL Second Generation instance. When creating the instance, you can select machine configurations of up to 32 cores and 208 GB of RAM, with disk sizes of up to 10 TB. In addition to having support for replicas, Cloud SQL Second Generation instances are configured to run regular backups by default. If you require additional MySQL tuning, you can set specific MySQL flags on the Cloud SQL instance. You can find more information about configuration in the Cloud SQL documentation.
Conclusions from testing Cloud SQL using Mimus
|Cloud SQL SG machine-type||Suggested number of concurrent users|
When using the Mimus test harness to simulate 100,000 concurrent users against a Cloud SQL for MySQL Second Generation instance that was created on an n1-highmem-16 Compute Engine instance, query response time remained under 2 seconds throughout the test.
If you are building a MASS mobile game and need to support hundreds of thousands of concurrent players, a service-fronted database pattern based on Cloud SQL for MySQL Second Generation can provide the needed performance. When your game grows in popularity, you can introduce additional Cloud SQL instances, either sharded or as replicas, and a Redis or Memcached caching strategy at the database service level to keep performance at acceptable levels.
For games with fewer projected concurrent users, you can use a smaller machine type to reduce costs.
For games with projected CCU in the millions, you should consider a NoSQL database, such as Google Cloud Datastore or Google Cloud Bigtable, with strong scalability and performance characteristics.
- Try out other Google Cloud Platform features for yourself. Have a look at our tutorials.