No tricks, just treats: Globally scaling the Halloween multiplayer Doodle with Open Match on Google Cloud
Joe Holley
Cloud Solutions Architect, Google Cloud Platform
The Doodles that appear on the Google homepage have long been a place for creativity and fun. Over the years, we’ve found that Doodles that have some element of gameplay have been among the most popular with users. So to celebrate Halloween this year, the team behind these delightful Doodles wanted to do something different and unexpected—bringing users together in a super fun real-time multiplayer game.
Because of their ease of access and viral popularity, Doodle games have historically reached incredibly high numbers of users around the world—some have even been played over a billion times in only four days of availability! Launching a global multiplayer Doodle game comes with considerable technical challenges, since it introduces matchmaking and dedicated game servers to the architecture.
But the Google Cloud gaming team loves a challenge, which is why we jumped at the opportunity to collaborate with the Google Doodle team on their first multiplayer Doodle game. We wanted to help them in the same way we help our external gaming customers—by simplifying the game infrastructure planning so they could focus on building the best game possible.
The end result was the successful launch of a very fun and popular ghost-themed game—if you didn't get a chance to play it on Halloween, you can find it here. Over a 65 hour period, the Doodle backend systems co-designed by, and running on, Google Cloud served over 100 million players in 62 countries, with a concurrent user load of over 500,000 players during a 5 minute window at its peak.
Here’s a deeper look at how it came together.
Matchmaking
Designing a low latency matchmaking service for a global pool of players required architecting for scale, reliability and high availability—the three of which are difficult to achieve with traditional matchmakers.Working together with the Doodle team, the Google Cloud gaming team leveraged Open Match–an open source matchmaking framework co-founded with Unity–to meet the demands of the first multiplayer Doodle. Open Match's distributed microservice approach, combined with the versatility of Google Kubernetes Engine, helped us overcome many of these challenges—if we had more players than expected and needed to handle more load, we could simply add additional frontend containers to the appropriate deployment. Although in this scenario we chose to run Open Match with GKE, it can run on Kubernetes anywhere since it’s an open source framework.
All the information required to match players was stored in Cloud Memorystore, our fully-managed in-memory data store service for Redis, which could be accessed by as many frontend workers as we needed. Once matches were made, they were sent to a global Cloud Pub/Sub topic to wait until a server was available. This meant all the work done to match players was durable in the event of an interruption in service.
The final problem to solve was how to decide where players should play—with the game being hosted in GCP regions across the world, past approaches to this problem may have required geographic DNS lookups per IP, or complicated client-side routing rules. However, with GCP’s Global Load Balancer, we were able to connect all our GKE clusters around the world to a single load balancer and let it handle the backend routing, while only requiring one matchmaker DNS entry, greatly simplifying client logic and routing rules.
Game servers
We believe that game server workloads are moving towards Kubernetes, so we worked closely with the Doodle team to help them leverage Google Kubernetes Engine (GKE) for their game servers.
The Doodle team generated container images with each game server build, storing them in a Google Container Registry (gcr.io) for easy access by the GKE clusters. With 1,100 nodes and 36,000 CPUs in clusters across 12 active GCP regions, the Doodle team was able to put game servers near the players to keep latency reasonable and the gameplay snappy. The game servers were configured to connect to the matchmaker on startup, and report their address and port number. They would then wait for a match to be returned with a list of players to expect. Players received the address and port of the game server from Open Match and connected directly to the game server for their spooky sessions!
GKE successfully hosted over 100 million players during the time that the Doodle was on the Google landing page, using over two million compute hours in all.
Monitoring and support
When it came to launching the Doodle, we knew that visibility into game performance would be crucial. With this in mind, we used our logging best practices for the Doodle’s containers to make them easily searchable in Stackdriver, and then went about creating custom metrics and alerting on those metrics.
Our Google Cloud Professional Services and Support teams were highly engaged with the Doodle team and provided guidance leading up to and during the launch. This included technical support escalations, implementation guidance, architectural reviews, and verification of capacity.
As with any major launch, there are always challenges that must be addressed creatively. For example, we discovered early on that we had used the wrong load balancing capacity algorithm for our use case—we had it set to CPU usage, which wasn’t optimal for small machine counts. As a result, we had to reconfigure to balance based on requests per second, all while still serving live traffic. This change around 12 hours after launch balanced the traffic more effectively to the regional clusters for this type of workload.
A spooky success story
After months of development and weeks of preparation, the 2018 Halloween Doodle went live in the first region at midnight, Australian time. Eighteen hours later, it was live around the world in 62 countries. With over 100 million players playing the first multiplayer Doodle, the stakes were high, but all the hard work leading up to the launch made things remarkably smooth. While this workload ran on GCP, thanks to the flexibility of Open Match and Google’s commitment to the open cloud, this architecture could be deployed anywhere Kubernetes can run.
To learn more about gaming solutions on Google Cloud, visit our website.