Jump to Content
Databases

Unity Ads uses Memorystore to power up to 10 million operations per second

October 28, 2024
Eren Boz

Senior Software Engineer, Unity

Kyle Meggs

Senior Product Manager, Google Cloud

Google Cloud Summit Series

Discover the latest in AI, Security, Workspace, App Dev, & more.

Register

Editor’s note: Unity Ads, a mobile advertising platform, previously relying on its own self-managed Redis infrastructure, was searching for a solution that scales better for various use cases and reduces maintenance overhead. Unity migrated their workloads to Memorystore for Redis Cluster, a fully managed service designed for high-performance workloads. Their infrastructure now handles up to 10 million Redis operations per second for a single instance.The company has gained a more reliable and scalable infrastructure, reduced costs, and gained time to focus on high-value activities.


For many users, handling 1 million operations per second is a remarkable feat, but it's business as usual at Unity Ads. Unity’s mobile performance ads solution, which serves ads to a vast network of mobile apps and games, easily manages this volume of operations every day, demonstrating the robust capabilities of our Redis clusters. Real-time ad requests, bidding, and ad selection involve multiple database operations, as does updating session data and tracking performance metrics — all translating to incredibly high performance demands. 

At Google Cloud, we know this exceptional demand calls for an infrastructure that is robust and highly scalable. Enter Memorystore for Redis Cluster, designed to handle the demanding workloads of industries like gaming, finance, and advertising, where speed and scale are critical. This fully managed service offers significantly high throughput and data capacity while maintaining microsecond latencies, consolidating larger workloads into a single, high-performance cluster.

Serving up success with Memorystore for Redis Cluster

Prior to using Memorystore, Unity had multiple pain points with their previous Do-It-Yourself (DIY) setup. For one, they used different flavors of self-managed Redis clusters, ranging from Terraform module-based static clusters to Kubernetes operators. These require specialized knowledge, and scaling and maintaining them was time-consuming. With these DIY clusters, they often overprovisioned mainly to mitigate potential downtime. But this overhead — and the time spent managing this infrastructure — is not sustainable in the high-performance ad business, where every microsecond and fraction of a cent counts.

Memorystore offered a compelling solution for Unity Ads. It fit seamlessly into their existing setup, so the transition was straightforward. Its cost was comparable to their DIY solution but without the management overhead. As existing Google Cloud users, they also saw value in deepening their integration with the platform.

The most important feature? Scalability. One of the standout features of Memorystore for Redis Cluster is that it can scale with zero downtime. To adapt quickly to changing demands, users can expand their clusters to handle terabytes of keyspace with a simple click or command. Memorystore also includes intelligent features that enhance reliability and ease of use. The service automatically distributes nodes across zones for high availability and manages replica nodes, placing them in different zones from their primaries to protect against outages. This automated approach simplifies what would otherwise be a complex manual process.

All this sealed the deal for Unity Ads and they decided to migrate their use cases, including session data, central valuation cache, distributed locks and state management. The migration process went more smoothly than anticipated. Their most critical session data migration, handling up to 1 million Redis operations per second, was accomplished without disruption by implementing double-writing during the transition. Even more impressive was their valuation cache migration, serving up to half a million requests per second (translating to over 1 million Redis operations per second), which was completed in about 15 minutes with minimal service impact. Our team also successfully moved Unity’s distributed locks system to Memorystore, a critical component for ensuring they don't process the same event twice.

Throughout this process, Eren Boz, a Senior Software Engineer at Unity spearheaded the project with close support from Google Cloud. Eren has been instrumental in implementing and overseeing the transition to Memorystore for Redis Cluster. Let's hear from Eren about Unity's experience with the new Memorystore architecture.

Memorystore in action: Unity Ads’ version

Our journey with Memorystore for Redis Cluster has been transformative. One of the most immediate benefits we noticed was that the stability of our infrastructure had increased. With our previous DIY Redis cluster, we often encountered unpredictable performance issues that were difficult to pinpoint due to the fact that they were running on multiple layers of virtualization services, i.e., Kubernetes and cloud compute, where we do not have direct observability and control.

For instance, take a look at this CPU usage graph of individual nodes from our old self-managed setup:

https://storage.googleapis.com/gweb-cloudblog-publish/images/unity_1.max-600x600.png

Fig. 1: Redis cluster node CPU Usages

As you can see, there were frequent spikes and inconsistencies in CPU usage across different nodes. Under these conditions, it was difficult to maintain consistent performance, especially during high-traffic periods.

We also faced challenges with our DIY Redis clusters during Kubernetes nodepool upgrades, which are a common situation of automatic upgrades to a new version of Kubernetes. You can see the p99 latency shoots through the roof!

https://storage.googleapis.com/gweb-cloudblog-publish/images/unity_2.max-1000x1000.png

Fig. 2: Redis client 99th percentile latencies during a kubernetes nodepool upgrade

Since transitioning to Memorystore, we no longer have to worry about such erratic behavior, which we suspect was the result of underlying Kubernetes networking or node issues.

Also, thanks to Memorystore, we can now scale smoothly in production — another significant improvement. Here's a graph showing our client metrics while we scaled the cluster up 60% in size.

https://storage.googleapis.com/gweb-cloudblog-publish/images/unity_3.max-1300x1300.png

Fig. 3: Client operation throughputs using Memorystore for tracking various counters during a scale-up

The operation rate remained remarkably stable throughout the process, with only minor fluctuations. This level of smooth scaling was a game-changer for us, allowing us to adapt to changing demands without disrupting our services.

This data shared above describes the high performance and consistent low latency we've been able to achieve with Memorystore. This level of performance is crucial for our ad-serving platform, where every microsecond counts.

Gaining a return on innovation

In addition to performance benefits, we’ve also seen significant operational improvements from moving to Memorystore. We no longer spend time tuning and testing our Redis clusters for production readiness.

From a business perspective, we expect that by right-sizing our clusters and utilizing appropriate Committed Use Discounts, we should be able to achieve cost efficiencies comparable to our previous DIY solution, especially with the introduction of single zone clusters to eliminate networking costs. With Memorystore, we're now getting a fully managed and much more reliable (99.99% SLA) and scalable solution for a similar cost as our previous self-managed Redis deployment.

Looking to the future, Memorystore has opened up new possibilities for how we architect our systems. We're now considering a “Memorystore-first” approach for many of our use cases. For example, when designing critical data systems, engineers often do not want to take risks and instead choose persistent database solutions such as Bigtable — regardless of whether the use case truly needs persistence and/or consistency. However, sometimes the use case only has data persistence requirements for around an hour, while databases such as Bigtable are better suited for data that will persist for months to years. With a hardened reliable Redis Cluster solution such as Memorystore, we can avoid such shortcuts and optimize for our shorter persistent (e.g. hours-to-days) data use cases, saving both on costs and development time. All in all, the reliability and scalability of Memorystore allows us to confidently expand our use of Redis across more of our infrastructure to lower costs and improve performance. 

Another transformative benefit is the simplicity of enabling persistence (either AOF or RDB) on Memorystore. Our Kubernetes DIY Redis cluster didn’t support persistence, which limited our use cases to only caching scenarios with temporary data that we could afford to lose. With one-click persistence on Memorystore, we’re able to extend use cases and even mix use cases within the same cluster to increase utilization and lower our costs. 

In our business, every second counts. By freeing up our team to focus on core business innovation rather than infrastructure management, Memorystore is helping us stay competitive and deliver better results for our advertisers and publishers.

Get started

Ready to get started with your own Memorystore implementation? 

Posted in