Shazam: saving through scaling with Google Cloud Platform

Shazam is one of the most popular mobile apps in the world, with over 1 billion downloads since its launch in 2008. Users Shazam over 20 million times each day, trusting Shazam to match the songs in seconds to its database of digital fingerprints. On occasions such as the Super Bowl or the Grammy Awards, Shazam’s dedicated GPUs face peaks of traffic requiring server capacity well above normal levels. That’s why Shazam avoids wasteful over-provisioning by autoscaling instances on Google Cloud Platform.

"We'd been running on our own bespoke non-elastic GPU infrastructure for about five years and had begun looking at ways to improve the flexibility of our deployment processes. Other cloud GPUs weren't fast enough for our needs, but the Google Cloud Platform tooling ecosystem, its pricing, and Google’s reputation convinced us. Rather than a lift and shift from our infrastructure to Google, we were able to build something from scratch in a little under two months. The team did a great job, but that’s also testament to the tooling in GCP. We were able to jump in with both feet and just go." - Ben Belchak, Head of Site Reliability Engineering, Shazam

Matching spikes in demand without over-provisioning

Apps are often subject to significant spikes in demand above normal levels, pushing companies to supply additional server capacity that stands idle for most of the year. To deliver high availability during major public events, Shazam maintained bare-metal servers on year-round contracts that were only used for a few hours of peak demand. The team looked for a more economical way to scale at speed for those events, with pricing that accurately reflected its needs.

To do that, Shazam chose Google Cloud Platform. Rapid autoscaling on Compute Engine means the team can provision for 50% of maximum demand instead of 100%, confident that GCP will respond to spikes when they occur. Compute Engine delivers new servers in minutes instead of the months experienced with bare-metal hardware, then shuts them down when they are no longer needed, and thanks to GCP’s pricing system, the company only pays for the minutes it uses. Stability is enhanced, too. By replacing bare-metal servers hosting 8 virtual GPUs with VMs containing only one, failing nodes cause less damage and can be repaired in minutes instead of hours. The need for failover server arrays has also been reduced, as monitoring on GCP swiftly detects VM failures and automatically orders replacements ready in minutes, instead of creating problems for staff that could take hours to fix.

"The GCP interface is great. Almost every time you try to do something in the Google UI there’s a corresponding API call or a G Cloud command there for you. We've used those a lot. We started off building everything by hand, clicking buttons and seeing if we liked the things that were built. Once we’d built something we liked, we could use the commands and API calls to automate it with tooling that we built on our side. It's a really good way to approach building things in the cloud, and something we really enjoyed." - Ben Belchak, Head of Site Reliability Engineering, Shazam

The snowball effect

Less than a month into using GCP, Shazam runs a third of its traffic through the platform. Exploring GCP products beyond Compute Engine, the company anticipates an economy of scale effect, with gathering cost savings as more of its services are migrated.

"Other cloud providers are playing catch up on some of the tooling Google has built. Take the rapidity with which we could deploy this new service. I don't think we could have done that on any other platform. Google has kind of hit a snowballing effect, where more things are coming out all the time and existing products are always being improved. It’s compelling for us to know that the products we're using will continue to get better.” - Ben Belchak, Head of Site Reliability Engineering, Shazam