Spotify chooses Google Cloud Platform to power data infrastructure
Guillaume Leygues
Lead Sales Engineer, Google Cloud Platform
It’s not every day you move a 75 million+ user company from a home-grown infrastructure to the cloud. But if you use Spotify, more and more of your musical experience will be delivered by Google Cloud Platform over the coming weeks and months — we’re partnering on an ambitious project to move Spotify’s backend into GCP.
Spotify aims to make music special for everyone. Today, the company hosts more than 2 billion playlists and gives consumers access to more than 30 million songs. Users can search for music across any device by artist, album, genre, playlist or record label, while features like Discover Weekly suggest personalized playlists for millions of people around the world.
While Spotify had engineers running its core infrastructure and buying or leasing data-center space, PC hardware and networking gear to provide a seamless experience for users — time and again it asked whether the tradeoff of resources that could otherwise focus on innovative features and software, was worth it.
Recently Spotify decided it didn’t want to be in the data center business, and chose Cloud Platform over the public cloud competition after careful review and testing. The company split their migration to Cloud Platform into two streams: a services track and a data track. Spotify runs their products on a multitude of tiny microservices, several of which are now being moved from on premise data centers into Google’s cloud using our Cloud Storage, Compute Engine and other products.
With Compute Engine, teams can rely on consistent performance from ultra high IOPS SSD and local SSD storage capabilities. And with autoscaling, they can build resilient and cost-efficient applications that use just the right amount of resources necessary at any given time. For storage, Spotify is now implementing Google Cloud Datastore and Google Cloud Bigtable. This rich fabric of storage services lets engineers work on complex back end logic, instead of focusing on how to store the data and maintain databases. Spotify is also deploying Google’s Cloud Networking services, such as Direct Peering, Cloud VPN and Cloud Router, to transfer petabytes of data. This results in a fast, reliable and secure experience for users around the globe.
On the data side of things, the company is adopting an entirely new technology stack. This includes moving from Hadoop, MapReduce, Hive and a series of home-grown dashboarding tools, to adopting the latest in data processing tools, including Google Cloud Pub/Sub, Google Cloud Dataflow, Google BigQuery, and Google Cloud Dataproc.
With BigQuery and Cloud Dataproc, data teams can run complex queries and get answers in a minute or two, rather than hours. This lets Spotify perform more frequent in-depth, interactive analysis, guiding product development, feature testing and more intelligent user-facing features. To gather and forward all events to its ecosystem, Spotify is using Cloud Pub/Sub, Google’s global service for messaging and streaming data. This gives teams the ability to process hundreds of thousands of messages per second, in a reliable no-ops manner. And to power its ETL workloads, Spotify is deploying Cloud Dataflow, Google’s data processing service. This lets the company rely on a single cloud-based managed service for both batch and stream processing.
What makes us most excited to work with Spotify is their company-wide focus on forward-looking user experiences. Now that they’ve begun using Google Cloud Platform, we can’t wait to see what Spotify builds next.
Join us for the GCP NEXT 2016 opening keynote, where we’ll feature a talk from Nicholas Harteau, VP of Engineering and Infrastructure at Spotify. You can also attend Spotify-led technical sessions where you can learn more about how they’re deploying Google Cloud BigQuery and Dataflow.