Priori Data builds big data analytics capabilities using Google Cloud Platform

App intelligence company Priori Data needed a comprehensive platform that would store an unlimited amount of data at a reasonable cost, provide querying tools for turning raw data into intelligence and quickly scale storage on demand. The company selected Google Cloud Platform because it can process vast amounts of data quickly, makes it easy to share data with customers and frees its engineers to work on data analytics rather than admin tasks.

Handling big data on a limited budget

Berlin-based Priori Data provides competitive intelligence and market data about mobile app downloads and revenue to stakeholders in the mobile app economy. In 2013 the company was founded as a five-person team with big ambitions: to provide market-based information about mobile apps, which would require gathering, analyzing and distributing data from multiple sources.

At launch the company used non-cloud storage and a SQL-based system. It quickly outgrew the service.

“We found the system incredibly limiting and unable to adapt to the scale we needed,” says Priori Data Founder and CEO Patrick Kane. “If we continued with it, we would have had to make significant investments to host our own data, which would have been extremely expensive.”

Instead, Priori Data turned to Google Cloud Platform.

“Google Cloud Platform was a godsend,” Kane says. “We only had to pay for what we used, and its scalability handled as much data as we needed. It also relieved us of having to hire a dedicated system administrator. When you’re working with a small budget, that’s a very big benefit.”

Putting queries on steroids

Priori Data gathers information from app stores and other sources around the world and stores it using Google Cloud Storage.

“The benefits of Google Cloud Storage go well beyond the immediate cost savings,” Kane explains. “If you pay a lot for storage, you have to make decisions up front about what data you want to preserve and what you don’t. With Google Cloud Storage we can keep data even if we don’t want to use it right away. That’s paid off for us because we’ve turned that data into information with commercial value.”

Priori Data processes raw data and turns it into useful information for its customers using Google BigQuery. Queries that previously took hours now take minutes. This allows the company to create usable information for its customers more quickly, such as historical trend charts showing how pricing strategies of competitors have changed over time. Because it can run more queries in a given time frame, it can also provide more in-depth intelligence.


GCP also makes it easier for Priori Data to share information with its customers. The company uses Google Cloud Pub/Sub to communicate among its various GCP services, and makes data available to customers via a Ruby on Rails web application hosted in Google Compute Engine.

“What’s great about Google Cloud Platform is its integration — all of the services speak the same language and are plug-and-play,” Kane says. “That gives us a centralized tool for all of our storage, processing and querying.”

Developer synergy and new markets

Using GCP, Priori Data has leveraged a small team to process vast amounts of data, share it with customers and partners and expand to new markets.

“App stores are global marketplaces,” Kane says. “Because of the affordability of Google Cloud Storage, we didn’t have to limit our scope to a specific geographic region, or limit how deeply into the data we went. We can target new markets without worrying about costs.”

He says an unintended benefit of GCP has been to unleash his staff’s creativity because of the integrated way GCP services work together.

“Moving to Google Cloud Platform brought together two functional divisions in our company: the developer-engineer side and the data science side,” he says. “Having a single location for collecting information and processing it has resulted in synergy between the teams.”

Kane says GCP has been a key component of Priori Data’s success.

“On a biweekly basis, we query and export about 150 billion data objects, which totals about 100 terabytes of raw data,” he says. “Google Cloud Platform does the heavy lifting for us and does it at incredible speeds. We use 28 Google Compute Engine instances and don’t need a dedicated system admin for it. We’re covering 58 countries right now and expect to bring that up to 125 in the coming year, while tripling the number of our data partners. We can do all that because Google Cloud Platform is a force multiplier, letting us accomplish more with less effort and cost.”