Cabify: Taking a data-driven approach to ride-sharing using Google BigQuery

About Cabify

Operating in Spain, Portugal, and Latin America, Cabify enables its customers to hire a private vehicle and driver through its Cabify smartphone app.

Industries: Technology
Location: Spain

About Equinix

Global consultancy Equinix supports customers on their digital transformation journey by leveraging the power of digital ecosystems.

Cabify implements Google Cloud Platform for its data pipeline using Google BigQuery, offering its data scientists more up-to-date data and reducing operational burdens with managed services.

Google Cloud Results

  • Reduces operational burden with managed services including Google App Engine and Cloud Functions
  • Integrates easily with a multi-cloud environment enabling simple data ingestion via Cloud Pub/Sub
  • Helps to control costs with on-demand query billing for BigQuery

Increased data agility facilitates profitable growth

Globally, the sharing economy relies on trust, but how people decide who they can trust varies from country to country. By tailoring the ride-hailing app model to Latin America, and focusing on the safety of its customers and the wellbeing of its drivers, Spanish company Cabify has grown rapidly over the past decade and now operates in 11 countries across Europe and Latin America. "It's a fast-moving sector, with new competitors, new regulations, and new services appearing all the time," says Sebastian Barrios, CTO/VP of Technology at Cabify.

"Our strategy is to outsource operational work as much as possible, and host as little as possible ourselves. That means we can focus on building the business towards long-term goals, such as diversifying our offering to cover all mobility needs."

Sebastian Barrios, CTO/VP of Technology, Cabify

Users book their ride through an app on their smartphone. With drivers operating in over 40 cities, that means tens of thousands of drivers reporting their location every four or five seconds. "We're dealing with real-life scenarios, where any downtime means missed customers and less food on the table for our drivers," says Sebastian. "It's a big responsibility."

Operating in a multi-cloud environment, Cabify wanted to improve the performance of its data processing and develop its data science infrastructure. It chose Google Cloud Platform (GCP) as the best fit for its needs.

"Our strategy is to outsource operational work as much as possible, and host as little as possible ourselves," says Sebastian. "That means we can focus on building the business towards long-term goals, such as diversifying our offering to cover all mobility needs."

Building a new data pipeline

When you're competing in a fast-paced market, getting quick feedback on features through data insights is important to staying ahead. Cabify wanted to improve the performance of its data infrastructure, to gain agility and improve reliability. "Every few seconds, we have to calculate the shortest route between various points, taking into account real-time traffic and different on-demand user needs," explains Sebastian. "Unreliable mobile networks make it even more complicated, so we wanted to optimize our data processing environments."

"Using BigQuery helps us to keep our costs down as we are only billed for the queries we run and the data used. It also offers excellent performance, as queries take seconds, whether you're processing megabytes or terabytes of data."

Sebastian Barrios, CTO/VP of Technology, Cabify

To do that, in 2018 Cabify migrated its data processing infrastructure to GCP, integrating new data science tools with the data pipeline on GCP. "All our apps, regardless of which cloud provider they are hosted on, report to GCP through Cloud Pub/Sub," says Sebastian. "Pub/Sub is great because it scales and responds automatically. We're currently moving towards a microservices architecture and, using Pub/Sub, we have been able to make our infrastructure event-driven, which reduces coupling, increases resiliency, and allows our teams to develop at a faster pace."

Data is gathered from multiple sources including drivers' cars and scooters, then transferred to Cloud Storage or Cloud SQL — or held in Cloud Bigtable if it needs additional processing — before being transferred to BigQuery. "Using BigQuery helps us to keep our costs down as we are only billed for the queries we run and the data used," says Sebastian. "It also offers excellent performance, as queries take seconds, whether you're processing megabytes or terabytes of data."

Automating and optimizing non-critical tasks

Cabify also uses a number of GCP tools to automate and optimize internal processes. "We use Cloud Vision API to automatically process various driver and customer documents that are scanned into the system," says Sebastian. "We also use App Engine and Cloud Functions for our voice platform, where users can request a service through a voice assistant, as well as for bots on our internal platform. And we use Firebase Cloud Messaging for the push notifications that tell our users when their car is about to arrive."

"Thanks to BigQuery, we're able to work with near real-time data. Before, we were working with data that was 24 hours behind, whereas now it's a couple of minutes behind real time. That means our data science team is much more confident taking a KPI-lead approach."

Sebastian Barrios, CTO/VP of Technology, Cabify

"It's great that these tasks can be carried out on our behalf in a reliable and cost-efficient way, as we want to focus on our core value-adding solutions," adds Sebastian.

Supporting profitable business growth

Following the implementation of its new data infrastructure on GCP, Cabify has improved its data turnaround. "Thanks to BigQuery, we're able to work with near real-time data," says Sebastian. "Before, we were working with data that was 24 hours behind, whereas now it's a couple of minutes behind real time. That means our data science team is much more confident taking a KPI-lead approach. We can define parameters and then track their impact much more accurately."

Running queries is also much faster, as queries that used to take minutes or hours now take seconds, enabling the data science team to be more agile. From a financial perspective, GCP has also proved to be the right choice. "Cost control is very important to us, as doing business profitably is one of our key differentiators," says Sebastian. "We really like the pay-as-you-use ethos with GCP."

Now, Cabify plans to migrate the rest of its core infrastructure to GCP. "We have already migrated our testing and development environments across to Compute Engine and are planning to migrate a lot of our infrastructure next year. We anticipate an overall cost saving of around 30 percent."

"Our goal is to diversify our offering with different services that cover all our customers' mobility needs, and to continue growing the existing business profitably," says Sebastian. "That's a pretty exciting challenge, and with GCP we can focus on development, not operations."

About Cabify

Operating in Spain, Portugal, and Latin America, Cabify enables its customers to hire a private vehicle and driver through its Cabify smartphone app.

Industries: Technology
Location: Spain

About Equinix

Global consultancy Equinix supports customers on their digital transformation journey by leveraging the power of digital ecosystems.