Jump to Content
Data Analytics

African super app Yassir delivers on data with BigQuery migration

February 28, 2025
Hamdi Amroun

Head of AI, Yassir

Maniganda Perumal

Data Platform Lead, Yassir

Join us at Google Cloud Next

April 9-11 in Las Vegas

Register

Yassir is a super app, supporting the daily lives of users in more than 45 cities across Algeria, Morocco, Tunisia, South Africa, and Senegal who rely on our ride-hailing, last-mile delivery, and financial services solutions. These users are both consumers and vendors — including drivers, couriers, restaurants, and more — that use our platform to run their businesses. 

At Yassir, we process a wide variety of datasets to ensure we provide the best and most reliable solutions for our users across all of our offerings, and we depend on that data to continually improve those services. However, our previous infrastructure made unifying data and AI difficult. 

Previously, we had two separate data systems: one using Databricks for deploying and training machine learning models and another through Google Cloud and BigQuery for storing and analyzing data. This setup led to several issues, such as formatting incompatibilities that we could not resolve. In addition, retrieving data from Databricks for processing within Google Cloud wasn’t possible, and this disconnect directly impacted our application performance.

These siloed environments meant our teams often had to duplicate work to develop and maintain any data projects, paying to maintain separate environments, and, despite all of this, failing to get the information that teams needed at the desired pace. 

To address these issues, we decided to consolidate our data infrastructure with Google Cloud to bring all of these functions into one place. This migration would allow us to provide better access to data and more scalability, and create new opportunities to analyze, review, and improve performance.

Creating a more flexible, unified data platform

Our existing relationship with the Google Cloud team provided a strong foundation to not only resolve our data connectivity roadblocks but also implement new data processing workflows using BigQuery and deploy new AI and machine learning models with Vertex AI. Consolidating with a single data provider also gave us a centralized place to review and control expenses as well as simple, centralized data governance controls. As a growing company, being able to scale our cloud usage up or down to optimize costs allows us to test and iterate without a hard commitment to every project, and that flexibility is invaluable. 

We worked closely with the Google Cloud team to design a solution that aligns with our growth goals. This meant participating in technical and strategic workshops to help train our team on the ins and outs of BigQuery — and its real-time, governance, and open-source capabilities — empowering our engineers with the tools and resources they need to experiment. This collaborative approach allows us to nurture the type of engineering culture we want to promote at Yassir; rather than simply using out-of-the-box solutions, we can tackle more complex problems by adapting flexible, existing solutions to our specific use cases.

After conducting our internal compatibility reviews, we migrated individual models from our previous solution into Vertex AI to test their consistency, and now they’re up and running nearly autonomously. By migrating from Databricks to BigQuery and combining our own models with the models provided by Google Cloud, we’ve improved the performance and efficiency of our machine learning processes and better positioned ourselves for ongoing growth. We may not be processing petabytes of data yet, but we know that we have the capability to do so when needed.

https://storage.googleapis.com/gweb-cloudblog-publish/images/image1_tPfY8QD.max-900x900.png

Evolving from data processing to data insights

Our previously disconnected data solutions made it difficult to provide secure access to specific data for specific teams. Since we stored our data in BigQuery but deployed models with Databricks, granting access to information to a user or a team meant giving them the keys to everything. Now, we can implement role-based access controls as well as Infrastructure as Code (IaC) Terraform scripts to automatically grant and revoke access to datasets for individuals or teams. Sharing data through Looker Studio Pro and directly providing access to BigQuery tables for our more technical users also means we can ensure the required data reaches the right users.

With our data unified in BigQuery and connected to our machine learning models, we can better support everything from customer growth and retention to marketplace optimization by providing insights into product usage, customer data, and more. To ensure we’re hitting our internal and customer-related goals, we closely monitor and create dashboards for operational and analytical datasets. 

Our operational dashboards give our sales and marketing teams the insights they need to better target and reach merchants and consumers. They also include insights into our staffing processes, helping us to gradually reduce delivery times, complete more rides faster, and improve how we support specific markets. We also have product-level detection and monitoring that help us provide real-time dynamic pricing and identify fraudulent trips and orders. Each data point we collect gives us more opportunities to build a more personalized and consistent customer experience. 

Our leadership team relies on our rapidly available datasets to drive strategic decision-making, including regional investment decisions to grow the business, macro-level plans for growth trajectories and marketing budgets, and identification of the areas of the business that need the most support or attention. These roadmap decisions are core to our overall growth strategy, and they wouldn’t be possible without the flexibility and scalability we’ve been able to achieve with BigQuery.

Posted in