DevOps & SRE

DevOps Award winner Moloco on ‘accelerating machine learning with DevOps’

August 16, 2023

https://storage.googleapis.com/gweb-cloudblog-publish/images/Moloco.max-2000x2000.png

Rob Martin

Senior Engineering Manager - Data & Machine Learning Infrastructure

Outlined in this post are the applications and achievements that earned Moloco the ‘Accelerating Machine Learning with DevOps’ award in the 2022 DevOps Awards. If you want to learn about the other winners and how they used DORA metrics and practices to grow their businesses, visit DORA.

Moloco - A leader in machine learning

Moloco is a global Machine Learning (ML) company with a mission to empower businesses of all sizes to grow through operational machine learning. Our team of data scientists, machine learning engineers, and software engineers develop highly accurate predictive models to monetize apps via user acquisition, in-app purchases, and other desired outcomes.

Founded in 2013 by engineers from Google and Oracle, Moloco entered mobile performance advertising because the voluminous data processing scale provided the perfect application to build one of the world’s most advanced privately owned ML-powered engines.

Over the past ten years, Moloco has refined and optimized this engine specifically for mobile app advertising. As of 2023, our portfolio has expanded into additional areas of opportunity including retail media advertising and monetization for video streaming. As we continue to grow and expand, we’ve attracted top talent from Meta, LinkedIn, Microsoft, and Apple. Our vision is to bring on the best minds in machine learning to further scale and evolve Moloco’s ML platform capabilities.

Today, our ML-powered Demand Side Platform (DSP) operates at massive scale, processing ad requests from dozens of ad networks representing an ability to reach more than 6B devices globally.

Moloco scale: 14 trillion ad bid requests per month

Breaking down those 14,000,000,000,000 bid requests equates to up to 600 billion bids requests daily, requiring we ingest… 10 petabytes of data every day — more than most companies process in a lifetime

Due to our precise model pricing, we filter down to 6 million requests processed per second through Tensorflow models on GKE.

Ad exchanges require 100ms or less to respond after we receive the bid request.

Network latency reduces the actual time to 50 - 80ms for processing. In that time we must:

Select the ad creative most relevant to the target user
Select the ad format suitable for the space purchased
Determine the bid price for a real-time auction

All of this data processing and decisioning requires about 10 Deep Neural Network (DNN) ML models, each of which takes about 10ms for a prediction — an order of magnitude faster than most companies are capable.

Our DNNs also require persistent data from databases which query in 1-5 milliseconds —or even nanoseconds — to return bid results.

This capability enables us to win bids and produce data-driven creatives with high conversion rates that win praise from our customers.

ML models in constant motion

To maintain optimal data processing, the entire system is continually retrained and redeployed to ensure performance remains competitive — in an environment that changes by the second, and can literally change with the weather.

Leveraging the scale of cloud infrastructure

Moloco’s lean team of only a couple hundred engineers built and operates this powerful privately owned ML–powered engine. As a team working for a bootstrapped startup, they quickly realized the value of leveraging planet-scale public cloud infrastructure, extensively applying ML automation, and consistently implementing DevOps best practices. Here’s how Moloco built an almost unrivaled machine learning platform on Google Cloud.

Moloco’s founding visionaries intimately understood Google’s pioneering work on scaling infrastructure, having themselves worked on Google scalable systems including Colossus, Flume, Bigtable, Borg, Tensorflow, and Dremel during tenures there. With ready access to analogous systems on Google Cloud (Cloud Storage, Dataflow, Bigtable, GKE, Tensorflow, and BigQuery), our team validated there was indeed ample power to hyper-scale a global ML-powered platform.

Additionally, most Google Cloud services are managed and/or serverless, so Moloco is able to rely on the Google Site Reliability Engineering (SRE) team to operate, manage, and scale the infrastructure. This allows our teams to stay focused on exploring machine learning models and features, creating new product features, and entering new verticals. Together with Google’s global fiber optic network and global regions in the US, EMEA, and APAC, Moloco built and expanded a truly planet-scale machine learning platform.

Putting the “L” in ML

The vast scale of Google Cloud empowers a tremendous amount of automated processing, including extracting insights from raw log data, downsampling the results for training models, parsing data for Moloco’s data scientists to review and run exploratory queries on trillions of rows of historical data, and joining data across thousands of tables to train models — all to win bids and fill the space with data-driven intelligence against a 100ms clock, and repeat with simultaneous volume at ever-expanding scale.

Results: Delivering machine learning at scale

Thanks to our ML infrastructure, we’ve achieved both technical and financial results. We’ve grown revenue by more than 5x from 2020 to 2022, and had revenues of more than $200M in 2022. Our hand-selected team has scaled Cloud DSP into a truly global deployment with up to 34,000 VM instances at daily peak. The platform’s growth has accelerated in the last two years, now processing more than 6 million bid requests per second with 80ms p95 latency.

To ensure we continually grow as a DevOps organization, we measure our success according to DORA’s four key metrics relating to machine learning models:

Deployment frequency: As we continually retrain and refine our machine learning models, we deploy new model versions to production 24 times per day.
Lead time for changes: When new models are published, thousands of bidding proxies retrieve and switch to the new version within 10 minutes.
Change failure rate: We validate model performance in pre-production via automated processes, leading us to recall less than 1% of our production machine learning models.
Time to restore service: When Moloco detects and corrects issues with a machine learning model, it takes less than 10 minutes to fully re-deploy.

As Moloco continues to scale and evolve, we can rely on the solid foundation provided by Google Cloud to deliver programmatic performance for our customers and explore new frontiers in machine learning.

Stay tuned for the rest of the series highlighting the DevOps Award Winners and read the 2022 State of DevOps report to dive deeper into the DORA research.

Posted in