Optimizing Waze ad delivery using TensorFlow over Vertex AI
Stephen Csukas
Software Engineer, Waze
Daniel Marcous
Data Science Lead, Waze
Try Google Cloud
Start building on Google Cloud with $300 in free credits and 20+ always free products.
Free trialWaze Ads
Waze is the world's largest community-based traffic and navigation app. As part of its offering, it lets advertisers put their businesses on the Waze map. By doing so, ads on Waze will reach consumers at key moments of their journey. Goals for advertising on Waze include getting customers to business locations, building brand awareness, and connecting with nearby customers at the right moments.
Waze uses several ad formats, the most prominent of which is called a “Pin”. Like a store sign, Pins inform and remind customers that a business is on or near their route.
Ad Serving @Waze
Waze Ads is a reservation platform, which means we commit to a fixed number of ad impressions in advance and then attempt to meet expected delivery based on the actual drives that occur. It is important to note that Waze only shows ads to users in a certain proximity to the advertised business location. Our ads inventory is thus highly correlated with traffic patterns - i.e. where and when people drive with Waze. After we set up an ads campaign, we choose the right time and place so we deliver on our commitment to the advertisers. We also have a planning tool to predict the quantity of sellable ads inventory based on traffic patterns and campaign setup, but that’s something for a different blog post :)
Following a locked and launched advertising campaign, “the life of a Waze ad” looks something like this:
Mobile client connects to server and “asks for pins to show” [Every few minutes for saving battery - this is important for what comes next]
Ad server gets request and scans for a list of candidate pins which advertise businesses in a certain proximity to the user’s location
Ad server ranks (and logs) all candidates according to internal logic (e.g. distance)
Mobile client gets a ranked list and saves it for later use
[Over the next few minutes] - map is shown on screen and client logic has the opportunity to show a pin ad
Mobile client scans the ranked list and displays a suitable number of pins that can fit the map on the user’s screen and are appropriate for its zoom level
Mobile client logs successfully displayed ads
Did you catch the issue in step 6?
Waze is a navigation app, meaning the user is driving!
The user’s visible map on screen constantly changes based on their destination, speed, traffic pattern, etc. These screen changes and alignments are important for providing the best user experience while navigating.
Upon performing a funnel-like drop analysis, we’ve noticed that step 6, although optimized for distance from the user (step 2) is a place where we lose ads in the funnel. Moreover, the effectiveness of the mobile client to find pins to display (step 6) is a direct result of the ads we choose to send to it (step 3). By making ad ranking (step 3) smarter, we can seamlessly unlock additional pin ads inventory, which would ensure Waze could better uphold its delivery commitments.
What would that include though? Predicting where the user is going? Predicting where they’ll be in the next few minutes?
Unlocking lost inventory using ML
Google’s CEO (Sundar Pichai) once said: “Machine learning is a core, transformative way by which we’re rethinking how we’re doing everything”
As you can imagine, we’ve naturally approached solving this problem with ML.
The problem can easily be formulated as a learning to rank ML problem where we rank candidate ads to maximize the likelihood of ads to be displayed in the mobile client.
We can debate the exact optimization goal, but ultimately when we create a list that should serve the mobile client for the next few minutes, we want to meet expected ad delivery (given an even sized candidate list) in that time window.
Maximizing Display Probability
By matching the ad server’s logged candidates with the mobile client’s successfully displayed ads, we can create a labeled dataset to be used for supervised learning.
As mentioned before, a successful display is based on whether the user’s screen in the next few minutes (after getting the candidate list) will include candidate locations. To optimize that, we need to know the user’s current location, destination, current route (suggested by Waze to follow) and the locations of all candidate pins. We translate the above information to several features to be used in a supervised model.
The trained model assigns probabilities for pins to be displayed in real time, which are taken into account in ranking. Note that they are not the sole contributor for ad ranking, as we still have multiple goals in choosing the right ad to show - (e.g. user relevance).
We chose to use TensorFlow to power this model. We were motivated by our requirement to perform complex feature engineering on numeric (mostly distance-based) features and our extreme scale requirements to power a real time ad serving use case with millions of predictions per second and a strict requirement on < 70ms end to end latency.
As avid GCP users, we’ve used the Vertex AI suite to train and deploy this TF model and easily integrate with the rest of our data stack. The resulting architecture looks something like this:
It is worth saying that the above diagram including the clean separation of concerns (based on FCDS philosophy) took a few iterations for us to achieve. We first started with an offline model deployed to Vertex AI models and rigorous A/B testing to demonstrate value before going for full productionization and automation (using TensorFlow Extended (TFX) over Vertex Pipelines) of this flow.
Results
We launched our integration with Vertex AI to power our display probability model in late 2020. With the display probability score incorporated into ad ranking we observed a lift of up to 19% in pins displayed per session in large markets including the US, Brazil, and France! Vertex AI delivered low latency predictions within our performance parameters and CPU based autoscaling ensured smooth scaling of additional resources as ads traffic changed throughout the day.
Summary
By using ML to rank the display probability of candidate ads we were able to increase the number of reserved impressions delivered per session, helping us keep our delivery commitments to advertisers.
There were many complexities involved in running ML at this scale in Waze. But luckily, thanks to Vertex AI we didn’t have to worry much about scale, latency, or devops and could focus on the ranking side. This was the first integration of such scale at Waze, and it paved the way for many more use cases in Ads, ETA modeling, drive suggestions and more. It allowed Waze to justify going all in on using TFX in Vertex AI.