Infrastructure options for RTB bidders (part 4)

This article focuses on the main tasks performed by a demand-side platform (DSP). These tasks include handling ad requests, using intelligent data, bidding, and handling events. In general, all of these tasks must happen quickly (within approximately a hundred milliseconds). The latency deadline is set by the sell-side platform (SSP) or by an ad exchange. The rest of this document uses 120 milliseconds as a standard deadline (Google's most common SLA).

This article is part of the following series:

See the overview for ad-tech terminology used throughout this series.


Apart from direct sales to advertisers, publishers also have the option to expose their inventory to programmatic buyers, who buy impressions through a real-time bidding (RTB) system. Publishers might do this to sell their remaining inventory or to reduce their management overhead. In RTB, the publisher's inventory is auctioned to buyers who bid for ad impressions.

To auction, publishers use SSPs that work with ad exchanges and DSPs to automatically return the ad that won an auction. Read the bidder section in the overview for more detail.

The following diagram depicts a possible architecture of a DSP system without integrated ad delivery.

Possible architecture of a DSP system without integrated ad delivery

For details about administrative frontends such as campaign and bid managers, see user frontend (in part 1).

Notable difference between this architecture and the one depicted for the ad server described in part 3 include the following:

  • Machine learning prediction happens offline. The predictions are copied into an in-memory store using locality-sensitive hashing to create keys based on unique feature combinations. See other options for quick machine learning serving in the ad serving article (in part 3).
  • Data to be read by the bidders is stored in an in-memory, clustered, NoSQL database for fast reads.

Platform considerations

The platform considerations section (in part 1) covers most of what you need. However, bidders must fulfill a few requirements, as noted in this section.

Geographic locations

To minimize network latency, locate your bidders near major ad exchanges. Close proximity minimizes the round-trip time incurred during communication between bidders and ad exchanges. Ad exchanges often require a response within about 120 milliseconds after the bid request is sent. Timing out too often might affect the DSP's ability to bid:

  • In the short term, the bidder will miss out on the specific auction.
  • In the medium to long term, the SSP might start throttling the DSP and send fewer bid requests.

Load balancing

Cloud Load Balancing includes support for 1 million+ queries per second at low latency in over 80 global locations. Your frontend will be available behind a single global IP address, which allows for a simpler DNS setup. Cloud Load Balancing also integrates with Cloud CDN.

When you use Kubernetes, a load balancer distributes traffic to the VM instances, and kube-proxy programs iptables to distribute traffic to endpoints. This method can affect network performance. For example, traffic could arrive at a node that doesn'tt contain the proper pod, which would add an extra hop. Google Kubernetes Engine (GKE) can save that extra hop using VPC-native clusters using Alias IPs and container-native load balancing.

Although Cloud Load Balancing is our recommended approach, you might consider setting up your own load balancing on either Compute Engine or GKE. Consider, for example, this frequency-capping use case:

  • A DSP frontend worker processes an ad request.
  • If the request is for a known (unique) user, the frequency counters are incremented that pertain to how often an ad or campaign has been served to that user.
  • If the user reaches a maximum threshold as specified by a cap, no more bids are placed for that (unique) user for a period of time.

Because RTB bidders are dealing with billions of requests per day, if you don't establish affinity between the bid requests for a (unique) user and the DSP frontend worker processing these requests, you must centralize the incoming events by region to aggregate the counters per (unique) user. The architecture shown previously in the overview depicts this centralized approach: collectors ingest events, Dataflow processes them, and then values such as counters finally get incremented through a master Redis node.

An affinitized approach allows the same frontend worker to process all ad requests that pertain to the same (unique) user. The frontend can keep a local cache of its counters; this removes the dependency (for this use case) on the centralized processing. The result is less overhead and decreased latency.

Affinity between a requestor and the processor is usually established in the load balancer by parsing the incoming request's headers. However, ad exchanges typically strip the headers from this user information, and we must therefore process the request payload. Because this is not supported by Cloud Load Balancing, if you are considering setting up your own load balancer, you might want to consider software such as HAProxy.

Ultimately, you must make a decision—you can choose a managed service that offers a global infrastructure or you can choose a custom build that can be adapted to specific use cases.

Connecting to your providers

Depending on your relationship with and proximity to the ad exchanges and SSPs, consider the following connection options:

  • Set up an interconnect through one of your partners and have the ad exchange or SSP do the same with the same partners. This option could provide an SLA for uptime, throughput reservation, and reduced egress costs.
  • If the ad exchange or SSP is also a Google Cloud customer, consider setting up VPC Network Peering, because this can reduce your networking costs. Note that this option comes with security implications.

If you are looking to further reduce latency, consider the following options:

  • Use Unix domain sockets, when supported, to communicate with a local data store.
  • Use protocols such as QUIC or UDP when possible.
  • Use protocol buffers for faster serialization and deserialization of structured data between data streams.

Handling bid requests

Bid requests are received by a frontend that has the same scaling requirements as outlined in the frontends section.

Bid requests are commonly serialized in JSON or protobuf format and often include the IP address, ad unit ID, ad size, user details, user agent, auction type, and maximum auction time. Many DSPs and SSPs work with the OpenRTB standard. But regardless of the standard or serialization used, your frontend code needs to parse the payload and extract the required fields and properties.

Your frontend will either discard the bid request by responding with a no-bid response (such as an HTTP status code 204), or it will proceed to the next step.


A bidder performs the following tasks:

  • User matching: Identify the (unique) user.
  • Selecting segments: Retrieve and select the (unique) user's segments and their price.
  • Deciding whether to bid: Some bids are too expensive, and some ad requests might not match any existing campaigns. A bidder should be able to refuse a bid. This refusal saves processing time and resources.
  • Selecting relevant ads:** **If the bidder decides to bid then the bidder must also select an ad. Selecting the right ad can improve the odds that a user might click and possibly generate a conversion.
  • Optimizing bids: A bidder should always try to find the minimum bid price that will still win the auction.
  • Building a bid response: Using OpenRTB or a custom application, build and return a bid response serialized in protobuf or JSON format. The response should include information such as the ad URL, the bid, and the win URL endpoint that can be called if the bid wins.

Machine learning often facilitates these bidding tasks. For example, ML can be used to predict the optimal price and whether the bid could be won before making a bid decision. This article focuses on infrastructure decisions, however. For details about training and serving ML models, see the following:

User matching

There are two commonly used ways to identify a (unique) user:

  • For advertising on mobile, you can use a resettable device identifier. Native apps built on Android or iOS platforms have access to the resettable device identifier.
  • For advertising on the web, advertising platforms can use cookies to store a resettable user identifier. Cookies, however, can only be read by the domain that created them, which makes sharing the identifier across platforms challenging. Ad-tech players can build tables to match independent cookies and get a broader view of a customer.

Match tables can be used between SSPs, DSPs, DMPs and ad servers. Each partnership implements a different process, even if all of these processes are quite similar.

In real-time bidding, the cookie-matching process often happens well before the DSP receives a bid request. Either the DSP initiates a cookie match request with the SSP from some other ad impression, or the SSP initiates a cookie match with the DSP ("pixel push") on an unrelated ad impression. More often, the DSP initiates the sync that does not necessarily occur on a publisher property, but on an advertiser's property. How cookie matching works in real-time bidding is explained on Google Ad Manager website and quite extensively on the web.

SSPs and DSPs should agree on who hosts the match table. This article assumes that the DSP hosts the user-matching datastore. This data store must:

  • Retrieve the payload for a specific key within at most single-digit milliseconds.
  • Be highly availability within a region.
  • Be present in all targeted regions to minimize networking latency.
  • Handle billions of reads per day.
  • Handle fast writes when updating current counters.

NoSQL databases are well suited for such workload, because they can scale horizontally to support heavy loads and can retrieve single rows extremely quickly.

If you want a fully managed service that can retrieve values using a specific key in single-digit milliseconds, consider Cloud Bigtable. It provides high availability, and its QPS and throughput scale linearly with the number of nodes. At a conceptual level, data is stored in Bigtable using a format similar to the following:

key internal_id created
ssp_id#source_id dsp_id 123456789

Selecting segments

Up to this point in the bidding process, the system has received an ad request, parsed it to extract the SSP's user ID, and matched it with its internal DSP's user ID.

With another lookup, the system can extract user segments from the (unique) user profile store, order the segments by price, and filter for the most appropriate segment. The following example shows the result of a lookup. (The example is simplified for clarity.)

key segments updated
dsp_id [ {‘dmp1':{‘segment':'a',‘type':‘cpm',‘price': 1}} {‘dmp1':{‘segment':'b',‘type':‘cpm',‘price': 2}} {‘dmp2':{‘segment':'c',‘type':‘cpm',‘price': 1.5}} ] 1530000000

Depending on your ordering and filtering logic, you might want to promote some discrete fields (such as the data provider name) into the key. An efficient key design will both help you to scale as well as reduce the querying time. For advice on how to approach the key design, see Choosing a row key in the documentation for designing a Cloud Bigtable schema.

Although this article uses Cloud Bigtable as an example service for reading segments and for performing user ID matching, in-memory stores such as Redis or Aerospike might offer better performance, though at the cost of additional operational overhead. For more details, see heavy-read storing patterns (in part 1).

To get access to additional external user data, DSPs often work with data management platforms (DMPs) with whom they implement user matching techniques that are similar to those used with the SSP.

A DSP uses two primary sources of data to profile the user:

  • First-party user information: A DSP might have recognized a user before on an advertiser's property. The DSP might have gathered some information about the user, such as a cookie or device ID, that the DSP can now use to profile the user.
  • Third-party user information: If a DSP doesn't work with many advertisers, the DSP can identify (unique) users by using data from DMPs that work with many properties over the internet or through applications. User matching is done either through the publisher-provided user ID, shared by the DSP and DMP, or directly between DMPs and DSPs. In the latter case, the DSP recurrently loads data offline from the DMPs with whom the DSP also keeps a match table.

Third-party data can be loaded recurrently from an external location to Cloud Storage and then loaded to BigQuery. Or the data can be loaded in real time to your exposed endpoint, which fronts a messaging system.

Handling events

How to ingest and store events is covered in event management. In RTB, the following additional events are also collected:

  • Bid requests: These are similar to ad requests and sent to an endpoint that you provide to the SSP or to the ad exchange.
  • Auction wins: On winning an auction, the SSP or ad exchange sends a notification back to a win endpoint, which is defined by the DSP earlier within its bid response.
  • Auction loss: SSPs or ad exchanges might notify all DSPs if their bid did not win, but they rarely offer other information. Certain SSPs, such as Google's exchange, communicate this piece of feedback in the subsequent bid request.

These events are used in various services on the platform to do the following:

  • Update counters: Similar to the ad selection process, some counters such as caps or remaining budgets help to filter out irrelevant campaigns or ads.
  • Make bid-related decisions: By using historical bid win/loss data and prices, the system improves its bid/no-bid decision and optimizes bid prices.
  • Provide data for reporting: Similar to ad serving, users want to know how bids and ads perform.

Joining auction and post-win data

Your bidder needs to make decisions for every bid request. This is different from ad serving, where prices are calculated for a batch of ads. For this reason, joining data as soon as possible can improve bid decisions.

When handling bid request and auction events, note the following:

  • Ad serving sees a few orders of magnitude difference between the number of impressions, clicks, and conversions. In RTB, a bid request, unlike an ad request, does not guarantee an impression (your bidder has to bid and win that bid).
  • The time between a won bid request and an impression is on the order of milliseconds.
  • The time between an impression and a click is on the order of seconds at most.
  • The time before a conversion can be minutes, days, or even weeks.

These patterns can cause a few problems when you join data, because your system might have to wait for something that might never happen (like a click after an impression). Your system might also have to wait a day or a week for an event to happen—for example, for a conversion after a click, or for a conversion not linked to a click (called view-through conversion). Finally, the system might have won a bid that did not result in a rendered and billable impression.

When building your system, assume the following:

  • Data is ingested in real time through a messaging system such as Pub/Sub.
  • Your data processing tool, such as Dataflow, can manage waiting for a few milliseconds or even seconds for an impression or click to happen after a winning bid request.
  • The system must be able to join all bid requests, bid responses, won bids, impressions, clicks, and conversions.
  • The auction ID is common to all events.
  • Clicks and conversions that happen after a predefined period can be discarded; for example, when too much time has passed to be certain that the click or conversion is due to the specific auction.

Where you perform the join—in the data pipeline or after the data has been stored—is determined by whether you want to join data immediately or whether the join process can wait. If you decide to join the data immediately, you might implement a process similar to the following:

  1. Read the event from Pub/Sub or from Apache Kafka.
  2. Parse and extract the information, including the auction ID.
  3. Perform a key lookup on Bigtable to see if there are already events for the auction ID.
  4. Write the event to Bigtable if there is no pre-existing event.
  5. Join the event with existing data if there are pre-existing events.
  6. Write the event to BigQuery for historical analytics.
  7. Write the event to a fast data store directly if the sink exists, or write the event to a messaging system that in turn lets a group of collectors read from and write to the fast data store.

The following diagram depicts these steps.

Distributed architecture for handling events and bid requests

Keep the following in mind when implementing this solution:

  • Due to the distributed nature of Dataflow and Pub/Sub, data might not be ordered. However, that doesn't matter because the goal is to perform a full join as soon as possible.
  • You need to perform your own garbage collection in BigQuery by using a delete statement (oversimplified in the diagram). This can be scheduled with Cloud Composer or by using cron jobs.
  • Use the Bigtable expiration feature to discard clicks or conversions that happen after a predefined time.

You can also improve the workflow using the timely and stateful processing functionality offered in Apache Beam. You could then get ordered events and not use Bigtable.

If you decide to use offline joins because you can tolerate a delay, the process looks similar to the following:

  1. Read the event from Pub/Sub or Apache Kafka.
  2. Parse and extract the information, including the auction ID.
  3. Append the processed records to a data warehouse such as BigQuery.
  4. Join data using the auction ID and newest timestamps every N intervals, where N is a predefined number of seconds, minutes, hours, or days. You can use Cloud Composer or cron jobs for this task.
  5. Using Cloud Composer or cron jobs, delete stale data—the data with the oldest timestamp per auction ID and event type—every N intervals.

Exporting data close to the bidders

For infrastructure options about serving data, see heavy-read storing patterns.

Although the concepts for exporting data are similar to ad serving, bidders need to return the bid response within a predefined deadline. For this reason, some bidders might prefer to use stores than can handle sub-millisecond reads and writes, even if these stores require more operational overhead. Bidders commonly use Redis with local slaves or regional Aerospike. Learn more about infrastructure options to export data from real-time aggregations or offline analytics.

What's next