This article is an overview of a multi-part series about building advertising platforms on Google Cloud. Because these platforms consist of many different services and serve many different users, this series addresses both shared and specific infrastructure options.
How the series works
The series has two main axes:
Shared infrastructure options. These articles address requirements that are common to different components of advertising platforms:
Specific infrastructure options. These articles address requirements that are particular to specific components of advertising platforms:
The following terms are used throughout this series and across the advertising industry:
- ad inventory: The ad slots being offered to buyers.
- ad slot: The space on a web or mobile page where the ad is displayed.
- ad tag: A small piece of code that includes parameters describing the ad slot.
- ad server: Technology used by ad serving platforms to deliver creatives to ad slots on a publisher's properties. Ad servers usually include features such as creative selection, counts, and serving.
- advertiser: Organizations that want to promote a product through different media either directly or through other buyers.
- audience: The (unique) users who visit or use a publisher's property.
- audience segment: A selection, based on a subset of the taxonomy, that results in a set of (unique) users whom advertisers can target.
- buyer: Purchases ad slots to place creatives. Buyers can be networks, agencies, or advertisers.
- conversion: Predefined action by an advertiser that a user might take on an advertiser's property.
- CPA: Cost per action. What a buyer pays per action. Actions or conversions can have different goals, such as acquiring as many users as possible, retaining high-valued key customers, or getting targeted users to buy something on their website. An action might be downloading a whitepaper, signing up for a newsletter, or buying something on the advertiser's website.
- CPC: Cost per click. What a buyer pays per ad click.
- CPM: Cost per mille. What a buyer pays per thousand impressions.
- creative: Advertisement presented to the targeted user.
- CTR: Click-through rate. Number of clicks divided by number of impressions.
- CVR: Conversion rate. Number of conversions divided by number of impressions.
- DMP: Data management platforms provide additional user information to advertising technology (ad tech) players. These platforms might give access to a data dump, or sometimes they load the data to your platform, if you give them access to object storage such as Cloud Storage.
- impression: When an ad is fetched from its source, and is billable.
- publisher: In the context of this series, a publisher owns a set of digital properties such as websites or mobile apps that provide ad slots for hosting creatives.
- (publisher’s) property: The website, app, or game where the publisher will provide an ad slot.
- supplier: Offers ad slots available for purchase on behalf of multiple publishers.
- targeted user: The target of the ad; the person who is meant to view the ad.
- taxonomy: The classification of audience attributes, normally as a hierarchy.
- (unique) user: A user who can be targeted and is known or considered to be unique. Determining uniqueness is difficult and is often a best-effort guess, given factors such as multiple persons using the same device or the same person using different devices.
Real-time bidding terms include the following:
- ad exchange: A marketplace for advertising that receives ad requests from SSPs. After receiving a request, the SSPs expect to receive ads from all the DSPs with a bid attached before they select the winning bid and return it to the sell side. This transaction must happen quickly. For example, Google waits approximately 120 ms for buyers to return a bid.
- DSP: Demand-side platforms receive an ad request that they must answer by a time set by the SSP or ad exchange. The time allowed can be as low as 100 ms and range up to a few seconds. DSPs decide whether they want to bid. If they do, they must select an ad, determine a bid price, and return their offer to the ad exchange.
- RTB: Real-time bidding. The process of exposing an ad inventory (ad slots) to programmatic buyers through an online auction mechanism.
- SSP: Supply(sell)-side platforms are sometimes part of an ad server, or they exist as a standalone tool that receives ad requests from publishers or ad servers. SSPs usually send an ad request to the ad exchanges, but sometimes they send this request directly to DSPs. The SSP might enrich this request with additional audience context—for example, demographics—to increase the value of the ad slot. SSPs expect to receive the ad that won an auction, which they then return to the publisher or ad server.
Additional terminology used specifically in this series:
- backend: A service or database used by the frontend to retrieve data or offload processing, for example, machine learning model training.
- customer: A platform user who uses the platform you are providing.
- frontend: A service that processes external requests.
- function: A specific capability offered by a service running on a platform.
- offline: Describes any process that does not feature in real-time decision making.
- online: Describes any process that must run as part of a real-time process.
- platform: A set of services that offers one of the major capabilities, such as ad serving, offering inventory, and bidding.
- platform user: A publisher, seller, buyer, advertiser, or other user who uses a platform UI.
- QPS: Queries per second.
- service: One or more functions that are offered as a set, typically as a single application.
- worker: The instance of a service performing a task. There are usually multiple workers running in parallel.
Different advertising platforms such as ad servers, demand-side platforms, supply-side platforms, and ad exchanges have a few functional components that function similarly in order to:
- Let platform users (suppliers, buyers) interact with the platform through a frontend UI.
- Handle requests, for example, for an ad or a bid.
- Manage events and data lifecycle such as impressions, clicks, conversions, and, potentially, bids won and lost.
The following diagram shows an architecture based on these components.
Most advertising platforms require a customer frontend, which usually consists of a UI backed by one or more different databases. The frontend must meet the following requirements:
- Be globally accessible at a latency that offers a good user experience.
- Be highly available so that customers can manage their preferences at any time.
- Scale with the demand, acknowledging that customers can use the platform at their discretion and at any time (given time zones and a global base of platform users).
Depending on the platform, these frontends might be used by suppliers and/or buyers, and they might form part of an ad server, DSP, SSP, or ad exchange. Each frontend offers different administrative capabilities and handles different advertising resources (ad creatives, bids, requests, demographics, and so on).
For more details about these concepts, see user frontend (in part 1).
Requests and ad selection
Ad selection is done by the platform when it receives a request. Requests might be ad requests generated from an ad tag in the usual ad-serving context. Or the requests might be bid requests coming from an SSP or ad exchange in an RTB context.
The part selecting an ad must:
- Be highly scalable: Ad-tech requests are often in the range of billions of daily requests.
- Be highly available: Given such a large scale, a single second of unavailability resulting in failed requests can have large business impacts.
- Offer minimum latency: Ads must be displayed as fast as possible to the targeted users, which affects how fast an ad must be selected. In RTB, latency is a critical requirement because SSPs or ad exchanges require bid responses to be returned within a specific period, which can be as low as 100 ms.
Components in the ad-selection process are:
- A frontend service that receives ad requests.
- One or several data stores used by frontends to make decisions.
- A selection algorithm that selects the ad.
Handling requests (in part 1) shows how to implement frontends, and Heavy-read storing patterns (in part 1) shows how to implement stores. For more specifics about ad servers and RTB bidders, read the relevant sections in the articles about ad servers and bidders (in part 4).
Both DSPs and ad servers make use of ad selection logic to profile the (unique) user, filter out non-relevant campaigns and ads, then select an ad. Moreover, the selection process for bidders also includes deciding whether to bid, determining the bid price, and possibly further optimizing their bid. You can find relevant links for both in Infrastructure options for serving advertising workloads (in part 1).
Events and data management
Most of the decisions made in an advertising platform depend on data coming from different sources, including:
- Ad requests received by ad-server frontends.
- Bid requests received by DSP frontends.
- Bid wins and losses received by the DSP win endpoints.
- Impression events generated after an ad is served to the targeted user. In most cases, impressions are billable impressions. Billable impressions are rendered and considered as viewable.
- Click events generated when a targeted user clicks an ad. The number of events is likely to be a few orders of magnitude lower than the number of impressions.
- Conversion events generated when a targeted user performs the hoped-for action on an advertiser's property. The number of events is likely to be lower than the number of ad clicks.
- Semi-static data managed by platform users.
- Offline data that comes from analyzing historical events.
- Third-party data, such as user segments and associated prices, provided by external sources such as DMPs.
Instead of a rule-based system, machine learning is an important component that can use historical data to train models offline and real-time data to train models online. These models can then be deployed locally so that individual components or services, such as ad servers, can make online predictions. The models can also be used to populate caches/key-value stores that serve already-made predictions.
The platform must be able to:
- Handle terabytes of daily data to be collected, ingested, processed, and stored.
- Scale to billions of daily events when collecting, ingesting, processing, and storing.
- Provide options for real-time online and offline processing.
- Run processing tasks such as machine learning in a distributed environment.
- Automatically feed relevant data back to the intelligence database, either real time through streaming, or later through batches.
For more detailed explanations on how to handle joins across bid requests, bid results, impressions, and clicks, see handling events (in part 3).
Ad servers generally consist of shared components and ad serving features, as shown in the following diagram.
As a core part of ad servers, ad serving requires:
- Low latency: Ads must be served quickly to make sure that targeted users see them (if scrolling allows) and that their ad viewing experience is not impaired.
- High availability: After an ad is selected, not serving it due to the platform being down would be wasteful and costly.
- Scalability: With billions of ad requests per day, many platforms have to serve billions of corresponding ads.
Even if some DSPs or SSPs serve ads from their infrastructure, this article assumes that they have implemented an ad server as part of their platform. To read more about serving ads, see serving the selected ad to the targeted user (in part 3).
The bidding process is detailed in Infrastructure options for RTB bidders (part 4).
DSPs generally consist of shared components with these requirements:
- Bid response must happen within a defined deadline, which makes latency critical.
- Bids are calculated per bid request, which makes the ad-selection logic complex. This extra logic in the algorithm is usually handled by an extra bidder service. For more detail, see bidding (in part 4).
The following diagram shows a high-level overview of a demand-side platform.
- For information about serving in ad tech, see Infrastructure options for serving advertising workloads (part 1).
- For information about managing data in ad tech, see Infrastructure options for data pipelines in advertising (part 2).
- For information about serving ad requests, see Infrastructure options for ad servers (part 3).
- For information about serving bid requests, see Infrastructure options for RTB bidders (part 4).
- Explore reference architectures, diagrams, tutorials, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.