How to do serverless pixel tracking with GCP
Whether they’re opening a newsletter or visiting a shopping cart page, how users interact with web content is very interesting to publishers. One way to understand user behavior is by using pixels, small 1x1 transparent images embedded into the web property. When loaded, the pixel calls a web server that records the request parameters passed in the URL that can be processed later.
Adding a pixel is easy, but hosting it and processing the request can be challenging for various reasons:
- You need to set up, manage and monitor your ad servers
- Users are usually global, which means that you need ad servers around the world
- User visits are spiky, so pixel servers must scale up to sustain the load and scale down to limit the spend.
For example, we recently worked with GCP partner and professional services firm DoiT International to build a pixel tracking platform that relieves the administrator from setting up or managing any servers. Instead, this serverless pixel tracking solution leverages managed GCP services, including:
- Google Cloud Storage: A global or regional object store that offers different options such as Standard, Nearline, Cold with various prices and SLAs depending on your needs. In our case, we used Standard, which offers low millisecond latency
- Google HTTP(s) Load Balancer: A global anycast IP load balancer service that can scale to millions of QPS with integrated logging. It also can be leveraged by Cloud CDN to prevent useless access to Google Cloud Storage by caching pixels closer to the user in Google edges
- BigQuery: Google's fully managed, petabyte-scale, low-cost enterprise data warehouse for analytics
- Stackdriver Logging: A logging system that allows you to store, search, analyze, monitor and alert on log data and events from GCP and Amazon Web Services (AWS). It supports Google load balancers and can export data to Cloud Storage, BigQuery or Pub/Sub
- A client calls a pixel URL that's served directly by Cloud Storage.
- A Google Cloud Load Balancer in front of Cloud Storage records the request to Stackdriver Logging, whether there was a cache hit or not.
- Stackdriver Logging exports every request to BigQuery as they come in, which acts as a storage and querying engine for ad-hoc analytics that can help business analysts better understand their users.
All those services are fully managed and do not require you to set up any instances or VMs. You can learn more about this solution by:
- Reading the solution paper: Serverless Pixel Tracking Architecture
- Following the tutorial: How to Do Serverless Pixel Tracking
- Load-testing it: How to load test a serverless pixel tracking setup