Overview of Cloud CDN

The Google Cloud Content Delivery Network uses Google's global edge network to serve content closer to users, which accelerates your websites and applications.

The Cloud CDN content delivery network works with HTTP(S) load balancing to deliver content to your users. The HTTP(S) load balancer provides the frontend IP addresses and ports that receive requests and the backends that respond to the requests.

Cloud CDN content can be sourced from various types of backends:

In Cloud CDN, these backends are also called origin servers. The following figure illustrates how responses from origin servers running on VM instances flow through an HTTP(S) load balancer before being delivered by Cloud CDN.

Responses flow from origin servers through Cloud CDN to clients
Cloud CDN response flow

How Cloud CDN works

When a user requests content from an HTTP(S) load balancer, the request arrives at a Google Front End (GFE), which is located at the edge of Google's network as close as possible to the user.

If the load balancer's URL map routes traffic to a backend that has Cloud CDN configured, the GFE uses Cloud CDN.

The following figure shows a cache miss and a cache hit.

The initial response is served by the origin server while subsequent
 responses are served by the GFE from cache
Cache miss and cache hit

Cache hit

If the GFE looks in the Cloud CDN cache and finds a cached response to the user's request, the GFE sends the cached response to the user. This is called a cache hit. When a cache hit occurs, the GFE looks up the content by its cache key and responds directly to the user, shortening the round trip time and saving the origin server from having to process the request.

Cache miss

The first time that a piece of content is requested, the GFE determines that it can't fulfill the request from the cache. This is called a cache miss.

When a cache miss occurs, The GFE might attempt to get the content from a nearby cache. If the nearby cache has the content, the GFE sends the content to the first cache via cache-to-cache fill. Otherwise, the GFE forwards the request to the HTTP(S) load balancer.

The load balancer then forwards the request to one of your backends. This backend is the origin server for the content.

When the cache receives the content, the GFE forwards the content to the user.

In the following figure:

  1. Origin servers running on VM instances send HTTPS(S) responses.
  2. The HTTP(S) load balancer distributes the responses to Cloud CDN.
  3. Cloud CDN delivers the resposes to end users.
The initial response is served by the origin server while subsequent
     responses are served by the GFE from cache
Cache miss and cache hit

Caching content

If the origin server's response to this request is cacheable, Cloud CDN stores the response in the Cloud CDN cache for future requests.

Data transfer from a cache to a client is called cache egress. Data transfer to a cache is called cache fill. As illustrated in the following figure, cache fill can originate from another Cloud CDN cache or from the origin server.

Cache fill is data transfer from an origin server to a cache or from a
 cache to another cache. Cache egress is data transfer from a cache to a
Cache fill and cache egress

On cache hits, you pay for cache egress bandwidth. On cache misses—including misses that resulted in cache-to-cache fill—you additionally pay for cache fill bandwidth. That means, all else being equal, cache hits cost less than cache misses. For detailed pricing information, refer to the pricing page.

No URL redirection

Cloud CDN doesn't perform any URL redirection. The Cloud CDN cache is located at the GFE. This means that:

  • Whether or not Cloud CDN is enabled, the URL that a client requests remains the same URL.
  • Whether or not there's a cache hit, the URL remains the same URL.

Cache hit ratio

The cache hit ratio is the percentage of times that a requested object is served from the cache. If the cache hit ratio is 60%, it means that the requested object is served from the cache 60% of the time and must be retrieved from the origin 40% of the time.

In the Google Cloud Console, the cache hit ratio is reported for each origin in the Cache hit ratio column.
Go to the Cloud CDN page
The percentage shown on this page represents a ratio calculated for a small time frame (the last few minutes). To view the cache hit ratio for a time period from 1 hour to 30 days, click the origin name and then click the Monitoring tab.

For information about how cache keys can affect the cache hit ratio, refer to Using cache keys. For troubleshooting information, refer to Cache hit ratio is low.

Inserting content into the cache

Caching is reactive in that an object is stored in a particular cache if a request goes through that cache and if the response is cacheable. An object stored in one cache does not automatically replicate into other caches; cache fill happens only in response to a client-initiated request. You cannot pre-load caches except by causing the individual caches to respond to requests.

When the origin server supports byte range requests, Cloud CDN can initiate multiple cache fill requests in reaction to a single client request. The section Requests initiated by Cloud CDN has more information about these requests.

Serving content from a cache

After you enable Cloud CDN, caching happens automatically for all cacheable content. Your origin server uses HTTP headers to indicate which responses should be cached and which should not. When you use a backend bucket, the origin server is Cloud Storage. When you use VM instances, the origin server is the web server software that you run on those instances. Caching details has additional information about what Cloud CDN caches and for how long.

Cloud CDN uses caches in numerous locations around the world. Because of the nature of caches, it is impossible to predict whether a particular request will be served out of a cache. You can, however, expect that popular requests for cacheable content will be served from a cache a high percentage of the time, yielding significantly reduced latencies, reduced cost, and reduced load on your origin servers.

You can view logs to see what Cloud CDN is serving from cache.

Removing content from the cache

To remove an item from cache, you can invalidate cached content.

Read about cache invalidation. Learn how to invalidate cached content.

Cache bypass

To bypass Cloud CDN, you can request an object directly from a Cloud Storage bucket or a Compute Engine VM. For example, a URL for a Cloud Storage bucket object looks like this:


Eviction and expiration

For content to be served from a cache, it must have been inserted into the cache, it must not be evicted, and it must not be expired.

Eviction and expiration are two different concepts. They both affect what gets served, but they don't directly affect each other.

  • Eviction: Every cache has a limit on how much it can hold. However, Cloud CDN adds content to caches even after they're full. To insert content into a full cache, the cache first removes something else to make room. This is called eviction. Caches are usually full, so they are constantly evicting content. They generally evict content that hasn't recently been accessed, regardless of the content's expiration time. The evicted content might be expired, and it might not be. Setting an expiration time doesn't affect eviction.

    Unpopular content means content that hasn't been accessed in a while. "A while" and "unpopular" are both relative to the bulk of other items in the cache. As caches receive more traffic, they also evict more cached content.

    As with all large-scale caches, content can be evicted unpredictably, so no particular request is guaranteed to be served from the cache.

  • Expiration: Content in HTTP(S) caches can have a configurable expiration time. The expiration time informs the cache not to serve old content, even if the content hasn't been evicted.

    For example, consider a picture-of-the-hour URL. Its responses should probably be set to expire in under one hour. Otherwise, the served content might be an old picture from a cache.

Requests initiated by Cloud CDN

When your origin server supports byte range requests, Cloud CDN can send multiple requests to the origin server in reaction to a single client request. As described in Support for byte range requests, Cloud CDN can initiate two types of requests: validation requests and byte range requests. For more information about byte range requests, see that documentation.

Data location settings of other Cloud Platform Services

Note that using Cloud CDN means data may be stored at serving locations outside of the region or zone of your origin server. This is normal and how HTTP caching works on the Internet. Under the Service Specific Terms of the Google Cloud Platform Terms of Service, the Data Location Setting that is available for certain Cloud Platform Services will not apply to Core Customer Data for the respective Cloud Platform Service when used with other Google products and services (in this case the Cloud CDN service). If you do not want this outcome, please do not use the Cloud CDN service.

Support for Google-managed SSL certificates

You can use Google-managed certificates when Cloud CDN is enabled.

Cloud Armor restriction

Google Cloud Armor is not supported for Cloud CDN. If you try to associate a Google Cloud Armor Security Policy with a backend service and Cloud CDN is enabled, the configuration is rejected. Similarly, if you attempt to enable Cloud CDN on a backend service that has an associated Google Cloud Armor security policy, the configuration process fails.

What's next