Google Cloud CDN and HTTP(S) load balancing
The Cloud CDN content delivery network works with HTTP(S) load balancing to deliver content to your users. Load balanced content may come from the following backend types:
In the following figure, responses from VM instances (for example) flow through an HTTP(S) load balancer before being delivered by Cloud CDN.
In the following figure, selected responses come from Cloud Storage.
When a user requests content from your site, that request passes through network locations at the edges of Google's network, usually far closer to the user than your backends. Cloud CDN uses caches at these network locations to store responses generated by your backends.
Cache hits, misses, fill, and egress
The first time a piece of content is requested, the cache sees that it can't fulfill the request. This is called a cache miss. The cache might attempt to get the content from a nearby cache. If the nearby cache has the content, it sends it to the first cache via cache-to-cache fill. Otherwise, the request is forwarded to the HTTP(S) load balancer. The load balancer in turn forwards the request to one of your Google Compute Engine instances or Google Cloud Storage buckets.
When the cache receives the content, the cache forwards the content to the user. If the content is cacheable, the cache also stores it for future requests along with a cache key. When your users make requests for the same content, the cache responds directly to the user, shortening the round trip time and saving any VM instances from having to process the request. This is called a cache hit.
Data transfer from a cache to a client is called cache egress. Data transfer into a cache is called cache fill. As illustrated in figure 3, cache fill can originate from one of your backends or from another Cloud CDN cache.
Cloud CDN uses caches in numerous locations around the world. Caching is reactive in that an object is stored in a particular cache if a request goes through that cache and if the response is cacheable. An object stored in one cache does not automatically replicate into other caches; cache-to-cache fill happens only in response to a client-initiated request. You cannot pre-load caches except by causing the individual caches to respond to requests.
On cache hits, you pay for cache egress bandwidth. On cache misses—including misses that resulted in cache-to-cache fill—you additionally pay for cache fill bandwidth. That means, all else being equal, cache hits cost less than cache misses. For detailed pricing information, refer to the pricing page.
Content served from cache
After you enable Cloud CDN, caching happens automatically for all cacheable content. The web server software running on your instance uses HTTP headers to indicate which responses should be cached and which should not. Caching Details has additional information about what Cloud CDN caches and for how long.
Because of the nature of caches, it is impossible to predict whether a particular request will be served out of cache. You can, however, expect that popular requests for cacheable content will be served from cache a high percentage of the time, yielding significantly reduced latencies, reduced cost, and reduced load on any instances involved.
Data location settings of other Cloud Platform Services
Please note that using Cloud CDN means data may be stored at serving locations outside of the region or zone of your origin server. This is normal and how HTTP caching works on the Internet. Under the Service Specific Terms of the Google Cloud Platform Terms of Service, the Data Location Setting that is available for certain Cloud Platform Services will not apply to Core Customer Data for the respective Cloud Platform Service when used with other Google products and services (in this case the Cloud CDN service). If you do not want this outcome, please do not use the Cloud CDN service.
- Learn how to enable Cloud CDN for your HTTP(S) load balanced instances and storage buckets.