Setting up Cloud CDN with third-party object storage

You can use an external backend when the content is hosted either on-premises or in another cloud. The external backend lets you serve the content from Google's Cloud CDN.

This document walks through the process of setting up third-party object storage—such as Amazon Simple Storage Service (Amazon S3) or Azure Blob storage—as an external backend for Cloud CDN. External backends and Cloud CDN work in conjunction with an external HTTP(S) load balancer.

Architecture

To create the external backend, you create an internet network endpoint group (NEG) that points to the third-party storage service as the backend for the load balancer. Internet NEGs are used for external backends.

To set up the third-party storage bucket as a backend, you must do the following:

  1. Prepare the third-party storage bucket to serve content.
  2. Create an internet NEG that uses the bucket's FQDN.
  3. Configure the external HTTP(S) load balancer with the internet NEG as the backend.
  4. Test the setup.

Preparing the bucket to serve content

Before you begin the setup in Google Cloud, make sure that the bucket is configured correctly. These instructions assume that you are using an Amazon S3 and have the required permissions to make changes to the Amazon S3 bucket and objects.

  1. Make sure that the Amazon S3 bucket and the objects in the bucket are public. For instructions, see the AWS knowledge base. For example: How can I grant public read access to some objects in my Amazon S3 bucket?.

  2. Make sure that the content meets the cacheability requirements, listed in Cacheable content. If you need to add object metadata, see the AWS knowledge base. For example: Editing object metadata.

  3. Note the Amazon S3 bucket's endpoint (the FQDN). You need this information when you set up the internet NEG. To get the endpoint information, follow the instructions provided in the AWS knowledge base. For example: Accessing a bucket. You can also get the Amazon S3 endpoint URL from the object's overview page.

Creating an internet NEG that uses the bucket's hostname

For simplicity, this example uses the FQDN backend.example.com. Make sure to replace this with your third-party storage bucket's FQDN, which might look something like http://unique-name-bucket.s3-us-west-1.amazonaws.com/.

This guide uses an example to teach the fundamentals of using an external backend (sometimes called a custom origin) in an external HTTP(S) load balancer. An external backend is an endpoint that is external to Google Cloud. When you use an external backend with an external HTTP(S) load balancer, you can improve performance by using Cloud CDN caching.

The guide steps through how to configure a global external HTTP(S) load balancer with a Cloud CDN-enabled backend service that proxies to an external backend server at backend.example.com.

In the example, the load balancer accepts HTTPS requests from clients, and proxies these requests as HTTP/2 to the external backend. This example assumes that the external backend supports HTTP/2.

Other options would be to configure a load balancer to accept HTTP or HTTP/2 requests, and use HTTPS when proxying requests to the external backend.

This guide assumes that you have already set up a load balancer and you are adding a new external backend.

A sample architecture looks like this:

S3 bucket use case for external backends
S3 bucket use case for external backends

In the diagram, www.example.com has a load balancer frontend with the IP address 120.1.1.1. When there is a cache miss, user requests for /cart/id/1223515 are fetched from the external backend by way of HTTP/2. All other incoming traffic is directed to either the Google Cloud backend service with Compute Engine VMs or to the backend bucket, based on the URL map.

Before you begin

Before following this guide, familiarize yourself with the following:

Permissions

To follow this guide, you need to create an internet NEG and create or modify an external HTTP(S) load balancer in a project. You should be either a project owner or editor, or you should have both of the following Compute Engine IAM roles.

Task Required role
Create and modify load balancer components Network Admin
Create and modify NEGs Compute Instance Admin

Configuring a load balancer with an external backend

This guide shows you how to configure and test an internet NEG.

Setup overview

Setting up an internet NEG involves doing the following:

  • Defining the internet endpoint in an internet NEG.
  • Adding an internet NEG as the backend to a backend service.
  • Defining which user traffic to map to this backend service by configuring your external HTTP(S) load balancer's URL map.
  • Allowlisting the necessary IP ranges.

This example creates the following resources:

  • A forwarding rule with the 120.1.1.1 IP address directs incoming requests to a target proxy.
    • The networkTier of the forwarding rule must be PREMIUM.
  • The target proxy checks each request against the URL map to determine the appropriate backend service for the request.
    • For external backends, the target proxy must be TargetHttpProxy or TargetHttpsProxy. This example uses TargetHttpsProxy.
  • Cloud CDN enabled (optional) on the backend service allows caching and serving responses from Cloud CDN caches.
  • The backend service configuration directs traffic to one internet NEG. This internet NEG contains the network endpoint where the external HTTP(S) load balancer sends traffic when there is a Cloud CDN cache miss.
  • This example includes a user-defined request header, which is required when the external backend expects a particular value for the HTTP request's Host header.

The setup looks like this:

Cloud CDN with Amazon S3 bucket backend
Cloud CDN with Amazon S3 bucket backend

Creating the NEG and internet endpoint

Console

  1. In the Google Cloud Console, go to the Network endpoint groups page.

    Go to the Network endpoint groups page

  2. Click Create network endpoint group.
  3. Enter the name of the network endpoint group: example-fqdn-neg.
  4. For Network endpoint group type, select Network endpoint group (Internet).
  5. For Default port, enter 443.
  6. For New network endpoint, select Fully qualified domain name and port.
  7. For the FQDN, enter backend.example.com.
  8. For Port type, select Default, and verify that Port number is 443.
  9. Click Create.

gcloud

  1. Create an internet NEG, and set the --network-endpoint-type to internet-fqdn-port (the hostname and port where your external backend can be reached):

    gcloud compute network-endpoint-groups create example-fqdn-neg \
        --network-endpoint-type="internet-fqdn-port" --global
    
  2. Add your endpoint to the NEG. If a port isn't specified, the port selection defaults to port 80 (HTTP) or 443 (HTTPS; HTTP/2) depending on the protocol configured in the backend service. Make sure to include the --global flag:

    gcloud compute network-endpoint-groups update example-fqdn-neg \
        --add-endpoint="fqdn=backend.example.com,port=443" \
        --global
    
  3. List the created internet NEG:

    gcloud compute network-endpoint-groups list --global
    

    Output:

    NAME                LOCATION   ENDPOINT_TYPE        SIZE
    example-fqdn-neg    global     INTERNET_FQDN_PORT   1
    

  4. List the endpoint within that NEG:

    gcloud compute network-endpoint-groups list-network-endpoints example-fqdn-neg \
        --global
    

    Output:

    INSTANCE   IP_ADDRESS   PORT   FQDN
                                   backend.example.com
    

Adding an external backend to a load balancer

The following example updates an existing load balancer.

In the existing load balancer, the default service is a Google Cloud service. The example modifies the existing URL map by adding a path matcher that sends all requests for cart/id/1223515 to the images backend service, which is associated with the internet NEG.

Console

Create the backend service and add the internet NEG

  1. In the Google Cloud Console, go to the Load balancing page.

    Go to the Load balancing page

  2. To add the backend service to an existing load balancer, select your external HTTP(S) load balancer, click Menu and then select Edit.
  3. Click Backend configuration.
  4. In the Create or select backend services & backend buckets pull-down menu, select Backend services > Create a backend service.
  5. Set the name of the backend service to images.
  6. For Backend type, select Internet network endpoint group.
  7. Select the protocol that you intend to use from the load balancer to the internet NEG. For this example, select HTTP/2.
  8. Under New backend > Internet network endpoint group, select example-fqdn-neg, and then click Done.
  9. Select Enable Cloud CDN.
  10. Retain the default cache mode and TTL settings.
  11. In Advanced configurations, under Custom request headers, click Add header.
    1. For Header name, enter Host.
    2. For Header value, enter backend.example.com.
  12. Click Create.
  13. Keep the window open to continue.

Attach the backend service to an existing URL map

  1. Click Host and path rules.
  2. The first row or rows have Google Cloud services in the right column, and one of them is already populated with the default rule Any unmatched (default) for Hosts and Paths.
  3. Ensure that there is a row with images selected in the right column. If it doesn't exist, click Add host and path rule, and select images. Populate the other fields as follows:
    1. In Hosts, enter *.
    2. In Paths, enter /cart/id/1223515.

Review and finalize

  1. Click Review and finalize.
  2. Compare your settings to what you intended to create.
  3. If everything looks correct, click Create to create your external HTTP(S) load balancer.

gcloud

  1. Create a new backend service for the NEG:

    gcloud compute backend-services create images \
       --global \
       --enable-cdn \
       --protocol=HTTP2
    
  2. Configure the backend service to add the custom request header Host: backend.example.com to the request:

    gcloud compute backend-services update images \
       --custom-request-header "Host: backend.example.com" --global
    
  3. Use the backend-services add-backend command to add the internet NEG to the backend service:

    gcloud compute backend-services add-backend images \
      --network-endpoint-group "example-fqdn-neg" \
      --global-network-endpoint-group \
      --global
    
  4. Attach the new backend service to the load balancer's URL map by creating a new matching rule to direct requests to that backend:

    gcloud compute url-maps add-path-matcher EXAMPLE_URL_MAP \
      --default-service=GCP_SERVICE_EXAMPLE \
      --path-matcher-name=CUSTOM_ORIGIN_PATH_MATCHER_EXAMPLE \
      --backend-service-path-rules=/CART/ID/1223515=IMAGES
    

    Replace the following:

    • EXAMPLE_URL_MAP: the name of your existing URL map
    • GCP_SERVICE_EXAMPLE: the name of an existing default backend service
    • CUSTOM_ORIGIN_PATH_MATCHER_EXAMPLE: the name of this new path rule
    • /CART/ID/1223515: the path
    • IMAGES: the name of the new backend service with the attached internet NEG

Allowlisting the necessary IP ranges

To allow an external HTTP(S) load balancer to send requests to your internet NEG, you must query the _cloud-eoips.googleusercontent.com DNS TXT record by using a tool like dig or nslookup.

For example, run the following dig command:

dig TXT _cloud-eoips.googleusercontent.com | grep -Eo 'ip4:[^ ]+' | cut -d':' -f2

The output contains two IP ranges, as follows:

34.96.0.0/20
34.127.192.0/18

Note the IP ranges and ensure that these ranges are allowed by your firewall or cloud access control list (ACL).

For more information, see Authenticating requests.

Testing the external HTTP(S) load balancer

Now that you have configured your load balancer, you can start sending traffic to the load balancer's IP address. If you configured a domain, you can send traffic to the domain name as well. However, DNS propagation can take time to complete so you can start by using the IP address for testing.

  1. Go to the Load balancing page in the Google Cloud Console.
    Go to the Load balancing page
  2. Click on the load balancer you just created.
  3. Note the IP Address of the load balancer.
  4. If you created an HTTP load balancer, you can test your load balancer using a web browser by going to http://IP_ADDRESS. Replace IP_ADDRESS with the load balancer's IP address. You should be directed to the helloworld service homepage.

    If you created an HTTPS load balancer, you can test your load balancer using a web browser by going to https://IP_ADDRESS. Replace IP_ADDRESS with the load balancer's IP address. You should be directed to the helloworld service homepage.

    If that does not work and you are using a Google-managed certificate, confirm that your certificate resource's status is ACTIVE. For more information, see Google-managed SSL certificate resource status.

    Alternatively, you can use curl from your local machine's command line. Replace IP_ADDRESS with the load balancer's IPv4 address.

    If you are using a Google-managed certificate, test the domain pointing to the load balancer's IP address. For example:

    curl -s 'https://backend.example.com:443' --connect-to --resolve backend.example.com:443:IP_ADDRESS
    

  5. (Optional) If you are using a custom domain, you might need to wait for the updated DNS settings to propagate. Then test your domain (for example, backend.example.com) in the web browser.

    For help with troubleshooting, see Troubleshooting external backend and internet NEG issues.

Testing Cloud CDN

Test 1: Hitting the bucket endpoint directly

This test uses the time and wget commands from a VM. The example downloads /cart/id/1223515/image.jpg from the bucket backend.example.com.

From the output, you can see that the overall request takes 780 ms. This is the time to retrieve a 3.3 MB image from Amazon S3 directly.

time wget backend.example.com/cart/id/1223515/image.jpg
--2020-06-26 18:22:46--  backend.example.com/cart/id/1223515/image.jpg
Resolving backend.example.com (backend.example.com)... 52.219.120.233
Connecting to backend.example.com (backend.example.com)|52.219.120.233|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3447106 (3.3M) [image/jpeg]
Saving to: '/cart/id/1223515/image.jpg.47'
/cart/id/1223515/image.jpg.47                                                 100%[==============================================================================================================================================>]   3.29M  6.25MB/s    in 0.5s
2020-06-26 18:22:47 (6.25 MB/s) - '/cart/id/1223515/image.jpg.47' saved [3447106/3447106]
real    0m0.780s
user    0m0.003s
sys     0m0.012s

Test 2: First request through Cloud CDN

This test uses the load balancer's IP address to retrieve the /cart/id/1223515/image.jpg file. Because this is the first request, it should be a miss and Cloud CDN should fetch the image from the origin, which is Amazon S3. From the output, you can see that the request took 844 ms.

time wget http://LOAD_BALANCER_IP_ADDRESS/cart/id/1223515/image.jpg
--2020-06-26 18:19:27--  http://LOAD_BALANCER_IP_ADDRESS/cart/id/1223515/image.jpg
Connecting to LOAD_BALANCER_IP_ADDRESS:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3447106 (3.3M) [image/jpeg]
Saving to: '/cart/id/1223515/image.jpg.44'
/cart/id/1223515/image.jpg.44                                                 100%[==============================================================================================================================================>]   3.29M  8.23MB/s    in 0.4s
2020-06-26 18:19:28 (8.23 MB/s) - '/cart/id/1223515/image.jpg.44' saved [3447106/3447106]
real    0m0.844s
user    0m0.003s
sys     0m0.012s

Test 3: Second request through CDN

Now we will make one more request using this load balancer IP. This time, we should get a cached response so it should be faster than the first 2 tests.

We are again using the same LB IP LOAD_BALANCER_IP_ADDRESS. From the output, we can see that the request took just 18ms.

time wget http://LOAD_BALANCER_IP_ADDRESS/cart/id/1223515/image.jpg
--2020-06-26 18:19:29--  http://LOAD_BALANCER_IP_ADDRESS/cart/id/1223515/image.jpg
Connecting to LOAD_BALANCER_IP_ADDRESS:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3447106 (3.3M) [image/jpeg]
Saving to: '/cart/id/1223515/image.jpg.45'
/cart/id/1223515/image.jpg.45                                                 100%[==============================================================================================================================================>]   3.29M  --.-KB/s    in 0.008s
2020-06-26 18:19:29 (423 MB/s) - '/cart/id/1223515/image.jpg.45' saved [3447106/3447106]
real    0m0.018s
user    0m0.001s
sys     0m0.010s

Verifying by using Logs

Logs for Cloud CDN are associated with the external HTTP(S) load balancer that your Cloud CDN-enabled backends are attached to. Using the logs, you can check whether a request is a hit or a miss. You can read more about Cloud CDN logs in Viewing logs.

Limitations

  • The third-party bucket and the objects must be public. External backends don't support signed URLs or signed cookies.

What's next